Sitemap

How to expose FastAPI endpoints as tools with MCP and consume them from Gemini

6 min readJun 2, 2025

--

Do you know how cool Large Language Models (LLMs) are? They can chat, write code, brainstorm ideas… but live in their digital world. What if you want them to interact with your applications, fetch real-time data, or perform actions like making a reservation? That’s where “tools” come in.

Tools are extensions or functions that enable LLMs to interact with the real world. Think of them as a toolbox for your AI. They let an LLM get information not in its training data (like checking a database or calling an API for weather info) or execute actions (like booking an appointment).

But here’s a challenge: tools are often built using specific languages and frameworks. If your LLM is in Python and your tool is in JavaScript or C#, how do you get them to talk to each other? This is a classic interoperability problem.

This is exactly the problem the Model Context Protocol (MCP) was created to solve.

What is MCP?

Born out of Anthropic (the folks behind Claude), MCP is a standardized protocol designed to help AI models (LLMs) communicate with data sources and tools. It provides a common way for two points or components to communicate, defining how information is exchanged, methods, security, and more.

The core idea is to provide a way to connect LLMs with the origins of data, tools, resources, context, and prompts. This allows LLMs to discover and utilize these external capabilities.

The general architecture involves MCP Clients (which can be LLMs, but also development environments, cloud tools, etc.) talking to MCP Servers. These servers can access external resources like databases, remote or local data, or APIs. The servers can run locally or be hosted remotely, exposing their tools.

Exposing FastAPI Endpoints as Tools with FastAPI-MCP

If you’re working with Python, chances are you’ve encountered or used FastAPI. It’s a super popular framework for building web applications and microservices in Python, known for its good practices, modern features, and great performance.

The awesome news is that you can easily expose your existing FastAPI endpoints as tools for LLMs using a specific extension: FastAPI-MCP. This means you can leverage the microservices you might already have deployed.

Here’s the magic:

  1. You have your standard FastAPI application, defining endpoints using decorators like @app.get("/weather/{country}"). These endpoints can fetch data (like weather) or perform actions.
  2. When defining these endpoints, it’s crucial to provide good descriptions, names, and parameter definitions. FastAPI’s Operation ID is particularly useful here, giving the tool a clear name.
  3. You create an instance ofFastApiMCP, passing your FastAPI application to it.
  4. You can specify which operations (endpoints) you want to include or exclude from being exposed as tools. This is great if you only want to expose a subset of your API.
  5. Important parameters like describe_all_response_schemas and describe_full_response_scheme are used to tell the MCP client (like an LLM) exactly what input the tool expects and what output it will provide. This schema information is essential for LLMs to understand how to use the tool.
  6. Finally, you “mount” the FastMCP instance.

That’s it! With just a few lines of code, your FastAPI application starts acting as an MCP server, exposing its chosen endpoints as discoverable tools.

FastAPI-MCP Example

Checking Things Out with the MCP Inspector

Before hooking up an LLM, how do you test if your MCP server is working correctly and exposing the tools as expected? Enter the MCP Inspector.

The Inspector is a helpful tool (available for Python, TScript/JavaScript) that lets you inspect and test an MCP server. You can run it from the command line, pointing it to your running MCP server’s URL (specifically the /mcp path).

It acts like a client, allowing you to see the tools your server is exposing (like get_weather_country). You can then use the Inspector to call these tools directly and see the response, confirming that your server logic is functioning correctly. It's a handy way to test your server without building a separate client application.

MCP Inspector
MCP Inspector — Tool Response

Using MCP Tools from Gemini

Now for the exciting part: getting an LLM, like Google’s Gemini, to use your newly exposed tools! Gemini is a multimodal model, meaning it can handle various types of content like text, images, etc..

Here’s the general flow:

  1. You set up your Gemini client in Python, including providing your API key.
  2. You define the connection parameters for the MCP server. Since the server might be running elsewhere, you typically use the mcp-proxy command with the server's URL (e.g., http://127.0.0.1:8000/mcp).
  3. You establish a session with the MCP server using these parameters. This session allows two-way communication (reading and writing).
  4. The client asks the MCP server for the list of available tools using the list_tools method.
  5. The server returns the definitions of the exposed tools, including their names, descriptions, and input/output schemas.
  6. Crucially, the client code translates these tool definitions into a format that Gemini understands, specifically FunctionDeclaration objects, including the name, description, and parameters (like country of type string for the weather tool).
  7. You then provide these formatted tool definitions to the Gemini model when you send it the user’s query (e.g., “What is the weather in Mexico?”).
  8. Gemini analyzes the user’s query and the tools it has been given. Based on the tool descriptions and the query, it decides if one of the tools can help fulfill the request. This is the LLM’s reasoning process.
  9. If Gemini determines a tool is needed (e.g., the Get Weather Country tool for a weather question), it doesn't try to execute it. Instead, it responds to the client indicating that a function call is needed and provides the name of the function to call and the necessary arguments (like country: "Mexico").
  10. Your client code receives this function_call response from Gemini.
  11. Your client code then calls the appropriate tool on the MCP server, using the call_tool method of the session, passing the name and arguments provided by Gemini. The client is responsible for executing the tool.
  12. The MCP server executes the underlying FastAPI endpoint.
  13. The MCP server returns the result (e.g., a JSON object with weather data) back to your client code.
  14. Your client code takes this result from the MCP server and sends it back to Gemini as part of the conversation history, specifying that it’s a response from a function call. You include the original question, Gemini’s request to call the function, and the function’s response.
  15. Gemini receives the tool’s output and uses this new information to formulate a final, natural language response to the user. For example, it can parse the weather JSON and tell the user the temperature, conditions, wind speed, etc., in a human-readable format.

This two-step process (LLM asks the client to call the tool, client calls the tool and gives the result back to LLM) allows the LLM to leverage external data and actions without directly executing the code itself.

Manual Integration Gemini using MCP Tools
Gemini SDKs have built-in support for MCP

Conclusions

Combining the power of LLMs like Gemini with frameworks like FastAPI through protocols like MCP opens up exciting possibilities. You can now easily connect your AI models to your existing services, giving them access to real-world data and the ability to perform real-world actions.

Using fastapi-mcp makes exposing your Python-based microservices as tools straightforward, and the MCP Inspector helps you test everything out. The detailed flow of using these tools from an LLM like Gemini shows how these components work together to enable truly intelligent applications that go beyond simple chat.

This is just one way to build MCP servers and clients. Stay tuned for more ways to integrate!

What do you think about MCP and integrating LLMs with your services? Let me know in the comments! 👇

--

--

Juan Guillermo Gómez Torres
Juan Guillermo Gómez Torres

Written by Juan Guillermo Gómez Torres

Tech Lead in Wordbox. @GDGCali Founder. @DevHackCali Founder. Firebase & GCP & Kotlin & AI @GoogleDevExpert

No responses yet