Very frustrating experience with Gemini 2.5 function calling performance

The function calling behavior of Gemini models has become completely unreliable and unpredictable:

  • Function calling used to work somewhat reliably, but recently it has stopped working almost entirely.
  • Instead of invoking the function (as verified in the API response), the model simply follows the schema and instructions, generating a plain text response without actually calling the function.
  • Occasionally, with the same query, the model will randomly use function calling again, but the output is often worse—ignoring instructions and schema more than when it just generates a text only response (!)
  • gemini-2.5-flash-preview-04-17 is noticeably better than the production gemini-2.5-flash
    at following instructions and schemas - when it works. The fact that the preview model is better than production raises concerns about stability and release practices.

This erratic behavior makes it impossible to build or trust production systems on top of these APIs.

Ongoing Bug:
The long-standing bug where the model starts its response with a ```json code block—even when explicitly instructed not to—remains unresolved after several months. This forces us to implement unreliable workarounds.

What is the reason for this regression/change of behavior? Will it be fixed?
Will Google ensure that production models match or exceed the reliability and quality of preview versions?

Gemini models are exceptional imo, but we can’t built any serious application if this is the type of performance & reliability we get.

Hello,

Could you please share your code so that we can reproduce your issue and report it to engg team?

How can I share the code? It’s a pretty complex assistant with a layered agent architecture + large schema for function calling… Should we jump on a call to share details and logs? That’s probably the easiest way. IMO you need to work with design partners like us to get these issues resolved fast. Difficult to test everything otherwise. Let me know

Hello,

If you could share some part of your code that would be helpful as we could try to reproduce your issue. But I have some suggestions which could help you:

Function calls are highly dependent on prompt and tool description. So you can try to improve tool descriptions and provide more specific instructions for function calls. If you always want a function call then there is a feature to force function call as well (example code below).

# Configure the client and tools
client = genai.Client()
house_tools = [
    types.Tool(function_declarations=[power_disco_ball, start_music, dim_lights])
]
config = types.GenerateContentConfig(
    tools=house_tools,
    automatic_function_calling=types.AutomaticFunctionCallingConfig(
        disable=True
    ),
    # Force the model to call 'any' function, instead of chatting.
    tool_config=types.ToolConfig(
        function_calling_config=types.FunctionCallingConfig(mode='ANY')
    ),
)

I tried what you suggested but this is what i got :frowning:

Full Google API Error Object: {
“error”: {
“code”: 500,
“message”: “An internal error has occurred. Please retry or report in Troubleshooting guide  |  Gemini API  |  Google AI for Developers”,
“status”: “INTERNAL”
}
}
GoogleProvider Error: Google API error: An internal error has occurred. Please retry or report in Troubleshooting guide  |  Gemini API  |  Google AI for Developers
:collision: Error in message generation: Error: Google API error: An internal error has occurred. Please retry or report in Troubleshooting guide  |  Gemini API  |  Google AI for Developers
at GoogleProvider.generateChatCompletion
at async handleSecondStageConfigGeneration

When i force the model to use the tools, the API fails and I get this Request Body (which is the user query + function call and schema)

:wrench: GoogleProvider: API request failed
:wrench: Request body: {
“contents”: [
{
“role”: “user”,
“parts”: [
{
“text”: “…[User question]…”
}
]
}
],
“generationConfig”: {
“temperature”: 0,
“maxOutputTokens”: 8192
},
“systemInstruction”: {
“parts”: [
{
“text”: “\n …[Prompt]…”
},
“tools”: [
{
“function_declarations”: [
{
…[Function schema]…
}
]
}

When I use my previous flow, i.e. without forcing, I get a success but with all the issues I mentioned.

:wrench: GoogleProvider: API request successful

This is the code used to force the function call

  if (options.force_function_call) {
    requestBody.toolConfig = {
      functionCallingConfig: {
        mode: "ANY" // This forces the model to call a function.
      }
    };
    console.log(`🔧 GoogleProvider: Function calling mode: FORCED (ANY)`);
  }

again, the major issue here is that it used to work no problem until a week or so ago. Now it’s returning text instead of calling the function (yet following the schema that was provided). It’s a bug in the API/model.

And also, the model keeps starting every response with ```json block even if explicitly asked not to - this has been going on for a couple of months with no resolution, forcing us to brittle workarounds.