Gemini Live API Tool Calling Issues – Inconsistent Behavior and Empty Tool Responses

Hi everyone,

I’m facing issues with the Gemini Live API related to tool (function) calling:

Main Issues:

  1. Incorrect Execution: Sometimes, despite declaring tools correctly for function calling, the model treats them as executable code instead of calling the tool.
  2. Empty Responses: Even when a tool is called and returns data, the model sometimes responds with nothing, as if the output was ignored.
  3. Tool Outputs Field: In some responses, it just shows "tool_outputs" without integrating the result naturally into the conversation.

What I’ve Done:

  • Validated the tool schema.
  • Logged tool responses on the backend.
  • Tested with multiple prompt variations.

Is this a known issue, or is there a workaround to ensure consistent tool behavior?

3 Likes

Hey @Mithun_Palanisamy, Do you mind sharing sample code that reproduces the issue? Sharing this will help us investigate and resolve the issue more quickly.

Thanks.

I’m seeing the same. Actually I never got a proper tool call response, but ALWAYS executable code, although that feature should be disabled:

// this is our wrapper code, but the config is passed through as is:
	geminiLiveProcessor := agent_runtime.NewGeminiLiveProcessor(
		agent_runtime.WithGeminiModel("gemini-2.0-flash-live-001"),
		agent_runtime.WithGeminiClientConfig(func(config *genai.ClientConfig) {
			config.APIKey = geminiApiKey
		}),
		agent_runtime.WithLiveConfig(func(config *genai.LiveConnectConfig) {
			// default, but make explicit for test
			config.ResponseModalities = []genai.Modality{genai.ModalityAudio}
			config.SystemInstruction = &genai.Content{
				Role: "system",
				Parts: []*genai.Part{{
					Text: "You are an assistant helping the user to remember things. " +
						"You can only remember things by using your available tool functions to store memories. " +
						"NEVER try to write or execute code.",
				}},
			}
		}),
	)
// our base config
		liveConfig: &genai.LiveConnectConfig{
			// we can only use either text or audio, not both
			ResponseModalities:       []genai.Modality{genai.ModalityAudio},
			InputAudioTranscription:  &genai.AudioTranscriptionConfig{},
			OutputAudioTranscription: &genai.AudioTranscriptionConfig{},
			RealtimeInputConfig: &genai.RealtimeInputConfig{
				AutomaticActivityDetection: &genai.AutomaticActivityDetection{
					Disabled:                 false,
					StartOfSpeechSensitivity: genai.StartSensitivityLow,
					EndOfSpeechSensitivity:   genai.EndSensitivityLow,
					PrefixPaddingMs:          utils.Ptr[int32](100),
					SilenceDurationMs:        utils.Ptr[int32](300),
				},
				ActivityHandling: genai.ActivityHandlingStartOfActivityInterrupts,
				TurnCoverage:     genai.TurnCoverageTurnIncludesOnlyActivity,
			},
			SpeechConfig: &genai.SpeechConfig{
				LanguageCode: "en-US",
				VoiceConfig: &genai.VoiceConfig{
					PrebuiltVoiceConfig: &genai.PrebuiltVoiceConfig{
						VoiceName: "Puck",
					},
				},
			},
		},
// this is also passed through basically as-is as a genai.BehaviorBlocking tool function definition
	geminiLiveProcessor.RegisterFunction(agent_runtime.FunctionDefinition{
		Name:        "store_memory_in_db",
		Description: "stores a memory for later use",
		Parameters: map[string]any{
			"type":     "object",
			"required": []string{"memory"},
			"properties": map[string]any{
				"memory": map[string]any{
					"type": "string",
				},
			},
		},

And then we do finally receive a model turn with a ExecutableCode part, but not a FunctionCall part

Also @GUNAND_MAYANGLAMBAM this example actually looks like this is expected… Is that the case?
Get_started_LiveAPI_tools.ipynb - Colab

@Martin_at_Cobbery try adding GoogleSearch() to the tools you declare. That resolved the issue for me

It looks a bit weird, but it resolves the issue for me

thanks, but for me this didn’t make a difference (and TBH I would have hated adding Google search access…). Maybe the Go and Python SDKs are different. I’m on google.golang.org/genai v1.11.1

I can confirm that this happens to us as well, the function is called 1 out of 20 times for us, sometimes the model output audio of the function’s description instead. You can find a github issue of other people having the same problem here. I think at this point we can also confirm that this happens to other SDKs as well, not just the python one.

Hey folks, I found the issue: (code is from Go SDK, but should be similar in others)
There is websocketMessage.ServerContent.ModelTurn.Parts[0].FunctionCall - which mostly never gets set and ikd what it even should indicate.
BUT, there is also websocketMessage.ToolCall.FunctionCalls which always works and contains what we actually expect!
The model still also fills ...Part.ExecutableCode which I think is just an artifact how function calling is internally implemented. Once you then send the tool call result it even produces a fake ...Part.CodeExecutionResult which then contains your result!