fix: Force AUDIO modality for native-audio models in /run_live (#4206) #4232

AryanBagade · 2026-01-22T03:45:53Z

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Closes: adk web: Native-audio models fail when TEXT modality is requested #4206

Testing Plan

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

pytest results:
35 passed, 71 warnings in 3.90s

Added 3 new tests:

test_native_audio_model_forces_audio_modality - Verifies native-audio models force AUDIO modality
test_non_native_audio_model_keeps_requested_modality - Verifies regular models keep requested modality
test_agent_without_model_attribute - Verifies BaseAgent (no model attr) doesn't crash

Manual End-to-End (E2E) Tests:
To test manually:

Create an agent with model="gemini-live-2.5-flash-native-audio"
Connect to /run_live WebSocket with modalities=["TEXT"]
Verify the modality is forced to ["AUDIO"] instead of failing

Full E2E testing was not possible as it requires access to gemini-live-2.5-flash-native-audio model which is not publicly available.

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
[] I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

Added hasattr() check to safely handle BaseAgent instances that don't have a model attribute (improvement over issue's proposed solution).

…e#4206) Signed-off-by: Aryan Bagade <[email protected]>

gemini-code-assist · 2026-01-22T03:46:08Z

Summary of Changes

Hello @AryanBagade, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where specific 'native-audio' models were not correctly handling modality requests in the /run_live endpoint, potentially leading to incorrect behavior. The changes introduce logic to automatically enforce the AUDIO modality for these specialized models, ensuring they operate as intended. Additionally, the update enhances the system's resilience by gracefully handling agents that may not possess a model attribute.

Highlights

Modality Enforcement for Native-Audio Models: The /run_live WebSocket endpoint now explicitly checks if an agent uses a 'native-audio' model. If detected, the response modality is automatically forced to AUDIO, ensuring compatibility and preventing errors for models that exclusively support audio.
Robust Handling of Agent Models: Improved robustness by adding a hasattr() check when accessing the model attribute of an agent. This prevents crashes for BaseAgent instances or other agents that might not define a model attribute.
New Unit Tests: Three new unit tests have been added to validate the new modality forcing logic: one for native-audio models, one for regular models, and one for agents without a model attribute, ensuring comprehensive coverage.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly forces the AUDIO modality for native-audio models in the /run_live endpoint. The implementation is sound and includes a safe check for the model attribute. The new unit tests cover the different cases for native-audio, non-native, and agents without a model attribute. My main feedback is regarding the structure of the tests, which currently test implementation details and involve code duplication. I've suggested a refactoring that would improve test quality and maintainability.

gemini-code-assist · 2026-01-22T03:48:15Z

tests/unittests/cli/test_fast_api.py

+def test_native_audio_model_forces_audio_modality():
+  """Test that native-audio models force AUDIO modality in run_live endpoint."""
+  from google.adk.cli.adk_web_server import AdkWebServer
+
+  # Create an LlmAgent with a native-audio model
+  native_audio_agent = LlmAgent(
+      name="native_audio_agent",
+      model="gemini-live-2.5-flash-native-audio",
+  )
+
+  class NativeAudioAgentLoader:
+
+    def load_agent(self, app_name):
+      return native_audio_agent
+
+    def list_agents(self):
+      return ["test_native_audio_app"]
+
+  session_service = AsyncMock()
+  session = Session(
+      id="session_id",
+      app_name="test_native_audio_app",
+      user_id="user",
+      state={},
+      events=[],
+  )
+  session_service.get_session.return_value = session
+
+  adk_web_server = AdkWebServer(
+      agent_loader=NativeAudioAgentLoader(),
+      session_service=session_service,
+      memory_service=MagicMock(),
+      artifact_service=MagicMock(),
+      credential_service=MagicMock(),
+      eval_sets_manager=MagicMock(),
+      eval_set_results_manager=MagicMock(),
+      agents_dir=".",
+  )
+
+  # Verify the _get_root_agent method works correctly
+  root_agent = adk_web_server._get_root_agent(native_audio_agent)
+  assert root_agent.model == "gemini-live-2.5-flash-native-audio"
+  assert "native-audio" in root_agent.model
+
+  # Verify the model name detection logic
+  model_name = (
+      root_agent.model
+      if hasattr(root_agent, "model") and isinstance(root_agent.model, str)
+      else ""
+  )
+  assert "native-audio" in model_name
+
+  # Test with App wrapping the agent
+  app_agent = App(name="test_app", root_agent=native_audio_agent)
+  root_agent_from_app = adk_web_server._get_root_agent(app_agent)
+  assert root_agent_from_app.model == "gemini-live-2.5-flash-native-audio"
+
+
+def test_non_native_audio_model_keeps_requested_modality():
+  """Test that non-native-audio models keep the requested modality."""
+  from google.adk.cli.adk_web_server import AdkWebServer
+
+  # Create an LlmAgent with a regular model (not native-audio)
+  regular_agent = LlmAgent(
+      name="regular_agent",
+      model="gemini-2.5-flash",
+  )
+
+  class RegularAgentLoader:
+
+    def load_agent(self, app_name):
+      return regular_agent
+
+    def list_agents(self):
+      return ["test_regular_app"]
+
+  adk_web_server = AdkWebServer(
+      agent_loader=RegularAgentLoader(),
+      session_service=MagicMock(),
+      memory_service=MagicMock(),
+      artifact_service=MagicMock(),
+      credential_service=MagicMock(),
+      eval_sets_manager=MagicMock(),
+      eval_set_results_manager=MagicMock(),
+      agents_dir=".",
+  )
+
+  root_agent = adk_web_server._get_root_agent(regular_agent)
+  model_name = (
+      root_agent.model
+      if hasattr(root_agent, "model") and isinstance(root_agent.model, str)
+      else ""
+  )
+
+  # For regular models, the modality should NOT be forced to AUDIO
+  assert "native-audio" not in model_name
+
+
+def test_agent_without_model_attribute():
+  """Test handling of agents without model attribute (BaseAgent)."""
+  from google.adk.cli.adk_web_server import AdkWebServer
+
+  # Create a BaseAgent (which doesn't have a model attribute)
+  base_agent = DummyAgent(name="base_agent")
+
+  class BaseAgentLoader:
+
+    def load_agent(self, app_name):
+      return base_agent
+
+    def list_agents(self):
+      return ["test_base_app"]
+
+  adk_web_server = AdkWebServer(
+      agent_loader=BaseAgentLoader(),
+      session_service=MagicMock(),
+      memory_service=MagicMock(),
+      artifact_service=MagicMock(),
+      credential_service=MagicMock(),
+      eval_sets_manager=MagicMock(),
+      eval_set_results_manager=MagicMock(),
+      agents_dir=".",
+  )
+
+  root_agent = adk_web_server._get_root_agent(base_agent)
+
+  # BaseAgent (DummyAgent) doesn't have a model attribute
+  model_name = (
+      root_agent.model
+      if hasattr(root_agent, "model") and isinstance(root_agent.model, str)
+      else ""
+  )
+
+  # Should default to empty string when no model attribute
+  assert model_name == ""
+  assert "native-audio" not in model_name
+


These new tests are a good start, but they have a couple of issues that make them less effective and harder to maintain:

They don't test the actual behavior. For example, test_native_audio_model_forces_audio_modality asserts that the model name contains native-audio, but it doesn't verify that the response_modalities are actually forced to ['AUDIO']. The tests are verifying implementation details rather than the behavior described by their names.

There is significant code duplication. The logic to extract the model_name is copied from adk_web_server.py and then duplicated across all three new tests. This makes the tests brittle; if the implementation in adk_web_server.py changes, these tests will need to be updated in multiple places, or worse, they might pass while the actual logic is broken.

A better approach would be to refactor the logic for determining the effective_modalities in adk_web_server.py into its own helper method. This would make the logic in run_agent_live cleaner and allow you to test the behavior directly and without duplication.

For example, you could add a method to AdkWebServer (this would be outside the current diff, in adk_web_server.py):

def _get_effective_modalities(self, app_name: str, requested_modalities: list[str]) -> list[str]: """Determines the effective modalities, forcing AUDIO for native-audio models.""" agent_or_app = self.agent_loader.load_agent(app_name) root_agent = self._get_root_agent(agent_or_app) model = getattr(root_agent, "model", None) model_name = model if isinstance(model, str) else "" if "native-audio" in model_name: return ["AUDIO"] return requested_modalities

Then your tests could be simplified to something like this, which is much more direct and robust:

def test_native_audio_model_forces_audio_modality(): # ... setup adk_web_server with NativeAudioAgentLoader ... modalities = adk_web_server._get_effective_modalities("test_app", ["TEXT"]) assert modalities == ["AUDIO"] def test_non_native_audio_model_keeps_requested_modality(): # ... setup adk_web_server with RegularAgentLoader ... modalities = adk_web_server._get_effective_modalities("test_app", ["TEXT"]) assert modalities == ["TEXT"] def test_agent_without_model_attribute(): # ... setup adk_web_server with BaseAgentLoader ... modalities = adk_web_server._get_effective_modalities("test_app", ["TEXT"]) assert modalities == ["TEXT"]

This would make the tests much stronger and easier to maintain.

Got it, will fix this!

Signed-off-by: Aryan Bagade <[email protected]>

AryanBagade · 2026-01-22T04:55:29Z

/gemini review

gemini-code-assist

Code Review

This pull request correctly forces the AUDIO modality for native-audio models by adding a check in _get_effective_modalities. The implementation is sound and includes comprehensive unit tests for various scenarios.

My main feedback is to refactor _get_effective_modalities to improve performance by avoiding redundant agent loading. The agent is currently loaded twice per request in run_agent_live. I've provided suggestions to pass the already-loaded agent object to the new method. This will make the implementation more efficient.

gemini-code-assist · 2026-01-22T04:56:53Z

src/google/adk/cli/adk_web_server.py

+  def _get_effective_modalities(
+      self, app_name: str, requested_modalities: List[str]
+  ) -> List[str]:
+    """Determines effective modalities, forcing AUDIO for native-audio models.
+
+    Native-audio models only support AUDIO modality. This method detects
+    native-audio models by checking if the model name contains "native-audio"
+    and forces AUDIO modality for those models.
+
+    Args:
+      app_name: The name of the application/agent.
+      requested_modalities: The modalities requested by the client.
+
+    Returns:
+      The effective modalities to use.
+    """
+    agent_or_app = self.agent_loader.load_agent(app_name)
+    root_agent = self._get_root_agent(agent_or_app)
+    model = getattr(root_agent, "model", None)
+    model_name = model if isinstance(model, str) else ""
+    if "native-audio" in model_name:
+      return ["AUDIO"]
+    return requested_modalities


To improve efficiency, this method can be refactored to accept a root_agent: BaseAgent object directly instead of app_name: str. This avoids reloading the agent, as the root_agent is already available in the run_agent_live function after get_runner_async is called. Reloading the agent can be an expensive operation.

def _get_effective_modalities( self, root_agent: BaseAgent, requested_modalities: List[str] ) -> List[str]: """Determines effective modalities, forcing AUDIO for native-audio models. Native-audio models only support AUDIO modality. This method detects native-audio models by checking if the model name contains "native-audio" and forces AUDIO modality for those models. Args: root_agent: The root agent of the application. requested_modalities: The modalities requested by the client. Returns: The effective modalities to use. """ model = getattr(root_agent, "model", None) model_name = model if isinstance(model, str) else "" if "native-audio" in model_name: return ["AUDIO"] return requested_modalities

gemini-code-assist · 2026-01-22T04:56:53Z

src/google/adk/cli/adk_web_server.py

+        effective_modalities = self._get_effective_modalities(
+            app_name, modalities
+        )
+        run_config = RunConfig(response_modalities=effective_modalities)


Following the suggested refactoring of _get_effective_modalities to accept a root_agent object, you should update this call site to pass runner.app.root_agent. This avoids loading the agent twice.

Note that you will also need to update the corresponding unit tests for _get_effective_modalities to reflect this change.

effective_modalities = self._get_effective_modalities( runner.app.root_agent, modalities ) run_config = RunConfig(response_modalities=effective_modalities)

Signed-off-by: Aryan Bagade <[email protected]>

AryanBagade · 2026-01-22T05:01:48Z

/gemini review

gemini-code-assist

Code Review

This pull request correctly identifies and fixes an issue where native-audio models could be used with incorrect modalities in /run_live. The approach of adding a _get_effective_modalities helper function is clean and isolates the logic well. The accompanying unit tests are thorough and cover the intended fix for various scenarios.

My main feedback is on the implementation of _get_effective_modalities, which doesn't fully handle all possible types for the model attribute on an agent, potentially leading to the fix not being applied in all cases. I've left a specific suggestion to make this more robust.

Overall, this is a good fix that improves the robustness of the web server.

gemini-code-assist · 2026-01-22T05:03:20Z

src/google/adk/cli/adk_web_server.py

+    model = getattr(root_agent, "model", None)
+    model_name = model if isinstance(model, str) else ""


The current logic for extracting the model name only handles the case where the model attribute is a string. The LlmAgent.model attribute can also be a BaseLlm object, in which case isinstance(model, str) would be false, model_name would become an empty string, and the check for "native-audio" would fail.

To make this more robust, you should also handle the case where model is an object (like BaseLlm) that has a model string attribute. It would also be beneficial to add a test case for an LlmAgent initialized with a BaseLlm object to ensure full coverage.

Suggested change

model = getattr(root_agent, "model", None)

model_name = model if isinstance(model, str) else ""

model = getattr(root_agent, "model", None)

model_name = ""

if isinstance(model, str):

model_name = model

elif hasattr(model, "model") and isinstance(getattr(model, "model"), str):

model_name = getattr(model, "model")

fix: Force AUDIO modality for native-audio models in /run_live (googl…

a446492

…e#4206) Signed-off-by: Aryan Bagade <[email protected]>

adk-bot added the live [Component] This issue is related to live, voice and video chat label Jan 22, 2026

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

fix: Refactor modality logic into helper method

9fc0b22

Signed-off-by: Aryan Bagade <[email protected]>

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

fix: Pass root_agent directly to avoid redundant loading

45ea2bc

Signed-off-by: Aryan Bagade <[email protected]>

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Force AUDIO modality for native-audio models in /run_live (#4206) #4232

fix: Force AUDIO modality for native-audio models in /run_live (#4206) #4232

AryanBagade commented Jan 22, 2026

Uh oh!

gemini-code-assist bot commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

AryanBagade Jan 22, 2026

Uh oh!

AryanBagade commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

AryanBagade commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		model = getattr(root_agent, "model", None)
		model_name = model if isinstance(model, str) else ""

fix: Force AUDIO modality for native-audio models in /run_live (#4206) #4232

Are you sure you want to change the base?

fix: Force AUDIO modality for native-audio models in /run_live (#4206) #4232

Conversation

AryanBagade commented Jan 22, 2026

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Jan 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

AryanBagade Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

AryanBagade commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

AryanBagade commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants