Allow setting a custom max context window for Google Gemini API provider (and/or universal max context window) #3717

anojndr · 2025-05-18T19:07:52Z

What problem does this proposed feature solve?

Currently, users cannot change the max context window for the Google Gemini API provider in Roo Code—it is fixed at 1 million tokens. This is problematic for users on the Gemini free tier, where the tokens-per-minute (TPM) limit is 250k. As a result, it's easy to hit provider-side limits and get errors, especially when Roo tries to use the full 1M context window.

Additionally, Roo Code's "Intelligently condense the context window" feature is based on the max context window setting. If users could set a lower max context window (e.g., 250k), the condensation/summarization would trigger at the right time for their actual usage limits, making the feature much more useful and preventing API errors.

Describe the proposed solution in detail

Add an option in Roo Code settings to set a custom max context window for the Google Gemini API provider, similar to how it works for OpenAI-compatible providers.
- This could be a per-provider setting, or (even better) a universal/global max context window setting that applies to all providers unless overridden.
When set, Roo Code should respect this limit for all context management, including:
- How much context is sent in each request to Gemini
- When to trigger "Intelligently condense the context window" (so it summarizes before hitting the user-defined limit, not the hardcoded 1M)
- Any UI warnings or token usage displays should reflect the user-set limit
Ideally, this setting should be easy to find and adjust, with a sensible default (e.g., 1M for Gemini, but user-overridable).

Technical considerations or implementation details (optional)

No response

Describe alternatives considered (if any)

The only current workaround is to manually keep conversations short or start new threads, which is disruptive and doesn't allow users to take full advantage of Roo Code's context management features.
Another alternative is to only allow this for OpenAI-compatible providers, but this leaves Gemini users at a disadvantage.
A universal/global max context window setting would be a good alternative, as it would help users who switch between providers or use multiple models.

Additional Context & Mockups

This feature would especially benefit users on the Gemini free tier (250k TPM), but also anyone who wants more control over context size for cost or performance reasons.
It would make the "Intelligently condense the context window" feature much more effective, since condensation would happen at the right time for the user's actual limits.

Proposal Checklist

I have searched existing Issues and Discussions to ensure this proposal is not a duplicate.
This proposal is for a specific, actionable change intended for implementation (not a general idea).
I understand that this proposal requires review and approval before any development work begins.

Are you interested in implementing this feature if approved?

Yes, I would like to contribute to implementing this feature.

hannesrudolph · 2025-05-21T19:46:13Z

@mrubens Approved?

canrobins13 · 2025-05-28T23:57:46Z

It is now possible to set a threshold (percentage of the context window) at which automatic condensing is triggered. Does that satisfy this requirement?

anojndr · 2025-05-29T00:30:19Z

It is now possible to set a threshold (percentage of the context window) at which automatic condensing is triggered. Does that satisfy this requirement?

Yeah

canrobins13 · 2025-05-29T02:17:19Z

Closing as complete since it is now possible to set a threshold (percentage of the context window) at which automatic condensing is triggered, and there is a manual condense button.

hannesrudolph · 2025-06-02T19:06:58Z

@canrobins13 I think we should still be able to set the max context window on the api provider level since the context condensing % threshold is not an api provider level setting and thus trying to cap Gemini at 200k (it gets more expensive after 200k) by setting the threshold to 80% will negativly impact when the user switches to use a model with a lower context they dont want to condense at that threshold.

HahaBill · 2025-06-02T20:44:31Z

@hannesrudolph Hi Hannes, happy to work on this issue! :)

canrobins13 · 2025-06-02T20:59:21Z

@hannesrudolph another option is to have a max number of tokens to condense at across all providers, and then condense at whichever is less of the percent and the absolute token count. I worry a bit that people won’t bother to keep track of provider specific settings or will get confused by them

avtc · 2025-06-02T21:26:36Z

Would be nice to have ability to set/store/restore the limit per model (for local models - especially valuable) or per provider/model

hannesrudolph · 2025-06-02T22:21:48Z

@hannesrudolph Hi Hannes, happy to work on this issue! :)

It's all yours!

mrubens · 2025-06-09T13:42:05Z

Is this duplicative of the work we're doing to set provider-specific condensing thresholds?

If we do need this, I'm not sure it should be Gemini-only. We should probably discuss.

HahaBill · 2025-06-09T17:06:04Z

Is this duplicative of the work we're doing to set provider-specific condensing thresholds?

If we do need this, I'm not sure it should be Gemini-only. We should probably discuss.

Hi @mrubens @hannesrudolph!! I looked into this a bit, and if you refer to this PR: #4456, then yes, it seems that it is duplicate work, but there are subtle differences.

In my opinion, it seems clearer for users to set it via context limit rather than thresholding if they have a specific goal for which token they want to cap at. Currently, users must search for the model's limit and calculate it for thresholding. But if they already know their token budgeting limit, it will be easier and 8E62 more direct for them to set via context limit.

Like anojndr has 250 000 TPM in this mind, if he set the context limit to 250 000, then the job is done. If he has to do it via thresholding, then he needs to calculate it and set the percentage. I think the latter is worse in terms of user experience.

However, I think this (meaning strategically setting context limit) can coexist with #4456 and be helpful in the future for cost calculation and budgeting features, e.g., for future intelligent cost optimization and monitoring systems. This will give users more finer-grained control and management over the cost.

Conclusion

I think both profile-specific thresholding and profile-specific context limit will coexist well with each other and give users finer-grained control over the cost. If we go this route and with this in mind, then yeah this should not be Gemini-specific.

anojndr added the enhancement New feature or request label May 18, 2025

github-project-automation bot added this to Roo Code Roadmap May 18, 2025

github-project-automation bot moved this to New in Roo Code Roadmap May 18, 2025

dosubot bot added the feature request Feature request, not a bug label May 18, 2025

mrubens added this to Roo Code Roadmap May 20, 2025

github-project-automation bot moved this to New in Roo Code Roadmap May 20, 2025

hannesrudolph moved this from New to Issue [Unassigned] in Roo Code Roadmap May 21, 2025

hannesrudolph added the Issue - Unassigned / Actionable Clear and approved. Available for contributors to pick up. label May 21, 2025

canrobins13 closed this as completed May 29, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap May 29, 2025

github-project-automation bot moved this from Issue [Unassigned] to Done in Roo Code Roadmap May 29, 2025

hannesrudolph reopened this Jun 2, 2025

github-project-automation bot moved this from Done to New in Roo Code Roadmap Jun 2, 2025

github-project-automation bot moved this from Done to Triage in Roo Code Roadmap Jun 2, 2025

hannesrudolph self-assigned this Jun 2, 2025

hannesrudolph moved this from Triage to Issue [In Progress] in Roo Code Roadmap Jun 2, 2025

hannesrudolph assigned HahaBill Jun 2, 2025

daniel-lxs added Issue - In Progress Someone is actively working on this. Should link to a PR soon. and removed Issue - Unassigned / Actionable Clear and approved. Available for contributors to pick up. labels Jun 3, 2025

HahaBill linked a pull request Jun 5, 2025 that will close this issue

feat: Adding Max Context Window setting for Gemini API provider #4360

Draft

23 tasks

daniel-lxs unassigned hannesrudolph Jun 5, 2025

This was referenced Jun 9, 2025

Finer-grained control of Gemini models and Enhancement of Gemini Integration in Roo Code #4483

Open

Finer-grained control of Gemini models #4519

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow setting a custom max context window for Google Gemini API provider (and/or universal max context window) #3717

Allow setting a custom max context window for Google Gemini API provider (and/or universal max context window) #3717

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Allow setting a custom max context window for Google Gemini API provider (and/or universal max context window) #3717

Allow setting a custom max context window for Google Gemini API provider (and/or universal max context window) #3717

Comments

What problem does this proposed feature solve?

Describe the proposed solution in detail

Technical considerations or implementation details (optional)

Describe alternatives considered (if any)

Additional Context & Mockups

Proposal Checklist

Are you interested in implementing this feature if approved?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Conclusion

Uh oh!