Gemini’s new token limits are just as bad as Claude’s, and maybe even a little dumber

Summary

Gemini subscription tiers have been adjusted; some get more value, while others lose out.
New compute-based quotas throttle heavy users.
Widespread reports suggest Gemini 3.5 Flash feels less reliable than 3.1 Pro.

After announcing sweeping changes to Gemini and AI-powered Search at Google I/O this week, Google announced overhauled Gemini AI usage limits and subscription tiers, sparking immediate backlash from users throughout communities and social media sites like Reddit who report restrictive throttling and a noticeable drop in output quality. The latest 3.5 Flash model fails to live up to Google’s claims, early testers say, and new compute-based usage limits heavily restrict access to older, yet seemingly more reliable models like 3.1 Pro.

Google Search is getting a complete AI overhaul — here’s what’s actually changing

AI… AI everywhere

New Gemini Plans are here

It’s a better deal for some, and a little worse for others

Google’s update introduces a revised subscription hierarchy that alters the value proposition for early adopters. The new structure includes Gemini 3.5 Flash, Omni, Flow, Daily Brief, and Gmail AI features for all plans.

AI Plus: Starts at $7.99 per month, but doesn’t include Pro 3.1
AI Pro: The standard tier remains $19.99 per month, and now includes YouTube Premium Lite, a new Google Pics image creation experience, and voice capabilities in Gmail, Docs, and Google Keep later this summer.
AI Ultra: Google lowered the price of its highest-tier plan from $250 to $200 per month and introduced a new $100-per-month option for advanced access with lower limits. It will include full YouTube Premium access and will be the only plan that includes Gemini Spark. Project Genie, which lets you create interactive 3D worlds, will only be available on the $200 per month Ultra plan.

Users on older promotional plans, such as the $100-per-year 5TB Pro plan offered during Black Friday, anticipate that those deals won’t return.

The new usage limits are a lot more prohibitive

Most users should be fine, but power users are burning through allowances

Google shifted from a daily prompt allowance to measuring total compute used. Quotas now refresh every five hours, with an overarching weekly limit. The system dynamically grades the complexity of a user’s request, the active features involved (generating Omni videos would use a lot more than figuring out an Instant Pot chili recipe), and the historical length of the chat thread.

This change seems to target intensive workflows, and many feel Gemini’s new usage limits are now as restrictive as Claude’s. When testing their workflows on Gemini 3.1 Pro, some users report hitting the limit after just 40 minutes to an hour, or eating up as much as 60 percent after a few messages. resulting in a forced four-hour wait. The revised structure effectively limits the paid platform for prolonged, continuous work sessions.

Those working with abnormally large context windows, such as heavy researchers and coders, seem to be burning through their allocations the fastest. Some suggest that starting new chats for each session can help curb usage, rather than continuing in a single long thread, as Gemini processes all previous context each time it receives new instructions.

Is Gemini’s intelligence actually regressing?

AI is supposed to improve over time, but this update feels worse

google-gemini-io — A computer screen displaying the Google Gemini AI interface in dark mode with the prompt “Where should we start?”.

Google claims the reasoning capabilities of the new 3.5 Flash model match the older 3.1 Pro standard while delivering faster speeds. The Gemini community at large begs to differ, and I concur with that consensus based on my own direct testing.

Users broadly report 3.5 Flash hallucinates data, sometimes more than it did on 3.1 Flash in some cases. It can still fail the basics, such as extracting accurate information from documents, and often struggles with basic logic. The default experience forces users onto the lighter 3.5 Flash model once quotas are reached.

While Gemini 3.1 Pro (especially Extended mode) remains capable for deep research and vibe coding, the new rate limits can make it prohibitive for some. Paying customers are already exploring local LLM hosting options or migrating to competitors like OpenAI’s ChatGPT and Anthropic’s Claude. Those are fine reprieves for now, but something tells me we should brace ourselves for further belt-tightening from major AI players as the monetary and logistical costs of deploying and supporting generative AI platforms continue to climb.

Source link

Gemini’s new token limits are just as bad as Claude’s, and maybe even a little dumber

Summary

Google Search is getting a complete AI overhaul — here’s what’s actually changing

New Gemini Plans are here

It’s a better deal for some, and a little worse for others

The new usage limits are a lot more prohibitive

Most users should be fine, but power users are burning through allowances

Is Gemini’s intelligence actually regressing?

AI is supposed to improve over time, but this update feels worse

Like this:

Related

Gemini’s new token limits are just as bad as Claude’s, and maybe even a little dumber

Summary

Google Search is getting a complete AI overhaul — here’s what’s actually changing

New Gemini Plans are here

It’s a better deal for some, and a little worse for others

The new usage limits are a lot more prohibitive

Most users should be fine, but power users are burning through allowances

Is Gemini’s intelligence actually regressing?

AI is supposed to improve over time, but this update feels worse

Share this:

Like this:

Related