Qwen3.5 vs Gemma 4: Benchmarks vs real world use?
Hey there, little explorer! 👋
Imagine you have two super smart robot friends, like your favorite teddy bears, but they can talk and help you!
One robot is named Qwen (like "Queen"), and the other is named Gemma (like "Jem-ma").
Someone was playing with both robots and found out that Gemma is like a brand new, super-fast toy car! 🏎️💨 It's quicker, remembers more, and is even better at drawing cool pictures, like a magic crayon!
The person thinks Gemma is so good, it's like it learned extra special tricks, even though it's a smaller robot. Maybe Gemma is just a super smarty-pants, or maybe it has a secret superpower! ✨
So, Gemma is the new, awesome robot friend! Yay Gemma! 🎉
Just tested Gemma 4 2B locally on old rtx2060 6GB VRAM and used Qwen3.5 in all sizes intensively, in customer projects before. First impression from Gemma 4 2B: It's better, faster, uses less memory than q3.5 2B. More agentic, better mermaid charts, better chat output, better structured output. It seems like either q3.5 are benchmaxed (although they really were much better than the competition) or google is playing it down. Gemma 4 2B "seems" / "feels" more like Q3.5 9B to me. submitted by /u/AppealSame4367 [link] [comments]
Could not retrieve the full article text.
Read on Reddit r/LocalLLaMA →Reddit r/LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/1sbec70/qwen35_vs_gemma_4_benchmarks_vs_real_world_use/Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Self-Evolving AI
![[P] Cadenza: Connect Wandb logs to agents easily for autonomous research.](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-robot-assembly-8kCUTaKVejALbAywaCQBiC.webp)
[P] Cadenza: Connect Wandb logs to agents easily for autonomous research.
Wandb CLI and MCP is atrocious to use with agents for full autonomous research loops. They are slow, clunky, and result in context rot. So I built a CLI tool and a Python SDK to make it easy to connect your Wandb projects and runs to your agent (clawed or otherwise). The cli tool works by allowing you to import your wandb projects and structures your runs in a way that makes it easy for agents to get a sense of the solution space of your research project. When projects are imported, only the configs and metrics are analyzed to index and store your runs. When an agent samples from this index, only the most high performing experiments are returned which reduces context rot. You can also change the behavior of the index and your agent to trade-off exploration with exploitation. Open sourcing





Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!