[P] Gemma 4 running on NVIDIA B200 and AMD MI355X from the same inference stack, 15% throughput gain over vLLM on Blackwell
Google DeepMind dropped Gemma 4 today: Gemma 4 31B: dense, 256K context, redesigned architecture targeting efficiency and long-context quality Gemma 4 26B A4B: MoE, 26B total / 4B active per forward pass, 256K context Both are natively multimodal (text, image, video, dynamic resolution). We got both running on MAX on launch day across NVIDIA B200 and AMD MI355X from the same stack. On B200 we're seeing 15% higher output throughput vs. vLLM (happy to share more on methodology if useful). Free playground if you want to test without spinning anything up: https://www.modular.com/#playground submitted by /u/carolinedfrasca [link] [comments]
Could not retrieve the full article text.
Read on Reddit r/MachineLearning →Reddit r/MachineLearning
https://www.reddit.com/r/MachineLearning/comments/1saot07/p_gemma_4_running_on_nvidia_b200_and_amd_mi355x/Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
launchmultimodalCan We Secure AI With Formal Methods? January-March 2026
In the month or so around the previous new years, as 2024 became 2025, we were saying “2025: year of the agent”. MCP was taking off, the inspect-ai and pydantic-ai python packages were becoming the standards, products were branching out from chatbots to heavy and autonomous use of toolcalls. While much of the product engineering scene may have underdelivered (in the sense that “planning a vacation” isn’t entirely something most people do with agents yet), the field of FMxAI I think was right on target. Feels like there’s an agentic component to everything I read these days. What is 2026 the year of? Besides “year of investors pressure all the math companies to pivot to program synthesis”? I’m declaring it now The number of blogposts relating to secure program synthesis went exponential sin

New multimodal dataset will help in the development of ethical AI systems
By Shaina Raza and Deval Pandya The Vector Institute’s AI Engineering team has developed Newsmediabias-plus (NMB+), a new multimodal dataset. It includes full-text articles alongside comprehensive publication details. It also [ ] The post New multimodal dataset will help in the development of ethical AI systems appeared first on Vector Institute for Artificial Intelligence .
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.

![[P] Gemma 4 running on NVIDIA B200 and AMD MI355X from the same inference stack, 15% throughput gain over vLLM on Blackwell](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-neural-network-P6fqXULWLNUwjuxqUZnB3T.webp)


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!