Gemma 4 - 31b abliterated quants
Got inspired to try and crack this egg without using heretic. FP16, Q8_0 and Q4_K_M quants, plus the abliteration script for modification/use is here: https://huggingface.co/paperscarecrow/Gemma-4-31B-it-abliterated-gguf based off of mlabonne's Orthogonalized Representation Intervention method , because I loved his ablits of gemma3 so much. Edit: Overestimated my internet speeds, still uploading the models. submitted by /u/Polymorphic-X [link] [comments]
Could not retrieve the full article text.
Read on Reddit r/LocalLLaMA →Reddit r/LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/1sawcyr/gemma_4_31b_abliterated_quants/Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelpaperhuggingface

quarkus-chat-ui: A Web Front-End for LLMs, and a Real-World Case for POJO-actor
Note: This article was originally published on SciVicsLab . quarkus-chat-ui: A Web Front-End for LLMs, and a Real-World Case for POJO-actor quarkus-chat-ui is a web UI for LLMs where multiple instances can talk to each other — built as a real-world use case for POJO-actor . Each quarkus-chat-ui instance exposes an HTTP MCP server at /mcp , so Instance A can call tools on Instance B, and Instance B can reply by calling tools back on A. The LLM backend — Claude Code CLI, Codex, or a local model via claw-code-local — acts as an MCP client that can reach these endpoints. The question was how to wire that up over HTTP, and how to handle the fact that LLM responses take tens of seconds and arrive as a stream. quarkus-chat-ui is the bridge that makes this work. Each instance wraps one LLM backend

I'm under 18, broke, and I just designed an open-source AI chip. Here's the full story.
I don't have a team. I don't have funding. I don't have a lab. I have a laptop, an internet connection, and an obsession with chips. This is the story of T1C — Tier 1 Chip — and why I built it. It started with a frustration. Every time I read about AI hardware, it was the same story. NVIDIA charges $30,000 for an H100. TSMC charges millions for a custom fab run. Apple Silicon is beautiful but completely closed. Intel, Qualcomm, AMD — all of them — locked behind NDAs, closed architectures, and billion-dollar relationships. I kept thinking: why does no one make an open-source AI chip that a real person can actually fabricate? Not a toy. Not a demo. A real architecture with real specs, real physics, and a real path to silicon. So I built one. T1C uses Digital In-Memory Computing — D-IMC. Inst
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Open Source AI

quarkus-chat-ui: A Web Front-End for LLMs, and a Real-World Case for POJO-actor
Note: This article was originally published on SciVicsLab . quarkus-chat-ui: A Web Front-End for LLMs, and a Real-World Case for POJO-actor quarkus-chat-ui is a web UI for LLMs where multiple instances can talk to each other — built as a real-world use case for POJO-actor . Each quarkus-chat-ui instance exposes an HTTP MCP server at /mcp , so Instance A can call tools on Instance B, and Instance B can reply by calling tools back on A. The LLM backend — Claude Code CLI, Codex, or a local model via claw-code-local — acts as an MCP client that can reach these endpoints. The question was how to wire that up over HTTP, and how to handle the fact that LLM responses take tens of seconds and arrive as a stream. quarkus-chat-ui is the bridge that makes this work. Each instance wraps one LLM backend

I'm under 18, broke, and I just designed an open-source AI chip. Here's the full story.
I don't have a team. I don't have funding. I don't have a lab. I have a laptop, an internet connection, and an obsession with chips. This is the story of T1C — Tier 1 Chip — and why I built it. It started with a frustration. Every time I read about AI hardware, it was the same story. NVIDIA charges $30,000 for an H100. TSMC charges millions for a custom fab run. Apple Silicon is beautiful but completely closed. Intel, Qualcomm, AMD — all of them — locked behind NDAs, closed architectures, and billion-dollar relationships. I kept thinking: why does no one make an open-source AI chip that a real person can actually fabricate? Not a toy. Not a demo. A real architecture with real specs, real physics, and a real path to silicon. So I built one. T1C uses Digital In-Memory Computing — D-IMC. Inst

Basic PSA. PocketPal got updated, so runs Gemma 4.
Just because I've seen a couple of "I want this on Android" questions, PocketPal got updated a few hours ago, and runs Gemma 4 2B and 4B fine. At least on my hardware (crappy little moto g84 workhorse phone). Love an app that gets regular updates. I'm going to try and squeak 26B a4 iq2 quantization into 12gigs of ram, on a fresh boot, but I'm almost certain it can't be done due to Android bloat. But yeah, 2B and 4B work fine and quickly under PocketPal. Hopefully their next one is 7-8B (not 9B), because the new Qwen 3.5 models just skip over memory caps, but the old ones didn't. Super numbers are great, running them with OS overhead and context size needs a bit smaller, to be functional on a 12gig RAM phone. Bring on the GemmaSutra 4 4B though, as another gold standard of thinking's and qu

Gemma 4 vs Qwen3.5 on SVG style
Some quick test using Gemma4-31B and Qwen3.5-27B, both Q4 quants from unsloth. I was already expecting Gemma 4 to be excellent at creative writing and better at translations for more obscure languages, but I didn’t expected to be that good at function calling and general coding tasks, and even in creating SVGs! Did you find any areas when Qwen3.5 beats Gemma4 ? submitted by /u/iChrist [link] [comments]

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!