Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessAI on CanvasHacker News AI TopDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - Thailand Business NewsGoogle News - AI ThailandAGI vs artificial intelligence: What’s the real difference - WIONGNews AI AGIAI agents promise to 'run the business,' but who is liable if things go wrong?Hacker News AI TopJapan Turns Labor Crisis Into Physical AI Testing Ground - The Tech BuzzGNews AI jobsBuy Facebook Reviews | Boost Brand Trust & VisibilityDev.to AIMy AI Pendant Turned Voice Memos Into Two Shipped ProjectsMedium AIWhy Your Website Is Invisible to AI Search Engines (And How to Fix It)Dev.to AI85% of Companies Claim Skills-Based Hiring. Only 0.14% of Hires Are Actually Affected.Medium AII Tried the Tea Checker App as a Developer — Here’s My Honest ReviewDev.to AIBeyond Simple OCR: Building an Autonomous VLM Auditor for E-Commerce ScaleDev.to AIHow to Build the 1% AI System — A Step-by-Step Implementation That Teams Actually UseMedium AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessAI on CanvasHacker News AI TopDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - Thailand Business NewsGoogle News - AI ThailandAGI vs artificial intelligence: What’s the real difference - WIONGNews AI AGIAI agents promise to 'run the business,' but who is liable if things go wrong?Hacker News AI TopJapan Turns Labor Crisis Into Physical AI Testing Ground - The Tech BuzzGNews AI jobsBuy Facebook Reviews | Boost Brand Trust & VisibilityDev.to AIMy AI Pendant Turned Voice Memos Into Two Shipped ProjectsMedium AIWhy Your Website Is Invisible to AI Search Engines (And How to Fix It)Dev.to AI85% of Companies Claim Skills-Based Hiring. Only 0.14% of Hires Are Actually Affected.Medium AII Tried the Tea Checker App as a Developer — Here’s My Honest ReviewDev.to AIBeyond Simple OCR: Building an Autonomous VLM Auditor for E-Commerce ScaleDev.to AIHow to Build the 1% AI System — A Step-by-Step Implementation That Teams Actually UseMedium AI
AI NEWS HUBbyEIGENVECTOREigenvector

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Reddit r/MachineLearningby /u/Embarrassed_Will_120 https://www.reddit.com/user/Embarrassed_Will_120April 4, 20262 min read1 views
Source Quiz

Hi everyone, I am from Australia : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replacing the 8-bit exponent with a 4-bit group code . For 99.97% of weights , decoding is just one integer ADD . Byte-aligned split storage: true 12-bit per weight, no 16-bit padding waste, and zero HBM read amplification. Yes 12 bit not 11 bit !! The main idea was not just “compress weights more”, but to make the format GPU-friendly enough to use directly during inference : sign + mantissa: exactly 1 byte per element group: two nibbles packed into exactly 1 byte too https://preview.redd.it/qbx94xeeo2tg1.png?width=1536 format=png auto=webp s=831da49f6b1729bd0a0e2d1f075786274e5a7398 1.33x smaller than BF16 Fixed-rate 12-bit per weight , no

Could not retrieve the full article text.

Read on Reddit r/MachineLearning →
Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

llamamistralmodel

Knowledge Map

Knowledge Map
TopicsEntitiesSource
[P] GPU fri…llamamistralmodelreleasereviewresearchReddit r/Ma…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 162 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models