AIOS — First Ground Truth Baseline (CPU DRAM Measurement)
AIOS — First Ground Truth Baseline (CPU DRAM Measurement) Following up on my earlier post introducing AIOS (CPU-native LLM inference architecture), we now have the first validated baseline measurement using hardware memory controller counters. Setup Model: Falcon 7B (GGUF Q4_K_M) CPU: Intel Core Ultra 7 265K (20 cores) OS: Arch Linux (kernel 6.19.10-zen1-1-zen) Method: perf uncore IMC counters (uncore_imc_free_running_0/data_read/) Results (5 runs × 200 tokens) MB/token: 2340 ± 4 MB Coefficient of Variation: 0.17% Tokens/sec: 11.43 ± 0.05 Key Takeaways The measurement is highly stable (CV < 1%), confirming that DRAM reads can be treated as a reliable physical metric. ~456–459 GB DRAM read for 200 tokens highlights the memory bandwidth wall in CPU inference. This establishes a ground truth
Could not retrieve the full article text.
Read on discuss.huggingface.co →discuss.huggingface.co
https://discuss.huggingface.co/t/aios-first-ground-truth-baseline-cpu-dram-measurement/174769Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelvaluationanalysisWhat Happens When You Press a Key
<h1> What Happens When You Press a Key </h1> <h2> One Letter, Eight Processors, Three OS Layers </h2> <p><em>Reading time: ~15 minutes</em></p> <p>You pressed the letter <code>a</code>.</p> <p>It showed up in your terminal. That felt instantaneous. Sorry to tell you, but it wasn't. Between the tip of your finger leaving that key and the character appearing on screen, at least eight separate processors handled your input, three operating system layers made decisions about it, and — if you're working over SSH, and let's be honest, how else are we talking to Claude at midnight — it crossed the internet encrypted and <a href="https://www.oed.com/dictionary/packetized_adj" rel="noopener noreferrer">packetized</a> inside a protocol designed by a Finnish researcher in the 1990s.</p> <p>The eight
File Descriptors: The Numbers Behind Everything
<h1> File Descriptors: The Numbers Behind Everything </h1> <h2> The Integers That Run Your System </h2> <p><em>Reading time: ~13 minutes</em></p> <p>You called <code>open("config.toml")</code> and got back the number <code>3</code>.</p> <p>Not a file handle. Not a stream object. Not a path. A small integer. The language runtime probably wrapped it in something friendlier — a <code>File</code> object, a <code>BufferedReader</code>, an <code>io.TextIOWrapper</code> — but that wrapping happens after the kernel gave you a number. The number is the real thing.</p> <p>That number is a <strong>file descriptor</strong>, and it's the single abstraction that holds together files, sockets, pipes, terminals, timers, signals, and <code>/dev/null</code>. They are all just integers pointing into a kernel
CodiumAI Alternatives: Best AI Testing Tools
<h2> CodiumAI Is Now Qodo: What Changed and Why It Matters </h2> <p><a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fid5jvbatb7drwkk5ns2g.png" class="article-body-image-wrapper"><img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fid5jvbatb7drwkk5ns2g.png" alt="Qodo (formerly CodiumAI) screenshot" width="800" height="500"></a></p> <p>If you are searching for CodiumAI alternatives, the first thing you need to know is that CodiumAI <a href="https://dev.to/blog/codiumai-to-qodo/">rebranded to Qodo in 2024</a>. The company, founded by Itam
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Open Source AI
b8600
<details open=""> <p>fix: correct misspellings in code comments (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="4177193072" data-permission-text="Title is private" data-url="https://github.com/ggml-org/llama.cpp/issues/21217" data-hovercard-type="pull_request" data-hovercard-url="/ggml-org/llama.cpp/pull/21217/hovercard" href="https://github.com/ggml-org/llama.cpp/pull/21217">#21217</a>)</p> <ul> <li>emdeddings → embeddings (gemma3.cpp, gemma3n-iswa.cpp,<br> gemma-embedding.cpp)</li> <li>imlpemented → implemented (llama-adapter.cpp)</li> <li>interere → interfere (llama-graph.cpp)</li> <li>overridde → overridden (chat.cpp)</li> <li>stastistics → statistics (ngram-map.h)</li> <li>layed → laid (llama-kv-cache.h)</li> <li>worster → worst (llama-context.cpp)
b8601
<details open=""> <p>common : gpt-oss handle builtin and unsolicited tool calls (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="4176521984" data-permission-text="Title is private" data-url="https://github.com/ggml-org/llama.cpp/issues/21213" data-hovercard-type="pull_request" data-hovercard-url="/ggml-org/llama.cpp/pull/21213/hovercard" href="https://github.com/ggml-org/llama.cpp/pull/21213">#21213</a>)</p> </details> <p><strong>macOS/iOS:</strong></p> <ul> <li><a href="https://github.com/ggml-org/llama.cpp/releases/download/b8601/llama-b8601-bin-macos-arm64.tar.gz">macOS Apple Silicon (arm64)</a></li> <li><a href="https://github.com/ggml-org/llama.cpp/releases/download/b8601/llama-b8601-bin-macos-x64.tar.gz">macOS Intel (x64)</a></li> <li><a href="http
Shanghai to double down on open-source AI amid push for tech self-sufficiency - South China Morning Post
<a href="https://news.google.com/rss/articles/CBMixgFBVV95cUxPcGQwSW9hOTd3OGpWYUp2RjVmYjJTZUVLaTR2QTlfd3NKYmdfZEZFamZNUnpCWG5SS2poM29xVjVlQUZuTlhhUVhpMGxKMEViRzN2UWRQSTZydFRUSFNyZjhfOUJ4RC0tLWJpQ1VLLWJCLU1SMzVONk1MbUhjYnZpZ3V3WTR2WjIzOElrMER2bGxkNjRYczZWbExEUU42SElpb1hRY0pCcGs1bGJOZXR0cFU3bWFWajR3bWxMdnFVV3FCZDRiN2fSAcYBQVVfeXFMT2xZZ2JwdE5GdTd3TUVweFhwZjRTajR1b1ZyRThtS3A5R1UxU2NtRl9xRXA2WFlBOTJJV252M0I3WjFWOURsck45TGg1NnlZMkFwYU1od3U4SDhvb1NCTmljd0NlRXpsSnJLUjhzMjlRSGEtTDJmbXJ6VzRHcVFoczdlZWViUXpob0wxUHp4dng2YWswOVcza3l0ZUNMbGpFQi15LWFydnExYzE2UllaWnYwa3RxWXpnQ2ItSW81Wk5KSDNFWmVB?oc=5" target="_blank">Shanghai to double down on open-source AI amid push for tech self-sufficiency</a> <font color="#6f6f6f">South China Morning Post</font>
Art: Creative Resistance in the Age of AI and Authoritarianism - yellowscene.com
<a href="https://news.google.com/rss/articles/CBMingFBVV95cUxPc3dwUTRlbTQ1NHUycjA2dkVrSnM0Nnl4eFlYRGxnYnkxVU9XbFZsMkF5UUlfeEkyMWI0MzF2ZEdna0NjbVhyaWVtOUtYRHFZX3NmLW1HNk44VENWb3lBNWNvVkd5dC1aSFFRbHY3alo1enhSU3dLeHNteUEycEY4TWVpTUp6YjJtcU5McDFENXpVTEZUM1Q4NWRsNFJFUQ?oc=5" target="_blank">Art: Creative Resistance in the Age of AI and Authoritarianism</a> <font color="#6f6f6f">yellowscene.com</font>
Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!