Open Source AI llama model benchmark update open-source llama.cpp

B70: Quick and Early Benchmarks & Backend Comparison

Reddit r/LocalLLaMAby /u/abotsis https://www.reddit.com/user/abotsisApril 3, 20263 min read1 views

llama.cpp: f1f793ad0 (8657) This is a quick attempt to just get it up and running. Lots of oneapi runtime still using "stable" from Intels repo. Kernel 6.19.8+deb13-amd64 with an updated xe firmware built. Vulkan is Debian but using latest Mesa compiled from source. Openvino is 2026.0. Feels like everything is "barely on the brink of working" (which is to be expected). sycl: $ build/bin/llama-bench -hf unsloth/Qwen3.5-27B-GGUF:UD-Q4_K_XL -p 512,16384 -n 128,512 | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: | | qwen35 27B Q4_K - Medium | 16.40 GiB | 26.90 B | SYCL | 99 | pp512 | 798.07 ± 2.72 | | qwen35 27B Q4_K - Medium | 16.40 GiB | 26.90 B | SYCL | 99 | pp16384

Could not retrieve the full article text.

Read on Reddit r/LocalLLaMA →

Original source

Reddit r/LocalLLaMA

https://www.reddit.com/r/LocalLLaMA/comments/1sbt1em/b70_quick_and_early_benchmarks_backend_comparison/

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

llamamodelbenchmark

ModelsFresh

Failure Mechanisms and Risk Estimation for Legged Robot Locomotion on Granular Slopes

arXiv:2603.06928v2 Announce Type: replace Abstract: Locomotion on granular slopes such as sand dunes remains a fundamental challenge for legged robots due to reduced shear strength and gravity-induced anisotropic yielding of granular media. Using a hexapedal robot on a tiltable granular bed, we systematically measure locomotion speed together with slope-dependent normal and shear granular resistive forces. While normal penetration resistance remains nearly unchanged with inclination, shear resistance decreases substantially as slope angle increases. Guided by these measurements, we develop a simple robot-terrain interaction model that predicts anchoring timing, step length, and resulting robot speed, as functions of terrain strength and slope angle. The model reveals that slope-induced per

arXiv cs.RO

1mabout 10 hours ago

Open Source AIFresh

viable/strict/1775476886: [vllm hash update] update the pinned vllm hash (#179439)

This PR is auto-generated nightly by this action . Update the pinned vllm hash. Pull Request resolved: #179439 Approved by: https://github.com/pytorchbot

PyTorch Releases

1mabout 9 hours ago

Open Source AILive

Why APEX Matters for MoE Coding Models and why it's NOT the same as K quants

I posted about my APEX quantization of QWEN Coder 80B Next yesterday and got a ton of great questions. Some people loved it, some people were skeptical, and one person asked "what exactly is the point of this when K quants already do mixed precision?" It's a great question. I've been deep in this for the last few days running APEX on my own hardware and I want to break down what I've learned because I think most people are missing the bigger picture here. So yes K quants like Q4_K_M already apply different precision to different layers. Attention gets higher precision, feed-forward gets lower. That's been in llama.cpp for a while and it works. But here's the thing nobody is talking about. MoE models have a coherence problem. I was reading this article last night and it clicked for me. When

Reddit r/LocalLLaMA

3mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 197 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Open Source AI

Open Source AIFresh

viable/strict/1775476886: [vllm hash update] update the pinned vllm hash (#179439)

This PR is auto-generated nightly by this action . Update the pinned vllm hash. Pull Request resolved: #179439 Approved by: https://github.com/pytorchbot

PyTorch Releases

1mabout 9 hours ago

Open Source AILive

Why APEX Matters for MoE Coding Models and why it's NOT the same as K quants

Reddit r/LocalLLaMA

3mabout 1 hour ago

Open Source AIFresh

Robots Challenge Humans For Future Space Exploration Roles - Let's Data Science

Robots Challenge Humans For Future Space Exploration Roles Let's Data Science

Google News - AI robotics

1mabout 4 hours ago

Open Source AIFresh

Only 20% of MCP Servers Are 'A-Grade' Secure — Here's How to Vet Them Before Installing

Most MCP servers lack documentation or contain security flags. Use specific tools and criteria to install only vetted, safe servers. The Security Problem Nobody Was Tracking The Model Context Protocol (MCP) ecosystem has exploded, crossing 20,000 servers. This growth solved the tooling problem for AI agents but created a massive, unmonitored security surface. When you run claude code with an MCP server, that code executes with your permissions—accessing your shell, filesystem, and environment variables. A malicious or poorly written server is a direct supply chain attack on your development environment. A new analysis from Loaditout scanned the entire public MCP ecosystem and assigned security grades. The results are stark: only 20.5% of servers (4,230 out of 20,652) earned an 'A' grade ,

Dev.to AI

4mabout 2 hours ago