b8640

llama.cpp Releasesby github-actions[bot]April 2, 20262 min read0 views

tests : add unit test coverage for llama_tensor_get_type ( #20112 ) Add unit test coverage for llama_tensor_get_type Fix merge conflicts, add more schemas clang formatter changes Trailing whitespace Update name Start rebase Updating files with upstream changes prior to rebase Changes needed from rebase Update attn_qkv schema, change throw behaviour Fix merge conflicts White space Update with latest changes to state counters Revert accidental personal CLAUDE.md changes Change quotation mark Reuse metadata.name since we have it Move test-only stuff out of llama-quant.cpp Hide the regex functionality back in llama-quant.cpp, use a unique pointer to a new struct 'compiled_tensor_type_patterns' which contains the patterns cont : inital deslop guidelines Cleanup based on review comments Continue

tests : add unit test coverage for llama_tensor_get_type (#20112)

Add unit test coverage for llama_tensor_get_type
Fix merge conflicts, add more schemas
clang formatter changes
Trailing whitespace
Update name
Start rebase
Updating files with upstream changes prior to rebase
Changes needed from rebase
Update attn_qkv schema, change throw behaviour
Fix merge conflicts
White space
Update with latest changes to state counters
Revert accidental personal CLAUDE.md changes
Change quotation mark
Reuse metadata.name since we have it
Move test-only stuff out of llama-quant.cpp
Hide the regex functionality back in llama-quant.cpp, use a unique pointer to a new struct 'compiled_tensor_type_patterns' which contains the patterns
cont : inital deslop guidelines
Cleanup based on review comments
Continue cleanup
Small cleanup
Manually set proper ordering of tensors, mostly applies to gemma
Formatting
Update tests/test-quant-type-selection.cpp

Co-authored-by: Sigbjørn Skjæret [email protected]

Fix merge conflicts

Co-authored-by: Georgi Gerganov [email protected] Co-authored-by: Sigbjørn Skjæret [email protected]

macOS/iOS:

macOS Apple Silicon (arm64)
macOS Intel (x64)
iOS XCFramework

Linux:

Ubuntu x64 (CPU)
Ubuntu arm64 (CPU)
Ubuntu s390x (CPU)
Ubuntu x64 (Vulkan)
Ubuntu arm64 (Vulkan)
Ubuntu x64 (ROCm 7.2)
Ubuntu x64 (OpenVINO)

Windows:

Windows x64 (CPU)
Windows arm64 (CPU)
Windows x64 (CUDA 12) - CUDA 12.4 DLLs
Windows x64 (CUDA 13) - CUDA 13.1 DLLs
Windows x64 (Vulkan)
Windows x64 (SYCL)
Windows x64 (HIP)

openEuler:

openEuler x86 (310p)
openEuler x86 (910b, ACL Graph)
openEuler aarch64 (310p)
openEuler aarch64 (910b, ACL Graph)

Original source

llama.cpp Releases

https://github.com/ggml-org/llama.cpp/releases/tag/b8640

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

claudellamaupdate

ModelsRecent

Anthropic Races to Contain Leak of Code Behind Claude AI Agent - WSJ

Anthropic Races to Contain Leak of Code Behind Claude AI Agent WSJ

GNews AI open source

1m1 day ago

ModelsRecent

My most common research advice: do quick sanity checks

Written quickly as part of the Inkhaven Residency . At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories: Doing quick sanity checks Saying precisely what you want to say Asking why one more time In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like. This piece covers doing quick sanity checks, which is the most common advice I give to junior researchers. I’ll cover the other two pieces of advice in a subsequent piece. Doing quick sanity checks Research is hard (almost by definition) and people are often wrong. Every researcher has wasted countless hours

AI Alignment Forum

7m1 day ago

ReleasesLive

Identifying Privacy Concerns in Upcoming Software Release: A Peek into the Future

arXiv:2604.01393v1 Announce Type: new Abstract: Identifying the features to be released in the next version of software, from a pool of potential candidates, is a challenging problem. User feedback from app stores is frequently used by software vendors for the evolution of apps across releases. Privacy feedback, although smaller in volume, carries a larger impact influencing app's success. Multiple existing work has focused on summarizing privacy concerns at the app level and has also shown that developers utilize feedback to implement security and privacy-related changes in subsequent releases. However, the current literature offers little support for release managers and developers in identifying privacy concerns prior to release. This gap exists as user reviews are typically available i

arXiv cs.SE

2m26 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 326 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsRecent

Anthropic Races to Contain Leak of Code Behind Claude AI Agent - WSJ

Anthropic Races to Contain Leak of Code Behind Claude AI Agent WSJ

GNews AI open source

1m1 day ago

ModelsLive

Tracking the emergence of linguistic structure in self-supervised models learning from speech

arXiv:2604.02043v1 Announce Type: cross Abstract: Self-supervised speech models learn effective representations of spoken language, which have been shown to reflect various aspects of linguistic structure. But when does such structure emerge in model training? We study the encoding of a wide range of linguistic structures, across layers and intermediate checkpoints of six Wav2Vec2 and HuBERT models trained on spoken Dutch. We find that different levels of linguistic structure show notably distinct layerwise patterns as well as learning trajectories, which can partially be explained by differences in their degree of abstraction from the acoustic signal and the timescale at which information from the input is integrated. Moreover, we find that the level at which pre-training objectives are d

arXiv eess.AS

1m26 minutes ago

ModelsRecent

My most common research advice: do quick sanity checks

AI Alignment Forum

7m1 day ago

ModelsLive

Fast dynamical similarity analysis

arXiv:2511.22828v2 Announce Type: replace-cross Abstract: Understanding how nonlinear dynamical systems (e.g., artificial neural networks and neural circuits) process information requires comparing their underlying dynamics at scale, across diverse architectures and large neural recordings. While many similarity metrics exist, current approaches fall short for large-scale comparisons. Geometric methods are computationally efficient but fail to capture governing dynamics, limiting their accuracy. In contrast, traditional dynamical similarity methods are faithful to system dynamics but are often computationally prohibitive. We bridge this gap by combining the efficiency of geometric approaches with the fidelity of dynamical methods. We introduce fast dynamical similarity analysis (fastDSA),

arXiv q-bio.NC

2m26 minutes ago