b8661
llama: add custom newline split for Gemma 4 ( #21406 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64 (ROCm 7.2) Ubuntu x64 (OpenVINO) Windows: Windows x64 (CPU) Windows arm64 (CPU) Windows x64 (CUDA 12) - CUDA 12.4 DLLs Windows x64 (CUDA 13) - CUDA 13.1 DLLs Windows x64 (Vulkan) Windows x64 (SYCL) Windows x64 (HIP) openEuler: openEuler x86 (310p) openEuler x86 (910b, ACL Graph) openEuler aarch64 (310p) openEuler aarch64 (910b, ACL Graph)
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign up
Appearance settings
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
llama
Así se trabaja ya en España para impulsar el transporte autónomo
Para llegar al campus de la Universidad de Vigo (Uvigo) se necesita echar mano del coche o del transporte público. Solo unos pocos de sus centros se sitúan en el centro de la ciudad: la mayoría están en la ciudad universitaria que se levantó en los 90 en lo que hasta entonces eran montes. La dependencia de medios de transporte es, por tanto, incuestionable. Quizás por eso también este es un espacio con potencial para probar nuevos modelos de vehículos. Desde el inicio del año, el campus de Vigo está testeando cómo es la vida con un bus autónomo. El vehículo mueve al estudiantado (y a quien quiera probarlo) con una línea que recorre el campus, una ciudad universitaria que “reúne las condiciones ideales: un entorno complejo, con usuarios diversos y necesidades reales de transporte de última

Best model for 4090 as AI Coding Agent
Good day. I am looking for best local model for coding agent. I might've missed something or some model which is not that widely used so I cam here for the help. Currently I have following models I found useful in agentic coding via Google's turbo quant applied on llama.cpp: GLM 4.7 Flash Q4_K_M -> 30B 30B Nemotron 3 Q4_K_M -> 30B Qwen3 Coder Next Q4_K_M -> 80B I really was trying to get Qwen3 Coder Next to get a decent t/s for input and output as I thought it would be a killer but to my surprise...it sometimes makes so silly mistakes that I have to do lots of babysitting for agentic flow. GLM 4.7 and Nemotron are the ones I really can't decide between, both have decent t/s for agentic coding and I use both to maxed context window. The thing is that I feel there might be some model that ju

Running a local LLM on Android with Termux and llama.cpp
What I used Samsung S21 Ultra Termux llama-cpp-cli llama-cpp-server Qwen3.5-0.8B with Q5_K_M quantization from huggingface (I also tried Bonsai-8B-GGUF-1bit from huggingface. Although this is a newer model and required a different setup, which I might write about at a later time, it produced 2-3 TPS and I did not find that to be usable) Installation I downloaded the "Termux" app from the Google Play store and installed the needed tools in Termux: pkg update pkg upgrade -y pkg install llama-cpp -y Downloading a model I downloaded Qwen3.5-0.8B-Q5_K_M.gguf in my phone browser and saved it to my device. Then I opened the download folder shortcut in the browser, selected the GGUF file -> open with: Termux Now the file is accessible in Termux. Running it in the terminal After that, I loaded the
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

MCP maintainers from Anthropic, AWS, Microsoft, and OpenAI lay out enterprise security roadmap at Dev Summit
In a roundtable panel at the MCP Dev Summit last week in New York, Model Context Protocol (MCP) maintainers from The post MCP maintainers from Anthropic, AWS, Microsoft, and OpenAI lay out enterprise security roadmap at Dev Summit appeared first on The New Stack .

Why harness engineering is becoming the new AI moat
The recent leak of Anthropic's Claude Code reveals a hard truth: as LLMs become commoditized, the sophisticated engineering harness built around them is becoming the real moat. The post Why harness engineering is becoming the new AI moat first appeared on TechTalks .


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!