Open Source AI llama model transformer version update github

v4.3

text-gen-webui Releasesby oobaboogaApril 3, 20263 min read0 views

Changes ik_llama.cpp support : Add ik_llama.cpp as a new backend: new textgen-portable-ik portable builds, new --ik flag for full installs. ik_llama.cpp is a fork by the author of the imatrix quants, including support for new quant types, significantly more accurate KV cache quantization (via Hadamard KV cache rotation, enabled by default), and optimizations for MoE models and CPU inference. API: Add echo + logprobs for /v1/completions . The completions endpoint now supports the echo and logprobs parameters, returning token-level log probabilities for both prompt and generated tokens. Token IDs are also included in the output via a new top_logprobs_ids field. Further optimize my custom gradio fork, saving up to 50 ms per UI event (button click, etc). Transformers: Autodetect torch_dtype fr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Original source

text-gen-webui Releases

https://github.com/oobabooga/text-generation-webui/releases/tag/v4.3

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

llamamodeltransformer

ModelsFresh

trunk/318e7eb43b73fd79cae64e4ea146f918760707f7

[Be][Claude Skills] Update bug bash skill with feedback from bug bash…

PyTorch Releases

1mabout 4 hours ago

ReleasesFresh

trunk/834da621b18df19b513ee787c6926d43f928adfc: add API to check if a tensor is symm-mem-tensor (#178947)

In Helion autotuner, we need clone a input symm memory tensor properly if the kernel inplace update it. That requires us to know if a tensor is a symm memory tensor. Right now I call rendezvous for the tensor. If no exception is thrown, then it's a symm memory tensor. But it's not ideal there will be a lot of warnings complaining calling rendezvous on non-symm memory tensor I'll need to pass in the process group name to this API. But fundamentally check if a tensor is a symmetric memory tensor does not require the process group name. Pull Request resolved: #178947 Approved by: https://github.com/ngimel , https://github.com/fegin

PyTorch Releases

1mabout 4 hours ago

ModelsFresh

trunk/34b6e17d1a24014822e71d2f0726adafc230ed0b: [Native DSLs] DSL Registry, base tests rework (#178381)

Summary: Note: Due to git-related shenanigans, this has subsumed #178518 Tests cleaning based on more explicit instructions to claude - should be better aligned with other torch tests. Add a separate registry for DSLs (alongside the existing registry for overrides). This allows a) a centralized place to query the availability of different DSLs, and b) a cleaner way to test / test for multiple DSLs without requiring manually adding each new DSL. Add Test skip decorators for current DSL list Test Plan: pytest -sv test/python_native/ Signed-off-by: Simon Layton [email protected] Pull Request resolved: #178381 Approved by: https://github.com/drisspg , https://github.com/albanD ghstack dependencies: #178637

PyTorch Releases

1mabout 4 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 165 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Open Source AI

Open Source AILive

From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agents

arXiv:2604.01496v1 Announce Type: new Abstract: We introduce SWE-ZERO to SWE-HERO, a two-stage SFT recipe that achieves state-of-the-art results on SWE-bench by distilling open-weight frontier LLMs. Our pipeline replaces resource-heavy dependencies with an evolutionary refinement strategy: (1) SWE-ZERO utilizes large-scale, execution-free trajectories to master code semantics and repository-level reasoning, and (2) SWE-HERO applies targeted, execution-backed refinement to transition these semantic intuitions into rigorous engineering workflows. Our empirical results set a new benchmark for open-source models of comparable size. We release a dataset of 300k SWE-ZERO and 13k SWE-HERO trajectories distilled from Qwen3-Coder-480B, alongside a suite of agents based on the Qwen2.5-Coder series.

arXiv cs.SE

1mabout 1 hour ago

Open Source AIFresh

A Quick Note on Gemma 4 Image Settings in Llama.cpp

In my last post, I mentioned using --image-min-tokens to increase the quality of image responses from Qwen3.5 . I went to load Gemma 4 the same way, and hit an error: [58175] srv process_chun: processing image... [58175] encoding image slice... [58175] image slice encoded in 7490 ms [58175] decoding image batch 1/2, n_tokens_batch = 2048 [58175] /Users/socg/llama.cpp-b8639/src/llama-context.cpp:1597: GGML_ASSERT((cparams.causal_attn || cparams.n_ubatch > = n_tokens_all ) "non-causal attention requires n_ubatch >= n_tokens" ) failed [58175] WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info. [58175] WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash. [58175] See: https://github.com/ggml-org/llama.cpp/pull/17869 [58175] 0 libggml-base.0.9.11.dylib 0

DEV Community

3mabout 3 hours ago

Open Source AIFresh

Building an AI-Powered DevSecOps Guardrail Pipeline with GitHub Actions

Learn how to build an AI-powered DevSecOps guardrail pipeline using GitHub Actions to automatically detect security vulnerabilities before deployment. Read All

Hackernoon AI

1mabout 4 hours ago

Open Source AIFresh

langchain-core==1.2.25

Changes since langchain-core==1.2.24 release(core): 1.2.25 ( #36473 ) fix(core): harden check for txt files in deprecated prompt loading functions ( #36471 ) fix(core): fixed typos in the documentation ( #36459 )

LangChain Releases

1mabout 6 hours ago