1.13.0
Hey there, little explorer! Guess what? Our robot friends, the ones that help us learn and play, just got a super cool update!
Imagine your favorite toy car. Sometimes, it gets new wheels or a faster engine, right? That's what happened here!
Our robot friends learned new tricks! They can now talk better, remember things more easily, and even see pictures better, like when you look at a colorful book! Some little boo-boos, like a wobbly wheel, got fixed too.
So, now our robot pals are even smarter and happier, ready to help us more! Yay! 🎉
What's Changed Features Add RuntimeState RootModel for unified state serialization Enhance event listener with new telemetry spans for skill and memory events Add A2UI extension with v0.8/v0.9 support, schemas, and docs Emit token usage data in LLMCallCompletedEvent Auto-update deployment test repo during release Improve enterprise release resilience and UX Bug Fixes Add tool repository credentials to crewai install Add tool repository credentials to uv build in tool publish Pass fingerprint metadata via config instead of tool args Handle GPT-5.x models not supporting the stop API parameter Add GPT-5 and o-series to multimodal vision prefixes Bust uv cache for freshly published packages in enterprise release Cap lancedb below 0.30.1 for Windows compatibility Fix RBAC permission levels to m
What's Changed
Features
-
Add RuntimeState RootModel for unified state serialization
-
Enhance event listener with new telemetry spans for skill and memory events
-
Add A2UI extension with v0.8/v0.9 support, schemas, and docs
-
Emit token usage data in LLMCallCompletedEvent
-
Auto-update deployment test repo during release
-
Improve enterprise release resilience and UX
Bug Fixes
-
Add tool repository credentials to crewai install
-
Add tool repository credentials to uv build in tool publish
-
Pass fingerprint metadata via config instead of tool args
-
Handle GPT-5.x models not supporting the stop API parameter
-
Add GPT-5 and o-series to multimodal vision prefixes
-
Bust uv cache for freshly published packages in enterprise release
-
Cap lancedb below 0.30.1 for Windows compatibility
-
Fix RBAC permission levels to match actual UI options
-
Fix inaccuracies in agent-capabilities across all languages
Documentation
-
Add coding agent skills demo video to getting started pages
-
Add comprehensive SSO configuration guide
-
Add comprehensive RBAC permissions matrix and deployment guide
-
Update changelog and version for v1.13.0
Performance
- Reduce framework overhead with lazy event bus, skip tracing when disabled
Refactoring
-
Convert Flow to Pydantic BaseModel
-
Convert LLM classes to Pydantic BaseModel
-
Replace InstanceOf[T] with plain type annotations
-
Remove unused third_party LLM directory
Contributors
@alex-clawd, @dependabot[bot], @greysonlalonde, @iris-clawd, @joaomdmoura, @lorenzejay, @lucasgomide, @thiagomoretto
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelreleaseversion
Semantic matching in graph space without matrix computation and hallucinations and no GPU
Hello AI community,For the past few months, I’ve been rethinking how AI should process language and logic. Instead of relying on heavy matrix multiplications (Attention mechanisms) to statistically guess the next word inside an unexplainable black box, I asked a different question: What if concepts existed in a physical, multi-dimensional graph space where logic is visually traceable?I am excited to share our experimental architecture. To be absolutely clear: this is not a GraphRAG system built on top of an existing LLM. This is a standalone Native Graph Cognitive Engine.The Core Philosophy:Zero-Black-Box (Total Explainability): Modern LLMs are black boxes; you never truly know why they chose a specific token. Our engine is a “glass brain.” Every logical leap and every generated sentence i
b8679
llama-bench: add -fitc and -fitt to arguments ( #21304 ) llama-bench: add -fitc and -fitt to arguments update README.md address review comments update compare-llama-bench.py macOS/iOS: macOS Apple Silicon (arm64) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64 (ROCm 7.2) Ubuntu x64 (OpenVINO) Windows: Windows x64 (CPU) Windows arm64 (CPU) Windows x64 (CUDA 12) - CUDA 12.4 DLLs Windows x64 (CUDA 13) - CUDA 13.1 DLLs Windows x64 (Vulkan) Windows x64 (SYCL) Windows x64 (HIP) openEuler: openEuler x86 (310p) openEuler x86 (910b, ACL Graph) openEuler aarch64 (310p) openEuler aarch64 (910b, ACL Graph)

15 Datasets for Training and Evaluating AI Agents
Datasets for training and evaluating AI agents are the foundation of reliable agentic systems. Agents don’t magically work — they need structured data that teaches action-taking: tool calling, web interaction, and multi-step planning. Just as importantly, they need evaluation datasets that catch regressions before those failures hit production. This is where most teams struggle. A chat model can sound correct while failing at execution, like returning invalid JSON, calling the wrong API, clicking the wrong element, or generating code that doesn’t actually fix the issue. In agentic workflows, those small failures compound across steps, turning minor errors into broken pipelines. That’s why datasets for training and evaluating AI agents should be treated as infrastructure, not a one-time res
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!