Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessNvidia acquisition of SchedMD sparks worry among AI specialists about software access - ReutersGNews AI NVIDIALumentum Holdings (LITE) Is Up 26.3% After Nvidia-Backed $2 Billion AI Optics Expansion - Has The Bull Case Changed? - simplywall.stGNews AI NVIDIANvidia acquisition of SchedMD sparks worry among AI specialists about software access - TradingViewGNews AI NVIDIAMicrosoft’s new AI models signal its independence while challenging OpenAI and Google - eMarketerGNews AI MicrosoftWhy TSMC grew four times faster than its foundry rivals in 2025 — price hikes, vertical integration, and commanding technology lead pay dividendstomshardware.comThe Complete DevSecOps Engineer Career Guide: From Pipeline Security to Platform Architect in 2026DEV CommunityOpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert TasksDEV CommunitySemantic matching in graph space without matrix computation and hallucinations and no GPUdiscuss.huggingface.coWhy We Built 5 Products on FastAPI + Next.js (and Would Do It Again)DEV CommunityHow We Run 5 Live SaaS Products on $35/Month in InfrastructureDEV CommunityOur Email Provider Banned Us Overnight -- Here's What We LearnedDEV CommunityCan TensorWave Leapfrog Nvidia’s Big Moat? - The InformationGNews AI NVIDIABlack Hat USAAI BusinessBlack Hat AsiaAI BusinessNvidia acquisition of SchedMD sparks worry among AI specialists about software access - ReutersGNews AI NVIDIALumentum Holdings (LITE) Is Up 26.3% After Nvidia-Backed $2 Billion AI Optics Expansion - Has The Bull Case Changed? - simplywall.stGNews AI NVIDIANvidia acquisition of SchedMD sparks worry among AI specialists about software access - TradingViewGNews AI NVIDIAMicrosoft’s new AI models signal its independence while challenging OpenAI and Google - eMarketerGNews AI MicrosoftWhy TSMC grew four times faster than its foundry rivals in 2025 — price hikes, vertical integration, and commanding technology lead pay dividendstomshardware.comThe Complete DevSecOps Engineer Career Guide: From Pipeline Security to Platform Architect in 2026DEV CommunityOpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert TasksDEV CommunitySemantic matching in graph space without matrix computation and hallucinations and no GPUdiscuss.huggingface.coWhy We Built 5 Products on FastAPI + Next.js (and Would Do It Again)DEV CommunityHow We Run 5 Live SaaS Products on $35/Month in InfrastructureDEV CommunityOur Email Provider Banned Us Overnight -- Here's What We LearnedDEV CommunityCan TensorWave Leapfrog Nvidia’s Big Moat? - The InformationGNews AI NVIDIA
AI NEWS HUBbyEIGENVECTOREigenvector

LLMs as Idiomatic Decompilers: Recovering High-Level Code from x86-64 Assembly for Dart

arXiv cs.SEby [Submitted on 2 Apr 2026]April 3, 20262 min read2 views
Source Quiz

arXiv:2604.02278v1 Announce Type: new Abstract: Translating machine code into human-readable high-level languages is an open research problem in reverse engineering. Despite recent advancements in LLM-based decompilation to C, modern languages like Dart and Swift are unexplored. In this paper, we study the use of small specialized LLMs as an idiomatic decompiler for such languages. Additionally, we investigate the augmentation of training data using synthetic same-language examples, and compare it against adding human-written examples using related-language (Swift -> Dart). We apply CODEBLEU to evaluate the decompiled code readability and compile@k to measure the syntax correctness. Our experimental results show that on a 73-function Dart test dataset (representing diverse complexity level

View PDF HTML (experimental)

Abstract:Translating machine code into human-readable high-level languages is an open research problem in reverse engineering. Despite recent advancements in LLM-based decompilation to C, modern languages like Dart and Swift are unexplored. In this paper, we study the use of small specialized LLMs as an idiomatic decompiler for such languages. Additionally, we investigate the augmentation of training data using synthetic same-language examples, and compare it against adding human-written examples using related-language (Swift -> Dart). We apply CODEBLEU to evaluate the decompiled code readability and compile@k to measure the syntax correctness. Our experimental results show that on a 73-function Dart test dataset (representing diverse complexity levels), our 4B specialized model achieves 71.3 CODEBLEU (95% CI 65.5-77.1), approximately comparable to a ~480B code model (73.1; 67.4-78.8). On a subset of 34 natural Dart functions, it reaches compile@k5 = 79.4% (Wilson 95% CI 63.2-89.7), vs. 64.7% (47.9-78.5) for the base model; the difference is suggestive but not statistically significant at 0.05. Our results indicate that adding Swift training data helps at 8B but not at 4B, suggesting a capacity threshold for effective cross-lingual transfer. Our experimental results show that small specialized models can generate readable, idiomatic Dart with meaningful identifiers while using minimal compute.

Comments: 5 pages, 1 figure, 3 tables. Accepted at SANER 2026 ERA Track

Subjects:

Software Engineering (cs.SE)

ACM classes: D.3.4; I.2.7; D.2.7

Cite as: arXiv:2604.02278 [cs.SE]

(or arXiv:2604.02278v1 [cs.SE] for this version)

https://doi.org/10.48550/arXiv.2604.02278

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Raafat Abualazm [view email] [v1] Thu, 2 Apr 2026 17:12:36 UTC (27 KB)

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
LLMs as Idi…modeltrainingannouncestudypaperarxivarXiv cs.SE

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 213 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!