Research Papers model training announce study paper arxiv

LLMs as Idiomatic Decompilers: Recovering High-Level Code from x86-64 Assembly for Dart

arXiv cs.SEby [Submitted on 2 Apr 2026]April 3, 20262 min read2 views

arXiv:2604.02278v1 Announce Type: new Abstract: Translating machine code into human-readable high-level languages is an open research problem in reverse engineering. Despite recent advancements in LLM-based decompilation to C, modern languages like Dart and Swift are unexplored. In this paper, we study the use of small specialized LLMs as an idiomatic decompiler for such languages. Additionally, we investigate the augmentation of training data using synthetic same-language examples, and compare it against adding human-written examples using related-language (Swift -> Dart). We apply CODEBLEU to evaluate the decompiled code readability and compile@k to measure the syntax correctness. Our experimental results show that on a 73-function Dart test dataset (representing diverse complexity level

View PDF HTML (experimental)

Abstract:Translating machine code into human-readable high-level languages is an open research problem in reverse engineering. Despite recent advancements in LLM-based decompilation to C, modern languages like Dart and Swift are unexplored. In this paper, we study the use of small specialized LLMs as an idiomatic decompiler for such languages. Additionally, we investigate the augmentation of training data using synthetic same-language examples, and compare it against adding human-written examples using related-language (Swift -> Dart). We apply CODEBLEU to evaluate the decompiled code readability and compile@k to measure the syntax correctness. Our experimental results show that on a 73-function Dart test dataset (representing diverse complexity levels), our 4B specialized model achieves 71.3 CODEBLEU (95% CI 65.5-77.1), approximately comparable to a ~480B code model (73.1; 67.4-78.8). On a subset of 34 natural Dart functions, it reaches compile@k5 = 79.4% (Wilson 95% CI 63.2-89.7), vs. 64.7% (47.9-78.5) for the base model; the difference is suggestive but not statistically significant at 0.05. Our results indicate that adding Swift training data helps at 8B but not at 4B, suggesting a capacity threshold for effective cross-lingual transfer. Our experimental results show that small specialized models can generate readable, idiomatic Dart with meaningful identifiers while using minimal compute.

Comments: 5 pages, 1 figure, 3 tables. Accepted at SANER 2026 ERA Track

Subjects:

Software Engineering (cs.SE)

ACM classes: D.3.4; I.2.7; D.2.7

Cite as: arXiv:2604.02278 [cs.SE]

(or arXiv:2604.02278v1 [cs.SE] for this version)

https://doi.org/10.48550/arXiv.2604.02278

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Raafat Abualazm [view email] [v1] Thu, 2 Apr 2026 17:12:36 UTC (27 KB)

Original source

arXiv cs.SE

https://arxiv.org/abs/2604.02278

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltrainingannounce

Models

AI Singapore taps Alibaba Cloud to power Sea-Lion model - Computer Weekly

AI Singapore taps Alibaba Cloud to power Sea-Lion model Computer Weekly

GNews AI Singapore

1m4 months ago

ModelsLive

Microsoft’s new AI models signal its independence while challenging OpenAI and Google - eMarketer

Microsoft’s new AI models signal its independence while challenging OpenAI and Google eMarketer

GNews AI Microsoft

1mabout 1 hour ago

ModelsLive

Semantic matching in graph space without matrix computation and hallucinations and no GPU

Hello AI community,For the past few months, I’ve been rethinking how AI should process language and logic. Instead of relying on heavy matrix multiplications (Attention mechanisms) to statistically guess the next word inside an unexplainable black box, I asked a different question: What if concepts existed in a physical, multi-dimensional graph space where logic is visually traceable?I am excited to share our experimental architecture. To be absolutely clear: this is not a GraphRAG system built on top of an existing LLM. This is a standalone Native Graph Cognitive Engine.The Core Philosophy:Zero-Black-Box (Total Explainability): Modern LLMs are black boxes; you never truly know why they chose a specific token. Our engine is a “glass brain.” Every logical leap and every generated sentence i

discuss.huggingface.co

2mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 213 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Research Papers

Research Papers

VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors

Vision Language Models struggle with fine-grained visual perception tasks due to their language-centric training approach, performing poorly on unnamed visual entities despite having relevant information in their representations. (1 upvotes on HuggingFace)