Detecting Complex Money Laundering Patterns with Incremental and Distributed Graph Modeling
arXiv:2604.01315v1 Announce Type: new Abstract: Money launderers take advantage of limitations in existing detection approaches by hiding their financial footprints in a deceitful manner. They manage this by replicating transaction patterns that the monitoring systems cannot easily distinguish. As a result, criminally gained assets are pushed into legitimate financial channels without drawing attention. Algorithms developed to monitor money flows often struggle with scale and complexity. The difficulty of identifying such activities is further intensified by the (persistent) inability of curre — Haseeb Tariq, Alen Kaja, Marwan Hassani
View PDF HTML (experimental)
Abstract:Money launderers take advantage of limitations in existing detection approaches by hiding their financial footprints in a deceitful manner. They manage this by replicating transaction patterns that the monitoring systems cannot easily distinguish. As a result, criminally gained assets are pushed into legitimate financial channels without drawing attention. Algorithms developed to monitor money flows often struggle with scale and complexity. The difficulty of identifying such activities is further intensified by the (persistent) inability of current solutions to control the excessive number of false positive signals produced by rigid, risk-based rules systems. We propose a framework called ReDiRect (REduce, DIstribute, and RECTify), specifically designed to overcome these challenges. The primary contribution of our work is a novel framing of this problem in an unsupervised setting; where a large transaction graph is fuzzily partitioned into smaller, manageable components to enable fast processing in a distributed manner. In addition, we define a refined evaluation metric that better captures the effectiveness of exposed money laundering patterns. Through comprehensive experimentation, we demonstrate that our framework achieves superior performance compared to existing and state-of-the-art techniques, particularly in terms of efficiency and real-world applicability. For validation, we used the real (open source) Libra dataset and the recently released synthetic datasets by IBM Watson. Our code and datasets are available at this https URL.
Subjects:
Machine Learning (cs.LG)
Cite as: arXiv:2604.01315 [cs.LG]
(or arXiv:2604.01315v1 [cs.LG] for this version)
https://doi.org/10.48550/arXiv.2604.01315
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Haseeb Tariq [view email] [v1] Wed, 1 Apr 2026 18:40:58 UTC (1,322 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
Source: Anthropic has acquired Coefficient Bio, which was developing a platform that enables AI to run biotech tasks such as planning drug research, for ~$400M (The Information)
The Information : Source: Anthropic has acquired Coefficient Bio, which was developing a platform that enables AI to run biotech tasks such as planning drug research, for ~$400M Anthropic has acquired AI biotech startup Coefficient Bio for roughly $400 million, according to a person with knowledge of the deal.

Source Known Identifiers: A Three-Tier Identity System for Distributed Applications
arXiv:2604.00151v1 Announce Type: cross Abstract: Distributed applications need identifiers that satisfy storage efficiency, chronological sortability, origin metadata embedding, zero-lookup verifiability, confidentiality for external consumers, and multi-century addressability. Based on our literature survey, no existing scheme provides all six of these identifier properties within a unified system. This paper introduces Source Known Identifiers (SKIDs), a three-tier identity system that projects a single entity identity across trust boundaries, addressing all six properties. The first tier, Source Known ID (SKID), is a 64-bit signed integer embedding a timestamp with a 250-millisecond precision, application topology, and a per-entity-type sequence counter. It serves as the database prima

LLMs as Idiomatic Decompilers: Recovering High-Level Code from x86-64 Assembly for Dart
arXiv:2604.02278v1 Announce Type: new Abstract: Translating machine code into human-readable high-level languages is an open research problem in reverse engineering. Despite recent advancements in LLM-based decompilation to C, modern languages like Dart and Swift are unexplored. In this paper, we study the use of small specialized LLMs as an idiomatic decompiler for such languages. Additionally, we investigate the augmentation of training data using synthetic same-language examples, and compare it against adding human-written examples using related-language (Swift -> Dart). We apply CODEBLEU to evaluate the decompiled code readability and compile@k to measure the syntax correctness. Our experimental results show that on a 73-function Dart test dataset (representing diverse complexity level
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers

LLMs as Idiomatic Decompilers: Recovering High-Level Code from x86-64 Assembly for Dart
arXiv:2604.02278v1 Announce Type: new Abstract: Translating machine code into human-readable high-level languages is an open research problem in reverse engineering. Despite recent advancements in LLM-based decompilation to C, modern languages like Dart and Swift are unexplored. In this paper, we study the use of small specialized LLMs as an idiomatic decompiler for such languages. Additionally, we investigate the augmentation of training data using synthetic same-language examples, and compare it against adding human-written examples using related-language (Swift -> Dart). We apply CODEBLEU to evaluate the decompiled code readability and compile@k to measure the syntax correctness. Our experimental results show that on a 73-function Dart test dataset (representing diverse complexity level

Fuzzing REST APIs in Industry: Necessary Features and Open Problems
arXiv:2604.01759v1 Announce Type: new Abstract: REST APIs are widely used in industry, in all different kinds of domains. An example is Volkswagen AG, a German automobile manufacturer. Established testing approaches for REST APIs are time consuming, and require expertise from professional test engineers. Due to its cost and importance, in the scientific literature several approaches have been proposed to automatically test REST APIs. The open-source, search-based fuzzer EvoMaster is one of such tools proposed in the academic literature. However, how academic prototypes can be integrated in industry and have real impact to software engineering practice requires more investigation. In this paper, we report on our experience in using EvoMaster at Volkswagen AG, as an EvoMaster user from 2023

Triosecuris: Formally Verified Protection Against Speculative Control-Flow Hijacking
arXiv:2601.22978v2 Announce Type: replace-cross Abstract: This paper introduces Triosecuris, a formally verified defense against Spectre BTB, RSB, and PHT that combines CET-style hardware-assisted control-flow integrity with compiler-inserted speculative load hardening (SLH). Triosecuris is based on the novel observation that in the presence of CET-style protection, we can precisely detect BTB misspeculation for indirect calls and RSB misspeculation for returns and set the SLH misspeculation flag. We formalize Triosecuris as a transformation in Rocq and provide a machine-checked proof that it achieves relative security: any transformed program running with speculation leaks no more than what the source program leaks without speculation. This strong security guarantee applies to arbitrary p

Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations
arXiv:2604.01635v1 Announce Type: new Abstract: Recent advances in GAN and diffusion models have significantly improved the realism and controllability of facial deepfake manipulation, raising serious concerns regarding privacy, security, and identity misuse. Proactive defenses attempt to counter this threat by injecting adversarial perturbations into images before manipulation takes place. However, existing approaches remain limited in effectiveness due to suboptimal perturbation injection strategies and are typically designed under white-box assumptions, targeting only simple GAN-based attribute editing. These constraints hinder their applicability in practical real-world scenarios. In this paper, we propose AEGIS, the first diffusion-guided paradigm in which the AdvErsarial facial image


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!