Beyond Static RAG: Using 1958 Biochemistry to Beat Multi-Hop Retrieval by 14%
<p>Standard Retrieval-Augmented Generation (RAG) often falls short on complex, multi-hop questions because it relies on static "lock and key" query matching. If the information needed to answer a query is semantically distant from the original text, standard vector search simply won't find it.</p> <p>We've developed Induced-Fit Retrieval (IFR), a dynamic graph traversal approach that mutates the query vector at every step to discover semantically distant but logically connected information.</p> <p>The Core Results<br> We ran our prototype through a rigorous test suite of 30 queries across multiple graph sizes, up to 5.2 million atoms.</p> <p>14.3% higher nDCG@10 compared to a competitive RAG-rerank baseline.</p> <p>15% Multi-hop Hit@20 in scenarios where traditional RAG methods scored 0%.<
Standard Retrieval-Augmented Generation (RAG) often falls short on complex, multi-hop questions because it relies on static "lock and key" query matching. If the information needed to answer a query is semantically distant from the original text, standard vector search simply won't find it.
We've developed Induced-Fit Retrieval (IFR), a dynamic graph traversal approach that mutates the query vector at every step to discover semantically distant but logically connected information.
The Core Results We ran our prototype through a rigorous test suite of 30 queries across multiple graph sizes, up to 5.2 million atoms.
14.3% higher nDCG@10 compared to a competitive RAG-rerank baseline.
15% Multi-hop Hit@20 in scenarios where traditional RAG methods scored 0%.
O(1) Latency Scaling: Latency remains near 10ms whether searching 100 atoms or 5.2 million.
Why Biochemistry? The system is inspired by Daniel Koshland’s 1958 "induced fit" model. In biology, enzymes change shape upon encountering a substrate to improve binding.
IFR applies this to Information Retrieval: instead of a static query vector, the vector mutates at each hop based on the visited node's embedding. This allows the query to follow the "curved manifolds" of high-dimensional embedding space that a fixed vector cannot reach.
Lessons from the Data Transparency is key to research, so we are also sharing our failures:
Catastrophic Drift: 67% of our failures occurred because the query mutated too aggressively, losing its original intent.
The Solution: v2 will implement an "Alpha Floor" to preserve at least 50% of the original query signal at all times.
We have open-sourced the prototype, our 18 raw JSON result logs, ablation studies, and full technical reports.
Check out the repo on GitHub: https://github.com/emil-celestix/celestix-ifr
DEV Community
https://dev.to/emilcelestix/beyond-static-rag-using-1958-biochemistry-to-beat-multi-hop-retrieval-by-14-4hfnSign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelopen-sourcemillionA meta-analysis of the persuasive power of large language models - Nature
<a href="https://news.google.com/rss/articles/CBMiX0FVX3lxTFBtVkFYLUROMVdUY09HLWF5ZXl2TTBtNHJrSXhBQTRSLWtxUi1mQ2g3cmVBMVF2WnlELVNhUlFnNU41UDdNMDBWRHFZalJYTWdYVE5KcjNfVURLbkNFVTJj?oc=5" target="_blank">A meta-analysis of the persuasive power of large language models</a> <font color="#6f6f6f">Nature</font>
Australian govt partners Anthropic on AI safety, research and infrastructure - Telecompaper
<a href="https://news.google.com/rss/articles/CBMiugFBVV95cUxNUjhfY3dKRFdBV3hIOW1PMXE4M1g2SGZkbjYxTWozbFBKdW1HN0RrU0VfdVRfbEt6MW0tRUhiQWsxUXppMzlnQk10SnVTZjY5MXBNVlYzWEtOeUZYSXBqTFZZb2lqX2hnRlZjV0pWMzkzNE5CNDl0TWV2MEczVHI2eGVIR0pZeFJTUE90VFNWSUkxdnloZzlYcHB4b0VRdC1QcXYxME0wRlFGVnAwaGhiYURNT1lYRkdOeEE?oc=5" target="_blank">Australian govt partners Anthropic on AI safety, research and infrastructure</a> <font color="#6f6f6f">Telecompaper</font>
Large language models in psychology - Nature
<a href="https://news.google.com/rss/articles/CBMiWEFVX3lxTE5ocmtjRFJXU1NaZ3pDZnc5WmoxUU56RlZ3Sy1CUTduYlh1YU52bEROb2pwUVBMRDgyWGNuYVQ0SHQ0c2djdHVmR1c2TUlrV1Vxa3JGbHRsWjA?oc=5" target="_blank">Large language models in psychology</a> <font color="#6f6f6f">Nature</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Open Source AI
Anthropic releases part of AI tool source code in 'error' - wataugademocrat.com
<a href="https://news.google.com/rss/articles/CBMi5gFBVV95cUxNTG5FV3JLdGlWWllNTUdFa2o1aTk1NHRQZmFZNnZKVzJHY2RJTzdvN3dxS0stU1BoTnRSeXlwcUF0YnpaZ1dVYl9sS1J3QzRjSDViREdLWlhHVlBiMi1OaXBkUXZ4S013MlRVMWZRM0tZeTJkZ1d1OWhaMDhSalhpOUd3SkdSYm95WlBHZG9mZzVVbk5OblFtRlQ1YU5zd3hCd1h5RVFJdzE5MzlGOS1vNmdwT2FENjNlUEpTcEtmRXdFc0pJcGdJcUlOaEVIZkFtbWI4bHBVbHk2QWVjeGJpT0RlblVHZw?oc=5" target="_blank">Anthropic releases part of AI tool source code in 'error'</a> <font color="#6f6f6f">wataugademocrat.com</font>
GITEX AI ASIA 2026: Asia’s $78B AI and Quantum Inflection Point - Techsauce
<a href="https://news.google.com/rss/articles/CBMigAFBVV95cUxNZlAtTW5kdWFDSWNad2lOZXUxVTB5ZGZvVWpJRFF2NDY1X1NuWVR4WjZUR1N5c2h4cGFLZXFvUmNYZV8tRlJucWNzOUZjQ1ZibDhnUDkzSnR1eFhXMkNlbzlyTXRzZHFSM1ZaYVdoazQyTTJCbmJOSDB1ZkJrd0FWSg?oc=5" target="_blank">GITEX AI ASIA 2026: Asia’s $78B AI and Quantum Inflection Point</a> <font color="#6f6f6f">Techsauce</font>
trunk/2aca6d4b70df9d8d03ca75b38521d76ddb95a56f: Revert "[FSDP2] add fqn to communication ops (#173838)"
<p>This reverts commit <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://github.com/pytorch/pytorch/commit/894b713e9815a80fd802e64714d8f283e139a104/hovercard" href="https://github.com/pytorch/pytorch/commit/894b713e9815a80fd802e64714d8f283e139a104"><tt>894b713</tt></a>.</p> <p>Reverted <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="3872587356" data-permission-text="Title is private" data-url="https://github.com/pytorch/pytorch/issues/173838" data-hovercard-type="pull_request" data-hovercard-url="/pytorch/pytorch/pull/173838/hovercard" href="https://github.com/pytorch/pytorch/pull/173838">#173838</a> on behalf of <a href="https://github.com/izaitsevfb">https://github.com/izaitsevfb</a> due to reverted internally (<a href="ht
b8606
<details open=""> <p>ggml-webgpu: port all AOT operators to JIT (<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="4097310957" data-permission-text="Title is private" data-url="https://github.com/ggml-org/llama.cpp/issues/20728" data-hovercard-type="pull_request" data-hovercard-url="/ggml-org/llama.cpp/pull/20728/hovercard" href="https://github.com/ggml-org/llama.cpp/pull/20728">#20728</a>)</p> <ul> <li>port cpy pipeline to shader lib with JIT compilation</li> <li>port glu pipeline to shader lib with JIT compilation</li> <li>port rope pipeline to shader lib with JIT compilation</li> <li>port soft_max pipeline to shader lib with JIT compilation</li> <li>removed unused functions from embed_wgsl.py which were used for<br> old AOT template expansion</li> </ul>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!