[Side A] Completely Defending Python from OOM Kills: The BytesIO Trap and D-MemFS 'Hard Quota' Design Philosophy
<blockquote> <p><strong>From the Author:</strong><br> Recently, I introduced <strong>D-MemFS</strong> on Reddit. The response was overwhelming, confirming that memory management and file I/O performance are truly universal challenges for developers everywhere. This series is my response to that global interest.</p> </blockquote> <h3> 🧭 About this Series: The Two Sides of Development </h3> <p>To provide a complete picture of this project, I’ve split each update into two perspectives:</p> <ul> <li> <strong>Side A (Practical / from Qiita):</strong> Implementation details, benchmarks, and technical solutions.</li> <li> <strong>Side B (Philosophy / from Zenn):</strong> The development war stories, AI-collaboration, and design decisions.</li> </ul> <h2> Introduction </h2> <p>If you write in-mem
From the Author: Recently, I introduced D-MemFS on Reddit. The response was overwhelming, confirming that memory management and file I/O performance are truly universal challenges for developers everywhere. This series is my response to that global interest.
🧭 About this Series: The Two Sides of Development
To provide a complete picture of this project, I’ve split each update into two perspectives:
-
Side A (Practical / from Qiita): Implementation details, benchmarks, and technical solutions.
-
Side B (Philosophy / from Zenn): The development war stories, AI-collaboration, and design decisions.
Introduction
If you write in-memory processing in Python, you will eventually encounter this kind of failure:
Killed
Enter fullscreen mode
Exit fullscreen mode
Or on Windows, the process simply vanishes without a word. It's an OOM (Out of Memory) kill. Both io.BytesIO and dict will expand limitlessly until memory runs out. The process disappears without you even knowing "where" or "why" it crashed—this is one of the most troublesome pitfalls of Python in-memory processing.
In this article, I will dig into how the Hard Quota design of D-MemFS solves this problem, right from its core design philosophy.
The Problem: BytesIO and dict Swell Limitlessly
First, let's clarify the problem.
from io import BytesIO
buf = BytesIO()
It won't stop no matter how much you write
It continues to succeed until physical memory runs out
for i in range(100_000): buf.write(b"x" * 10_000)*
print(buf.tell()) # 1,000,000,000 — 1 GiB`
Enter fullscreen mode
Exit fullscreen mode
This write does not fail. It stubbornly continues succeeding until the OS kills the process.
The same applies to dict.
vfs: dict[str, bytes] = {} for i in range(100_000): vfs[f"file_{i}.bin"] = b"x" * 10_000*_vfs: dict[str, bytes] = {} for i in range(100_000): vfs[f"file_{i}.bin"] = b"x" * 10_000*_No errors until 1 GiB piles up`
Enter fullscreen mode
Exit fullscreen mode
Soft Quotas Are Not Enough
An approach like "checking the size and warning after writing" is called a soft quota. But this has a fundamental flaw—the data has already been written.
# Pseudo-implementation of a soft quota (A bad example) MAX_BYTES = 100 * 1024 * 1024 # 100 MiB total = 0# Pseudo-implementation of a soft quota (A bad example) MAX_BYTES = 100 * 1024 * 1024 # 100 MiB total = 0def soft_write(buf: BytesIO, data: bytes) -> None: buf.write(data) # <- Writes first global total total += len(data) if total > MAX_BYTES: # <- Notices after writing raise MemoryError("quota exceeded") # Too late`
Enter fullscreen mode
Exit fullscreen mode
The moment the threshold is exceeded, the memory has already been consumed. Furthermore, rolling back the written data after throwing an exception is not easy.
D-MemFS's Hard Quota Design
The D-MemFS quota operates on a Central Bank model. Before a write is executed, it checks the remaining quota balance and immediately rejects the write if there isn't enough.
write(data) is called ↓ Requests a reservation of len(data) bytes from the Quota Manager ↓ Is the balance sufficient? YES -> Decreases balance and executes write NO -> raises MFSQuotaExceededError (the write never happens)write(data) is called ↓ Requests a reservation of len(data) bytes from the Quota Manager ↓ Is the balance sufficient? YES -> Decreases balance and executes write NO -> raises MFSQuotaExceededError (the write never happens)Enter fullscreen mode
Exit fullscreen mode
Data is never written. The file is not polluted, and you can catch the exception and continue processing.
Code Example: Actually Using the Quota
Basic Quota Settings and Exception Handling
from dmemfs import MemoryFileSystem, MFSQuotaExceededError
10 MiB Hard Quota
mfs = MemoryFileSystem(max_quota=10 * 1024 * 1024) mfs.mkdir("/data")
def safe_write(mfs: MemoryFileSystem, path: str, data: bytes) -> bool: """Can continue processing even if writing fails""" try: with mfs.open(path, "wb") as f: f.write(data) return True except MFSQuotaExceededError as e: print(f"[Warning] Skipped writing to {path} due to quota excess: {e}") return False
Success Case
safe_write(mfs, "/data/small.bin", b"x" * (1 * 1024 * 1024)) # 1 MiB → OK*
Failure Case (Exceeds quota)
safe_write(mfs, "/data/big.bin", b"x" * (20 * 1024 * 1024)) # 20 MiB → Skipped*
The file is not polluted (opening 'wb' leaves an empty file, but no data was written)
st = mfs.stat("/data/big.bin") print(st["size"]) # 0`
Enter fullscreen mode
Exit fullscreen mode
Processing While Checking the Remaining Quota
from dmemfs import MemoryFileSystem, MFSQuotaExceededError
QUOTA = 64 * 1024 * 1024 # 64 MiB mfs = MemoryFileSystem(max_quota=QUOTA) mfs.mkdir("/chunks")
def process_stream(stream, chunk_size: int = 4 * 1024 * 1024): """Reads a stream into memory by chunks""" chunk_index = 0 written_paths = []
for chunk in iter(lambda: stream.read(chunk_size), b""): path = f"/chunks/chunk_{chunk_index:04d}.bin" try: with mfs.open(path, "wb") as f: f.write(chunk) written_paths.append(path) chunk_index += 1 except MFSQuotaExceededError: print(f"Quota reached: Kept up to {chunk_index} chunks") break_
return written_paths`
Enter fullscreen mode
Exit fullscreen mode
Node Count Limit: MFSNodeLimitExceededError
You can also set a limit on the number of files (nodes). This helps to quickly detect bugs that cause the file count to explode.
from dmemfs import MemoryFileSystem, MFSNodeLimitExceededError
Max 100 files
mfs = MemoryFileSystem(max_nodes=100) mfs.mkdir("/logs")
for i in range(200): try: with mfs.open(f"/logs/entry_{i:04d}.log", "xb") as f: f.write(f"log entry {i}\n".encode()) except MFSNodeLimitExceededError: print(f"Node limit reached: Stopped at {i} files") break`_
Enter fullscreen mode
Exit fullscreen mode
Storage Backends: SequentialMemoryFile and RandomAccessMemoryFile
D-MemFS has two types of storage backends.
SequentialMemoryFile (Sequential)
Implemented internally as a chain of byte sequences (list[bytes]).
-
Fast appending and reading from the beginning
-
Slow random access (needs to traverse chunks)
-
High memory efficiency (fewer allocations)
RandomAccessMemoryFile (Random Access)
Implemented internally as a bytearray.
-
Fast seek + read/write
-
Because it pre-allocates buffers during writing, doing only sequential writing might result in wasted memory.
auto-promotion (Automatic Promotion)
When default_storage="auto" (the default), it observes the file access pattern and automatically switches backends.
File created -> Starts as SequentialMemoryFile ↓ Random access (seek) is detected ↓ Automatically promoted to RandomAccessMemoryFileFile created -> Starts as SequentialMemoryFile ↓ Random access (seek) is detected ↓ Automatically promoted to RandomAccessMemoryFileEnter fullscreen mode
Exit fullscreen mode
# You can also explicitly pin the backend from dmemfs import MemoryFileSystem# You can also explicitly pin the backend from dmemfs import MemoryFileSystemmfs_seq = MemoryFileSystem(default_storage="sequential") # Always sequential mfs_ra = MemoryFileSystem(default_storage="random_access") # Always random access mfs_auto = MemoryFileSystem(default_storage="auto") # Auto (default)`
Enter fullscreen mode
Exit fullscreen mode
promotion_hard_limit: Suppressing Promotion of Giant Files
Auto-promotion entails copying data into a bytearray upon random access. If this happens with an extremely large file, memory usage temporarily doubles.
By setting promotion_hard_limit, files exceeding this size will not automatically promote.
from dmemfs import MemoryFileSystem
Files 64 MiB or larger will not automatically promote
mfs = MemoryFileSystem( max_quota=512 * 1024 * 1024, promotion_hard_limit=64 * 1024 * 1024, )`
Enter fullscreen mode
Exit fullscreen mode
In pipelines handling massive data, this is an important parameter to prevent memory spikes. In enterprise batch processing or data pipelines, this parameter acts as a safety net purposefully designed to smooth out memory spikes. Being able to strictly control the upper limit of memory usage synergizes well with K8s memory limits and CI resource constraints, tying directly into operational stability.
Memory Accounting: What is Included in the Quota
The quota tracks more than just pure data bytes.
Quota consumption = Bytes of Actual Data + Chunk Overhead
Enter fullscreen mode
Exit fullscreen mode
Since SequentialMemoryFile retains data in chunks, the chunk header information is slightly added as overhead. Because of this, when configuring "Quota = 10 MiB", the actual memory usage will confidently stay under 10 MiB (the actual data will be slightly less due to the overhead).
This design prioritizes the guarantee that "the quota is absolutely never exceeded".
Thread-Safe Atomic Operations
Quota updates are handled atomically under locks.
from dmemfs import MemoryFileSystem import threadingfrom dmemfs import MemoryFileSystem import threadingmfs = MemoryFileSystem(max_quota=10 * 1024 * 1024) mfs.mkdir("/concurrent")
errors = []
def writer(thread_id: int): for i in range(50): try: path = f"/concurrent/t{thread_id}f{i}.bin" with mfs.open(path, "xb") as f: f.write(b"x" * (100 * 1024)) # 100 KiB each except Exception as e: errors.append(e)
threads = [threading.Thread(target=writer, args=(i,)) for i in range(10)] for t in threads: t.start() for t in threads: t.join()
Excess requests over the quota yield exceptions, but the file system isn't broken
quota_errors = [e for e in errors if "quota" in str(e).lower()] print(f"Quota exceeded: {len(quota_errors)} times (Normal behavior)") print(f"FS Corruption: None")`
Enter fullscreen mode
Exit fullscreen mode
If the two steps of "checking the quota and writing" are separated, a race condition could occur where another thread cuts in between the check and the write to exhaust the quota. In D-MemFS, this verification and reservation are executed under a single RW lock, completely eliminating this conflict.
A World With Hard Quotas vs. Without
No Quota (BytesIO / dict) D-MemFS Hard Quota
Behavior on memory exceedance Process is OOM killed
MFSQuotaExceededError rises
Detection timing Unnoticed until OS kills it Detected instantly before write
Catching the exception
Impossible (SIGKILL)
Recoverable with try/except
Rollback Impossible Unnecessary since write hasn't happened
File integrity May be corrupted Guaranteed
Logging / Monitoring Often lost Can be logged as an exception
Quota Configuration Guidelines
Here are practical guidelines regarding what values to set.
import os import psutilimport os import psutildef recommended_quota() -> int: """ Example of using a certain percentage of available memory as a quota. In production, a fixed value is more predictable. """ available = psutil.virtual_memory().available return int(available * 0.25) # 25% of available memory*
Rule of thumb for actual use cases
QUOTAS = { "unit_test": 32 * 1024 * 1024, # 32 MiB — For testing "ci_pipeline": 256 * 1024 * 1024, # 256 MiB — CI Pipeline "batch_processing": 2 * 1024 * 1024 * 1024, # 2 GiB — Batch processing }`*
Enter fullscreen mode
Exit fullscreen mode
Basic Principle: Estimate the worst-case input size and set the quota to 1.5 - 2 times that amount. If that exceeds the total memory budget of the process, reconsider the design.
Memory Guard: Detecting Physical Memory Depletion in Advance
While a Hard Quota manages the "budget within the virtual FS," there remains another problem—when the set quota exceeds the physical memory of the host machine.
For example, even if you set max_quota=4GiB, if the machine only has 2 GiB of free memory, the OS will execute an OOM kill before reaching the quota. Hard quotas alone cannot prevent this.
The Memory Guard introduced in v0.3.0 addresses these "OOMs occurring outside the quota."
3 Modes
Mode Behavior
"none"
No checks (Default, backward compatible)
"init"
Checks if max_quota exceeds available memory at FS initialization
"per_write"
Checks physical memory balance on every write (interval specifiable)
from dmemfs import MemoryFileSystem
Detect insufficient memory upon initialization (Recommended)
mfs = MemoryFileSystem( max_quota=4 * 1024 * 1024 * 1024, # 4 GiB memory_guard="init", memory_guard_action="raise", # If "warn", yields ResourceWarning )*
Check per write (For stricter use cases)
mfs = MemoryFileSystem( max_quota=4 * 1024 * 1024 * 1024, memory_guard="per_write", memory_guard_action="warn", memory_guard_interval=1.0, # Check interval in seconds )`*
Enter fullscreen mode
Exit fullscreen mode
The Relationship Between Hard Quotas and Memory Guard
It might be easier to understand with an analogy of a house.
-
Hard Quota = The area of a room. A limit on how much baggage you can place.
-
Memory Guard = The building's load-bearing limit check. Confirming whether the building can withstand that weight in the first place.
Only when both are present is the safety of in-memory processing truly complete.
Design Ingenuity of "per_write" Mode
Since the "per_write" mode queries the OS for physical memory balance every time, there are concerns about performance impact. To address this, the memory_guard_interval parameter can control the check interval. The default is 1 second—if 1 second hasn't passed since the last check, it uses the cached value.
# Secures safety while maintaining performance even with high-frequency writes mfs = MemoryFileSystem( max_quota=1 * 1024 * 1024 * 1024, memory_guard="per_write", memory_guard_action="raise", memory_guard_interval=2.0, # Checks every 2 seconds )# Secures safety while maintaining performance even with high-frequency writes mfs = MemoryFileSystem( max_quota=1 * 1024 * 1024 * 1024, memory_guard="per_write", memory_guard_action="raise", memory_guard_interval=2.0, # Checks every 2 seconds )Enter fullscreen mode
Exit fullscreen mode
The guarantee of the Hard Quota that "it absolutely never exceeds the quota", and the guarantee of the Memory Guard that "it won't keep running while physical memory is lacking". This dual defense is the complete picture of D-MemFS's OOM countermeasures.
Behavior in free-threaded Python (GIL=0)
In the free-threaded mode (python3.13t) introduced in Python 3.13 onwards, there is no GIL, making thread conflicts more surface-level. D-MemFS has been tested in GIL=0 environments (369 tests × 3 OS × 3 Python versions), and quota atomicity is guaranteed by explicit locks irrelevant of the GIL.
# Testing in free-threaded Python python3.13t -c " from dmemfs import MemoryFileSystem import threading# Testing in free-threaded Python python3.13t -c " from dmemfs import MemoryFileSystem import threadingmfs = MemoryFileSystem(max_quota=5 * 1024 * 1024) mfs.mkdir('/test')
def worker(n): for i in range(100): try: with mfs.open(f'/test/w{n}{i}.bin', 'xb') as f: f.write(b'x' * 10240) except Exception: pass*
threads = [threading.Thread(target=worker, args=(i,)) for i in range(20)] for t in threads: t.start() for t in threads: t.join() print('Completed (No crashes)') "`
Enter fullscreen mode
Exit fullscreen mode
Conclusion
OOM is a failure that is immensely difficult to debug. Staff traces are rarely left behind, and it's hard to identify which code is the cause. By "proactively rejecting writes that don't fit in the budget," Hard Quotas convert this problem into a catchable exception.
D-MemFS's quota design is based on the philosophy of "No Surprises." Memory usage will never exceed the configured limit, exceptions can be handled with try/except, and the integrity of the file system is always maintained.
If you have ever experienced an OOM failure in in-memory processing, please do give it a try.
pip install D-MemFS
Enter fullscreen mode
Exit fullscreen mode
🔗 Links & Resources
-
Original Japanese Article: PythonのOOMキルを完全防御する:BytesIOの罠とD-MemFS「ハードクォータ」の設計思想
If you find this project interesting, a ⭐ on GitHub would be the best way to support my work!
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelbenchmarkavailable
Netflix - yes Netflix - jumps on the AI bandwagon with video editor
Video-language model revises how objects interact when things get removed from a scene A new Netflix model promises to rewrite the way we make movies. Just imagine this. As the director of the multi-million dollar epic Car Crash III: Suddenest Impact, you've just finished filming the finale where your star, Cruz Control, drives straight into an onrushing semi.…
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.

![[Side A] Completely Defending Python from OOM Kills: The BytesIO Trap and D-MemFS 'Hard Quota' Design Philosophy](https://media2.dev.to/dynamic/image/width=1200,height=627,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vney0xtkjc0fo4kadmp.png)





Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!