The Context Window Problem
Every developer working with large language models has hit this wall: your document is too long. Whether it's a codebase, a legal contract, or a collection of research papers, the context window becomes a hard ceiling that forces awkward workarounds like chunking, summarization, or simply giving up on certain use cases.
We built RLM to solve this problem.
What is RLM?
RLM (Recursive Language Models) is an open-source Node.js/TypeScript implementation based on groundbreaking research from MIT by Alex L. Zhang, Tim Kraska, and Omar Khattab. The core insight is elegant: instead of trying to fit an entire document into a context window, treat it as an external environment that the LLM can explore programmatically.
The model doesn't read your entire document at once. It writes JavaScript code to navigate, search, and extract the information it needs, recursively calling itself to handle sub-problems when necessary.
How It Works
RLM implements a REPL-style execution loop:
- Initialize: Your document becomes a variable accessible in a sandboxed JavaScript environment
- Query: The LLM analyzes the task and generates code to explore the document
- Execute: The code runs in an isolated VM, returning results to the model
- Iterate: Based on results, the model can write more code or spawn recursive sub-queries
- Complete: When the model has gathered enough information, it synthesizes the final answer
This approach means a 500-page PDF or a massive codebase isn't fundamentally different from a single paragraph. The model simply explores more.
Key Features
Multi-Provider Support
RLM works with the models you're already using across three major providers:
- OpenAI: GPT-5 series, GPT-4.1 series, GPT-4o, o3 and o1 reasoning models
- Anthropic: Claude 4.5 series, Claude 4 series, Claude 3.5 models
- Google: Gemini 3, Gemini 2.5, and Gemini 2.0 series
Security First
All generated code runs in an isolated VM sandbox. The model can explore your documents but cannot access your filesystem, network, or system resources.
Recursive Sub-Queries
This is where the research shines. When facing a complex task, the model can spawn nested calls to handle chunks of the problem independently, then synthesize the results. It's hierarchical problem-solving at the LLM level.
Production Ready
- Streaming support for real-time progress updates
- Comprehensive tracing with full execution history
- Token and cost tracking built in
- Retry and rate limiting for API resilience
- Vercel deployment support out of the box
Quick Start
Getting started takes minutes:
git clone https://github.com/hampton-io/RLM.git
cd RLM
npm install
npm run buildThen use it in your code:
import { RLM } from 'rlm';
const rlm = new RLM({ model: 'gpt-4o-mini' });
const result = await rlm.completion(
"Find all security vulnerabilities mentioned in this audit report",
veryLongAuditDocument
);
console.log(result.response);
console.log(`Cost: $${result.usage.estimatedCost.toFixed(4)}`);Or use the CLI for quick tasks:
npx tsx src/cli.ts "Summarize the main arguments" -f research-paper.txt -m claude-3-5-sonnet-latest --streamReal-World Use Cases
Codebase Analysis
Point RLM at an entire repository and ask questions like "Where is authentication handled?" or "Find all API endpoints that don't validate input." The model explores the code structure, reads relevant files, and synthesizes comprehensive answers.
Document Processing
Legal contracts, research papers, financial reports: any document that exceeds context limits becomes tractable. Extract specific clauses, compare sections, or generate summaries without losing information to truncation.
Data Exploration
CSV files with thousands of rows? Log files spanning months? RLM can write queries to filter, aggregate, and analyze data that would never fit in a context window.
The Research Behind It
RLM implements concepts from the MIT paper "Recursive Language Models" which demonstrates that allowing models to call themselves recursively substantially improves performance on complex, long-horizon tasks. The key findings:
- Recursive decomposition enables models to break complex tasks into manageable sub-problems
- Efficient context management through hierarchical processing proves critical for scalability
- The approach outperforms non-recursive baselines on challenging benchmarks
Open Source
RLM is MIT licensed and open for contributions. Whether you want to add support for new models, improve the sandboxing, or build integrations, we welcome pull requests.
Check out the GitHub repository to get started, report issues, or contribute to the project.
What's Next
We're actively developing RLM with plans for:
- Enhanced tracing and debugging tools
- Pre-built recipes for common use cases
- Performance optimizations for very large documents
- Support for additional providers and local models
The context window has been a fundamental constraint in LLM applications. With recursive approaches like RLM, that constraint becomes a speed bump rather than a wall.
Ready to process unlimited context? Try RLM today.