RLM is an open-source implementation of MIT's Recursive Language Models research, enabling LLMs to process documents of unlimited length by treating them as explorable environments.

The Context Window Problem

Every developer working with large language models has hit this wall: your document is too long. Whether it's a codebase, a legal contract, or a collection of research papers, the context window becomes a hard ceiling that forces awkward workarounds like chunking, summarization, or simply giving up on certain use cases.

We built RLM to solve this problem.

What is RLM?

RLM (Recursive Language Models) is an open-source Node.js/TypeScript implementation based on groundbreaking research from MIT by Alex L. Zhang, Tim Kraska, and Omar Khattab. The core insight is elegant: instead of trying to fit an entire document into a context window, treat it as an external environment that the LLM can explore programmatically.

The model doesn't read your entire document at once. It writes JavaScript code to navigate, search, and extract the information it needs, recursively calling itself to handle sub-problems when necessary.

How It Works

RLM implements a REPL-style execution loop:

Initialize: Your document becomes a variable accessible in a sandboxed JavaScript environment
Query: The LLM analyzes the task and generates code to explore the document
Execute: The code runs in an isolated VM, returning results to the model
Iterate: Based on results, the model can write more code or spawn recursive sub-queries
Complete: When the model has gathered enough information, it synthesizes the final answer

This approach means a 500-page PDF or a massive codebase isn't fundamentally different from a single paragraph. The model simply explores more.

Key Features

Multi-Provider Support

RLM works with the models you're already using across three major providers:

OpenAI: GPT-5 series, GPT-4.1 series, GPT-4o, o3 and o1 reasoning models
Anthropic: Claude 4.5 series, Claude 4 series, Claude 3.5 models
Google: Gemini 3, Gemini 2.5, and Gemini 2.0 series

Security First

All generated code runs in an isolated VM sandbox. The model can explore your documents but cannot access your filesystem, network, or system resources.

Recursive Sub-Queries

This is where the research shines. When facing a complex task, the model can spawn nested calls to handle chunks of the problem independently, then synthesize the results. It's hierarchical problem-solving at the LLM level.

Production Ready

Streaming support for real-time progress updates
Comprehensive tracing with full execution history
Token and cost tracking built in
Retry and rate limiting for API resilience
Vercel deployment support out of the box

Quick Start

Getting started takes minutes:

git clone https://github.com/hampton-io/RLM.git
cd RLM
npm install
npm run build

Then use it in your code:

import { RLM } from 'rlm';

const rlm = new RLM({ model: 'gpt-4o-mini' });

const result = await rlm.completion(
  "Find all security vulnerabilities mentioned in this audit report",
  veryLongAuditDocument
);

console.log(result.response);
console.log(`Cost: $${result.usage.estimatedCost.toFixed(4)}`);

Or use the CLI for quick tasks:

npx tsx src/cli.ts "Summarize the main arguments" -f research-paper.txt -m claude-3-5-sonnet-latest --stream

Real-World Use Cases

Codebase Analysis

Point RLM at an entire repository and ask questions like "Where is authentication handled?" or "Find all API endpoints that don't validate input." The model explores the code structure, reads relevant files, and synthesizes comprehensive answers.

Document Processing

Legal contracts, research papers, financial reports: any document that exceeds context limits becomes tractable. Extract specific clauses, compare sections, or generate summaries without losing information to truncation.

Data Exploration

CSV files with thousands of rows? Log files spanning months? RLM can write queries to filter, aggregate, and analyze data that would never fit in a context window.

The Research Behind It

RLM implements concepts from the MIT paper "Recursive Language Models" which demonstrates that allowing models to call themselves recursively substantially improves performance on complex, long-horizon tasks. The key findings:

Recursive decomposition enables models to break complex tasks into manageable sub-problems
Efficient context management through hierarchical processing proves critical for scalability
The approach outperforms non-recursive baselines on challenging benchmarks

Open Source

RLM is MIT licensed and open for contributions. Whether you want to add support for new models, improve the sandboxing, or build integrations, we welcome pull requests.

Check out the GitHub repository to get started, report issues, or contribute to the project.

What's Next

We're actively developing RLM with plans for:

Enhanced tracing and debugging tools
Pre-built recipes for common use cases
Performance optimizations for very large documents
Support for additional providers and local models

The context window has been a fundamental constraint in LLM applications. With recursive approaches like RLM, that constraint becomes a speed bump rather than a wall.

Ready to process unlimited context? Try RLM today.

Introducing RLM: Breaking Through the Context Window Barrier