Context Beyond the Window — COLM 2026 Workshop

§01 About

Modern language models operate within finite context windows, yet many real-world tasks require models to absorb, retain, and act on information that far exceeds any single prompt.

This workshop addresses the full spectrum of context management: fitting more into the window, maintaining state across interactions, and transferring knowledge into parameters. We frame this around the trade-off between context-time memory (information supplied at inference) and weight-time memory (information absorbed into parameters).

Our goal is to build a shared vocabulary across subcommunities that rarely meet in one venue: long-context modeling, retrieval-augmented systems, continual learning, knowledge distillation, and LLM agents.

§02 Important Dates

Submission deadlinevia OpenReview · papers due 23:59 AoE

June 23, 2026

Notification of acceptancedecisions communicated to authors

July 24, 2026

Camera-ready deadlinefinal manuscript on OpenReview

September 25, 2026

Workshop dayHilton Union Square, San Francisco

October 9, 2026

All deadlines 23:59 anywhere on Earth. Submit via OpenReview.

§03 Invited Speakers

Yoshua Bengio

Mila · LawZero · UdeM

Turing Award laureate, professor at Université de Montréal, founder and scientific advisor of Mila, and president of LawZero. Recent work focuses on AI safety — managing the risks of advanced AI systems and how language models can safely acquire, store, and act on knowledge.

Aakanksha Chowdhery

Reflection AI · Stanford

Researcher at Reflection AI and adjunct professor at Stanford; Program Chair for MLSys 2026. Technical lead on PaLM (540B) and lead contributor to Gemini pre-training at Google; now building open-weight agentic and autonomous coding models.

Omar Khattab

MIT EECS · CSAIL

Assistant professor at MIT EECS / CSAIL; PhD from Stanford. Creator of ColBERT (late-interaction neural retrieval) and DSPy (a programming framework for LM pipelines). His research directly addresses how external knowledge is retrieved, compressed, and fed into the language-model context.

Albert Gu

CMU · Cartesia AI

Assistant professor of ML at CMU and co-founder / chief scientist of Cartesia AI; PhD from Stanford. Lead author of the S4 and Mamba state-space-model papers, foundational to how context can be compressed and managed beyond the transformer paradigm. TIME100 AI (2024).

Rishabh Agarwal

Periodic Labs · McGill

Founding member at Periodic Labs and adjunct professor at McGill; previously a staff research scientist at Google DeepMind, with a PhD from Mila. Core contributor to the Gemma and Gemini models and creator of on-policy distillation for LLMs; NeurIPS outstanding paper award for rigorous evaluation in reinforcement learning. His work on distillation and RL post-training bears directly on how knowledge is internalized into model weights rather than retrieved at inference.

§04 Schedule

9:00 – 9:10

Organizers

Welcome remarks

Session 1Context and Architectures

9:10 – 9:40

Yoshua Bengio

Invited talk

9:40 – 10:10

Albert Gu

Invited talk

10:10 – 10:40

Contributed speakers

Contributed talks (2 × 15 min)

10:40 – 11:40

Poster session 1 + coffee

Session 2Retrieval and In-Context Memory

11:40 – 12:10

Omar Khattab

Invited talk

12:10 – 1:10

Lunch

1:10 – 1:40

Student contributed speakers

Contributed talks (2 × 15 min)

Session 3Knowledge Internalization

1:40 – 2:10

Aakanksha Chowdhery

Invited talk

2:10 – 3:10

Poster session 2 + coffee

Session 4Synthesis

3:10 – 3:40

TBA

Invited talk

3:40 – 4:25

All speakers

Panel: The Memory Bottleneck

4:25 – 4:40

Organizers

Closing remarks & best-paper award

§05 Call for Papers

Submission format

Short papers: up to 4 pages (excluding references and appendix)
Long papers: up to 8 pages (excluding references and appendix)
Use the official COLM 2026 LaTeX template
All submissions are non-archival
Submit via OpenReview

What we accept

New research results
Position papers
System descriptions
Benchmark papers
Negative or synthetic findings that clarify trade-offs

Topics we accept

We welcome submissions across the full spectrum of context management in language models, including but not limited to:

Context compression and summarization
Long-context and infinite-context architectures
Memory-augmented and recurrent models
KV cache optimization
Retrieval-augmented generation
Multi-turn and multi-session context management
Agentic memory and orchestration
Knowledge distillation and internalization
Continual and lifelong learning
State space models
Test-time training and memorization
Benchmarks and evaluation

Review process

Each submission will receive at least two reviews. The process is double-blind: author identities and affiliations must be removed from the manuscript, and citations to the authors' own prior work should be anonymized where they would otherwise reveal identity. Organizers will not review papers for which they have a conflict of interest. Accepted work will be presented as contributed talks or posters, with oral slots selected to ensure strong representation of junior researchers.

Reviewer commitment

We rely on submitting authors to share the reviewing load. By submitting, at least one author per paper commits to serving as a reviewer for the workshop if asked by the Program Chairs. The OpenReview submission form asks for the profile of the committing author. Failure to fulfill this commitment may result in desk-rejection of the submission.

Call for reviewers

We are recruiting reviewers from the broader community. If you work on long-context modeling, retrieval, efficient architectures, continual or test-time training, agentic memory, or evaluation, we would be grateful for your help. Reviewing runs between the submission deadline and notifications, and no reviewer is assigned more than five papers. To volunteer, fill out the reviewer sign-up form.

Dual submission

Concurrent submission to other venues is allowed. Because this workshop is non-archival, accepting a paper here does not preclude its later publication elsewhere, and authors are not required to withdraw the paper from other concurrent review processes.

Code of conduct

All authors, reviewers, and attendees are expected to adhere to the COLM Code of Conduct.

The OpenReview submission portal is open until June 23, 2026.

Open OpenReview ↗ Sign up to review ↗

§06 Organizers

Junior organizers

Siddarth Venkatraman

Mistral · Mila

Research intern at Mistral; PhD student at Mila / UdeM co-advised by Glen Berseth and Nikolay Malkin. Works on RL, probabilistic inference, and generative models, with current focus on LLM post-training and inference scaling.

Dane Malenfant

McGill · Mila

MSc student at McGill / Mila supervised by Blake Richards in the LiNC Lab; citizen of the Métis Nation–Saskatchewan. Researches cooperative multi-agent systems, credit assignment, and neuro-inspired algorithms.

Emiliano Penaloza

Microsoft · Mila

Research intern at Microsoft; PhD student at UdeM / Mila supervised by Laurent Charlin. Recent work on RL post-training for long-horizon agentic tasks.

Sharut Gupta

MIT CSAIL

PhD candidate at MIT CSAIL advised by Phillip Isola and Stefanie Jegelka. Researches self-supervised and contrastive representation learning, with focus on representations that adapt across distribution shifts; previously at Google DeepMind and Meta FAIR.

Thomas Jiralerspong

Anthropic (Astra Fellow) · UdeM · Mila

Astra Fellow at Anthropic and PhD student at UdeM / Mila. Research focuses on language model agents, long-context reasoning, and post-training.

Benjamin Therien

Mila · UdeM

PhD student at UdeM / Mila co-advised by Irina Rish and Eugene Belilovsky. Researches distributed optimization, hyperparameter transfer, and continual pre-training; previously at Meta FAIR and UWaterloo.

Alicia Sun

Reflection AI

Researcher at Reflection AI, working on the systems and training infrastructure behind long-context language models.

Senior organizers

Guillaume Lajoie

Mila · UdeM · Google Research

Associate professor at UdeM and core member of Mila; Canada CIFAR AI Chair and Canada Research Chair in Neural Computation. Works on mechanisms of intelligence common to biological and artificial systems via dynamical systems and information theory.

Martin Klissarov

Google DeepMind · McGill · Mila

Research scientist at Google DeepMind finishing his PhD at McGill / Mila under Doina Precup and Marlos Machado. Works on RL and LLM agents — intrinsic motivation, meta-learning, and self-directed learning drives.

Danqi Chen

Princeton CS · Thinking Machines

Associate professor at Princeton (on sabbatical at Thinking Machines Lab) co-leading the Princeton NLP Group. Research spans the full LM life cycle — pre-training, alignment, retrieval, and efficient deployment.

§07 Contact

Email the organizers.

For submission questions, sponsorship enquiries, accessibility requests, or anything else.

colm-context-beyond-window@googlegroups.com