OLMoTrace points model output back to training data

Newsletter

Meet OLMoTrace

OLMoTrace: Connecting a language model's response back to its training data

For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting?

Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light.

OLMoTrace connects phrases or even whole sentences in the language model’s output back to verbatim matches in its training data. It does this by searching billions of documents and trillions of tokens in real time and highlighting where it finds compelling matches.

To activate OLMoTrace, click on the “Show OLMoTrace” button underneath one of OLMo’s responses in the Ai2 Playground. After a few seconds, several text spans of the model response will be highlighted, and an "OLMoTrace" panel will show up on the right side with matching documents from the training data. The highlights indicate relatively long and unique spans that appear verbatim at least once in the training data of this model.

Alt text: Screenshot of how OLMoTrace works in the Ai2 Playground interface. Prompt asks, "Who is Celine Dion?" The model outputs four paragraphs with several text spans highlighted. On the right, the OLMoTrace panel displays several documents from the training data that match the highlighted text spans in the model output.

If you click on a highlight in the model response, the side panel will show only documents containing the selected span. Similarly, if you click “Locate span” on a document in the side panel, the span highlights will narrow down to those that appear in the selected document.

Screenshot of OLMoTrace in action. The OLMoTrace panel shows a highlighted document with a bolded text span, '"My Heart Will Go On" from the movie "Titanic"', which exactly matches the highlighted text span in the model output.

Through OLMoTrace, you can gain insights into why the model generates certain sequences of words. In the screenshot below, our 13B model claims that its knowledge cutoff date is August 2023. This isn’t true, since its pre-training data has a cutoff of December 2022. Without OLMoTrace, we would have no way to tell why the model returned this false information, but now we can see that the model may have learned to say August 2023 from the post-training examples listed on the right. This finding led us to remove contents containing knowledge cutoff dates from the 32B model’s post-training data.

Screenshot of OLMoTrace in action. In the model response, the highlighted text span reads, "As of my last update in August 2023, I cannot provide real-time updates". The OLMoTrace panel shows a post-training document from tulu-3-sft-olmo-2-mixture that contains the exact text.

We developed OLMoTrace to enable researchers, developers, and the general public to inspect where and how language models may have learned to generate certain word sequences. By releasing OLMoTrace and opening up large pretraining and post-training datasets, we hope to advance scientific research in AI and public understanding of AI systems.

OLMoTrace is available today with OLMo 2 32B Instruct, OLMo 2 13B Instruct, and OLMoE 1B 7B Instruct. For more behind the scenes, read the blog.

Try OLMoTrace

Work with us

Ai2 newsletter archive