← Back to Research Radar
Academic Publication Academic Publication

Simulating 500 million years of evolution with a language model

550
Citations
February 21, 2025
Published Date

Research Abstract & Technology Focus

More than 3 billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here, we show that language models trained at scale on evolutionary data can generate functional proteins that are far away from known proteins. We present ESM3, a frontier multimodal generative language model that reasons over the sequence, structure, and function of proteins. ESM3 can follow complex prompts combining its modalities and is highly responsive to alignment to improve its fidelity. We have prompted ESM3 to generate fluorescent proteins. Among the generations that we synthesized, we found a bright fluorescent protein at a far distance (58% sequence identity) from known fluorescent proteins, which we estimate is equivalent to simulating 500 million years of evolution.
Read Full Literature

AI Semantic Synergy Context

Connecting this academic literature to real-world market discussions and products.

crossref.org › academic paper
2%

Simulating 500 million years of evolution with a language model

More than 3 billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here, we show that language models trained at scale on evolutionary data can gen...

crossref.org › academic paper
0%

AI models collapse when trained on recursively generated data

Abstract Stable diffusion revolutionized image creation from descriptive text. GPT-2 (ref. 1), GPT-3(.5) (ref. 2) and GPT-4 (ref. 3) demonstrated high performance across a variety of lang...

crossref.org › academic paper
0%

A survey on multimodal large language models

ABSTRACT Recently, the multimodal large language model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful large language models (LLMs) as a brai...

crossref.org › academic paper
0%

When large language models meet personalization: perspectives of challenges and opportunities

AbstractThe advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large lan...

news.ycombinator.com › comment
0%

Show HN: I built a tiny LLM to demystify how language models work

Cool project. I'm working on something where multiple LLM agents share a world and interact with each other autonomously. One thing that surprised me is how much the "world" matters — same model, s...

Frequently Asked Questions (FAQ)

Curated market intelligence mapped to this research.

What is the core focus of the research titled 'Simulating 500 million years of evolution with a language model'?

This literature focuses on: More than 3 billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here, we show that language models trained at scale on evolutionary data can generate functional proteins that are far away from k...

Are there open-source GitHub repositories related to Simulating 500 million years of evolution with a language model?

Yes, open-source projects like Tencent-Hunyuan/HY-World-2.0 (HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds) are actively building upon these concepts.

What other academic literature is closely related to 'Simulating 500 million years of evolution with a language model'?

Yes, highly correlated activity was mapped. An entry titled 'Simulating 500 million years of evolution with a language model' discusses this: More than 3 billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here, we show that language mo...

How is the concept of 'Simulating 500 million years of evolution with a language model' being discussed by engineers on Hacker News?

Yes, highly correlated activity was mapped. An entry titled 'Show HN: I built a tiny LLM to demystify how language models work' discusses this: Cool project. I'm working on something where multiple LLM agents share a world and interact with each other autonomously. One thing that surprised ...

Cite this Market Intelligence Report

Reference our AI-mapped synergy between this research and the commercial market to instantly build authority.

Commercial Realization

Startups and Open Source tools heavily associated with the concepts explored in this paper.

Associated Media Narrative