← Back to Research Radar
Academic Publication Academic Publication

The Pfam protein families database: embracing AI/ML

250
Citations
January 6, 2025
Published Date

Research Abstract & Technology Focus

Abstract
The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.uk/interpro/). This update describes major developments in Pfam since 2020, including decommissioning the Pfam website and integration with InterPro, harmonization with the ECOD structural classification, and expanded curation of metagenomic, microprotein and repeat-containing families. We highlight how AlphaFold structure predictions are being leveraged to refine domain boundaries and identify new domains. New families discovered through large-scale sequence similarity analysis of AlphaFold models are described. We also detail the development of Pfam-N, which uses deep learning to expand family coverage, achieving an 8.8% increase in UniProtKB coverage compared to standard Pfam. We discuss plans for more frequent Pfam releases integrated with InterPro and the potential for artificial intelligence to further assist curation. Despite recent advances, many protein families remain to be classified, and Pfam continues working toward comprehensive coverage of the protein universe.
Read Full Literature

Correlated Market Trend: Database

Bridging academia to market: The 60-day public search velocity mapping directly to the core technology of this paper. Dashed line represents 7-day moving average.

AI Semantic Synergy Context

Connecting this academic literature to real-world market discussions and products.

crossref.org › academic paper
0%

Foundation models in bioinformatics

ABSTRACT With the adoption of foundation models (FMs), artificial intelligence (AI) has become increasingly significant in bioinformatics and has successfully addressed many historic...

roipad.com › trend story
0%

AlphaFold hits ‘next level’: the AI tool now includes protein pairing

The database of 200 million protein-structure predictions now includes homodimers, adding new biological relevance.

crossref.org › academic paper
0%

Bilingual language model for protein sequence and structure

Abstract Adapting language models to protein sequences spawned the development of powerful protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein struct...

crossref.org › academic paper
0%

InterPro: the protein sequence classification resource in 2025

Abstract InterPro (https://www.ebi.ac.uk/interpro) is a freely accessible resource for the classification of protein sequences into families. It integrates predictive models, known a...

crossref.org › academic paper
0%

AI-driven multi-omics integration for multi-scale predictive modeling of genotype-environment-phenotype relationships

No description provided.

Frequently Asked Questions (FAQ)

Curated market intelligence mapped to this research.

What is the core focus of the research titled 'The Pfam protein families database: embracing AI/ML'?

This literature focuses on: Abstract The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.uk/interpro/). This update describe...

Are there open-source GitHub repositories related to The Pfam protein families database: embracing AI/ML?

Yes, open-source projects like t8y2/dbx (15MB, lightweight, cross-platform database client. Supports MySQL, PostgreSQL, SQLite, Redis, MongoDB, DuckDB, ClickHouse, SQL Server and more.) are actively building upon these concepts.

Which startups are commercializing the technology behind The Pfam protein families database: embracing AI/ML?

Products like HelixDB are bringing this to market. Their focus is: An open-source OLTP graph-vector database built in Rust..

What other academic literature is closely related to 'The Pfam protein families database: embracing AI/ML'?

Yes, highly correlated activity was mapped. An entry titled 'Foundation models in bioinformatics' discusses this: ABSTRACT With the adoption of foundation models (FMs), artificial intelligence (AI) has become increasingly significant in bioinform...

Are there commercial applications of 'The Pfam protein families database: embracing AI/ML' in market news publications?

Yes, highly correlated activity was mapped. An entry titled 'AlphaFold hits ‘next level’: the AI tool now includes protein pairing' discusses this: The database of 200 million protein-structure predictions now includes homodimers, adding new biological relevance.

Cite this Market Intelligence Report

Reference our AI-mapped synergy between this research and the commercial market to instantly build authority.

Commercial Realization

Startups and Open Source tools heavily associated with the concepts explored in this paper.

  • GitHub
    t8y2/dbx
    15MB, lightweight, cross-platform database client. Supports MySQL, ...
  • GitHub
    mark9-droid/TomodachiPC
    Tomodachi Life Living The Dream PC: Ultimate Mii manager and life s...
  • Product Hunt
    HelixDB
    An open-source OLTP graph-vector database built in Rust.
  • Product Hunt
    Actian VectorAI DB
    The portable vector database for AI agents beyond the cloud

Associated Media Narrative