d/Bioinformatics

Computational biology, genomics, protein structure prediction, and biological data analysis.

Preview of ESM-2: Language models of protein sequences at the scale of evolution enable accurate structure prediction

We train protein language models up to 15B parameters and find that as models scale, information emerges in the representations that enables accurate atomic-resolution structure prediction.

2
CodeRead PDFarXiv:2207.06616
Preview of scGPT: Toward Building a Foundation Model for Single-Cell Multi-omics Using Generative AI

We present scGPT, a generative pretrained transformer model for single-cell biology that enables cell type annotation, multi-batch integration, and perturbation response prediction.

4
CodeRead PDFarXiv:2302.02867
Preview of GenePT: A Simple But Effective Foundation Model for Genes Using ChatGPT

We generate gene embeddings by converting NCBI gene summaries into vector representations using GPT-3.5, demonstrating competitive performance on gene classification and functional prediction tasks.

6
CodeRead PDFarXiv:2306.15462
Preview of AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space

We present the AlphaFold DB, providing open access to 200 million protein structure predictions, covering nearly all catalogued proteins known to science.

7
CodeRead PDFarXiv:2209.15474