MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations
Tang X, Tran A, Tan J, Gerstein M. MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations. Bioinformatics 2024, 40: i357-i368. PMID: 38940177, PMCID: PMC11256921, DOI: 10.1093/bioinformatics/btae260.Peer-Reviewed Original ResearchConceptsTransformer encoderDownstream tasksLanguage modelBiomedical textSelf-supervised pre-trainingExplicit 3D representationRepresentation improves performanceDeep learning modelsRepresentation of moleculesContrastive learningSupervisory signalExtract embeddingsRepresentation capabilityJoint representationBiomedical domainPre-trainingTextual dataLearning modelsMolecular representationsModel weightsJupyter NotebookStep-by-step guidanceEncodingProperty predictionStructural information