Featured Publications
To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice
Salvatore M, Kundu R, Shi X, Friese C, Lee S, Fritsche L, Mondul A, Hanauer D, Pearce C, Mukherjee B. To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice. Journal Of The American Medical Informatics Association 2024, 31: 1479-1492. PMID: 38742457, PMCID: PMC11187425, DOI: 10.1093/jamia/ocae098.Peer-Reviewed Original ResearchEHR-linked biobanksNational Health Interview Survey dataHealth Interview Survey dataPhenome-wide association studyMichigan Genomics InitiativeElectronic health record-linked biobankTarget populationInterview Survey dataColorectal cancerUS adult populationSelection biasUK BiobankAssociation estimatesBiobank dataRecruitment strategiesEffect of selection biasICD codesLog odds ratioUKBSelection weightsEffect sizeAssociation studiesAdult populationBiobankImpact prevalence
2024
Improving prediction models of amyotrophic lateral sclerosis (ALS) using polygenic, pre-existing conditions, and survey-based risk scores in the UK Biobank
Jin W, Boss J, Bakulski K, Goutman S, Feldman E, Fritsche L, Mukherjee B. Improving prediction models of amyotrophic lateral sclerosis (ALS) using polygenic, pre-existing conditions, and survey-based risk scores in the UK Biobank. Journal Of Neurology 2024, 271: 6923-6934. PMID: 39249108, DOI: 10.1007/s00415-024-12644-2.Peer-Reviewed Original ResearchPolygenic risk scoresRisk scorePre-existing conditionsPhenome-wide association studyControls of European descentPhenotype risk scoreUK Biobank dataAmyotrophic lateral sclerosis riskRisk score distributionIncreased ALS riskInfluence of environmental exposuresExposure-related factorsCombined risk scoreUK BiobankAmyotrophic lateral sclerosisBaseline demographic covariatesBiobank dataPRS-CSALS riskAmyotrophic lateral sclerosis diagnosisDiagnosis 1Demographic covariatesAssociation studiesEuropean descentMethodsUtilizing data
2023
Uncovering associations between pre-existing conditions and COVID-19 Severity: A polygenic risk score approach across three large biobanks
Fritsche L, Nam K, Du J, Kundu R, Salvatore M, Shi X, Lee S, Burgess S, Mukherjee B. Uncovering associations between pre-existing conditions and COVID-19 Severity: A polygenic risk score approach across three large biobanks. PLOS Genetics 2023, 19: e1010907. PMID: 38113267, PMCID: PMC10763941, DOI: 10.1371/journal.pgen.1010907.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresMichigan Genomics InitiativeUK BiobankPre-existing conditionsPhenome-wide association studyAssociation studiesCohort-specific analysesPolygenic risk score approachUK Biobank cohortMeta-analysisIncreased risk of hospitalizationGenome-wide association studiesBody mass indexRisk of hospitalizationIdentified novel associationsRisk score approachCOVID-19 outcome dataCOVID-19 hospitalizationCOVID-19Mass indexRisk scoreBiobankCardiovascular conditionsCOVID-19 severityIncreased risk
2022
The construction of cross-population polygenic risk scores using transfer learning
Zhao Z, Fritsche L, Smith J, Mukherjee B, Lee S. The construction of cross-population polygenic risk scores using transfer learning. American Journal Of Human Genetics 2022, 109: 1998-2008. PMID: 36240765, PMCID: PMC9674947, DOI: 10.1016/j.ajhg.2022.09.010.Peer-Reviewed Original ResearchConceptsGenome-wide association studiesPolygenic risk scoresAncestry groupsTransferability of PRSPRS-CSPolygenic risk score methodsEuropean ancestry cohortsIndividuals of African ancestryIndividuals of South Asian ancestryNon-European ancestry groupsNon-European ancestrySouth Asian ancestryAssociation studiesDichotomous traitsSouth Asian sampleEuropean ancestryGenetic researchPRS modelAncestryAsian ancestryAfrican ancestryAfrican samplesUK BiobankRisk scoreAsian samplesExPRSweb: An online repository with polygenic risk scores for common health-related exposures
Ma Y, Patil S, Zhou X, Mukherjee B, Fritsche L. ExPRSweb: An online repository with polygenic risk scores for common health-related exposures. American Journal Of Human Genetics 2022, 109: 1742-1760. PMID: 36152628, PMCID: PMC9606385, DOI: 10.1016/j.ajhg.2022.09.001.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresChronic conditionsPhenome-wide association studyMichigan Genomics InitiativeRisk scoreAssociation studiesHealth-related exposuresGenome-wide association studiesUK BiobankGenetic risk factorsPRS methodsFollow-up studyRisk factorsComplex traitsGenome InitiativeGenetic modifiersBiobankInfluence of exposureEnvironmental variablesScoresLipid levelsExpRLifestyleSmokingOnline repository
2021
On cross-ancestry cancer polygenic risk scores
Fritsche L, Ma Y, Zhang D, Salvatore M, Lee S, Zhou X, Mukherjee B. On cross-ancestry cancer polygenic risk scores. PLOS Genetics 2021, 17: e1009670. PMID: 34529658, PMCID: PMC8445431, DOI: 10.1371/journal.pgen.1009670.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresGenome-wide association studiesProstate cancer polygenic risk scoresPolygenic risk score distributionRecruitment of diverse participantsAncestry groupsPolygenic risk score methodsRisk scoreNon-genetic risk factorsElectronic health recordsBreast cancer casesHealth recordsUK BiobankGWAS effortsDisease risk assessmentCancer casesAssociation studiesGenetic dataEuropean ancestryPersonalized risk stratificationSummary statisticsRisk factorsAncestryDiverse participantsField of cancerEfficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes
Bi W, Zhou W, Dey R, Mukherjee B, Sampson J, Lee S. Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. American Journal Of Human Genetics 2021, 108: 825-839. PMID: 33836139, PMCID: PMC8206161, DOI: 10.1016/j.ajhg.2021.03.019.Peer-Reviewed Original ResearchConceptsOrdinal categorical phenotypesGenome-wide association studiesCategorical phenotypesGenome-wide significant variantsRare variantsPhenotype distributionControlled type I error ratesType I error rateMixed model approachArray genotypingAssociation studiesCommon variantsQuantitative traitsSignificant variantsLogistic mixed modelsLack of analysis toolsUK BiobankLinear mixed model approachPhenotypeAssociation TestVariantsMixed modelsSignificance levelMAFTraits
2020
Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks
Fritsche L, Patil S, Beesley L, VandeHaar P, Salvatore M, Ma Y, Peng R, Taliun D, Zhou X, Mukherjee B. Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks. American Journal Of Human Genetics 2020, 107: 815-836. PMID: 32991828, PMCID: PMC7675001, DOI: 10.1016/j.ajhg.2020.08.025.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresGenome-wide association studiesMichigan Genomics InitiativeUK BiobankPopulation-based UK BiobankPolygenic risk score constructionPublished genome-wide association studiesLongitudinal biorepository effortAssociation studiesPredictive polygenic risk scoresRisk scoreNHGRI-EBI GWAS CatalogCancer traitsIndependent biobankMichigan MedicineGWAS CatalogGenome InitiativeBiobankScoresTraitsCancer researchOnline repositoryMichiganMedicineEvaluationA Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank
Bi W, Fritsche L, Mukherjee B, Kim S, Lee S. A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank. American Journal Of Human Genetics 2020, 107: 222-233. PMID: 32589924, PMCID: PMC7413891, DOI: 10.1016/j.ajhg.2020.06.003.Peer-Reviewed Original ResearchConceptsControlled type I error ratesTime-to-event data analysisType I error rateGenetic studies of human diseasesGenome-wide significance levelTime-to-event phenotypesSaddlepoint approximationGenome-wide analysisEuropean ancestry samplesMinor allele frequencyStudy of human diseaseElectronic health recordsCox PH regression modelRegression modelsStandard Wald testProportional hazardsBinary phenotypesData analysisAncestry samplesGenetic studiesHealth recordsUK BiobankAllele frequenciesInpatient dataCox proportional hazards
2019
The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities
Beesley L, Salvatore M, Fritsche L, Pandit A, Rao A, Brummett C, Willer C, Lisabeth L, Mukherjee B. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Statistics In Medicine 2019, 39: 773-800. PMID: 31859414, PMCID: PMC7983809, DOI: 10.1002/sim.8445.Peer-Reviewed Original ResearchConceptsElectronic health recordsHealth recordsMichigan Genomics InitiativeBiobank-based studiesHealth-related researchUK BiobankHealth researchDisease-gene associationsStudy designAgnostic searchBiobankDisease-treatmentInformatics infrastructureHypothesis-generating studyPhenotypic identificationGenome InitiativeMissing dataResource catalogExploratory questionsCurrent bodyBiobank researchData typesMedical researchRecruitment mechanismsPractical guidanceA Fast and Accurate Method for Genome-wide Scale Phenome-wide G × E Analysis and Its Application to UK Biobank
Bi W, Zhao Z, Dey R, Fritsche L, Mukherjee B, Lee S. A Fast and Accurate Method for Genome-wide Scale Phenome-wide G × E Analysis and Its Application to UK Biobank. American Journal Of Human Genetics 2019, 105: 1182-1192. PMID: 31735295, PMCID: PMC6904814, DOI: 10.1016/j.ajhg.2019.10.008.Peer-Reviewed Original ResearchConceptsCase-control ratioGenome-wide significance levelMeasures of environmental exposureGenome-wide analysisEuropean ancestry samplesGenetic association studiesSaddlepoint approximationCase-control imbalanceAnalysis of phenotypesGene-environment interactionsPopulation-based biobanksControlled type I error ratesAssociation studiesG x E effectsUK BiobankType I error rateGenetic variantsE analysisSPAGEComplex diseasesEnvironmental exposuresTest statisticsE studySimulation studyWald testExploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb
Fritsche L, Beesley L, VandeHaar P, Peng R, Salvatore M, Zawistowski M, Taliun S, Das S, LeFaive J, Kaleba E, Klumpner T, Moser S, Blanc V, Brummett C, Kheterpal S, Abecasis G, Gruber S, Mukherjee B. Exploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb. PLOS Genetics 2019, 15: e1008202. PMID: 31194742, PMCID: PMC6592565, DOI: 10.1371/journal.pgen.1008202.Peer-Reviewed Original ResearchConceptsMichigan Genomics InitiativeElectronic health recordsPolygenic risk scoresSkin cancer subtypesPheWAS resultsUK BiobankElectronic health record dataLongitudinal biorepository effortPhenome-wide association studyRisk scoreHealth record dataUK Biobank dataPrediction of disease riskPublicly-available sourcesHealth recordsGenetic architectureBiobank dataMichigan MedicineRecord dataSecondary phenotypesDisease riskVisual catalogAssociation studiesGenome InitiativePheWAS