Featured Publications
To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice
Salvatore M, Kundu R, Shi X, Friese C, Lee S, Fritsche L, Mondul A, Hanauer D, Pearce C, Mukherjee B. To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice. Journal Of The American Medical Informatics Association 2024, 31: 1479-1492. PMID: 38742457, PMCID: PMC11187425, DOI: 10.1093/jamia/ocae098.Peer-Reviewed Original ResearchEHR-linked biobanksNational Health Interview Survey dataHealth Interview Survey dataPhenome-wide association studyMichigan Genomics InitiativeElectronic health record-linked biobankTarget populationInterview Survey dataColorectal cancerUS adult populationSelection biasUK BiobankAssociation estimatesBiobank dataRecruitment strategiesEffect of selection biasICD codesLog odds ratioUKBSelection weightsEffect sizeAssociation studiesAdult populationBiobankImpact prevalenceStatistical Inference for Association Studies Using Electronic Health Records: Handling Both Selection Bias and Outcome Misclassification
Beesley L, Mukherjee B. Statistical Inference for Association Studies Using Electronic Health Records: Handling Both Selection Bias and Outcome Misclassification. Biometrics 2020, 78: 214-226. PMID: 33179768, DOI: 10.1111/biom.13400.Peer-Reviewed Original ResearchConceptsElectronic health recordsHealth recordsElectronic health record data analysisElectronic health record settingsSelection biasMichigan Genomics InitiativeAssociation studiesEHR-linkedHealth researchInverse probability weighting methodStudy sampleEffect estimatesProbability weighting methodLack of representativenessType I errorSurvey sampling literatureStandard error estimatesGold standard labelsDisease statusError estimatesStatistical inferenceMisclassificationInference strategySampling literatureStandard labelsAssociation of Polygenic Risk Scores for Multiple Cancers in a Phenome-wide Study: Results from The Michigan Genomics Initiative
Fritsche L, Gruber S, Wu Z, Schmidt E, Zawistowski M, Moser S, Blanc V, Brummett C, Kheterpal S, Abecasis G, Mukherjee B. Association of Polygenic Risk Scores for Multiple Cancers in a Phenome-wide Study: Results from The Michigan Genomics Initiative. American Journal Of Human Genetics 2018, 102: 1048-1061. PMID: 29779563, PMCID: PMC5992124, DOI: 10.1016/j.ajhg.2018.04.001.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresElectronic health recordsAssociations of polygenic risk scoresPhenome-wide significant associationsPolygenic risk score associationsLongitudinal biorepository effortNon-cancer diagnosesPatients' electronic health recordsPhenome-wide association studyAnalysis of temporal orderMichigan Genomics InitiativeRisk scoreAssociated with multiple phenotypesFemale breast cancerNHGRI-EBI CatalogRisk profileGenetic risk profilesMeasures of genomic variationCancer traitsCase-control studyPheWAS analysisHealth recordsHealth systemMichigan MedicineCancer diagnosis
2023
Uncovering associations between pre-existing conditions and COVID-19 Severity: A polygenic risk score approach across three large biobanks
Fritsche L, Nam K, Du J, Kundu R, Salvatore M, Shi X, Lee S, Burgess S, Mukherjee B. Uncovering associations between pre-existing conditions and COVID-19 Severity: A polygenic risk score approach across three large biobanks. PLOS Genetics 2023, 19: e1010907. PMID: 38113267, PMCID: PMC10763941, DOI: 10.1371/journal.pgen.1010907.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresMichigan Genomics InitiativeUK BiobankPre-existing conditionsPhenome-wide association studyAssociation studiesCohort-specific analysesPolygenic risk score approachUK Biobank cohortMeta-analysisIncreased risk of hospitalizationGenome-wide association studiesBody mass indexRisk of hospitalizationIdentified novel associationsRisk score approachCOVID-19 outcome dataCOVID-19 hospitalizationCOVID-19Mass indexRisk scoreBiobankCardiovascular conditionsCOVID-19 severityIncreased risk
2022
ExPRSweb: An online repository with polygenic risk scores for common health-related exposures
Ma Y, Patil S, Zhou X, Mukherjee B, Fritsche L. ExPRSweb: An online repository with polygenic risk scores for common health-related exposures. American Journal Of Human Genetics 2022, 109: 1742-1760. PMID: 36152628, PMCID: PMC9606385, DOI: 10.1016/j.ajhg.2022.09.001.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresChronic conditionsPhenome-wide association studyMichigan Genomics InitiativeRisk scoreAssociation studiesHealth-related exposuresGenome-wide association studiesUK BiobankGenetic risk factorsPRS methodsFollow-up studyRisk factorsComplex traitsGenome InitiativeGenetic modifiersBiobankInfluence of exposureEnvironmental variablesScoresLipid levelsExpRLifestyleSmokingOnline repositoryPolygenic Liability to Depression Is Associated With Multiple Medical Conditions in the Electronic Health Record: Phenome-wide Association Study of 46,782 Individuals
Fang Y, Fritsche L, Mukherjee B, Sen S, Richmond-Rakerd L. Polygenic Liability to Depression Is Associated With Multiple Medical Conditions in the Electronic Health Record: Phenome-wide Association Study of 46,782 Individuals. Biological Psychiatry 2022, 92: 923-931. PMID: 35965108, PMCID: PMC10712651, DOI: 10.1016/j.biopsych.2022.06.004.Peer-Reviewed Original ResearchConceptsPhenome-wide association studyPolygenic risk scoresMDD PRSHealth recordsRisk scoreAssociation studiesGenome-wide polygenic risk scoreAssociated with multiple medical conditionsMeasures of genetic riskMichigan Genomics InitiativePsychiatric traitsElectronic health recordsEuropean ancestry participantsMajor depressive disorderAssociated with tobacco use disorderTests of associationMultiple medical conditionsGenitourinary conditionsTobacco use disorderDisease-associated disabilityMolecular genetic toolsMolecular genetic discoveriesPsychiatric disease categoriesHealth outcomesSubstance-related disorders
2020
Phenotype risk scores (PheRS) for pancreatic cancer using time-stamped electronic health record data: Discovery and validation in two large biobanks
Salvatore M, Beesley L, Fritsche L, Hanauer D, Shi X, Mondul A, Pearce C, Mukherjee B. Phenotype risk scores (PheRS) for pancreatic cancer using time-stamped electronic health record data: Discovery and validation in two large biobanks. Journal Of Biomedical Informatics 2020, 113: 103652. PMID: 33279681, PMCID: PMC7855433, DOI: 10.1016/j.jbi.2020.103652.Peer-Reviewed Original ResearchConceptsElectronic health recordsPolygenic risk scoresElectronic health record dataMichigan Genomics InitiativePhenotype risk scoreHigh-risk individualsPancreatic cancer diagnosisBody mass indexRisk scoreCancer diagnosisMedical phenomeUK Biobank (UKBHealth record dataSource of patient informationRisk predictionHypothesis-generating associationsDisease risk predictionHealth recordsUnadjusted associationsDrinking statusSmoking statusEpidemiological covariatesUKBPatient informationMultivariate associationsCancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks
Fritsche L, Patil S, Beesley L, VandeHaar P, Salvatore M, Ma Y, Peng R, Taliun D, Zhou X, Mukherjee B. Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks. American Journal Of Human Genetics 2020, 107: 815-836. PMID: 32991828, PMCID: PMC7675001, DOI: 10.1016/j.ajhg.2020.08.025.Peer-Reviewed Original ResearchConceptsPolygenic risk scoresGenome-wide association studiesMichigan Genomics InitiativeUK BiobankPopulation-based UK BiobankPolygenic risk score constructionPublished genome-wide association studiesLongitudinal biorepository effortAssociation studiesPredictive polygenic risk scoresRisk scoreNHGRI-EBI GWAS CatalogCancer traitsIndependent biobankMichigan MedicineGWAS CatalogGenome InitiativeBiobankScoresTraitsCancer researchOnline repositoryMichiganMedicineEvaluationAn analytic framework for exploring sampling and observation process biases in genome and phenome‐wide association studies using electronic health records
Beesley L, Fritsche L, Mukherjee B. An analytic framework for exploring sampling and observation process biases in genome and phenome‐wide association studies using electronic health records. Statistics In Medicine 2020, 39: 1965-1979. PMID: 32198773, DOI: 10.1002/sim.8524.Peer-Reviewed Original ResearchConceptsElectronic health recordsHealth recordsAssociation studiesObservational health care databasesElectronic health record dataLongitudinal biorepository effortPhenome-wide association studyMichigan Genomics InitiativeHealth record dataHealth care databasesDisease-gene association studiesMichigan Health SystemCare databaseHealth systemPhenotype misclassificationStudy biasRecord dataNonprobability samplingAssociation analysisData sourcesGenome InitiativeMisclassificationAnalysis approachRecordsSensitivity analysisInteraction analysis under misspecification of main effects: Some common mistakes and simple solutions
Zhang M, Yu Y, Wang S, Salvatore M, Fritsche L, He Z, Mukherjee B. Interaction analysis under misspecification of main effects: Some common mistakes and simple solutions. Statistics In Medicine 2020, 39: 1675-1694. PMID: 32101638, DOI: 10.1002/sim.8505.Peer-Reviewed Original ResearchConceptsType I error rateType I error inflationIndependence assumptionWald and score testsCorrect type I error ratesSandwich variance estimatorSandwich estimatorScore testVariance estimationSimulation studyMisspecificationMichigan Genomics InitiativeStatistical practiceBinary outcomesTested interactionsEmpirical factsFlexible modelData modelTest of interactionBiobank studyInflationAssumptionsContinuous outcomesEpidemiological literatureLinear regression models
2019
The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities
Beesley L, Salvatore M, Fritsche L, Pandit A, Rao A, Brummett C, Willer C, Lisabeth L, Mukherjee B. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Statistics In Medicine 2019, 39: 773-800. PMID: 31859414, PMCID: PMC7983809, DOI: 10.1002/sim.8445.Peer-Reviewed Original ResearchConceptsElectronic health recordsHealth recordsMichigan Genomics InitiativeBiobank-based studiesHealth-related researchUK BiobankHealth researchDisease-gene associationsStudy designAgnostic searchBiobankDisease-treatmentInformatics infrastructureHypothesis-generating studyPhenotypic identificationGenome InitiativeMissing dataResource catalogExploratory questionsCurrent bodyBiobank researchData typesMedical researchRecruitment mechanismsPractical guidanceExploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb
Fritsche L, Beesley L, VandeHaar P, Peng R, Salvatore M, Zawistowski M, Taliun S, Das S, LeFaive J, Kaleba E, Klumpner T, Moser S, Blanc V, Brummett C, Kheterpal S, Abecasis G, Gruber S, Mukherjee B. Exploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb. PLOS Genetics 2019, 15: e1008202. PMID: 31194742, PMCID: PMC6592565, DOI: 10.1371/journal.pgen.1008202.Peer-Reviewed Original ResearchConceptsMichigan Genomics InitiativeElectronic health recordsPolygenic risk scoresSkin cancer subtypesPheWAS resultsUK BiobankElectronic health record dataLongitudinal biorepository effortPhenome-wide association studyRisk scoreHealth record dataUK Biobank dataPrediction of disease riskPublicly-available sourcesHealth recordsGenetic architectureBiobank dataMichigan MedicineRecord dataSecondary phenotypesDisease riskVisual catalogAssociation studiesGenome InitiativePheWAS