Featured Publications
Statistical Inference for Association Studies Using Electronic Health Records: Handling Both Selection Bias and Outcome Misclassification
Beesley L, Mukherjee B. Statistical Inference for Association Studies Using Electronic Health Records: Handling Both Selection Bias and Outcome Misclassification. Biometrics 2020, 78: 214-226. PMID: 33179768, DOI: 10.1111/biom.13400.Peer-Reviewed Original ResearchConceptsElectronic health recordsHealth recordsElectronic health record data analysisElectronic health record settingsSelection biasMichigan Genomics InitiativeAssociation studiesEHR-linkedHealth researchInverse probability weighting methodStudy sampleEffect estimatesProbability weighting methodLack of representativenessType I errorSurvey sampling literatureStandard error estimatesGold standard labelsDisease statusError estimatesStatistical inferenceMisclassificationInference strategySampling literatureStandard labels
2023
Cohort profile: Epidemiologic Questionnaire (EPI-Q) – a scalable, app-based health survey linked to electronic health record and genotype data
Salvatore M, Clark-Boucher D, Fritsche L, Ortlieb J, Houghtby J, Driscoll A, Caldwell-Larkins B, Smith J, Brummett C, Kheterpal S, Lisabeth L, Mukherjee B. Cohort profile: Epidemiologic Questionnaire (EPI-Q) – a scalable, app-based health survey linked to electronic health record and genotype data. Epidemiology And Health 2023, 45: e2023074. PMID: 37591787, PMCID: PMC10867525, DOI: 10.4178/epih.e2023074.Peer-Reviewed Original ResearchMeSH KeywordsElectronic Health RecordsFemaleGenotypeHealth SurveysHumansMaleMiddle AgedMobile ApplicationsRetrospective StudiesSurveys and QuestionnairesConceptsElectronic health recordsHealth recordsSelf-reported health dataFamily health historyEpidemiological questionnaireCancer screeningHealth cohortHealth SurveyHealth historyFinancial toxicityBaseline surveyEHR dataHealth dataCohort dataEPI-QAverage ageOccupational exposureGenotype dataParticipantsGenotype informationInstitutional review board approvalResponse rateCohortLife meaningQuestionnaire
2022
Case studies in bias reduction and inference for electronic health record data with selection bias and phenotype misclassification
Beesley L, Mukherjee B. Case studies in bias reduction and inference for electronic health record data with selection bias and phenotype misclassification. Statistics In Medicine 2022, 41: 5501-5516. PMID: 36131394, PMCID: PMC9826451, DOI: 10.1002/sim.9579.Peer-Reviewed Original ResearchConceptsElectronic health recordsElectronic health record data analysisElectronic health record settingsLeverages external data sourcesElectronic health record dataPopulation-based data sourcesEHR-based researchLongitudinal health informationUniversity of Michigan Health SystemHealth record dataSelection biasPopulation-based researchMichigan Health SystemMultiple sources of biasFactors related to selectionPatient-level dataHealth recordsHealth systemHealth informationPhenotype misclassificationSummary estimatesPhenotyping errorsCancer diagnosisSources of biasRecord dataAssessing the added value of linking electronic health records to improve the prediction of self-reported COVID-19 testing and diagnosis
Clark-Boucher D, Boss J, Salvatore M, Smith J, Fritsche L, Mukherjee B. Assessing the added value of linking electronic health records to improve the prediction of self-reported COVID-19 testing and diagnosis. PLOS ONE 2022, 17: e0269017. PMID: 35877617, PMCID: PMC9312965, DOI: 10.1371/journal.pone.0269017.Peer-Reviewed Original ResearchMeSH KeywordsCOVID-19COVID-19 TestingElectronic Health RecordsHumansSelf ReportSurveys and QuestionnairesConceptsElectronic health recordsHealth recordsCOVID-19-related outcomesCOVID-19 testingSurvey respondentsSelf-reported outcomesSelf-reported dataCOVID-19 outcomesElectronic recordsSurvey dataCOVID-19Prediction modelModel contextSurveyCOVID-19 diagnosisOutcomesPredictor variablesDigital surveyData sourcesCoronavirus disease 2019RespondentsPredictorsCOVID-19 casesDiagnosisRecordsPolygenic Liability to Depression Is Associated With Multiple Medical Conditions in the Electronic Health Record: Phenome-wide Association Study of 46,782 Individuals
Fang Y, Fritsche L, Mukherjee B, Sen S, Richmond-Rakerd L. Polygenic Liability to Depression Is Associated With Multiple Medical Conditions in the Electronic Health Record: Phenome-wide Association Study of 46,782 Individuals. Biological Psychiatry 2022, 92: 923-931. PMID: 35965108, PMCID: PMC10712651, DOI: 10.1016/j.biopsych.2022.06.004.Peer-Reviewed Original ResearchMeSH KeywordsDepressionDepressive Disorder, MajorElectronic Health RecordsGenome-Wide Association StudyHumansMultifactorial InheritanceConceptsPhenome-wide association studyPolygenic risk scoresMDD PRSHealth recordsRisk scoreAssociation studiesGenome-wide polygenic risk scoreAssociated with multiple medical conditionsMeasures of genetic riskMichigan Genomics InitiativePsychiatric traitsElectronic health recordsEuropean ancestry participantsMajor depressive disorderAssociated with tobacco use disorderTests of associationMultiple medical conditionsGenitourinary conditionsTobacco use disorderDisease-associated disabilityMolecular genetic toolsMolecular genetic discoveriesPsychiatric disease categoriesHealth outcomesSubstance-related disorders
2020
Phenotype risk scores (PheRS) for pancreatic cancer using time-stamped electronic health record data: Discovery and validation in two large biobanks
Salvatore M, Beesley L, Fritsche L, Hanauer D, Shi X, Mondul A, Pearce C, Mukherjee B. Phenotype risk scores (PheRS) for pancreatic cancer using time-stamped electronic health record data: Discovery and validation in two large biobanks. Journal Of Biomedical Informatics 2020, 113: 103652. PMID: 33279681, PMCID: PMC7855433, DOI: 10.1016/j.jbi.2020.103652.Peer-Reviewed Original ResearchMeSH KeywordsBiological Specimen BanksElectronic Health RecordsGenome-Wide Association StudyHumansMichiganPancreatic NeoplasmsPhenotypeRisk FactorsConceptsElectronic health recordsPolygenic risk scoresElectronic health record dataMichigan Genomics InitiativePhenotype risk scoreHigh-risk individualsPancreatic cancer diagnosisBody mass indexRisk scoreCancer diagnosisMedical phenomeUK Biobank (UKBHealth record dataSource of patient informationRisk predictionHypothesis-generating associationsDisease risk predictionHealth recordsUnadjusted associationsDrinking statusSmoking statusEpidemiological covariatesUKBPatient informationMultivariate associationsAn analytic framework for exploring sampling and observation process biases in genome and phenome‐wide association studies using electronic health records
Beesley L, Fritsche L, Mukherjee B. An analytic framework for exploring sampling and observation process biases in genome and phenome‐wide association studies using electronic health records. Statistics In Medicine 2020, 39: 1965-1979. PMID: 32198773, DOI: 10.1002/sim.8524.Peer-Reviewed Original ResearchMeSH KeywordsBiasElectronic Health RecordsGenome-Wide Association StudyMichiganPhenotypePolymorphism, Single NucleotideConceptsElectronic health recordsHealth recordsAssociation studiesObservational health care databasesElectronic health record dataLongitudinal biorepository effortPhenome-wide association studyMichigan Genomics InitiativeHealth record dataHealth care databasesDisease-gene association studiesMichigan Health SystemCare databaseHealth systemPhenotype misclassificationStudy biasRecord dataNonprobability samplingAssociation analysisData sourcesGenome InitiativeMisclassificationAnalysis approachRecordsSensitivity analysis
2019
The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities
Beesley L, Salvatore M, Fritsche L, Pandit A, Rao A, Brummett C, Willer C, Lisabeth L, Mukherjee B. The emerging landscape of health research based on biobanks linked to electronic health records: Existing resources, statistical challenges, and potential opportunities. Statistics In Medicine 2019, 39: 773-800. PMID: 31859414, PMCID: PMC7983809, DOI: 10.1002/sim.8445.Peer-Reviewed Original ResearchConceptsElectronic health recordsHealth recordsMichigan Genomics InitiativeBiobank-based studiesHealth-related researchUK BiobankHealth researchDisease-gene associationsStudy designAgnostic searchBiobankDisease-treatmentInformatics infrastructureHypothesis-generating studyPhenotypic identificationGenome InitiativeMissing dataResource catalogExploratory questionsCurrent bodyBiobank researchData typesMedical researchRecruitment mechanismsPractical guidanceExploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb
Fritsche L, Beesley L, VandeHaar P, Peng R, Salvatore M, Zawistowski M, Taliun S, Das S, LeFaive J, Kaleba E, Klumpner T, Moser S, Blanc V, Brummett C, Kheterpal S, Abecasis G, Gruber S, Mukherjee B. Exploring various polygenic risk scores for skin cancer in the phenomes of the Michigan genomics initiative and the UK Biobank with a visual catalog: PRSWeb. PLOS Genetics 2019, 15: e1008202. PMID: 31194742, PMCID: PMC6592565, DOI: 10.1371/journal.pgen.1008202.Peer-Reviewed Original ResearchConceptsMichigan Genomics InitiativeElectronic health recordsPolygenic risk scoresSkin cancer subtypesPheWAS resultsUK BiobankElectronic health record dataLongitudinal biorepository effortPhenome-wide association studyRisk scoreHealth record dataUK Biobank dataPrediction of disease riskPublicly-available sourcesHealth recordsGenetic architectureBiobank dataMichigan MedicineRecord dataSecondary phenotypesDisease riskVisual catalogAssociation studiesGenome InitiativePheWAS