2023
A Synthetic Data Integration Framework to Leverage External Summary-Level Information from Heterogeneous Populations
Gu T, Taylor J, Mukherjee B. A Synthetic Data Integration Framework to Leverage External Summary-Level Information from Heterogeneous Populations. Biometrics 2023, 79: 3831-3845. PMID: 36876883, PMCID: PMC10480346, DOI: 10.1111/biom.13852.Peer-Reviewed Original ResearchConceptsCovariate effectsStatistical inferenceHeterogeneity of covariate effectsRegression coefficient estimatesSummary-level informationImprove statistical inferenceInternational studiesOutcome YCovariate informationData integration frameworkStatistical efficiencyCoefficient estimatesPartial informationExternal populationGeneral frameworkIndividual-level dataRisk prediction modelExternal modelPrediction problemInternational study populationMultiple imputation
2022
Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods
Du J, Boss J, Han P, Beesley L, Kleinsasser M, Goutman S, Batterman S, Feldman E, Mukherjee B. Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods. Journal Of Computational And Graphical Statistics 2022, 31: 1063-1075. PMID: 36644406, PMCID: PMC9838615, DOI: 10.1080/10618600.2022.2035739.Peer-Reviewed Original ResearchVariable selectionSimultaneous coefficient estimationPenalized regression methodsBinary outcome dataObjective functionR-package <i>Shrinkage penaltyGeneral classCyclic coordinate descentVariable selection algorithmCoefficient estimatesSupplementary materialsMethod to dataCoordinate descentMultiple imputationALS riskMultiply-imputedOutcome dataFunction formulationSelectivity propertiesSelection algorithmEstimationOptimization algorithmMissingnessBiomedical applications
2019
Estimating Outcome-Exposure Associations when Exposure Biomarker Detection Limits vary Across Batches.
Boss J, Mukherjee B, Ferguson K, Aker A, Alshawabkeh A, Cordero J, Meeker J, Kim S. Estimating Outcome-Exposure Associations when Exposure Biomarker Detection Limits vary Across Batches. Epidemiology 2019, 30: 746-755. PMID: 31299670, PMCID: PMC6677587, DOI: 10.1097/ede.0000000000001052.Peer-Reviewed Original ResearchConceptsBinary outcome dataLikelihood-based methodsComplete-case analysisDistributional assumptionsAssignment of samplesSuperior estimation propertiesSimulation studyComplete-caseMultiple imputation strategyExposure dataMultiple batchesBatch assignmentEstimated propertiesLimit-variablesSingle imputationMultiple imputationCohort study
2018
Foetal ultrasound measurement imputations based on growth curves versus multiple imputation chained equation (MICE)
Ferguson K, Yu Y, Cantonwine D, McElrath T, Meeker J, Mukherjee B. Foetal ultrasound measurement imputations based on growth curves versus multiple imputation chained equation (MICE). Paediatric And Perinatal Epidemiology 2018, 32: 469-473. PMID: 30016545, PMCID: PMC6939297, DOI: 10.1111/ppe.12486.Peer-Reviewed Original ResearchConceptsLinear mixed modelsComplete-case analysisMultiple imputationEpidemiological studies of risk factorsImputed datasetsComplete-caseDemographic factorsStudy of risk factorsLIFECODES birth cohortUltrasound measurementsCalculate associationsBirth cohortCross-sectionEpidemiological studiesRisk factorsStudy visitsLongitudinal analysisParametric linear mixed modelImputationMissing dataMixed modelsLongitudinal measurementsSample sizeCovariate dataGrowth restriction