2022
Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods
Du J, Boss J, Han P, Beesley L, Kleinsasser M, Goutman S, Batterman S, Feldman E, Mukherjee B. Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods. Journal Of Computational And Graphical Statistics 2022, 31: 1063-1075. PMID: 36644406, PMCID: PMC9838615, DOI: 10.1080/10618600.2022.2035739.Peer-Reviewed Original ResearchVariable selectionSimultaneous coefficient estimationPenalized regression methodsBinary outcome dataObjective functionR-package <i>Shrinkage penaltyGeneral classCyclic coordinate descentVariable selection algorithmCoefficient estimatesSupplementary materialsMethod to dataCoordinate descentMultiple imputationALS riskMultiply-imputedOutcome dataFunction formulationSelectivity propertiesSelection algorithmEstimationOptimization algorithmMissingnessBiomedical applications
2021
A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures
Boss J, Rix A, Chen Y, Narisetty N, Wu Z, Ferguson K, McElrath T, Meeker J, Mukherjee B. A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures. Environmetrics 2021, 32 PMID: 34899005, PMCID: PMC8664243, DOI: 10.1002/env.2698.Peer-Reviewed Original ResearchGroup least absolute shrinkageEnvironmental health studiesHealth outcomesHealth StudyLIFECODES birth cohortBirth cohortExposure interactionsPenalized regression methodsDose-response relationshipExposure mixturesComprehensive R Archive NetworkInteraction effectsInduce sparsityAdaptive weightsGroup lassoSelection operatorHeredity constraintLeast Absolute ShrinkageSelection frameworkNonlinear interaction effectsSample sizeVariable selectionJoint effectsCoefficient estimatesGroup structure
2013
Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons
Sun Z, Tao Y, Li S, Ferguson K, Meeker J, Park S, Batterman S, Mukherjee B. Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons. Environmental Health 2013, 12: 85. PMID: 24093917, PMCID: PMC3857674, DOI: 10.1186/1476-069x-12-85.Peer-Reviewed Original ResearchConceptsMultipollutant modelsHealth impacts of environmental factorsEffect estimatesExposure-response associationsExposure to multiple pollutantsTime series designConsequence of environmental exposureSample sizeHealth impactsEnvironmental exposuresPresence of multicollinearityRisk predictionPotential interactive effectsInitial screeningPollutant mixturesImpact of environmental factorsSupervised principal component analysisModel dimensionsStatistical literatureData examplesTree-based methodsMultiple pollutantsVariable selectionSimulation studyReduce model dimension