Formulating a Research Question
February 06, 2023In this first video, we discuss the PICO criteria to guide you in framing a comparative research question.
Information
- ID
- 9449
- To Cite
- DCA Citation Guide
Transcript
- 00:02<v Maria>My name is Maria Ciarleglio</v>
- 00:04and I'm a faculty member
- 00:05in the Department of Biostatistics
- 00:08at the Yale School of Public Health.
- 00:11In this video series I will introduce the clinical research
- 00:14process to prepare you to collaborate with a statistician.
- 00:20In this first video we'll discuss what is often
- 00:24the first step of the research process,
- 00:26formulating a research question.
- 00:31The first step in the research process
- 00:33is to convert the need for information
- 00:36into an answerable question or hypothesis.
- 00:40A well formulated research question is specific and precise.
- 00:45The research question guides the study design
- 00:49and other design-related study characteristics,
- 00:52the data that are collected, the data analysis
- 00:56and ultimately determines what you can conclude
- 00:59at the end of the study.
- 01:02The PICO criteria can be used to guide you
- 01:05in framing a comparative research question.
- 01:09The PICO framework begins by specifying the population
- 01:12of interest, then the intervention being studied,
- 01:16the control or comparator group,
- 01:19and the outcomes of interest.
- 01:23Begin by specifying the population of interest.
- 01:26For example,
- 01:27patients with non-alcoholic fatty liver disease.
- 01:31The target population is the group of patients
- 01:34to which you would like to generalize your study findings.
- 01:38The study population is the group
- 01:40of patients to which you have access.
- 01:44The study population may be a subset
- 01:47of the target population.
- 01:50For example, your goal may be to generalize
- 01:53to all adult Americans
- 01:55with non-alcoholic fatty liver disease.
- 01:58However, you may be limited to a patient population
- 02:02from a certain state or medical center.
- 02:06In our case, we may only have access to patients
- 02:09with non-alcoholic fatty liver disease
- 02:12followed in the liver clinic from 2015 to 2020.
- 02:18In this case, you could either collect data
- 02:20from all individuals in the available study population
- 02:24if it's feasible to do that.
- 02:26Otherwise, if the study population is too large
- 02:29you could select a random sample
- 02:32from that available study population.
- 02:35If you choose a representative random sample
- 02:39your results are generalizable to that study population.
- 02:45Next, specify the main intervention,
- 02:49which is the exposure test treatment
- 02:52or the main prognostic factor
- 02:54that you are interested in studying.
- 02:57For example, lifestyle modification to achieve weight loss
- 03:02or if studying liver cancer,
- 03:04your intervention of interest could be serafenib
- 03:07to prolonged survival.
- 03:12If you're interested in performing a comparison,
- 03:15the next step is to specify a control
- 03:18or comparison intervention or exposure.
- 03:23This can be, for example,
- 03:24a placebo control or the current standard of care.
- 03:30Finally, we must specify the clinical outcome
- 03:33or primary endpoint of your study.
- 03:37This includes the element of time, if that's appropriate,
- 03:40and this would apply if you're looking
- 03:42at a fixed follow up time period post-intervention.
- 03:47Say three month survival following surgery
- 03:51or NAFLD resolution one year following
- 03:54a certain percentage reduction in total body weight.
- 04:01Let's run through an example of the type of study
- 04:03we often perform using medical record data.
- 04:07The research question asks
- 04:10among Hepatitis B infected persons,
- 04:12what factors tests best identify individuals
- 04:17at highest risk of progression,
- 04:19as well as those at low risk of progression?
- 04:23The population studied is Hepatitis B infected persons
- 04:28treated at the Yale Liver Center between 2011 and 2021.
- 04:35The interventions of interest
- 04:37are different patient characteristics.
- 04:39Specifically, the study will look at different permutations
- 04:43of key baseline exposures or risk factors
- 04:47identified in previous studies of Hepatitis B prognosis.
- 04:52Here, the investigators will look at age
- 04:55presence of fibrosis, presence of cirrhosis,
- 04:58elevated ALT, and detectable viral load.
- 05:05The comparator group for each of these factors
- 05:08is absence of the baseline factor.
- 05:12The outcomes of interest are liver related morbidity,
- 05:16progression of liver disease
- 05:17and mortality during up to 10 years of follow up.
- 05:22Now, this is more of an exploratory study
- 05:25looking for signals of association, but even still,
- 05:29it has a clearly defined population,
- 05:31intervention or exposures of interest,
- 05:34control or reference levels of the exposures
- 05:37and outcomes of interest.
- 05:40Sitting down and thinking through the PICO criteria
- 05:43forces you to make decisions
- 05:45and pre-specify important aspects of your study.
- 05:51As we saw in the last example,
- 05:53there are often multiple clinical endpoints of interest.
- 05:57Endpoints are classified as clinical or nonclinical.
- 06:03Clinical endpoints describe outcomes
- 06:05involving how a patient feels, functions or survives.
- 06:10They may be assessed by a clinician
- 06:12and involve clinical judgment,
- 06:14such as the occurrence of stroke or MI.
- 06:17They may also be measured by a standard performance measure
- 06:21such as a pulmonary function test
- 06:23or they can be patient-reported,
- 06:25such as self-reported symptoms or quality of life.
- 06:30Nonclinical endpoints include biomarkers
- 06:33that may not directly relate to how a patient feels,
- 06:37however they're thought to be important indicators
- 06:39of the disease process.
- 06:41These endpoints can include blood tests, imaging
- 06:45or other physiological measures such as blood pressure.
- 06:49A good primary outcome should directly align
- 06:52with the primary aim of the study.
- 06:55The endpoint should be accurate
- 06:57and precise, quantifiable, validated, and reproducible.
- 07:02We generally include a single primary endpoint.
- 07:06The goal should be to choose a primary endpoint
- 07:09that will influence decision making in practice.
- 07:13The most significant and impactful endpoint that addresses
- 07:17the research question is chosen as the primary endpoint
- 07:21and additional important endpoints may be designated
- 07:25as secondary or tertiary.
- 07:28Secondary endpoints may not be considered sufficient
- 07:32to influence decision making alone,
- 07:35but may help support the claim of efficacy.
- 07:38Tertiary endpoints
- 07:39are sometimes called exploratory endpoints.
- 07:43If included, they are generally used
- 07:45to test exploratory hypotheses.
- 07:50Again, we generally use a single primary outcome.
- 07:54Using multiple primary endpoints may lead
- 07:57to an unfocused research question and can present problems
- 08:01with interpretation if the treatment effect is observed
- 08:04to differ across the multiple outcomes.
- 08:08However, multiple endpoints may be needed
- 08:11when a clinical benefit depends
- 08:13on more than one aspect of the disease.
- 08:16For example, in Alzheimer's, we may require an effect
- 08:20on both cognition and function,
- 08:23so there may be situations where multiple endpoints
- 08:26are necessary for demonstrating efficacy.
- 08:30The statistical issue with multiple endpoints
- 08:32is what we call multiplicity.
- 08:36When we conduct statistical analysis
- 08:38and perform hypothesis tests,
- 08:40there's a chance that we conclude
- 08:42a significant difference exists between the arms tested
- 08:47when in truth, there is no difference.
- 08:49This is due to random variation in the data
- 08:52that we can observe, but this is a mistake in error,
- 08:56and we refer to this type of error as a type one error
- 09:02or the alpha level of the test.
- 09:05We like to keep this type of error low,
- 09:08so we typically set the type one error of our tests at 5%.
- 09:13So when you're testing a single endpoint,
- 09:16you can maintain a type one error of 5%.
- 09:20However, suppose we're testing two primary endpoints
- 09:23and success on either endpoint would lead
- 09:26to a conclusion of a treatment difference.
- 09:30The type one error rate on each endpoint compounds
- 09:34and there's an inflation of the overall type one error
- 09:36probability above 5%.
- 09:40This increases the chance of false conclusions
- 09:43regarding the efficacy of the intervention.
- 09:46Special statistical testing procedures
- 09:49need to be used to control the type one error rate
- 09:52for the study with multiple endpoints.
- 09:56Multiple primary endpoints occur in three ways.
- 10:00The first is when there are multiple endpoints
- 10:03and each endpoint could be sufficient
- 10:05on its own to establish the efficacy
- 10:07of the intervention being tested.
- 10:10These multiple endpoints correspond
- 10:11to multiple chances of success,
- 10:14so failure to adjust for multiplicity
- 10:17can lead to type one error rate inflation
- 10:20and a false conclusion of effectiveness.
- 10:23The second option is when the determination of effectiveness
- 10:27depends on success on all primary endpoints
- 10:31when there are two or more primary endpoints.
- 10:34In this setting, there are no multiplicity issues related
- 10:37to the primary endpoints
- 10:40as there is only one path that leads
- 10:42to a successful outcome for the trial and therefore,
- 10:46no concern with type one error rate inflation.
- 10:50The third option combines several aspects
- 10:52of effectiveness into a single primary composite endpoint.
- 10:57This avoids multiple endpoint related multiplicity issues.
- 11:01In many cardiovascular studies
- 11:04it's common to combine several endpoints.
- 11:06For example, cardiovascular death, heart attack and stroke
- 11:11into a single composite primary endpoint.
- 11:15In this case, death is considered on its own
- 11:18as a secondary endpoint.
- 11:19If any one of the elements
- 11:21of the composite outcome is observed,
- 11:23then the endpoint has occurred for that patient.
- 11:27It's important that the endpoints included
- 11:29in the composite endpoint
- 11:31are of similar clinical importance.
- 11:34Using a composite endpoint is helpful
- 11:37when the components are individually rare
- 11:41so choosing a composite endpoint allows you to
- 11:43observe more events.
- 11:45A limitation of using a composite endpoint is that
- 11:49given the sample size of the study,
- 11:51there may not be adequate statistical power
- 11:55to test each component of the endpoint separately.
- 11:59We'll discuss statistical power in a future video
- 12:02on elements of sample size calculations.
- 12:05We'll also discuss endpoints and variables in general,
- 12:09from a data collection perspective in a future video.
- 12:15In this video, we discussed important things
- 12:18to consider when formulating your research question.
- 12:22From the research question will flow
- 12:24the specific statistical hypotheses to be tested,
- 12:28the design of the study, including the sample size,
- 12:31the data necessary to answer the research question,
- 12:35the statistical analysis that will be performed
- 12:38and the conclusions that can be made.
- 12:41The next video, which is the second video in this series,
- 12:45will give you an overview
- 12:46of study designs commonly used in clinical research.
- 12:51In video three, we will discuss the data collection process
- 12:55and formally define different variable types.
- 12:58This video will prepare us
- 13:00for video four on sample size determination.