Don’t let that detailed image throw you. Unlike, say, X-rays, a functional magnetic resonance image (fMRI) isn’t a snapshot. It’s a statistical map, the colorful end product of massive calculations.

During an fMRI experiment, researchers typically scan the brain at rest and then during a series of such tasks as recalling a string of numbers. This technique yields hundreds of images, each containing information about changes in metabolism and blood flow. These changes are associated with neuronal activity, measured in voxels by the tens of thousands. (A voxel is a cube-shaped data unit analogous to the 2-D pixel.) Long after the person being scanned has gone home, the researchers must contend with gigantic amounts of raw data.

First, in a step called preprocessing, researchers correct the data. There is slice-timing correction, since not all “slices” of the brain are imaged simultaneously. There is motion correction, since subjects tend to move during scans. Low-resolution fMRI data are superimposed onto a standard “template brain” image obtained by regular MRI (coregistration), but these have to be corrected because not everyone’s brain neatly matches the template (normalization).

Then comes data analysis—researchers try to isolate those areas and networks that sent a stronger signal during a specific mental task. Subtraction is a common approach, in which the signal obtained at rest is “subtracted” from the one obtained during the task. To further localize the mental process being studied, images obtained during the study task may be subtracted from images captured during a control task. Whatever signal remains may show the area of brain involved in that task, though this deduction contains many pitfalls—correlation does not imply causation, and seemingly simple cognitive tasks may comprise multiple simpler processes. Yet the signal from a single task can be extremely faint amid the brain’s busy baseline activity. Without subtraction the two scans might look nearly identical.

Combining data from multiple research subjects is crucial, but it, too, can be dicey. The statistics may not take into account the anomalies that often dog complex experiments, such as missing data or a subject with a truly unusual brain signal. And individual subjects’ anatomical differences are often blurred during normalization, which means losing potentially important information.

Because of an fMR image’s colors, we say that the brain “lights up” in response to some mental task. But in reality nothing “lights up”; these colors are simply a code for the relative strength of the signal within each voxel. To judge whether that signal is random or real, researchers must also choose a threshold, or p-value, that represents the likelihood of a particular result being due to chance. A common p-value in statistics is 0.05, or 5 percent, meaning that an acceptable result is no more than 5 percent likely to be due to chance alone. But some fMRI researchers call for p-values as strict as .005, reasoning that they’ll find fewer false positives that way. False positives are a real danger, as one group famously demonstrated by “finding” areas of brain activation during an fMRI analysis of a dead salmon. Unfortunately, stricter p-values might eliminate important data from consideration.

Statisticians are working on ways to refine all these analyses in hopes of ensuring that what “lights up” during a cognitive task reflects real, significant, and specific brain activity. But in the meantime, it’s best to bear in mind that an fMR image is more like a graph than a photo. Like any image born of statistics, it can both enlighten and mislead.