Source: http://mindhive.mit.edu/node/112
1. What is smoothing?
"Smoothing" is generally used to describe spatial smoothing in neuroimaging, and that‘s a nice euphamism for "blurring." Spatial smoothing consists of applying a small blurring kernel across your image, to average part of the intensities from neighboring voxels together. The effect is to blur the image somewhat and make it smoother - softening the hard edges, lowering the overall spatial frequency, and hopefully improving your signal-to-noise ratio.
2. What‘s the point of smoothing?
Improving your signal to noise ratio. That‘s it, in a nutshell. This happens on a couple of levels, both the single-subject and the group.
At the single-subject level: fMRI data has a lot of noise in it, but studies have shown that most of the spatial noise is (mostly) Gaussian - it‘s essentially random, essentially independent from voxel to voxel, and roughly centered around zero. If that‘s true, then if we average our intensity across several voxels, our noise will tend to average to zero, whereas our signal (which is some non-zero number) will tend to average to something non-zero, and presto! We‘ve decreased our noise while not decreasing our signal, and our SNR is better. (Desmond & Glover (DesignPapers) demonstrate this effect with real data.)
Matthew Brett has a nice discussion and several illustrations of this on the Cambridge Imagers page: http://www.mrc-cbu.cam.ac.uk/Imaging/smoothing.html
At the group level: Anatomy is highly variable between individuals, and so is exact functional placement within that anatomy. Even with normalized data, there‘ll be some good chunk of variability between subjects as to where a given functional cluster might be. Smoothing will blur those clusters and thus maximize the overlap between subjects for a given cluster, which increases our odds of detecting that functional cluster at the group level and increasing our sensitivity.
Finally, a slight technical note for SPM: Gaussian field theory, by which SPM does p-corrections, is based on how smooth your data is - the more spatial correlation in the data, the better your corrected p-values will look, because there‘s fewer degree of freedom in the data. So in SPM, smoothing will give you a direct bump in p-values - but this is not a "real" increase in sensitivity as such.
3. When should you smooth? When should you not?
Smoothing is a good idea if:
- You‘re not particularly concerned with voxel-by-voxel resolution.
- You‘re not particularly concerned with finding small (less than a handful of voxels) clusters.
- You want (or need) to improve your signal-to-noise ratio.
- You‘re averaging results over a group, in a brain region where functional anatomy and organization isn‘t precisely known.
- You‘re using SPM, and you want to use p-values corrected with Gaussian field theory (as opposed to FDR).
Smoothing‘d not a good idea if:
- You need voxel-by-voxel resolution.
- You believe your activations of interest will only be a few voxels large.
- You‘re confident your task will generate large amounts of signal relative to noise.
- You‘re working primarily with single-subject results.
- You‘re mainly interested in getting region-of-interest data from very specific structures that you‘ve drawn with high resolution on single subjects.
4. At what point in your analysis stream should you smooth?
The first point at which it‘s obvious to smooth is as the last spatial preprocessing step for your raw images; smoothing before then will only reduce the accuracy of the earlier preprocessing (normalization, realignment, etc.) - those programs that need smooth images do their own smoothing in memory as part of the calculation, and don‘t save the smoothed versions. One could also avoid smoothing the raw images entirely and instead smooth the beta and/or contrast images. In terms of efficiency, there‘s not much difference - smoothing even hundreds of raw images is a very fast process. So the question is one of performance - which is better for your sensitivity?
Skudlarski et. al (SmoothingPapers) evaluated this for single-subject data and found almost no difference between the two methods. They did find that multifiltering (see below) had greater benefits when the smoothing was done on the raw images as opposed to the statistical maps. Certainly if you want to use p-values corrected with Gaussian field theory (a la SPM), you need to smooth before estimating your results. It‘s a bit of a toss-up, though...
5. How do you determine the size of your kernel? Based on your resolution? Or structure size?
A little of both, it seems. The matched filter theorem, from the signal processing field, tells us that if we‘re trying to recover a signal (like an activation) in noisy data (like fMRI), we can best do it by smoothing our data with a kernel that‘s about the same size as our activation.
Trouble is, though, most of us don‘t know how big our activations are going to be before we run our experiment. Even if you have a particular structure of interest (say, the hippocampus), you may not get activation over the whole region - only a part.
Given that ambiguity, Skudlarski et. al introduce a method called multifiltering, in which you calculate results once from smoothed images, and then a second set of results from unsmoothed images. Finally, you average together the beta/con images from both sets of results to create a final set of results. The idea is that the smoothed set of results preferentially highlight larger activations, while the unsmoothed set of results preserve small activations, and the final set has some of the advantages of both. Their evaluations showed multifiltering didn‘t detect larger activations (clusters with radii of 3-4 voxels or greater) as well as purely smoothed results (as you might predict) but that over several cluster sizes, multifiltering outperformed traditional smoothing techniques. Its use in your experiment depends on how important you consider detecting activations of small size (less than 3-voxel radius, or about).
Overall, Skudlarski et. al found that over several cluster sizes, a kernel size of 1-2 voxels (3-6 mm, in their case) was most sensitive in general.
A good rule of thumb is to avoid using a kernel that‘s significantly larger than any structure you have a particular a priori interest in, and carefully consider what your particular comfort level is with smaller activations. A 2-voxel-radius cluster is around 30 voxels and change (and multifiltering would be more sensitive to that size); a 3-voxel-radius cluster is 110 voxels or so (if I‘m doing my math right). 6mm is a good place to start. If you‘re particularly interested in smaller activations, 2-4mm might be better. If you know you won‘t care about small activations and really will only look at large clusters, 8-10mm is a good range.
6. Should you use a different kernel for different parts of the brain?
It‘s an interesting question. Hopfinger et. al find that a 6mm kernel works best for the data they examine in the cortex, but a larger kernel (10mm) works best in subcortical regions. This might be counterintuitive, considering the subcortical structures they examine are small in general than large cortical activations - but they unfortunately don‘t include information about the size of their activation clusters, so the results are difficult to interpret. You might think a smaller kernel in subcortical regions would be better, due to the smaller size of the structures.
Trouble is, figuring out exactly which parts of the brain to use a different size of kernel on presupposes a lot of information - about activation size, about shape of HRF in one region vs. another - that pretty much doesn‘t exist for most experimental set-ups or subjects. I would tend to suggest that varying the size of the kernel for different regions is probably more trouble than it‘s worth at this point, but that may change as more studies come out about HRFs in different regions and individualized effects of smoothing. See Kiebel and Friston (SmoothingPapers), though, for some advanced work on changing the shape of the kernel in different regions...
7. What does it actually do to your activation data?
About what you‘d expect - preferentially brings out larger activations. Check out White et. al (SmoothingPapers) for some detailed illustrations. We hope to have some empirical results and maybe some pictures up here in the next few weeks...
8. What does it do to ROI data?
Great question, and not one I‘ve got a good answer for at the moment. One big part of the answer will depend on the ratio of your smoothing kernel size to your ROI size. Presumably, assuming your kernel size is smaller than your ROI, it may help improve SNR in your ROI, but if the kernel and ROI are similar sizes, smoothing may also blur the signal such that your structure contains less activation. With any luck, we can do a little empirical testing on this questions and have some results up here in the future...