Homework: optimization

This homework focuses on extending and speeding up our code for detecting tuned cells in calcium imaging data. As part of making our analysis more realistic, we’ll walk through a lite version of the method used in Ohki et al..

Refactor: difference images

In class, we used a statistical approach that simply averaged all baseline frames and all frames for each moving grating stimulus together. We also discussed some of the potential drawbacks of this method. For this homework, we’ll use a different approach: calculating an activation minus baseline difference image for each trial. This will allow us to compute a low-variance effect for each trial while calculating a more honest variance measure across trials.

Load the calcium imaging data from this week.
Create a variable that contains a difference image for each trial in the data set. The difference image is the image formed by subtracting the image formed by taking the mean across baseline frames from the image formed by taking the mean across stimulus frames for the single trial.

Refactor: detecting tuned pixels

In class, we remarked that much better methods were available for detecting whether or not a particular pixel is tuned. One such method is detailed in Ohki et al., which boils down to the following three steps:

Check to see whether a one-way ANOVA shows a main effect of stimulus for a given pixel.
If it does not, the pixel is untuned. If it does, find the preferred (i.e., max mean activation) stimulus.
Construct the tuning index \(R = 1 - \frac{r_{orth}}{r_{pref}}\) for the pixel. Here, \(r_{pref}\) is the mean activation for the preferred stimulus, while \(r_{orth}\) is the activation in the opposite direction (180° for direction tuning). If \(R\) is greater than some minimum value (0.33 is generally accepted), the cell is tuned.

Implement a new function that determines whether a pixel is tuned and if so, returns its preferred direction.
- Perform the ANOVA and get the p-value. You should use a false positive rate for the test of \(\alpha = 0.05\).
- You might want to choose a sentinel value for your function to return in cases where the pixel is not tuned.
Use this function to create a tuning image just as we did in class.

Optimization: loop order

In many cases of interest, the order in which we loop over arrays can impact performance. Let’s see if it makes a difference in our case.

Time the execution of the part of your code that calculates the tuning image pixel-by-pixel.
How does the timing compare if you loop over the array in the opposite order (i.e., rows in the inner loop instead of columns or vice-versa)? Why or why not does this make a difference?

Optimization: profiling

Clearly, the pixel-by-pixel calculation of tuning is the most time-intensive step in our procedure. To get a better sense of where our program is spending its time, we’ll use profiling to take a look:

Profile your code (either the tuning image generation itself or the entire homework).
- Where does your code spend most of its time?
- What piece of the code would you need to speed up in order to improve performance?
- How might you do this?

Optimization: parallelism

In cases where the bottleneck in our code is one of NumPy’s own functions, and we’re not free to change our approach to the problem (e.g., the algorithm or approximation we’re using), we can still gain some traction by using parallel computing. The simplest method for doing this is ipyparallel, which executes code in parallel processes on your laptop (or on a cluster you’re connected to). In our case, because the calculation at each pixel is independent of every other, the problem is embarrassingly parallel and this is a simple matter of calling map_sync.

Parallelize your pixel tuning calculation. To do this, you will first need to open a terminal in Jupyter lab and type ipcluster start. The terminal will spit back some lines about Starting ipcluster, and then you’re good to open your notebook. For a sample demonstration to make sure this is working, see the tutorial here.
How big a speedup did you get? (Make sure not to time the setup steps.) We would naively expect the computation time to be, e.g., 4x smaller with four workers. Why might your answer differ from this?

Quantitative Neurobiology

Class details:

Syllabus

Exercises