Notes, assignments, and code for NEUROBIO 735 (Spring 2024).
1/11 – 4/17:
Tuesday, Thursday
3:00 – 4:30
301 Bryan Research Building
This homework focuses on extending and speeding up our code for detecting tuned cells in calcium imaging data. As part of making our analysis more realistic, we’ll walk through a lite version of the method used in Ohki et al..
In class, we used a statistical approach that simply averaged all baseline frames and all frames for each moving grating stimulus together. We also discussed some of the potential drawbacks of this method. For this homework, we’ll use a different approach: calculating an activation minus baseline difference image for each trial. This will allow us to compute a low-variance effect for each trial while calculating a more honest variance measure across trials.
In class, we remarked that much better methods were available for detecting whether or not a particular pixel is tuned. One such method is detailed in Ohki et al., which boils down to the following three steps:
In many cases of interest, the order in which we loop over arrays can impact performance. Let’s see if it makes a difference in our case.
Clearly, the pixel-by-pixel calculation of tuning is the most time-intensive step in our procedure. To get a better sense of where our program is spending its time, we’ll use profiling to take a look:
Profile your code (either the tuning image generation itself or the entire homework).
In cases where the bottleneck in our code is one of NumPy’s own functions, and we’re not free to change our approach to the problem (e.g., the algorithm or approximation we’re using), we can still gain some traction by using parallel computing. The simplest method for doing this is ipyparallel
, which executes code in parallel processes on your laptop (or on a cluster you’re connected to). In our case, because the calculation at each pixel is independent of every other, the problem is embarrassingly parallel and this is a simple matter of calling map_sync
.
Parallelize your pixel tuning calculation. To do this, you will first need to open a terminal in Jupyter lab and type ipcluster start
. The terminal will spit back some lines about Starting ipcluster
, and then you’re good to open your notebook. For a sample demonstration to make sure this is working, see the tutorial here.
How big a speedup did you get? (Make sure not to time the setup steps.) We would naively expect the computation time to be, e.g., 4x smaller with four workers. Why might your answer differ from this?