Quantitative Neurobiology

Notes, assignments, and code for NEUROBIO 735 (Spring 2025).

Class details:

1/9 – 4/15:
Tuesday, Thursday
3:00 – 4:30
301 Bryan Research Building

Syllabus

Exercises

Solutions

Book

Week 3: Imaging data

For week three, we’ll be working with imaging data. Two-photon imaging, to be precise, though other types of imaging present similar challenges. Unlike point process (i.e., spike) data, which are just collections of events — temporal data — imaging data are spatiotemporal: we must deal not only with time, but space as well. In practice, this means not only working with time series, but with images: time series of images.

On the coding side, we’ll be devoting some time this week to learning what makes code run fast, why some programming languages are faster than others, the strengths and weaknesses of Python as a coding platform when we need lots of computation, and the tools Python provides for helping us find and remove speed bottlenecks in our code.

Much of what we will learn can be summarized in the classic quote by computer scientist Donald Knuth:

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.”

So before our second class:

Week 2: Tabular data

This week in class, we’ll be considering one of the most common data formats in all of science: tabular data. Data tables are the organizing principle behind spreadsheets, databases, and even advanced data visualization. Tabular data are typically organized with observations in rows and measured variables in columns, with the freedom to mix numbers, text, and dates in the same data structure.

In Python, these data are supported by the Pandas package through its Series and DataFrame classes. These data structures combine the efficiency and speed of NumPy arrays with special row and column index information that can be used to more easily group, subset, and reorganize heterogeneous data. The typical Pandas use case is for mixtures of categorical, string, and numerical data, as is often the case for behavioral and clinical applications.

So before our second meeting:

  • Read all the course material on tabular data.
  • Read Hadley Wickham’s Tidy Data paper. Don’t worry about the R syntax, but do internalize the concepts. How you organize your data can make a huge difference in how easy it is to analyze later.
  • Read about the split-apply-combine method. Again, don’t worry about the R syntax; focus on the analysis pattern, which generalizes across programming languages. In our second session, we’ll cover Pandas’s syntax for performing many of the same operations.
  • Watch this video about generalized linear models:

Week 1: Spike data

We’re going to start the analysis section of the course by working with a ubiquitous data type in neurobiology: point process or event data. Typically, these events are action potentials, but they could also be EPSPs, vesicle releases, or even communication in social networks. We’ll focus on common exploratory data approaches, including the peri-stimulus/peri-event time histogram and estimation of firing rates.

We’ll also get practice refactoring, the process of iteratively rewriting code to make it cleaner, more useful, and more maintainable. As a result, much of this week’s in-class work will have you writing and rewriting the same analyses, each time making them more generic and reusable. Along the way, we’ll touch on related issues such as code smells, testing, and the red-green-refactor method.

So before our first class:

and before our second class:

Optional reading:

Here's the plan

Before we get going on our first week of class, let’s talk about how we’re going to do this:

Overview

The goal of the course is to give you a sampler of techniques and ideas in quantitative neurobiology, which we consider to encompass computation, data analysis, modeling, and theory. The course is divided into three main sections (with a final week for project presentations):

  1. Introduction to programming (1/9 – 1/30): There are several good options here, but we’ll be using Python. More on details below, but we will not assume prior programming experience.

  2. Analyzing neural data (systems) (2/4 – 3/6): The goal here is to get you comfortable using programming to explore, visualize, analyze, and model several types of data generated by neuroscience experiments.

  3. Analyzing neural data (cellular/molecular) (3/18 – 4/15): Here, we’ll use R to analyze data from cellular and molecular neuro experiments.

Logistics

Getting together

Class will be in person in 301 Bryan.

Computing

This year, we will encourage you to use Google Colab for your assignments. This has the advantage of standardizing our Python environment and allowing you to easily share assignments for grading. If you prefer to use a local machine, that’s fine; please just discuss with us in advance, since we can’t necessarily provide support to debug your setup. Colab uses a variant of Jupyter Notebooks, which are covered in the Python Data Science Handbook, but we will also cover the basics of navigating this in our first class.

Assignments

We will have both in-class work and assignments. You are allowed to work collaboratively on homework, but your write-ups must be done independently. Please also note everyone you worked with when turning in your assignments.

Solutions should be submitted as saved Jupyter Notebooks or R Markdown. We’ll tell you where to put these and how to name them.

Solutions are due before class on Tuesdays (3pm EST). Given how difficult a period this is, we can work with you if something unexpected occurs, but we need to know in advance. Please help us help you. We can’t release solutions until we have everyone’s assignments turned in.

Phase I: Beginning Python

For the first several weeks of class, we’ll be offering a crash course in basic Python covering A Whirlwind Tour of Python, transitioning to the Python Data Science Handbook. This will be basic and focused on students who have limited programming background. This is purely optional. Those of you who are already comfortable with programming do not need to attend, though you will be responsible for the material. I will also be working with the TAs to set up additional help during this period for those who would like it.

Phase II: Data Analysis

This second phase of the course will cover five weeks and will focus on analyzing real neuroscience data sets.

I will not be lecturing. At least, not much. Most of what we’ll cover isn’t really learned effectively that way, so we’ll use our class time to complete programming and data analysis exercises that build on the basic Python knowledge you gained by reading A Whirlwind Tour of Python.

Each week, we’ll do two sessions of in-class assignments, for which you’ll be encouraged to work with a partner. The weeks are organized around both data and programming themes, and the in-class assignments often build on one another. After class is done for the day, we’ll post links to solutions. Typically, we’ll be walking you through an example analysis, with the goal of setting you up for the homework.

Outside of class

You are responsible for reading through the Python Data Science Handbook. I will try to (roughly) have assignments keep pace with the material in the book, but this will be loose.

You are also responsible for checking this website. All class materials will be posted here, as well as changes and corrections to homework assignments.

Getting help

Please make use of the TAs and their office hours. I am also glad to help. If something is confusing with the assignments, the fault is probably mine, and you’re probably not alone. If you alert me early, we can probably fix it.

Rough schedule for Part I

This will change as we go along, but in order to help you get started, here’s our tentative plan:

Date Topic Exercises Reading
1/9 Housekeeping, accessing computing, advanced Googling   WWTP Ch. 1 – 6
1/14 What can Python do? NMA tutorial 1 NMA tutorial 2 WWTP Ch. 7 – 8
1/16 Data structures, iteration notebook WWTP Ch. 9 – 12
1/21 Patterns, functions, duck typing notebook PDSH Ch. 1
1/23 NumPy and arrays notebook PDSH Ch. 2
1/28 Data frames notebook PDSH Ch. 3
1/30 Plotting Seaborn tutorials Matplotlib tutorials PDSH Ch. 4