{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exploring Image data\n",
    "Image data in neurobiology can be broadly divided into two types:\n",
    "\n",
    "- **Static images**, including histology, gene expression maps, and other 2 and 3-d images. These represent data at a single time point.\n",
    "- **Dynamic images**, including calcium and fMRI imaging. These represent data as a sequence of images, with a collection of images at each time.\n",
    "\n",
    "## Loading the data\n",
    "This week's data are available for download <a href=\"{{ site.dataurl }}/week3/DirectionTuning_V1_dec.mat\">here</a> (right click and \"save as\"), or can be downloaded from Google Drive via\n",
    "```\n",
    "!gdown 10mbjgJIkyaczKs47_f4es0M3AI7gqTy3\n",
    "```\n",
    "on Colab or in the notebook. The data are recordings of calcium imaging experiments in mouse V1 that come to us courtesy of Ashley Wilson in Lindsey Glickfeld's lab. The data are once again in `.mat` format. This week, they are large by the standards of what we've been working with (~200MB), though far smaller than such datasets are for real experiments. In the actual experiments, images are taken at 30 Hz (30 images per second), whereas the sample data are downsampled to 3 Hz.\n",
    "\n",
    "In the experiment that generated these data, the mice were exposed to a series of drifting grating stimuli, and recorded responses were images reflecting calcium fluorescence levels. The stimuli consisted of drifting gratings at a variety of orientations, probing the sensitivity of cells in V1 to both orientation and motion direction.\n",
    "\n",
    "The variables in the data file are:\n",
    "\n",
    "- `data`: This week, it will be up to you to determine what dimensions correspond to what variables.\n",
    "- `dirTuningExp`: A structure containing metadata about the experiment. In particular:\n",
    "    - `nTrials`: Number of trials in the data.\n",
    "    - `stimOffFrames`: Number of frames during which no stimulus was presented.\n",
    "    - `stimOnFrames`: Number of frames during which each stimulus was presented.\n",
    "    - `tGratingDirectionDeg`: Direction of drift of the moving grating on each trial.\n",
    "\n",
    "Each trial (and thus the dataset) *began* with the stimulus off and then switched it on.\n",
    "\n",
    "```{admonition} Exercise\n",
    "1. Load the data. What is its shape? \n",
    "1. Based on principles of memory layout we've discussed, which dimension of the array *should* be time, if we're mostly interested in performing analysis on the individual images?\n",
    "1. Plot sections of the data (as images) to determine which dimension of the `data` array *is* time.\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "!gdown 10mbjgJIkyaczKs47_f4es0M3AI7gqTy3"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# Load data\n",
    "import scipy.io as sci\n",
    "vars = sci.loadmat('DirectionTuning_V1_dec.mat')\n",
    "\n",
    "data = vars['data']\n",
    "dirTuningExp = vars['dirTuningExp']\n",
    "\n",
    "## shape of the data\n",
    "Nvert, Nhorz, Nframes = data.shape\n",
    "print(f\"Data are {Nvert} x {Nhorz} x {Nframes}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{toggle}\n",
    "If we're using C memory layout (\"row-major order,\" with consecutive data in rows), then index 0 should be time, since that is the index that advances slowest. In Matlab or Julia (with FORTRAN or column-major order), time would need to be the _last_ index, since that's the one that changes slowest. We want the slowest changing index to be time because all points at the same time are then consecutive in memory, and grabbing them for further manipulation and calculation will be faster.\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%config InlineBackend.figure_formats = ['svg']\n",
    "\n",
    "plt.imshow(data[:, :, 1], aspect='auto')\n",
    "plt.colorbar()\n",
    "plt.show()\n",
    "\n",
    "plt.imshow(data[:, :, 5], aspect='auto')\n",
    "plt.colorbar()\n",
    "plt.show()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Converting from images to time series\n",
    "In addition to a sequence of images, we can also think of these data as a collection of time series, one per pixel.\n",
    "\n",
    "```{admonition} Exercise\n",
    "1. Extract the calcium time series for a few representative pixels. Plot them. Be sure your x-axis reflects the actual time between images/samples.\n",
    "```\n",
    "\n",
    "### Solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# let's look at a time series\n",
    "import numpy as np\n",
    "\n",
    "## take (55, 125) and (75, 180) as examples\n",
    "dt = 1/3  # 3 Hz sampling \n",
    "taxis = np.arange(0, Nframes) * dt\n",
    "\n",
    "plt.plot(taxis, data[55, 125, :])\n",
    "plt.plot(taxis, data[75, 180, :])\n",
    "plt.xlabel('Time [s]')\n",
    "plt.ylabel('Pixel Intensity')\n",
    "plt.show()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Converting from arrays to movies\n",
    "For spatiotemporal data, one of the best ways to gain qualitative insight is by using your eye as a pattern detector. For a sequence of images, Python, like many languages, will allow us to play them as a movie. There's a simple example [here](https://matplotlib.org/stable/gallery/animation/dynamic_image.html) (Colab-specific [here](https://colab.research.google.com/github/jckantor/CBE30338/blob/master/docs/A.03-Animation-in-Jupyter-Notebooks.ipynb)) and more complex tutorials [here](https://jakevdp.github.io/blog/2012/08/18/matplotlib-animation-tutorial/) and [here](https://towardsdatascience.com/how-to-create-animated-graphs-in-python-bb619cc2dec1). I cannot overemphasize the power of creating animations and movies of your data.\n",
    "\n",
    "```{admonition} Exercise\n",
    "1. Using the examples above, make a movie of the data.\n",
    "1. Make sure the colors in your plot are appropriately normalized. Different image functions have different expectations about the range of values in the data you feed them, but it might help, for example, to make sure that the values in each pixel are between 0 and 1 across all images. More specifically, make sure you are normalizing **across** images, not just **within** images.\n",
    "```\n",
    "```{warning}\n",
    "Making a movie using all the data can take a long time. I suggest using a smaller number of frames (50-100) for a more reasonable runtime.\n",
    "```\n",
    "\n",
    "````{tip}\n",
    "If you want to convert the animation object to HTML5 video, as in the Colab solution, but you aren't working in Colab, you may need to install `ffmpeg`. I suggest\n",
    "```\n",
    "!conda install -c conda-forge ffmpeg\n",
    "```\n",
    "if you are using Anaconda.\n",
    "````\n",
    "\n",
    "### Solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# show a movie\n",
    "import matplotlib.animation as anim\n",
    "from IPython.display import HTML\n",
    "\n",
    "## normalizer\n",
    "pmax = np.max(data)\n",
    "pmin = np.min(data)\n",
    "\n",
    "fig = plt.figure();\n",
    "vid = []\n",
    "for i in range(50):\n",
    "    vid_temp = plt.imshow((data[:, :, i] - pmin) / (pmax - pmin), animated = True, aspect = 'auto')\n",
    "    vid.append([vid_temp])\n",
    "\n",
    "plt.close() # so we don't display the plot itself\n",
    "    \n",
    "vid = anim.ArtistAnimation(fig, vid, interval = dt * 1000)\n",
    "\n",
    "HTML(vid.to_html5_video())\n",
    "\n",
    "## to save the video, uncomment this:\n",
    "# vid.save('basic_animation.mp4', fps=int(1/dt)) "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tuning curves: a first pass\n",
    "For each recorded neuron in the movie, we might like to assess its sensitivity to both orientation and motion direction. To do this, we first need to find the locations of cells within the image (and they might move), then appropriately average the calcium fluorescence time series, then finally assess whether a stimulus is tuned and to what degree.\n",
    "\n",
    "But for programming, we should start simple, with the most straightforward version we can think of: let's try to assess the tuning of each *pixel* and do so with a back-of-the-envelope sort of calculation that we can refine as we go.\n",
    "\n",
    "```{admonition} Exercise\n",
    "1. Let's start with a fixed point in the image (e.g., index (45, 199)). Plot the calcium fluorescence time series for that point.\n",
    "1. For the stimulus-off baseline and each orientation, find the mean calcium activation. There are lots of ways to do this. Plot the tuning curve as a function of motion direction. Make sure to label the x-axis appropriately and indicate the baseline activation level.\n",
    "1. Find the orientation for which activation is maximal. Do this programmmatically, since we'll want to automate this for each pixel later.\n",
    "```\n",
    "\n",
    "### Solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# let's plot the time series for one pixel\n",
    "pt_in_img = [45, 199]\n",
    "tseries = data[pt_in_img[0], pt_in_img[1], :]\n",
    "plt.plot(np.arange(Nframes) * dt, tseries)\n",
    "plt.xlabel('time (s)');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# ...and figure out its tuning\n",
    "# Every orientation is off for 12, on for 6\n",
    "# strategy: make a grouped variable for each frame, then get a group mean for each condition\n",
    "\n",
    "## first clean data\n",
    "labels = -1 * np.ones([Nframes, 1])\n",
    "codes = dirTuningExp['tGratingDirectionDeg'][0][0][0]\n",
    "Ntrials = dirTuningExp['nTrials'][0][0][0][0]\n",
    "Noff = dirTuningExp['stimOffFrames'][0][0][0][0]\n",
    "Non = dirTuningExp['stimOnFrames'][0][0][0][0]\n",
    "\n",
    "offset = 0\n",
    "for i in range(Ntrials):\n",
    "    offset = offset + Noff\n",
    "    labels[offset + np.array(range(Non))] = codes[i]\n",
    "    offset = offset + Non\n",
    "\n",
    "## then convert the data into table structure\n",
    "import pandas as pd\n",
    "\n",
    "dat = {'orientation': labels.ravel(), 'trace': tseries}\n",
    "dat = pd.DataFrame(data = dat)\n",
    "\n",
    "## convert orientation to categorical type\n",
    "dat['orientation'] = dat['orientation'].astype('category')\n",
    "\n",
    "## compute average\n",
    "tuning = dat.groupby(by = 'orientation').mean()\n",
    "print(tuning)\n",
    "\n",
    "# plot the tuning curve\n",
    "tuning_toplot = tuning.iloc[1:]\n",
    "tuning_toplot.plot()\n",
    "xlim = plt.xlim()\n",
    "plt.plot(xlim, [tuning.iloc[0], tuning.iloc[0]], 'r')\n",
    "plt.legend(('trace','baseline'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# find maximally tuned orientation\n",
    "# a really brittle way to do tuning is to find the max absolute response relative to baseline\n",
    "# which equals to find the absolute average value\n",
    "\n",
    "# where does the tuning curve equal its max?\n",
    "# this can return more than one value!\n",
    "temp = tuning_toplot.loc[tuning_toplot['trace'] == np.max(tuning['trace']), :]\n",
    "\n",
    "# get these values\n",
    "orientations = tuning_toplot.index.values\n",
    "\n",
    "# print the first one\n",
    "preferred = temp.index.values[0]\n",
    "print(f\"Maximal tuning at {preferred} degrees.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "import os\n",
    "os.remove('DirectionTuning_V1_dec.mat')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "book",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "a97c0be00662db3bed8fe12bace3dd68dcba4f9aa04406cb661d7504ba5e85ce"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}