Neural decoding

Neural decoding is a neuroscience field concerned with the hypothetical reconstruction of sensory and other stimuli from information that has already been encoded and represented in the brain by networks of neurons.^[1] Reconstruction refers to the ability of the researcher to predict what sensory stimuli the subject is receiving based purely on neuron action potentials. Therefore, the main goal of neural decoding is to characterize how the electrical activity of neurons elicit activity and responses in the brain.^[2]

This article specifically refers to neural decoding as it pertains to the mammalian neocortex.

Overview

When looking at a picture, people's brains are constantly making decisions about what object they are looking at, where they need to move their eyes next, and what they find to be the most salient aspects of the input stimulus. As these images hit the back of the retina, these stimuli are converted from varying wavelengths to a series of neural spikes called action potentials. These patterns of action potentials are different for different objects and different colors; we therefore say that the neurons are encoding objects and colors by varying their spike rates or temporal patterns. Now, if someone were to probe the brain by placing electrodes in the primary visual cortex, they may find what appears to be random electrical activity. These neurons are actually firing in response to the lower level features of visual input, possibly the edges of a picture frame. This highlights the crux of the neural decoding hypothesis: that it is possible to reconstruct a stimulus from the response of the ensemble of neurons that represent it. In other words, it is possible to look at spike train data and say that the person or animal being recorded is looking at a red ball.

With the recent breakthrough in large-scale neural recording and decoding technologies, researchers have begun to crack the neural code and already provided the first glimpse into the real-time neural code of memory traces as memory is formed and recalled in the hippocampus, a brain region known to be central for memory formation.^[3]^[4] Neuroscientists have initiated a large-scale brain activity mapping or brain decoding project^[5] to construct brain-wide neural codes.

Encoding to decoding

Implicit about the decoding hypothesis is the assumption that neural spiking in the brain somehow represents stimuli in the external world. The decoding of neural data would be impossible if the neurons were firing randomly: nothing would be represented. This process of decoding neural data forms a loop with neural encoding. First, the organism must be able to perceive a set of stimuli in the world – say a picture of a hat. Seeing the stimuli must result in some internal learning: the encoding stage. After varying the range of stimuli that is presented to the observer, we expect the neurons to adapt to the statistical properties of the signals, encoding those that occur most frequently:^[6] the efficient-coding hypothesis. Now neural decoding is the process of taking these statistical consistencies, a statistical model of the world, and reproducing the stimuli. This may map to the process of thinking and acting, which in turn guide what stimuli we receive, and thus, completing the loop.

In order to build a model of neural spike data, one must both understand how information is originally stored in the brain and how this information is used at a later point in time. This neural coding and decoding loop is a symbiotic relationship and the crux of the brain's learning algorithm. Furthermore, the processes that underlie neural decoding and encoding are very tightly coupled and may lead to varying levels of representative ability.^[7]^[8]

Spatial resolutions

Much of the neural decoding problem depends on the spatial resolution of the data being collected. The number of neurons needed to reconstruct the stimulus with reasonable accuracy depends on the means by which data is collected and the area which is being recorded. For example, rods and cones (which respond to colors of small visual areas) in the retina may require more recordings than simple cells (which respond to orientation of lines) in the primary visual cortex.

Previous recording methods relied on stimulating single neurons over a repeated series of tests in order to generalize this neuron's behavior.^[9] New techniques such as high-density multi-electrode array recordings and multi-photon calcium imaging techniques now make it possible to record from upwards of a few hundred neurons. Even with better recording techniques, the focus of these recordings must be on an area of the brain that is both manageable and qualitatively understood. Many studies look at spike train data gathered from the ganglion cells in the retina, since this area has the benefits of being strictly feedforward, retinotopic, and amenable to current recording granularities. The duration, intensity, and location of the stimulus can be controlled to sample, for example, a particular subset of ganglion cells within a structure of the visual system.^[10] Other studies use spike trains to evaluate the discriminatory ability of non-visual senses such as rat facial whiskers^[11] and the olfactory coding of moth pheromone receptor neurons.^[12]

Even with ever-improving recording techniques, one will always run into the limited sampling problem: given a limited number of recording trials, it is impossible to completely account for the error associated with noisy data obtained from stochastically functioning neurons. (For example, a neuron's electric potential fluctuates around its resting potential due to a constant influx and efflux of sodium and potassium ions.) Therefore, it is not possible to perfectly reconstruct a stimulus from spike data. Luckily, even with noisy data, the stimulus can still be reconstructed within acceptable error bounds.^[13]

Temporal resolutions

Timescales and frequencies of stimuli being presented to the observer are also of importance to decoding the neural code. Quicker timescales and higher frequencies demand faster and more precise responses in neural spike data. In humans, millisecond precision has been observed throughout the visual cortex, the retina,^[14] and the lateral geniculate nucleus. So one would suspect this to be the appropriate measuring frequency. This has been confirmed in studies that quantify the responses of neurons in the lateral geniculate nucleus to white-noise and naturalistic movie stimuli.^[15] At the cellular level, spike-timing-dependent plasticity operates at millisecond timescales.^[16] Therefore models seeking biological relevance should be able to perform at these temporal scales.

Probabilistic decoding

When decoding neural data, arrival times of each spike $t_{1},{\text{ }}t_{2},{\text{ }}...,{\text{ }}t_{n}{\text{ }}={\text{ }}\{t_{i}\}$ , and the probability of seeing a certain stimulus, $P[s(t)]$ may be the extent of the available data. The prior distribution $P[s(t)]$ defines an ensemble of signals, and represents the likelihood of seeing a stimulus in the world based on previous experience. The spike times may also be drawn from a distribution $P[\{t_{i}\}]$ ; however, what we want to know is the probability distribution over a set of stimuli given a series of spike trains $P[s(t)|\{t_{i}\}]$ , which is called the response-conditional ensemble. What remains is the characterization of the neural code by translating stimuli into spikes, $P[\{t_{i}\}|s(t)]$ ; the traditional approach to calculating this probability distribution has been to fix the stimulus and examine the responses of the neuron. Combining everything using Bayes' Rule results in the simplified probabilistic characterization of neural decoding: $P[s(t)|\{t_{i}\}]=P[\{t_{i}\}|s(t)]*(P[s(t)]/P[\{t_{i}\}])$ . An area of active research consists of finding better ways of representing and determining $P[\{t_{i}\}|s(t)]$ .^[17] The following are some such examples.

Spike train number

The simplest coding strategy is the spike train number coding. This method assumes that the spike number is the most important quantification of spike train data. In spike train number coding, each stimulus is represented by a unique firing rate across the sampled neurons. The color red may be signified by 5 total spikes across the entire set of neurons, while the color green may be 10 spikes; each spike is pooled together into an overall count. This is represented by:

$P(r|s)=\prod _{}P(n_{ij}|s)$

where $r=n=$ the number of spikes, $n_{ij}$ is the number of spikes of neuron $i$ at stimulus presentation time $j$ , and s is the stimulus.

Instantaneous rate code

Adding a small temporal component results in the spike timing coding strategy. Here, the main quantity measured is the number of spikes that occur within a predefined window of time T. This method adds another dimension to the previous. This timing code is given by:

$P(r|s)=\prod _{l}\left[\prod _{i,j}v_{i}(t_{ijl}|s)dt\right]exp\left[-\sum _{i}\int _{0}^{T}dtv_{i}(t|s)\right]$

where $t_{ijl}$ is the jth spike on the lth presentation of neuron i, $v_{i}(t|s)$ is the firing rate of neuron i at time t, and 0 to T is the start to stop times of each trial.

Temporal correlation

Temporal correlation code, as the name states, adds correlations between individual spikes. This means that the time between a spike $t_{i}$ and its preceding spike $t_{i-1}$ is included. This is given by:

$P(r|s)=\prod _{l}\left[\prod _{i,j}v_{i}(t_{ijl},\tau (t_{ijl})|s)dt\right]exp\left[-\sum _{i}\int _{0}^{T}dtv_{i}(t,\tau (t)|s)\right]$

where $\tau (t)$ is the time interval between a neurons spike and the one preceding it.

Ising decoder

Another description of neural spike train data uses the Ising model borrowed from the physics of magnetic spins. Because neural spike trains are effectively binarized (either on or off) at small time scales (10 to 20 ms), the Ising model is able to effectively capture the present pairwise correlations,^[18] and is given by:

$P(r|s)={\frac {1}{\mathrm {Z} (s)}}exp\left(\sum _{i}h_{i}(s)r_{i}+{\frac {1}{2}}\sum _{i\neq j}J_{ij}(s)r_{i}r_{j}\right)$

where $r=(r_{1},r_{2},...,r_{n})^{T}$ is the set of binary responses of neuron i, $h_{i}$ is the external fields function, $J_{ij}$ is the pairwise couplings function, and $\mathrm {Z} (s)$ is the partition function

Agent-based decoding

In addition to the probabilistic approach, agent-based models exist that capture the spatial dynamics of the neural system under scrutiny. One such model is hierarchical temporal memory, which is a machine learning framework that organizes the visual perception problem into a hierarchy of interacting nodes (neurons). The connections between nodes on the same level and lower levels are termed synapses, and their interactions are subsequently learning. Synapse strengths modulate learning and are altered based on the temporal and spatial firing of nodes in response to input patterns.^[19]^[20]

While it is possible to transform the firing rates of these modeled neurons into the probabilistic and mathematical frameworks described above, agent-based models provide the ability to observe the behavior of the entire population of modeled neurons. Researchers can circumvent the limitations implicit with lab-based recording techniques. Because this approach does rely on modeling biological systems, error arises in the assumptions made by the researcher and in the data used in parameter estimation.

Applicability

The advancement in our understanding of neural decoding benefits the development of brain-machine interfaces, prosthetics^[21] and the understanding of neurological disorders such as epilepsy.^[22]