https://www.semanticscholar.org/paper/Frontal-cortex-function-as-derived-from-predictive-Alexander-Brown/69311d910137314e2eb4ff1019ab41a47c352ffb

Hierarchical Predictive Coding

Deconstructing neuroscience to support the successful development of our sentient AI build.

7 min readJun 5, 2018

On our PlutusX Blog, I wrote about AI Basics — The Math and just wanted to test the audience to see interest level for content like that. I don’t believe that our PlutusX blog was the best medium to store that info so I wanted to expand on the topic here on my personal one.

For those who do not know, aside from PlutusX, a crypto bank, I am also deeply invested in building sentient AI bots. I believe that understanding and building a tolerant bot is imperative for the prevention of a potential existential threat to humanity, or the biosphere as a whole. As technology advances (i.e. quantum computing) both predictive computation and overall processing power present the likely conception of a hyper-intelligent AI which has the possibility of being both extremely beneficial or mortifying. Through my research and construction of the bot, I have decided to dive deep into understanding the neuroscience of the human mind. Understanding how neural nets, synapsis, and transmitter interact with each other will give insight for us. In return, this newly learned information will allow me and my team to build more accurate models for our bot moving forward.

The goal isn’t to replicate the human mind but instead implement effective components that prove to be imperative for perception and learning.

In this article and future articles to follow I want to explore hierarchical predictive coding and the parallels with in the human brain.

Define perception

Our perception of the world are not passive reflections of input data, but instead constructs built in context to sensory evidence in ways heavily informed by stored knowledge.

If you don’t believe me you should check out the Strawberry Illusion below. These perceptive models are proposed by individuals such as Helmholtz in 1860 and many others such as Neisser in 1967, and Gregory in 1980.

Strawberry illusion

Here is an image of strawberries. Because your previous knowledge of staweberries perception will lead you to believe that the ones presented in the picutre below are red. In actuality there are no red pixel in the picture. This is an example of generative models based on previously stored knowledge.

https://motherboard.vice.com/en_us/article/wnkq5n/this-picture-has-no-red-pixelsso-why-do-the-strawberries-still-look-red

Alright, lets dive into the meat of this topic…

Hierarchical Predictive Coding (HPC)

Hierarchical Predictive Coding (HPC) is forged by the brain using multiple layers of predictive assumptive models. Essentially, the brain receives input data from sensory stimulation, makes non-linear assumptions based on its previous knowledge of the world, and is quickly adjusted using corrected feedback by income data that follows.

The main component (the heart) of HPC is probabilistic generative models. As it is, general models allow the brain to construct plausible versions of sensory input data for itself using what the system (program) has learned about the probabilistic outcome patterns from the data previously stored.

This idea suggests that the mind works more like a computer graphic program rather than a standard pattern recognition model or classic classifier.

An example would be an “Image Grammar” — a probabilistic generative model for images. How this works is you expose multi-layer neural networks to a large data set of training images and generative model will attempt to create “fakes”. For the program to be successful at building realistic fakes it must be able to learn visual pattern characteristics of different kind of objects at different variations and spatial scales (Karras et al 2018-ICLR).

high-res “fake Celebrity” generated images. Generative Adversarial Networks (GAN)

Crucially, in HPC, the trained-up generation engine is used for online perception (recognition), and the brain (program) recognizes the current scene and builds probabilistic predictions (matches) to most accurately assume the outcome with incoming sensory input.

The goal is for a cascade of generative-model based construct (prediction) to match and explain away the incoming sensory input to a tolerable error rate.

Basically, Sensory Stimulation >> Online Perception (recognition) >> Probabilistic predictions (match) >> Generative-model based construct (prediction) >> correct model using looping feedback to minimize error rate to a tolerable noise.

This GAN model can be used in more complicated scenarios too, outside replicating fake images which as the foundation are just building predictive constructs using pixels as data, even with multiple layers. This can be used to predict taste of foods based on optical input, outcome based on action, and so on. The best way to correlate actions and generative-models is to leverage actions to help increase probability of the predicted outcome. Changing your environment to better cultivate a higher probabilistic yield for the model.

In HPC, Prediction errors are used to select better “top-down” guesses (hypothesis), continually refining the predictive model until some [variable] tolerable error level is achieved, OR be the catalyst for driving plasticity and learning to minimize error for maximum optimization.

Why do we want to implement this into our A.I. system? Well, to make this work, formal HPC models depict distinct predictions or representation units and error units, both coexisting on every level of neural processing.

These two units have been associated, respectively, with deep pyramidal cells which propagate backwards neural connections in the brain, and superficial pyramidal cells which is responsible for building forward connections in the brain (Bastos et al 2012).

In essence, all that is needed for HCP are two components:

The brain needs to display a deep functional asymmetry between something carrying predictions and something carrying precision-weighted error, and
Those two combine according to the predictive coding formula.

If you meet these two parameters simultaneously then you are operating in an HPC environment.

Minimizing Error Rates

So, predictive models are only useful if the models generated are highly accurate. So, how can we minimize error efficiently while still maintaining a high level of effectiveness while processing?

HPC is NOT exclusively about the suppression of error but more so enhancing the importance of Predictive Errors (PEs). Since both representation and error units coexist simultaneously at every level of neural processing the suppression of error units sharpens laterally the representation units. In addition to simply suppressing error, the units are actually assigned importance, weighted, and reliability based on the environment and context.

Such PEs are temporally more important than others based on the context of the environment than those prediction errors would be otherwise. As a result, highly-weighted PE signals enjoy enhanced post-synaptic gain, so that representation unit activity can recruit more power to explain away the error (minimize).

Why do we want to implement this model in our AI code? Well, the current neurological studies show that complex neurotransmitter economies are implicated here, with dopamine (not the only transmitter) as a major contributor for feedback correction in the brain showing that Reward learning models and HPC is a good start if we are to attempt to replicate perception and accurate generative-models (Kok et al 2012).

Whats interesting about HPC is that if our model is correct then we can in theory have to ability to also code mental illnesses like autism and schizophrenia. Maybe, even possibly reverse engineer the neurological patterns so that we can eradicate the illnesses from the brain. Bad on my understanding, Autism for example is inadequately weighted PEs using stored knowledge in the generative-models thus making the world seem full of unpredictable and unexplained stimulation (Brock, J. 2012).

Take Away for Hierarchical Predictive Coding:

Perception crucially involves the distribution of probabilistic generative models to predict sensory flux at every level of neural processing.
Precision-weighted predicted error is actively computed during EVERY episode of perceptual processing.
There is a fundamental asymmetry in the neural wiring, so that forward and backward connections play distinctive functional roles. The is predictive flow is doing all the heavy (non-linear) lifting.
The asymmetry is exploited by combining prediction and error signals using the ‘predictive coding’ formula. Predictions suppress errors laterally at every level below

This is part one of many more to come. I want to chop it up into smaller more digestible size pieces of content. Sentient AI is a feat that is more tangible that we think. Once Quantum computing evolves to be more efficient and effective we can expect to see exponential jumps in A.I. development.