Browsed by
Category: Free Energy

A “First Principles” Approach to General AI

A “First Principles” Approach to General AI

What We Need to Take the Next Tiny, Incremental Little Step: The “next big thing” is likely to be the next small thing – a tiny step, an incremental shift in perspective. However, a perspective shift is all that we need in order to make some real advances towards general artificial intelligence (GAI). In the second chapter of the ongoing book , I share the following figure (and sorry, the chapter itself is not released yet): Now, we’ve actually been…

Read More Read More

A “Hidden Layer” Guiding Principle – What We Minimally Need

A “Hidden Layer” Guiding Principle – What We Minimally Need

Putting It Into Practice: If we’re going to move our neural network-type architectures into a new, more powerful realm of AI capability, we need to bust out of the “sausage-making” mentality that has governed them thus far, as we discussed last week. To do this, we need to give our hidden layer(s) something to do besides respond to input stimulus. It’s very realistic that this “something” should be free energy minimization, because that’s one of the strongest principles in the…

Read More Read More

How Getting to a Free Energy Bottom Helps Us Get to the Top

How Getting to a Free Energy Bottom Helps Us Get to the Top

Free Energy Minimization Gives an AI Engine Something Useful to Do:   Cutting to the chase: we need free energy minimization in a computational engine, or AI system, because it gives the system something to do besides being a sausage-making machine, as I described in yesterday’s blog on What’s Next for AI. Right now, deep learning systems are constrained to be simple input/output devices. We force-feed them with stimulus at one end, and they poop out (excuse me, “pop out”)…

Read More Read More

Machine Learning: Multistage Boost Process

Machine Learning: Multistage Boost Process

Three Stages to Orbital Altitude in Machine Learning Several years ago, Regina Dugan (then Director of DARPA) gave a talk in which she showed a clip of epic NASA launch fails. Not just one, but many fails. The theme was that we had to risk failure in order to succeed with innovation. This YouTube vid of rocket launch failures isn’t the exact clip that she showed (the “action” doesn’t kick in for about a minute), but it’s pretty close. For…

Read More Read More

Neg-Log-Sum-Exponent-Neg-Energy – That’s the Easy Part!

Neg-Log-Sum-Exponent-Neg-Energy – That’s the Easy Part!

The Surprising (Hidden) “Gotcha” in This Energy Equation: A couple of days ago, I was doing one of my regular weekly online “Synch” sessions with my Deep Learning students. In a sort of “Beware, here there be dragons!” moment, I showed them this energy equation from the Hinton et al. (2012) Nature review paper on acoustic speech modeling: One of my students pointed out, “That equation looks kind of simple.” Well, he’s right. And I kind of bungled the answer,…

Read More Read More

Seven Essential Machine Learning Equations: A Cribsheet (Really, the Précis)

Seven Essential Machine Learning Equations: A Cribsheet (Really, the Précis)

Making Machine Learning As Simple As Possible Albert Einstein is credited with saying, Everything should be made as simple as possible, but not simpler. Machine learning is not simple. In fact, once you get beyond the simple “building blocks” approach of stacking things higher and deeper (sometimes made all too easy with advanced deep learning packages), you are in the midst of some complex stuff. However, it does not need to be more complex than it has to be.  …

Read More Read More

A Tale of Two Probabilities

A Tale of Two Probabilities

Probabilities: Statistical Mechanics and Bayesian:   Machine learning fuses several different lines of thought, including statistical mechanics, Bayesian probability theory, and neural networks. There are two different ways of thinking about probability in machine learning; one comes from statistical mechanics, and the other from Bayesian logic. Both are important. They are also very different. While these two different ways of thinking about probability are usually very separate, they come together in some of the more advanced machine learning topics, such…

Read More Read More

Seven Statistical Mechanics / Bayesian Equations That You Need to Know

Seven Statistical Mechanics / Bayesian Equations That You Need to Know

Essential Statistical Mechanics for Deep Learning   If you’re self-studying machine learning, and feel that statistical mechanics is suddenly showing up more than it used to, you’re not alone. Within the past couple of years, statistical mechanics (statistical thermodynamics) has become a more integral topic, along with the Kullback-Leibler divergence measure and several inference methods for machine learning, including the expectation maximization (EM) algorithm along with variational Bayes.     Statistical mechanics has always played a strong role in machine…

Read More Read More

How to Read Karl Friston (in the Original Greek)

How to Read Karl Friston (in the Original Greek)

Karl Friston, whom we all admire, has written some lovely papers that are both enticing and obscure. Cutting to the chase, what we really want to understand is this equation: In a Research Digest article, Peter Freed writes: … And today, Karl Friston is not explaining [the free energy principle] in a way that makes it usable to your average psychiatrist/psychotherapist on the street – which is frustrating. I am not alone in my confusion, and if you read the…

Read More Read More

Approximate Bayesian Inference

Approximate Bayesian Inference

Variational Free Energy I spent some time trying to figure out the derivation for the variational free energy, as expressed in some of Friston’s papers (see citations below). While I made an intuitive justification, I just found this derivation (Kokkinos; see the reference and link below): Other discussions about variational free energy: Whereas maximum a posteriori methods optimize a point estimate of the parameters, in ensemble learning an ensemble is optimized, so that it approximates the entire posterior probability distribution…

Read More Read More