Browsed by
Category: A Resource – Article

Selecting a Neural Network Transfer Function: Classic vs. Current

Selecting a Neural Network Transfer Function: Classic vs. Current

Neural Network Transfer Functions: Sigmoid, Tanh, and ReLU   Making it or breaking it with neural networks: how to make smart choices.     Why We Weren’t Getting Convergence   This last week, in working with a very simple and straightforward XOR neural network, a lot of my students were having convergence problems. The most likely reason? Very likely, it’s been my choice for the transfer function. I had given them a very simple network. (Lots of them are still…

Read More Read More

Notational Frenzy

Notational Frenzy

When the Subtle Art of Mathematical Notation Defeats You (and How to Fight Back)   A couple of years ago, I was teaching Time Series and Forecasting for the first time. I didn’t know the subject – at all – but that didn’t bother me. Hey, it was mathematics, right? Anything that’s mathematical, I can eat for lunch, and then want some dessert-equations afterwards. First week, introducing the subject. That went fine. Second week, Simple Exponential Smoothing (SES). That’s simple….

Read More Read More

Labor Day Reading and Academic Year Kick-Off

Labor Day Reading and Academic Year Kick-Off

Deep Learning / Machine Learning Reading and Study Guide:   Several of you have been asking for guided reading lists. This makes sense.   Your Starting Point for Neural Networks, Deep Learning, and Machine Learning   Your study program (reading and code) depends on where you are. Starting out (High-grass country; St. Louis to Alcove Springs): Basic neural networks and deep learning; architecture for common networks, such as CNNs (convolutional neural networks); learning rules and architecture design. Well on the…

Read More Read More

The Statistical Mechanics Underpinnings of Machine Learning

The Statistical Mechanics Underpinnings of Machine Learning

Machine Learning Is Different Now:   Actually, machine learning is a continuation of what it always has been, which is deeply rooted in statistical physics (statistical mechanics). It’s just that there’s a culmination of insights that are now a very substantive body of work, with more theoretical rigor behind them than most of us know.     A Lesson from Mom: It takes a lot of time to learn a new discipline. This is something that I learned from my…

Read More Read More

2025 and Beyond

2025 and Beyond

Artificial Intelligence and Jobs by the Year 2025: One of my biggest take-aways from the recent (May, 2017) NVIDIA GTC (GPU Technology Conference) was less about the technology, and more about the near-term jobs impact of artificial intelligence (AI) and robotics. Making smart education and career decisions is crucial, as the emerging combination of AI and robotics will have a huge impact on jobs. Those of you studying artificial intelligence, deep learning, and neural networks will have a stronger career…

Read More Read More

How to Read Karl Friston (in the Original Greek)

How to Read Karl Friston (in the Original Greek)

Karl Friston, whom we all admire, has written some lovely papers that are both enticing and obscure. Cutting to the chase, what we really want to understand is this equation: In a Research Digest article, Peter Freed writes: … And today, Karl Friston is not explaining [the free energy principle] in a way that makes it usable to your average psychiatrist/psychotherapist on the street – which is frustrating. I am not alone in my confusion, and if you read the…

Read More Read More

Approximate Bayesian Inference

Approximate Bayesian Inference

Variational Free Energy I spent some time trying to figure out the derivation for the variational free energy, as expressed in some of Friston’s papers (see citations below). While I made an intuitive justification, I just found this derivation (Kokkinos; see the reference and link below): Other discussions about variational free energy: Whereas maximum a posteriori methods optimize a point estimate of the parameters, in ensemble learning an ensemble is optimized, so that it approximates the entire posterior probability distribution…

Read More Read More

Brain Networks and the Cluster Variation Method: Testing a Scale-Free Model

Brain Networks and the Cluster Variation Method: Testing a Scale-Free Model

Surprising Result Modeling a Simple Scale-Free Brain Network Using the Cluster Variation Method One of the primary research thrusts that I suggested in my recent paper, The Cluster Variation Method: A Primer for Neuroscientists, was that we could use the 2-D Cluster Variation Method (CVM) to model distribution of configuration variables in different brain network topologies. Specifically, I was expecting that the h-value (which measures the interaction enthalpy strength between nodes in a 2-D CVM grid) would change in a…

Read More Read More

The Cluster Variation Method: A Primer for Neuroscientists

The Cluster Variation Method: A Primer for Neuroscientists

Single-Parameter Analytic Solution for Modeling Local Pattern Distributions The cluster variation method (CVM) offers a means for the characterization of both 1-D and 2-D local pattern distributions. The paper referenced at the end of this post provides neuroscientists and BCI researchers with a CVM tutorial that will help them to understand how the CVM statistical thermodynamics formulation can model 1-D and 2-D pattern distributions expressing structural and functional dynamics in the brain. The equilibrium distribution of local patterns, or configuration…

Read More Read More

Brain-Computer Interfaces, Kullback-Leibler, and Mutual Information: Case Study #1

Brain-Computer Interfaces, Kullback-Leibler, and Mutual Information: Case Study #1

Brain-Computer Interfaces, Kullback-Leibler, and Mutual Information: Case Study #1 In the previous blogpost, I introduced the Kullback-Leibler divergence as an essential information-theoretic tool for researchers, designers, and practitioners interested in not just Brain-Computer Interfaces (BCIs), but specifically in Brain-Computer Information Interfaces (BCIIs). The notion of Mutual Information (MI) is also fundamental to information theory, and it can be expressed in terms of the Kullback-Leibler divergence. Mutual Information is given as: Mutual Information Notation I(x,y) is the mutual information of two…

Read More Read More