A Boltzmann Machine is a stochastic recurrent neural network invented by Geoffrey Hinton and Terry Sejnowski in 1985. It consists of binary units or neurons with weighted connections between them. These networks are capable of learning probability distributions from input data. The architecture includes visible units that represent the data we observe, and hidden units that capture dependencies between visible units. All units are connected to each other, forming a fully connected network.
Boltzmann Machines are energy-based models where each state of the network has an associated energy value. The energy function is defined as the negative sum of products of connected unit states multiplied by their connection weights, plus the sum of bias terms. The network tends to evolve toward states with lower energy. This energy landscape has multiple local minima, which represent stable patterns the network can learn. The probability of the network being in a particular state is determined by the Boltzmann distribution, which gives lower energy states higher probability.
Boltzmann Machines learn by adjusting connection weights to maximize the likelihood of generating the training data. The learning rule involves comparing correlations between units in two phases: the data phase, where visible units are clamped to training examples, and the model phase, where the network runs freely. The weight update is proportional to the difference between these correlations. However, training Boltzmann Machines presents significant challenges. The sampling process required for the model phase is computationally expensive, as it requires running Markov Chain Monte Carlo methods until the network reaches thermal equilibrium. This leads to slow convergence and makes these models difficult to train in practice.
A Restricted Boltzmann Machine, or RBM, is a simplified variant of the Boltzmann Machine with a bipartite structure. Unlike the fully connected Boltzmann Machine, RBMs have no connections between units in the same layer. Connections only exist between visible and hidden units. This restriction makes RBMs much more efficient to train using contrastive divergence, a method developed by Geoffrey Hinton. RBMs can be stacked to form Deep Belief Networks, which were among the first successful deep learning architectures. RBMs have found applications in dimensionality reduction, feature learning, collaborative filtering for recommendation systems, and as pre-training components for deep neural networks.
To summarize what we've learned about Boltzmann Machines: They are stochastic recurrent neural networks invented by Geoffrey Hinton and Terry Sejnowski that learn probability distributions over their inputs. As energy-based models, they associate each network state with an energy value, with lower energy states having higher probability according to the Boltzmann distribution. Learning in these networks involves comparing correlations between units during data and model phases, though this process is computationally expensive. Restricted Boltzmann Machines simplify the architecture by eliminating connections within the same layer, making them more practical to train. RBMs played a crucial role in the early development of deep learning as building blocks for Deep Belief Networks and remain relevant in various applications including generative models, feature learning, and dimensionality reduction.