Cross-entropy is a fundamental concept in machine learning that measures the difference between predicted and true probability distributions. When a model makes accurate predictions, the cross-entropy loss is low. When predictions are poor, the loss becomes very high, as shown by this curve.
The binary cross-entropy formula calculates the loss for each prediction. When the true label is 1 and we predict 0.9, the loss is low at 0.11. But when we predict 0.1 for a true label of 1, the loss jumps to 2.30. This steep penalty encourages the model to make confident, correct predictions.
Categorical cross-entropy extends the concept to multiple classes. For a three-class problem, if the true class is A and our model predicts probabilities of 0.7 for A, 0.2 for B, and 0.1 for C, the loss is calculated as negative log of 0.7, which equals 0.36. The model is penalized based on how confident it is about the correct class.
Cross-entropy serves as the guiding signal during training. The model makes predictions, cross-entropy calculates the loss, and backpropagation adjusts the weights to minimize this loss. As training progresses, the loss typically decreases, showing the model is learning to make better predictions.
To summarize: Cross-entropy is a powerful loss function that measures how well predicted probabilities match true labels. It comes in binary and categorical forms, heavily penalizes confident wrong predictions, and provides the mathematical foundation for training classification models through gradient descent.