How to build deep neural network architecture? Today, deep learning is one of the most promising machine learning algorithms, especially for image recognition and unstructured data. A deep neural network is basically a neural network with more than two hidden layers. Hidden layers are layers between the input layer and the output layer in a neural network, its role is to learn features from input layer. While increasing the number of hidden layers, it helps the neural network to learn more complex features from the input data. Building a state-of-the-art deep neural network first depends on the type of problem that has to be solved. A problem in image classification doesn’t require the same architecture as a problem in anomaly detection or forecasting. In image classification, the most common type of layers that are used is convolutional layers as they are most suitable for images as input. In anomaly detection, it is preferable to use an architecture based on encoderdecoder as the neural network will deconstruct and reconstruct the input and try to flag any input that doesn’t follow the general pattern. One of the most frequently asked question about building a deep neural network is how deep the neural network should be at the beginning of the training. The ideal situation is to start with the smallest architecture possible which means the least layers possible and then increasing the number of layers until it reaches the best possible performance. Another frequently asked question is how to select activation functions. An activation function plays a crucial role in a deep learning model as it is capable of transforming input data to a nonlinear approximation. To make it simple, the most popular activation function for hidden layers is ReLU. It is the one that shows the best results, in most of the case. For the layer before the output layer, the selection is based on the type of problems that are being solved. For example, if the problem is a classification problem, the Softmax activation function is the one that should be used as it helps in converting the data into probabilities. Also, for the activation function usage, it is recommended to sometimes try new ones (Tanh, Leaky ReLU, etc.) and see if that increases the performance and accuracy. Also note that every type of deep neural network architecture has its specific use case. For example, a generative adversarial neural network is a type of architecture that can be used to generate data. It can also be used for anomaly detection but it cannot be used for image classification or any other task. So, you have to consider all possible architectures of neural networks as a tool box and select the most appropriate one based on the problem that you are trying to solve.

视频信息