Skip to content

2. LeNet5 Model

1. Introduction to LeNet5

LeNet-5 is a pioneering convolutional neural network (CNN) architecture designed by Yann LeCun and his colleagues in the late 1980s and early 1990s. It was specifically developed for handwritten digit recognition tasks, such as recognizing digits in the MNIST dataset. LeNet-5 played a crucial role in demonstrating the effectiveness of CNNs for image recognition tasks and laid the groundwork for modern deep learning architectures used in computer vision today.

2. Components of LeNet-5

  1. Architecture Overview

    LeNet-5 consists of seven layers, including convolutional layers, subsampling (pooling) layers, and fully connected layers. Here’s a detailed breakdown of its components:

    LeNet-5 Architecture

    • Input Layer: Accepts grayscale images of size pixels.

    • Convolutional Layers: LeNet-5 has two convolutional layers:

      • First Convolutional Layer: Applies six convolutional filters of size .
      • Second Convolutional Layer: Applies sixteen convolutional filters of size .
      • Mathematical Operation: The convolution operation computes the dot product between the input image and the convolutional filters, followed by an activation function (commonly a sigmoid or tanh in LeNet-5).
    • Subsampling (Pooling) Layers: Following each convolutional layer, LeNet-5 uses average pooling layers:

      • First Pooling Layer: Performs average pooling with a window and a stride of 2.
      • Second Pooling Layer: Similar average pooling with a window and a stride of 2.
      • Mathematical Operation: Average pooling reduces the spatial dimensions of the feature maps by taking the average value within each pooling window.
    • Fully Connected Layers: LeNet-5 includes three fully connected layers:

      • First Fully Connected Layer: 120 neurons with a sigmoid activation function.
      • Second Fully Connected Layer: 84 neurons with a sigmoid activation function.
      • Output Layer: Typically consists of 10 neurons (corresponding to 10 classes in MNIST), with a softmax activation function to output class probabilities.
  2. Mathematical Operations in LeNet-5

    • Convolution Operation: Convolutional layers apply filters to the input images to extract spatial hierarchies of features: where is the input image, is the filter (kernel), and are spatial coordinates.

    • Activation Function: LeNet-5 uses sigmoid or tanh activation functions after convolutional and fully connected layers to introduce non-linearity.

    • Average Pooling: Average pooling reduces the spatial dimensions of feature maps by computing the average value within each pooling window:

    • Fully Connected Layers: Each neuron in the fully connected layers computes a weighted sum of its inputs, followed by a bias term and activation function: where is the weight matrix, is the input vector, and is the bias vector.

3. Architectural Innovations

  1. Early Use of Convolutional Layers: LeNet-5 demonstrated the power of convolutional layers in learning hierarchical features directly from pixel values, which is essential for pattern recognition in images.

  2. Pooling Layers: The use of average pooling layers helped in reducing the spatial dimensions of feature maps, improving translation invariance and computational efficiency.

  3. Training Methodology: LeNet-5 was trained using gradient-based optimization methods like stochastic gradient descent (SGD), which were computationally feasible even with the hardware available at the time.

4. Impact and Legacy

  1. Performance: LeNet-5 achieved state-of-the-art results in handwritten digit recognition tasks, laying the foundation for subsequent advancements in CNN architectures for image classification.

  2. Influence on Deep Learning: The architectural principles of LeNet-5, such as convolutional layers, pooling layers, and fully connected layers, became fundamental in the development of modern CNNs used in diverse applications ranging from image recognition to natural language processing.

  3. Benchmark Dataset: MNIST, the dataset used to train and evaluate LeNet-5, became a standard benchmark dataset for evaluating new machine learning techniques, further cementing LeNet-5’s legacy in the field.