1.3 MLP

1. MLP (Multi-Layer Perceptron)

An MLP (Multilayer Perceptron) is a type of artificial neural network composed of layers of neurons. It’s one of the simplest and most foundational neural network architectures.

1.1 🧱 Structure of an MLP

At a high level, an MLP has:

Input layer – takes the input features.
One or more hidden layers – where computation happens using weights and activation functions.
Output layer – produces the final prediction (regression value or classification label).

Each layer is fully connected (i.e., each neuron in one layer connects to every neuron in the next). Hence MLPs are often called fully connected networks or dense networks.

1.2 🧮 How does it work?

Each neuron in a layer performs:

Where:

: input vector
: weight vector
: bias
: activation function (e.g., ReLU, sigmoid)

1.3 🔄 Example Flow

Input → [Dense Layer + Activation] → [Dense Layer + Activation] → Output

E.g., for a 3-layer MLP:

x (input)
↓
Layer 1: W1·x + b1 → ReLU
↓
Layer 2: W2·h1 + b2 → ReLU
↓
Output Layer: W3·h2 + b3 → Output

1.4 🧠 What can MLPs do?

MLPs can approximate any continuous function (thanks to the Universal Approximation Theorem), and are used for:

Regression
Classification
Function approximation
Time-series prediction (when used with context)

1.5 🧠 MLP Diagram

 Input Layer       Hidden Layer(s)        Output Layer
   [x₁] ──┬──────┐   o         o           [ŷ]
   [x₂] ──┼─────►|─► o  ...    o ───────►
   [x₃] ──┘      └── o         o
              (e.g. ReLU activation)

Each circle is a neuron. Each layer is fully connected to the next. Hidden layers apply a nonlinear function like ReLU.

1.6 🔧 Code Example

import torch
import torch.nn as nn
import torch.nn.functional as F

class MLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)  # input → hidden
        self.fc2 = nn.Linear(hidden_size, output_size) # hidden → output

    def forward(self, x):
        x = F.relu(self.fc1(x))  # activation after first layer
        x = self.fc2(x)          # no activation if doing regression
        return x

# Example usage
model = MLP(input_size=3, hidden_size=5, output_size=1)  # 3 inputs → 5 hidden → 1 output
input_data = torch.tensor([[0.1, 0.2, 0.3]])
output = model(input_data.float())

print(output)

You can tweak:

output_size = 1 for regression
output_size = 2 or more with softmax for classification

2. MLP for Regression vs Classification

Great question! Adapting Multilayer Perceptrons (MLPs) for regression vs classification tasks mainly involves changes in:

Output layer architecture
Activation functions
Loss functions

2.1 🔁 Shared parts

Regardless of the task, MLPs usually have:

Input layer (based on feature size)
One or more hidden layers
Non-linear activations (e.g., ReLU, tanh) in hidden layers

2.2 🔵 For Regression Tasks

1. Output layer:

Usually 1 neuron (or more if multi-output regression).
No activation function (i.e., linear output):

2. Loss function:

Mean Squared Error (MSE) or Mean Absolute Error (MAE):

3. Interpretation:

Output is a continuous value, modeling things like temperature, price, etc.

2.3 🔴 For Classification Tasks

1. Binary Classification:

Output layer

1 neuron
Sigmoid activation to squash into [0, 1]:

Loss function

Binary Cross-Entropy:

2. Multi-class Classification:

Output layer

One neuron per class (i.e., size = number of classes)
Softmax activation to get probabilities that sum to 1:

Loss function

Categorical Cross-Entropy:

2.4 🧠 Summary

Task	Output Neurons	Output Activation	Loss Function
Regression	1 (or more)	None (Linear)	MSE / MAE
Binary Class.	1	Sigmoid	Binary Cross-Entropy
Multi-class	# of classes	Softmax	Categorical Cross-Entropy

Enter Password

1.3 MLP

1. MLP (Multi-Layer Perceptron)

1.1 🧱 Structure of an MLP

1.2 🧮 How does it work?

1.3 🔄 Example Flow

1.4 🧠 What can MLPs do?

1.5 🧠 MLP Diagram

1.6 🔧 Code Example

2. MLP for Regression vs Classification

2.1 🔁 Shared parts

2.2 🔵 For Regression Tasks

1. Output layer:

2. Loss function:

3. Interpretation:

2.3 🔴 For Classification Tasks

1. Binary Classification:

Output layer

Loss function

2. Multi-class Classification:

Output layer

Loss function

2.4 🧠 Summary