Module 2 — Training & Learning Process

Published: June 24, 2026

Deep Learning Tutorial Series

This module explains how neural networks actually learn from data. The focus is on intuition: how predictions become better over time through feedback.

2.1 Learning Goals

By the end of this module, you should understand:

What “training” means in machine learning
What a loss function is
Why models make errors
How gradient descent improves models
The idea of backpropagation (conceptual)
What an epoch, batch, and iteration are

2.2 What is Training?

Training is the process of improving a model by adjusting its parameters (weights and biases) using data.

Basic idea:

Make a prediction
Compare it with the correct answer
Measure how wrong it is
Update the model to reduce error

This loop repeats many times.

Reading materials:

2.3 Prediction vs Reality

A model produces a prediction:

prediction ≠ true value

The difference between them is called error.

Example:

True value: 100
Prediction: 80
Error: 20

2.4 What is a Loss Function?

A loss function measures how wrong a model is.

It converts error into a single number.

Example (Mean Squared Error)

loss = (prediction - true_value)^2

Why we need it:

Gives a clear training signal
Smaller loss = better model

2.5 Goal of Training

The goal is simple: minimize loss.

We want the model to make predictions that reduce this error as much as possible.

2.6 Intuition: Learning as Feedback

Think of learning like this:

You try something
You see the mistake
You adjust next time

Neural networks do exactly this, but mathematically.

2.7 What are We Updating?

A neural network learns by adjusting:

Weights (importance of inputs)
Biases (baseline adjustments)

These parameters control the behavior of the model.

2.8 Gradient Descent (Core Idea)

Gradient descent is the main algorithm used to reduce loss.

Intuition:

If loss is high → change parameters
Move in direction that reduces error

Think of it as walking downhill:

Loss = height
Goal = reach lowest point

2.9 Learning Rate

The learning rate controls how big each update is.

Too large → unstable training
Too small → slow learning

It is one of the most important hyperparameters.

Reading materials:

How to Configure the Learning Rate When Training Deep Learning Neural Networks

2.10 Backpropagation (Conceptual View)

Backpropagation is how the model figures out:

which weights caused the error

It works by:

Computing loss
Sending error backward through network
Calculating contribution of each weight
Updating weights accordingly

You don’t need full math yet—just the idea that errors flow backward.

2.11 Epochs, Batches, Iterations

Epoch: One full pass over the entire dataset
Batch: A small subset of data used at once
Iteration: One update step using one batch

Example:

Dataset: 10,000 samples
Batch size: 100
Iterations per epoch: 100

2.12 Training Loop (Conceptual)

A typical training loop looks like:

Take batch of data
Make predictions
Compute loss
Compute gradients
Update weights
Repeat

This is the core of all deep learning.

2.13 Why Training Works

Training works because:

Errors provide feedback
Gradients tell direction of improvement
Repeated updates gradually improve performance

Over time, the model learns patterns in data.

2.14 Common Problems in Training

Overfitting: Model memorizes training data instead of learning patterns
Underfitting: Model is too simple to learn patterns
Vanishing gradients: Learning becomes extremely slow in deep networks

2.15 Key Takeaways

Training = iterative improvement
Loss function measures error
Gradient descent reduces loss
Backpropagation assigns responsibility for errors
Learning happens through repeated updates

Acknowledgement

Part of the contents are generated by ChatGPT.

Return to the Main Page of Deep Learning and Machine Learning .

Junqing Zhang