๐ง Introduction to Artificial Neural Networks with Keras
Artificial Neural Networks (ANNs) are the foundation of deep learning. In this tutorial, you’ll learn how to build and train a Multi-Layer Perceptron (MLP) using Keras, one of the most user-friendly deep learning frameworks.
⚙️ What is Keras?
Keras is a high-level API that lets you easily build, train, and evaluate neural networks—especially deep learning models like MLPs.
Keras is now part of TensorFlow, so in modern TensorFlow (2.x and above), you’ll always use:
import tensorflow as tf
# Use tf.keras for all Keras-related operations
๐งฉ Loading a Dataset
You’ll usually need a dataset to train your neural network.
In this example, we’ll use Fashion MNIST — a dataset of 70,000 grayscale images (28×28 pixels) of clothing items, split into training, validation, and test sets.
Code example:
import tensorflow as tf
fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
# Hold out 5,000 images for validation
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]
Images are usually scaled between 0 and 1 for better training:
X_train, X_valid, X_test = X_train / 255.0, X_valid / 255.0, X_test / 255.0
๐️ Building a Model (Sequential API)
The Sequential API is the simplest way to stack layers one after another.
A typical MLP includes:
-
Input layer — defines the shape (e.g., 28×28)
-
Flatten layer — converts 2D images to 1D vectors
-
Hidden layers — e.g., Dense(300, ReLU), Dense(100, ReLU)
-
Output layer — Dense(10, softmax) for 10 classes
Example code:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=[28, 28]),
tf.keras.layers.Dense(300, activation="relu"),
tf.keras.layers.Dense(100, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax"),
])
๐ Model Summary and Inspecting Layers
You can print a summary of your model:
print(model.summary())
Each layer manages its own weights and biases:
weights, biases = model.layers[1].get_weights()
print(weights.shape) # e.g., (784, 300)
print(biases.shape) # e.g., (300,)
Layers can be named automatically or manually. Layer names help when accessing them later.
๐งฎ Compiling the Model
Compiling sets up the model’s training process by specifying:
-
Loss function — measures prediction error
-
Optimizer — updates model weights
-
Metrics — to track during training
For classification tasks:
-
Loss:
"sparse_categorical_crossentropy" -
Optimizer:
"sgd"or"adam" -
Metric:
"accuracy"
Example:
model.compile(
loss="sparse_categorical_crossentropy",
optimizer="sgd",
metrics=["accuracy"]
)
๐ Training the Model
Train using the fit() method:
history = model.fit(
X_train, y_train,
epochs=30,
validation_data=(X_valid, y_valid)
)
During training, Keras reports loss and accuracy for each epoch.
๐งพ Evaluating and Making Predictions
After training, measure performance using:
test_loss, test_acc = model.evaluate(X_test, y_test)
To make predictions on new data:
Y_proba = model.predict(X_new) # Probabilities for each class
Y_pred = Y_proba.argmax(axis=1) # Most likely class
๐ช Summary of Steps
-
Import libraries
-
Load and prepare data
-
Build the model
-
Compile with loss, optimizer, metrics
-
Train with
fit() -
Evaluate and predict
In short:
You can build a deep neural network in just a few lines using Keras and its Sequential API.
Remember to scale your data, stack your layers, and choose sensible hyperparameters.
๐ Understanding Model Performance and Improving It
Monitoring training helps you detect underfitting or overfitting early.
Interpreting Training and Validation Curves
-
Accuracy: both training and validation should increase.
-
Loss: both should decrease.
If validation accuracy plateaus or drops while training accuracy rises → Overfitting.
⚠️ What to Do If Performance Isn’t Great
1. Adjust the Learning Rate
Controls how fast weights are updated.
-
Too high → unstable
-
Too low → slow training
2. Try a Different Optimizer
Try SGD, Adam, RMSprop, etc.
Always retune learning rate afterward.
3. Adjust Model Architecture
Modify:
-
Number of layers
-
Neurons per layer
-
Activation functions (ReLU, tanh, etc.)
4. Tune Training Parameters
-
Batch size (default: 32)
-
Larger batches = faster but may generalize less
✅ Final Evaluation Before Deployment
Once validation accuracy stabilizes:
-
Use
model.evaluate()on test data for final accuracy. -
If validation loss is still decreasing, continue training with
fit()— Keras resumes from where it left off.
๐ Summary
-
Tune hyperparameters methodically: learning rate → optimizer → architecture.
-
Re-evaluate after each change.
-
Use the test set for final validation before deployment.
0 Comments