🚀 TensorFlow & Keras: Modern Deep Learning Framework

Introduction

TensorFlow is Google's open-source deep learning framework, while Keras is its high-level API that makes building neural networks intuitive and fast. Together, they provide a powerful ecosystem for everything from research prototypes to production-scale applications. This lesson covers the fundamentals: building models with Sequential and Functional APIs, custom layers and losses, callbacks for training control, and best practices for modern deep learning development.

Core Concepts and Setup

# Installation (if needed):
# pip install tensorflow

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, optimizers, callbacks
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_classification, load_iris, fetch_california_housing
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

# Check TensorFlow version
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"GPU Available: {tf.config.list_physical_devices('GPU')}")

print("\n" + "="*60)
print("TENSORFLOW & KERAS FUNDAMENTALS")
print("="*60)

# Core concepts
tf_keras_concepts = """
TENSORFLOW & KERAS KEY CONCEPTS:

1. TENSORFLOW BASICS:
   • Tensors: Multi-dimensional arrays
   • Eager execution: Operations execute immediately
   • Graphs: Computational graphs for optimization
   • Automatic differentiation: GradientTape

2. KERAS API LEVELS:
   • Sequential: Linear stack of layers
   • Functional: Complex architectures
   • Subclassing: Full customization

3. LAYERS:
   • Dense: Fully connected
   • Conv2D: Convolutional
   • LSTM/GRU: Recurrent
   • Dropout: Regularization
   • BatchNormalization: Normalization

4. OPTIMIZERS:
   • SGD: Stochastic gradient descent
   • Adam: Adaptive moment estimation
   • RMSprop: Root mean square prop
   • AdamW: Adam with weight decay

5. LOSS FUNCTIONS:
   • Binary crossentropy
   • Categorical crossentropy
   • MSE, MAE, Huber
   • Custom losses

6. CALLBACKS:
   • ModelCheckpoint: Save best model
   • EarlyStopping: Stop when no improvement
   • ReduceLROnPlateau: Adjust learning rate
   • TensorBoard: Visualization

7. MODEL WORKFLOW:
   1. Define architecture
   2. Compile (optimizer, loss, metrics)
   3. Fit (train)
   4. Evaluate
   5. Predict
"""

print(tf_keras_concepts)

Building Models with Sequential API

class SequentialModelExamples:
    """Examples of building models with Sequential API"""
    
    def __init__(self):
        self.models = {}
        self.histories = {}
        
    def build_classification_model(self, input_dim, n_classes):
        """Build a simple classification model"""
        
        model = keras.Sequential([
            # Input layer with input shape
            layers.Dense(128, activation='relu', input_shape=(input_dim,)),
            layers.Dropout(0.3),
            
            # Hidden layers
            layers.Dense(64, activation='relu'),
            layers.BatchNormalization(),
            layers.Dropout(0.3),
            
            layers.Dense(32, activation='relu'),
            layers.Dropout(0.2),
            
            # Output layer
            layers.Dense(n_classes, activation='softmax' if n_classes > 2 else 'sigmoid')
        ])
        
        return model
    
    def build_regression_model(self, input_dim):
        """Build a regression model"""
        
        model = keras.Sequential()
        
        # Add layers one by one
        model.add(layers.Dense(64, activation='relu', input_shape=(input_dim,)))
        model.add(layers.BatchNormalization())
        model.add(layers.Dense(32, activation='relu'))
        model.add(layers.Dropout(0.2))
        model.add(layers.Dense(16, activation='relu'))
        model.add(layers.Dense(1))  # No activation for regression
        
        return model
    
    def demonstrate_sequential_api(self):
        """Complete example with Sequential API"""
        
        # Generate classification dataset
        X, y = make_classification(n_samples=1000, n_features=20, 
                                  n_informative=15, n_redundant=5,
                                  n_classes=3, random_state=42)
        
        # Preprocess
        scaler = StandardScaler()
        X = scaler.fit_transform(X)
        
        # Convert labels to categorical
        y_cat = keras.utils.to_categorical(y, num_classes=3)
        
        # Split data
        X_train, X_test, y_train, y_test = train_test_split(
            X, y_cat, test_size=0.2, random_state=42
        )
        
        # Build model
        model = self.build_classification_model(input_dim=20, n_classes=3)
        
        # Compile model
        model.compile(
            optimizer=keras.optimizers.Adam(learning_rate=0.001),
            loss='categorical_crossentropy',
            metrics=['accuracy']
        )
        
        # Model summary
        print("\nModel Summary:")
        print("-" * 60)
        model.summary()
        
        # Define callbacks
        early_stopping = callbacks.EarlyStopping(
            monitor='val_loss',
            patience=10,
            restore_best_weights=True
        )
        
        reduce_lr = callbacks.ReduceLROnPlateau(
            monitor='val_loss',
            factor=0.5,
            patience=5,
            min_lr=0.00001
        )
        
        # Train model
        history = model.fit(
            X_train, y_train,
            validation_split=0.2,
            epochs=50,
            batch_size=32,
            callbacks=[early_stopping, reduce_lr],
            verbose=0
        )
        
        # Evaluate
        test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
        print(f"\nTest Loss: {test_loss:.4f}")
        print(f"Test Accuracy: {test_acc:.4f}")
        
        # Visualize training history
        self.plot_training_history(history)
        
        return model, history
    
    def plot_training_history(self, history):
        """Plot training and validation metrics"""
        
        fig, axes = plt.subplots(1, 2, figsize=(12, 4))
        
        # Loss plot
        axes[0].plot(history.history['loss'], label='Training Loss')
        axes[0].plot(history.history['val_loss'], label='Validation Loss')
        axes[0].set_xlabel('Epoch')
        axes[0].set_ylabel('Loss')
        axes[0].set_title('Model Loss')
        axes[0].legend()
        axes[0].grid(True, alpha=0.3)
        
        # Accuracy plot
        if 'accuracy' in history.history:
            axes[1].plot(history.history['accuracy'], label='Training Accuracy')
            axes[1].plot(history.history['val_accuracy'], label='Validation Accuracy')
            axes[1].set_xlabel('Epoch')
            axes[1].set_ylabel('Accuracy')
            axes[1].set_title('Model Accuracy')
            axes[1].legend()
            axes[1].grid(True, alpha=0.3)
        
        plt.tight_layout()
        plt.show()

# Sequential API examples
sequential_examples = SequentialModelExamples()

print("\n" + "="*60)
print("SEQUENTIAL API DEMONSTRATION")
print("="*60)

model_seq, history_seq = sequential_examples.demonstrate_sequential_api()

Functional API for Complex Architectures

class FunctionalAPIExamples:
    """Examples of building complex models with Functional API"""
    
    def __init__(self):
        self.models = {}
        
    def build_multi_input_model(self):
        """Model with multiple inputs"""
        
        # Define two inputs
        input_numeric = layers.Input(shape=(10,), name='numeric_input')
        input_categorical = layers.Input(shape=(5,), name='categorical_input')
        
        # Process numeric input
        x1 = layers.Dense(32, activation='relu')(input_numeric)
        x1 = layers.Dropout(0.2)(x1)
        x1 = layers.Dense(16, activation='relu')(x1)
        
        # Process categorical input
        x2 = layers.Dense(16, activation='relu')(input_categorical)
        x2 = layers.Dropout(0.2)(x2)
        
        # Concatenate branches
        concatenated = layers.Concatenate()([x1, x2])
        
        # Final layers
        x = layers.Dense(32, activation='relu')(concatenated)
        x = layers.Dropout(0.3)(x)
        output = layers.Dense(1, activation='sigmoid', name='output')(x)
        
        # Create model
        model = keras.Model(inputs=[input_numeric, input_categorical], 
                          outputs=output)
        
        return model
    
    def build_multi_output_model(self):
        """Model with multiple outputs"""
        
        # Single input
        input_layer = layers.Input(shape=(20,), name='input')
        
        # Shared layers
        shared = layers.Dense(64, activation='relu')(input_layer)
        shared = layers.BatchNormalization()(shared)
        shared = layers.Dropout(0.3)(shared)
        shared = layers.Dense(32, activation='relu')(shared)
        
        # Branch for classification output
        class_branch = layers.Dense(16, activation='relu')(shared)
        class_output = layers.Dense(3, activation='softmax', name='classification')(class_branch)
        
        # Branch for regression output
        reg_branch = layers.Dense(16, activation='relu')(shared)
        reg_output = layers.Dense(1, name='regression')(reg_branch)
        
        # Create model
        model = keras.Model(inputs=input_layer, 
                          outputs=[class_output, reg_output])
        
        return model
    
    def build_residual_block_model(self):
        """Model with residual connections (skip connections)"""
        
        def residual_block(x, filters, kernel_size=3):
            """Create a residual block"""
            # Save input for skip connection
            shortcut = x
            
            # Main path
            x = layers.Dense(filters, activation='relu')(x)
            x = layers.BatchNormalization()(x)
            x = layers.Dense(filters)(x)
            x = layers.BatchNormalization()(x)
            
            # Add skip connection
            x = layers.Add()([x, shortcut])
            x = layers.Activation('relu')(x)
            
            return x
        
        # Build model with residual blocks
        input_layer = layers.Input(shape=(20,))
        
        # Initial transformation
        x = layers.Dense(64, activation='relu')(input_layer)
        x = layers.BatchNormalization()(x)
        
        # Residual blocks
        x = residual_block(x, 64)
        x = residual_block(x, 64)
        x = layers.Dropout(0.3)(x)
        
        # Output
        output = layers.Dense(1, activation='sigmoid')(x)
        
        model = keras.Model(inputs=input_layer, outputs=output)
        
        return model
    
    def demonstrate_functional_api(self):
        """Complete example with Functional API"""
        
        # Build different architectures
        multi_input_model = self.build_multi_input_model()
        multi_output_model = self.build_multi_output_model()
        residual_model = self.build_residual_block_model()
        
        # Visualize architectures
        fig, axes = plt.subplots(1, 3, figsize=(18, 8))
        
        # Plot model architectures
        for idx, (model, title) in enumerate([
            (multi_input_model, 'Multi-Input Model'),
            (multi_output_model, 'Multi-Output Model'),
            (residual_model, 'Residual Model')
        ]):
            # Create a text representation
            axes[idx].text(0.1, 0.9, f"{title}\n" + "="*30, 
                          fontsize=12, weight='bold')
            
            # Model details
            total_params = model.count_params()
            n_layers = len(model.layers)
            
            info_text = f"""
            Total Parameters: {total_params:,}
            Number of Layers: {n_layers}
            
            Input Shape(s): {model.input_shape if hasattr(model.input, '__len__') 
                           else [model.input_shape]}
            Output Shape(s): {model.output_shape if hasattr(model.output, '__len__') 
                            else [model.output_shape]}
            """
            
            axes[idx].text(0.1, 0.3, info_text, fontsize=10, family='monospace')
            axes[idx].set_xlim(0, 1)
            axes[idx].set_ylim(0, 1)
            axes[idx].axis('off')
        
        plt.suptitle('Functional API Model Architectures', fontsize=14, y=1.02)
        plt.tight_layout()
        plt.show()
        
        # Compile and show multi-output model summary
        print("\nMulti-Output Model Summary:")
        print("-" * 60)
        multi_output_model.compile(
            optimizer='adam',
            loss={'classification': 'categorical_crossentropy',
                  'regression': 'mse'},
            loss_weights={'classification': 1.0, 'regression': 0.5},
            metrics={'classification': 'accuracy', 'regression': 'mae'}
        )
        multi_output_model.summary()
        
        return multi_input_model, multi_output_model, residual_model

# Functional API examples
functional_examples = FunctionalAPIExamples()

print("\n" + "="*60)
print("FUNCTIONAL API DEMONSTRATION")
print("="*60)

multi_input, multi_output, residual = functional_examples.demonstrate_functional_api()

Custom Layers and Models

class CustomComponents:
    """Custom layers, losses, and metrics"""
    
    def __init__(self):
        self.custom_objects = {}
        
    class CustomDenseLayer(layers.Layer):
        """Custom dense layer with L2 regularization"""
        
        def __init__(self, units, activation=None, l2_reg=0.01, **kwargs):
            super().__init__(**kwargs)
            self.units = units
            self.activation = keras.activations.get(activation)
            self.l2_reg = l2_reg
            
        def build(self, input_shape):
            # Create weights
            self.w = self.add_weight(
                shape=(input_shape[-1], self.units),
                initializer='glorot_uniform',
                trainable=True,
                name='kernel',
                regularizer=keras.regularizers.l2(self.l2_reg)
            )
            self.b = self.add_weight(
                shape=(self.units,),
                initializer='zeros',
                trainable=True,
                name='bias'
            )
            
        def call(self, inputs):
            output = tf.matmul(inputs, self.w) + self.b
            if self.activation:
                output = self.activation(output)
            return output
        
        def get_config(self):
            config = super().get_config()
            config.update({
                'units': self.units,
                'activation': keras.activations.serialize(self.activation),
                'l2_reg': self.l2_reg
            })
            return config
    
    class CustomModel(keras.Model):
        """Custom model using subclassing"""
        
        def __init__(self, num_classes=10):
            super().__init__()
            self.num_classes = num_classes
            
            # Define layers
            self.dense1 = layers.Dense(128, activation='relu')
            self.dropout1 = layers.Dropout(0.3)
            self.dense2 = layers.Dense(64, activation='relu')
            self.dropout2 = layers.Dropout(0.3)
            self.classifier = layers.Dense(num_classes, activation='softmax')
            
        def call(self, inputs, training=False):
            x = self.dense1(inputs)
            x = self.dropout1(x, training=training)
            x = self.dense2(x)
            x = self.dropout2(x, training=training)
            return self.classifier(x)
    
    def custom_loss_function(self, y_true, y_pred):
        """Custom focal loss for imbalanced classification"""
        
        @tf.function
        def focal_loss(y_true, y_pred, gamma=2.0, alpha=0.25):
            # Clip predictions to prevent log(0)
            epsilon = tf.keras.backend.epsilon()
            y_pred = tf.clip_by_value(y_pred, epsilon, 1. - epsilon)
            
            # Calculate focal loss
            p_t = tf.where(tf.equal(y_true, 1), y_pred, 1 - y_pred)
            alpha_factor = tf.ones_like(y_true) * alpha
            alpha_t = tf.where(tf.equal(y_true, 1), alpha_factor, 1 - alpha_factor)
            cross_entropy = -tf.math.log(p_t)
            weight = alpha_t * tf.pow((1 - p_t), gamma)
            loss = weight * cross_entropy
            
            return tf.reduce_mean(loss)
        
        return focal_loss
    
    def custom_metric_f1_score(self, y_true, y_pred):
        """Custom F1 score metric"""
        
        @tf.function
        def f1_score(y_true, y_pred):
            # Convert predictions to binary
            y_pred = tf.round(y_pred)
            
            # Calculate TP, FP, FN
            tp = tf.reduce_sum(y_true * y_pred)
            fp = tf.reduce_sum((1 - y_true) * y_pred)
            fn = tf.reduce_sum(y_true * (1 - y_pred))
            
            # Calculate precision and recall
            precision = tp / (tp + fp + tf.keras.backend.epsilon())
            recall = tp / (tp + fn + tf.keras.backend.epsilon())
            
            # Calculate F1
            f1 = 2 * precision * recall / (precision + recall + tf.keras.backend.epsilon())
            
            return f1
        
        return f1_score
    
    def demonstrate_custom_components(self):
        """Demonstrate custom layers and losses"""
        
        # Create model with custom layer
        model = keras.Sequential([
            layers.Input(shape=(20,)),
            self.CustomDenseLayer(64, activation='relu', l2_reg=0.01),
            layers.Dropout(0.3),
            self.CustomDenseLayer(32, activation='relu', l2_reg=0.01),
            layers.Dense(1, activation='sigmoid')
        ])
        
        # Compile with custom loss and metric
        model.compile(
            optimizer='adam',
            loss=self.custom_loss_function(None, None),
            metrics=['accuracy', self.custom_metric_f1_score(None, None)]
        )
        
        print("\nModel with Custom Components:")
        print("-" * 60)
        model.summary()
        
        # Generate sample data
        X = np.random.randn(1000, 20)
        y = (X[:, 0] + X[:, 1] > 0).astype(float)
        
        # Train briefly
        history = model.fit(X, y, validation_split=0.2, epochs=5, verbose=0)
        
        print("\nTraining with custom components completed.")
        print(f"Final loss: {history.history['loss'][-1]:.4f}")
        print(f"Final accuracy: {history.history['accuracy'][-1]:.4f}")
        
        return model

# Custom components
custom_components = CustomComponents()

print("\n" + "="*60)
print("CUSTOM LAYERS AND MODELS")
print("="*60)

custom_model = custom_components.demonstrate_custom_components()

Callbacks and Training Control

class CallbacksAndTraining:
    """Advanced training techniques with callbacks"""
    
    def __init__(self):
        self.callbacks = []
        
    def create_callbacks(self, model_name='model'):
        """Create a comprehensive set of callbacks"""
        
        # Model checkpoint - save best model
        checkpoint = callbacks.ModelCheckpoint(
            filepath=f'{model_name}_best.h5',
            monitor='val_loss',
            save_best_only=True,
            save_weights_only=False,
            mode='min',
            verbose=1
        )
        
        # Early stopping
        early_stop = callbacks.EarlyStopping(
            monitor='val_loss',
            patience=15,
            restore_best_weights=True,
            verbose=1
        )
        
        # Reduce learning rate on plateau
        reduce_lr = callbacks.ReduceLROnPlateau(
            monitor='val_loss',
            factor=0.5,
            patience=5,
            min_lr=1e-6,
            verbose=1
        )
        
        # Custom callback for learning rate scheduling
        class CustomLRScheduler(callbacks.Callback):
            def __init__(self, initial_lr=0.001):
                super().__init__()
                self.initial_lr = initial_lr
                
            def on_epoch_begin(self, epoch, logs=None):
                if epoch > 0:
                    # Exponential decay
                    new_lr = self.initial_lr * np.exp(-0.1 * epoch)
                    keras.backend.set_value(self.model.optimizer.lr, new_lr)
                    print(f'\nEpoch {epoch}: Learning rate = {new_lr:.6f}')
        
        # Custom callback for logging
        class CustomLogger(callbacks.Callback):
            def __init__(self):
                super().__init__()
                self.losses = []
                self.accuracies = []
                
            def on_epoch_end(self, epoch, logs=None):
                self.losses.append(logs.get('loss'))
                self.accuracies.append(logs.get('accuracy'))
                
                # Print custom message every 10 epochs
                if epoch % 10 == 0 and epoch > 0:
                    print(f'\n[Custom Logger] Epoch {epoch}:')
                    print(f'  Average loss (last 10): {np.mean(self.losses[-10:]):.4f}')
                    if self.accuracies[-1] is not None:
                        print(f'  Average accuracy (last 10): {np.mean(self.accuracies[-10:]):.4f}')
        
        # TensorBoard callback
        tensorboard = callbacks.TensorBoard(
            log_dir='./logs',
            histogram_freq=1,
            write_graph=True,
            update_freq='epoch'
        )
        
        return [checkpoint, early_stop, reduce_lr, CustomLRScheduler(), 
                CustomLogger(), tensorboard]
    
    def demonstrate_training_strategies(self):
        """Demonstrate different training strategies"""
        
        # Generate dataset
        X, y = make_classification(n_samples=5000, n_features=20,
                                  n_informative=15, n_classes=2,
                                  random_state=42)
        X = StandardScaler().fit_transform(X)
        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=0.2, random_state=42
        )
        
        # Build model
        model = keras.Sequential([
            layers.Dense(64, activation='relu', input_shape=(20,)),
            layers.BatchNormalization(),
            layers.Dropout(0.3),
            layers.Dense(32, activation='relu'),
            layers.BatchNormalization(),
            layers.Dropout(0.2),
            layers.Dense(16, activation='relu'),
            layers.Dense(1, activation='sigmoid')
        ])
        
        # Different training strategies
        strategies = {
            'Standard': {
                'optimizer': 'adam',
                'loss': 'binary_crossentropy',
                'batch_size': 32
            },
            'Large Batch': {
                'optimizer': keras.optimizers.Adam(learning_rate=0.01),
                'loss': 'binary_crossentropy',
                'batch_size': 256
            },
            'Gradient Accumulation': {
                'optimizer': keras.optimizers.SGD(learning_rate=0.01, momentum=0.9),
                'loss': 'binary_crossentropy',
                'batch_size': 8
            }
        }
        
        results = {}
        
        fig, axes = plt.subplots(1, 3, figsize=(15, 5))
        
        for idx, (name, config) in enumerate(strategies.items()):
            # Clone model for fair comparison
            model_clone = keras.models.clone_model(model)
            
            # Compile
            model_clone.compile(
                optimizer=config['optimizer'],
                loss=config['loss'],
                metrics=['accuracy']
            )
            
            # Train
            history = model_clone.fit(
                X_train, y_train,
                validation_split=0.2,
                epochs=30,
                batch_size=config['batch_size'],
                verbose=0
            )
            
            # Evaluate
            test_loss, test_acc = model_clone.evaluate(X_test, y_test, verbose=0)
            results[name] = {'loss': test_loss, 'accuracy': test_acc}
            
            # Plot learning curves
            axes[idx].plot(history.history['loss'], label='Train Loss', alpha=0.7)
            axes[idx].plot(history.history['val_loss'], label='Val Loss', alpha=0.7)
            axes[idx].set_xlabel('Epoch')
            axes[idx].set_ylabel('Loss')
            axes[idx].set_title(f'{name}\nBatch Size: {config["batch_size"]}\n'
                               f'Test Acc: {test_acc:.3f}')
            axes[idx].legend()
            axes[idx].grid(True, alpha=0.3)
        
        plt.suptitle('Training Strategy Comparison', fontsize=14, y=1.02)
        plt.tight_layout()
        plt.show()
        
        return results

# Callbacks and training
training_control = CallbacksAndTraining()

print("\n" + "="*60)
print("CALLBACKS AND TRAINING STRATEGIES")
print("="*60)

# Create callbacks
callback_list = training_control.create_callbacks('my_model')
print(f"\nCreated {len(callback_list)} callbacks")

# Demonstrate training strategies
print("\nComparing training strategies...")
training_results = training_control.demonstrate_training_strategies()

print("\nTraining Results:")
for name, metrics in training_results.items():
    print(f"  {name}: Loss={metrics['loss']:.4f}, Accuracy={metrics['accuracy']:.4f}")

Model Saving and Loading

class ModelPersistence:
    """Saving and loading models"""
    
    def demonstrate_model_saving(self):
        """Different ways to save and load models"""
        
        # Create a simple model
        model = keras.Sequential([
            layers.Dense(64, activation='relu', input_shape=(10,)),
            layers.Dense(32, activation='relu'),
            layers.Dense(1, activation='sigmoid')
        ])
        
        model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
        
        # Generate dummy data
        X = np.random.randn(100, 10)
        y = (X[:, 0] > 0).astype(float)
        
        # Train briefly
        model.fit(X, y, epochs=2, verbose=0)
        
        print("\nModel Saving Methods:")
        print("-" * 60)
        
        # Method 1: Save entire model (architecture + weights + training config)
        model.save('complete_model.h5')
        print("✓ Saved complete model to 'complete_model.h5'")
        
        # Method 2: Save only weights
        model.save_weights('model_weights.h5')
        print("✓ Saved weights to 'model_weights.h5'")
        
        # Method 3: Save as SavedModel format (TensorFlow native)
        model.save('saved_model_dir')
        print("✓ Saved in SavedModel format to 'saved_model_dir/'")
        
        # Method 4: Save architecture only (as JSON)
        model_json = model.to_json()
        with open('model_architecture.json', 'w') as f:
            f.write(model_json)
        print("✓ Saved architecture to 'model_architecture.json'")
        
        print("\nModel Loading Methods:")
        print("-" * 60)
        
        # Load complete model
        loaded_model = keras.models.load_model('complete_model.h5')
        print("✓ Loaded complete model")
        
        # Load weights only (need to recreate architecture first)
        new_model = keras.Sequential([
            layers.Dense(64, activation='relu', input_shape=(10,)),
            layers.Dense(32, activation='relu'),
            layers.Dense(1, activation='sigmoid')
        ])
        new_model.load_weights('model_weights.h5')
        print("✓ Loaded weights into new model")
        
        # Load SavedModel
        saved_model = keras.models.load_model('saved_model_dir')
        print("✓ Loaded SavedModel")
        
        # Verify loaded model works
        test_prediction = loaded_model.predict(X[:5], verbose=0)
        print(f"\nTest prediction shape: {test_prediction.shape}")
        
        return model, loaded_model

# Model persistence
persistence = ModelPersistence()

print("\n" + "="*60)
print("MODEL SAVING AND LOADING")
print("="*60)

original_model, loaded_model = persistence.demonstrate_model_saving()

Best Practices and Tips

print("\n" + "="*60)
print("TENSORFLOW/KERAS BEST PRACTICES")
print("="*60)

best_practices = """
KEY GUIDELINES:

1. DATA PREPARATION:
   • Normalize/standardize inputs
   • Handle class imbalance (class_weight, oversampling)
   • Use data augmentation for small datasets
   • Shuffle training data
   • Use proper train/val/test splits

2. MODEL ARCHITECTURE:
   • Start simple, increase complexity gradually
   • Use BatchNormalization for deep networks
   • Add Dropout for regularization (0.2-0.5)
   • Consider skip connections for very deep models
   • Match output activation to problem type

3. COMPILATION:
   • Binary classification: sigmoid + binary_crossentropy
   • Multi-class: softmax + categorical_crossentropy
   • Regression: no activation + mse/mae
   • Use appropriate metrics for evaluation

4. TRAINING:
   • Start with Adam optimizer (good default)
   • Use callbacks (EarlyStopping, ReduceLROnPlateau)
   • Monitor both training and validation metrics
   • Save best model during training
   • Use appropriate batch size (32-512)

5. REGULARIZATION:
   • L1/L2 weight regularization
   • Dropout layers (not in output layer)
   • Data augmentation
   • Early stopping
   • Batch normalization

6. DEBUGGING TIPS:
   • Start with small subset of data
   • Verify model can overfit single batch
   • Check for NaN/Inf in gradients
   • Monitor layer outputs
   • Use TensorBoard for visualization

7. PERFORMANCE OPTIMIZATION:
   • Use tf.data for data pipelines
   • Enable mixed precision training
   • Use GPU when available
   • Consider model pruning/quantization
   • Batch inference for predictions

8. COMMON PITFALLS:
   ✗ Forgetting to normalize data
   ✗ Wrong loss function for task
   ✗ Learning rate too high/low
   ✗ Not using validation set
   ✗ Overfitting to validation set
   ✗ Data leakage
"""

print(best_practices)

# GPU optimization tips
gpu_tips = """
GPU OPTIMIZATION:

# Check GPU availability
print(tf.config.list_physical_devices('GPU'))

# Set memory growth
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)

# Mixed precision training
policy = tf.keras.mixed_precision.Policy('mixed_float16')
tf.keras.mixed_precision.set_global_policy(policy)

# Prefetch and cache data
dataset = dataset.cache().prefetch(tf.data.AUTOTUNE)
"""

print("\nGPU Optimization:")
print(gpu_tips)

Practice Exercises

Exercise 1: Build a CNN for Image Classification

Create a convolutional neural network:

Use Conv2D, MaxPooling2D layers
Implement data augmentation
Use transfer learning with pre-trained model
Fine-tune specific layers
Evaluate on test dataset

Exercise 2: Build an LSTM for Time Series

Create a recurrent neural network:

Preprocess sequence data
Build LSTM/GRU architecture
Handle variable-length sequences
Implement attention mechanism
Forecast future values

Exercise 3: Custom Training Loop

Implement custom training with GradientTape:

Define custom training step
Implement gradient accumulation
Add custom metrics
Create custom learning rate schedule
Log with TensorBoard

Summary and Key Takeaways

🎯 Key Points to Remember

Sequential API: Perfect for linear stack of layers
Functional API: Flexible for complex architectures
Model Subclassing: Full control and customization
Callbacks: Essential for training control and monitoring
Compilation: Match optimizer, loss, and metrics to your task
Regularization: Dropout, BatchNorm, L1/L2 prevent overfitting
Save Models: Multiple formats for different use cases
Start Simple: Begin with basic architecture, iterate and improve