Run this notebook: Open in Colab Open in Kaggle

05: Advanced Techniques & Applications¶

“The best way to predict the future is to create it.” - Peter Drucker

Welcome to the advanced techniques and practical applications of time series forecasting! This notebook covers ensemble methods, uncertainty quantification, and real-world deployment considerations.

🎯 Learning Objectives¶

By the end of this notebook, you’ll be able to:

Implement ensemble forecasting methods
Quantify and visualize prediction uncertainty
Handle multivariate time series forecasting
Deploy models for production use
Monitor and maintain forecasting systems
Apply time series methods to real-world problems

🤝 Ensemble Methods¶

Ensemble forecasting combines multiple models to improve accuracy and robustness.

Why Ensembles Work:¶

Bias-Variance Tradeoff: Different models have different strengths
Error Diversity: Models make different mistakes
Robustness: Less sensitive to individual model failures
Uncertainty: Better uncertainty quantification

Ensemble Types:¶

Simple Averaging: Equal weights for all models
Weighted Averaging: Performance-based weights
Stacking: Meta-model learns optimal combination
Bagging: Bootstrap aggregation
Boosting: Sequential model improvement

Time Series Specific:¶

Temporal validation: Respect time ordering
Rolling ensembles: Update with new data
Online learning: Continuous model adaptation

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.tsa.arima.model import ARIMA
from prophet import Prophet
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error
import torch
import torch.nn as nn
import warnings
warnings.filterwarnings('ignore')

# Set random seeds
np.random.seed(42)
torch.manual_seed(42)

# Set up plotting
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['font.size'] = 12]

print("Ensemble Forecasting Libraries Loaded!")

def generate_ensemble_data(n_days=500, noise_level=0.1):
    """Generate time series data for ensemble testing"""
    
    dates = pd.date_range('2020-01-01', periods=n_days, freq='D')
    t = np.arange(n_days)
    
    # Complex trend with changes
    trend = 0.002 * t + 0.000005 * t**2
    trend_changes = np.zeros(n_days)
    trend_changes[150:250] += 5  # Positive shock
    trend_changes[350:400] -= 8  # Negative shock
    
    # Multiple seasonal patterns
    weekly = 3 * np.sin(2 * np.pi * t / 7)
    monthly = 4 * np.sin(2 * np.pi * t / 30.44)
    quarterly = 2 * np.sin(2 * np.pi * t / 91.25)
    
    # External factors
    external = np.random.choice([-3, -1, 0, 1, 3], n_days, p=[0.1, 0.2, 0.4, 0.2, 0.1])
    external_smooth = np.convolve(external, np.ones(14)/14, mode='same')
    
    # Combine components
    y = (trend + trend_changes + weekly + monthly + quarterly + external_smooth)
    
    # Add noise
    noise = np.random.normal(0, noise_level * np.std(y), n_days)
    y += noise
    
    # Create DataFrame
    df = pd.DataFrame({
        'ds': dates,
        'y': y,
        'trend': trend,
        'weekly': weekly,
        'monthly': monthly,
        'quarterly': quarterly,
        'external': external_smooth,
        'noise': noise
    })
    
    return df

# Generate ensemble data
ensemble_data = generate_ensemble_data()

print(f"Generated {len(ensemble_data)} days of ensemble test data")
print(f"Date range: {ensemble_data['ds'].min()} to {ensemble_data['ds'].max()}")
print(f"Value range: {ensemble_data['y'].min():.2f} to {ensemble_data['y'].max():.2f}")

# Plot the data
plt.figure(figsize=(15, 8))
plt.plot(ensemble_data['ds'], ensemble_data['y'], 'b-', linewidth=1.5, alpha=0.8)
plt.title('Time Series Data for Ensemble Testing')
plt.xlabel('Date')
plt.ylabel('Value')
plt.grid(True, alpha=0.3)
plt.show()

# Show data structure
print("\nData Structure:")
print(ensemble_data.head())
print("\nData Info:")
print(ensemble_data.info())

class TimeSeriesEnsemble:
    """Ensemble forecasting class combining multiple models"""
    
    def __init__(self, models=None):
        if models is None:
            self.models = {
                'arima': None,
                'prophet': None,
                'rf': None,
                'linear': None,
                'lstm': None
            }
        else:
            self.models = models
        
        self.scaler = MinMaxScaler()
        self.is_trained = False
    
    def _prepare_features(self, data, lookback=30):
        """Prepare features for ML models"""
        df = data.copy()
        
        # Lag features
        for i in range(1, lookback + 1):
            df[f'lag_{i}'] = df['y'].shift(i)
        
        # Rolling statistics
        df['rolling_mean_7'] = df['y'].rolling(7).mean()
        df['rolling_std_7'] = df['y'].rolling(7).std()
        df['rolling_mean_30'] = df['y'].rolling(30).mean()
        df['rolling_std_30'] = df['y'].rolling(30).std()
        
        # Time features
        df['day_of_week'] = df['ds'].dt.dayofweek
        df['month'] = df['ds'].dt.month
        df['quarter'] = df['ds'].dt.quarter
        
        # Drop NaN values
        df = df.dropna()
        
        return df
    
    def fit(self, train_data):
        """Fit all models in the ensemble"""
        print("Training ensemble models...")
        
        # Prepare features for ML models
        train_features = self._prepare_features(train_data)
        X = train_features.drop(['ds', 'y'], axis=1)
        y = train_features['y']
        
        # Scale features
        X_scaled = self.scaler.fit_transform(X)
        
        # Train ARIMA
        print("  Training ARIMA...")
        try:
            arima_model = ARIMA(train_data['y'], order=(5, 1, 2))
            self.models['arima'] = arima_model.fit()
        except:
            print("    ARIMA training failed, skipping")
            self.models['arima'] = None
        
        # Train Prophet
        print("  Training Prophet...")
        try:
            prophet_df = train_data[['ds', 'y']].copy()
            prophet_model = Prophet(
                changepoint_prior_scale=0.05,
                seasonality_mode='additive',
                daily_seasonality=False
            )
            prophet_model.fit(prophet_df)
            self.models['prophet'] = prophet_model
        except:
            print("    Prophet training failed, skipping")
            self.models['prophet'] = None
        
        # Train Random Forest
        print("  Training Random Forest...")
        try:
            rf_model = RandomForestRegressor(
                n_estimators=100,
                max_depth=10,
                random_state=42
            )
            rf_model.fit(X_scaled, y)
            self.models['rf'] = rf_model
        except:
            print("    Random Forest training failed, skipping")
            self.models['rf'] = None
        
        # Train Linear Regression
        print("  Training Linear Regression...")
        try:
            linear_model = LinearRegression()
            linear_model.fit(X_scaled, y)
            self.models['linear'] = linear_model
        except:
            print("    Linear Regression training failed, skipping")
            self.models['linear'] = None
        
        # Placeholder for LSTM (would need proper implementation)
        self.models['lstm'] = None
        
        self.is_trained = True
        print("Ensemble training completed!")
    
    def predict(self, test_data, forecast_horizon=1):
        """Generate ensemble predictions"""
        if not self.is_trained:
            raise ValueError("Ensemble must be trained before prediction")
        
        predictions = {}
        
        # ARIMA predictions
        if self.models['arima'] is not None:
            try:
                arima_pred = self.models['arima'].forecast(steps=len(test_data))
                predictions['arima'] = arima_pred
            except:
                predictions['arima'] = np.full(len(test_data), np.nan)
        
        # Prophet predictions
        if self.models['prophet'] is not None:
            try:
                future = self.models['prophet'].make_future_dataframe(periods=len(test_data), freq='D')
                prophet_forecast = self.models['prophet'].predict(future)
                prophet_pred = prophet_forecast['yhat'].tail(len(test_data)).values
                predictions['prophet'] = prophet_pred
            except:
                predictions['prophet'] = np.full(len(test_data), np.nan)
        
        # ML model predictions
        test_features = self._prepare_features(test_data)
        if len(test_features) > 0:
            X_test = test_features.drop(['ds', 'y'], axis=1)
            X_test_scaled = self.scaler.transform(X_test)
            
            # Random Forest
            if self.models['rf'] is not None:
                try:
                    rf_pred = self.models['rf'].predict(X_test_scaled)
                    predictions['rf'] = rf_pred
                except:
                    predictions['rf'] = np.full(len(test_features), np.nan)
            
            # Linear Regression
            if self.models['linear'] is not None:
                try:
                    linear_pred = self.models['linear'].predict(X_test_scaled)
                    predictions['linear'] = linear_pred
                except:
                    predictions['linear'] = np.full(len(test_features), np.nan)
        
        # LSTM placeholder
        predictions['lstm'] = np.full(len(test_data), np.nan)
        
        return predictions
    
    def ensemble_predict(self, predictions, method='simple_average'):
        """Combine predictions using ensemble method"""
        
        # Stack predictions
        pred_df = pd.DataFrame(predictions)
        
        if method == 'simple_average':
            # Simple average of available predictions
            valid_preds = pred_df.dropna(axis=1)
            if len(valid_preds.columns) > 0:
                ensemble_pred = valid_preds.mean(axis=1)
            else:
                ensemble_pred = np.full(len(pred_df), np.nan)
        
        elif method == 'median':
            # Median of available predictions
            valid_preds = pred_df.dropna(axis=1)
            if len(valid_preds.columns) > 0:
                ensemble_pred = valid_preds.median(axis=1)
            else:
                ensemble_pred = np.full(len(pred_df), np.nan)
        
        else:
            raise ValueError(f"Unknown ensemble method: {method}")
        
        return ensemble_pred.values

# Split data for ensemble testing
train_size = int(len(ensemble_data) * 0.7)
train_data = ensemble_data[:train_size]
test_data = ensemble_data[train_size:]

print(f"Train set: {len(train_data)} samples")
print(f"Test set: {len(test_data)} samples")

# Create and train ensemble
ensemble = TimeSeriesEnsemble()
ensemble.fit(train_data)

# Generate predictions
individual_predictions = ensemble.predict(test_data)
ensemble_pred = ensemble.ensemble_predict(individual_predictions, method='simple_average')

# Calculate accuracies
accuracies = {}
for model_name, preds in individual_predictions.items():
    if not np.isnan(preds).all():
        valid_idx = ~np.isnan(preds)
        if np.sum(valid_idx) > 0:
            mae = mean_absolute_error(test_data['y'].values[valid_idx], preds[valid_idx])
            accuracies[model_name] = mae

# Ensemble accuracy
ensemble_mae = mean_absolute_error(test_data['y'], ensemble_pred)
accuracies['ensemble'] = ensemble_mae

print("\nModel Accuracies (MAE):")
for model, mae in accuracies.items():
    print(f"  {model}: {mae:.4f}")

# Plot ensemble results
plt.figure(figsize=(15, 8))

# Plot actual data
plt.plot(test_data['ds'], test_data['y'], 'k-', linewidth=2, label='Actual', alpha=0.8)

# Plot individual predictions
colors = ['red', 'blue', 'green', 'orange', 'purple']
for i, (model_name, preds) in enumerate(individual_predictions.items()):
    if not np.isnan(preds).all():
        plt.plot(test_data['ds'], preds, color=colors[i % len(colors)], 
                linewidth=1.5, alpha=0.7, label=f'{model_name.upper()}')

# Plot ensemble prediction
plt.plot(test_data['ds'], ensemble_pred, 'r-', linewidth=3, label='Ensemble')

plt.title('Ensemble Forecasting Results')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

📊 Uncertainty Quantification¶

Uncertainty quantification is crucial for reliable forecasting and decision making.

Types of Uncertainty:¶

Aleatoric: Inherent randomness in data (irreducible)
Epistemic: Model uncertainty (reducible with more data/knowledge)
Prediction intervals: Range of likely future values
Confidence intervals: Uncertainty in parameter estimates

Quantification Methods:¶

Bootstrap: Resampling-based uncertainty
Bayesian: Probabilistic parameter estimation
Ensemble: Model disagreement as uncertainty
Quantile regression: Direct quantile prediction
Conformal prediction: Distribution-free uncertainty

Applications:¶

Risk assessment: Decision making under uncertainty
Inventory management: Safety stock calculations
Financial planning: Portfolio optimization
Supply chain: Demand planning with uncertainty

class UncertaintyQuantifier:
    """Class for quantifying prediction uncertainty"""
    
    def __init__(self, n_bootstraps=100, confidence_level=0.95):
        self.n_bootstraps = n_bootstraps
        self.confidence_level = confidence_level
        self.lower_percentile = (1 - confidence_level) / 2 * 100
        self.upper_percentile = (1 + confidence_level) / 2 * 100
    
    def bootstrap_uncertainty(self, model_func, data, n_forecasts):
        """Calculate uncertainty using bootstrapping"""
        
        predictions = []
        
        for i in range(self.n_bootstraps):
            # Bootstrap sample
            bootstrap_idx = np.random.choice(len(data), len(data), replace=True)
            bootstrap_sample = data.iloc[bootstrap_idx]
            
            # Fit model and predict
            try:
                pred = model_func(bootstrap_sample, n_forecasts)
                predictions.append(pred)
            except:
                continue
        
        if len(predictions) == 0:
            return None, None, None
        
        predictions = np.array(predictions)
        
        # Calculate statistics
        mean_pred = np.mean(predictions, axis=0)
        lower_bound = np.percentile(predictions, self.lower_percentile, axis=0)
        upper_bound = np.percentile(predictions, self.upper_percentile, axis=0)
        
        return mean_pred, lower_bound, upper_bound
    
    def ensemble_uncertainty(self, predictions_dict):
        """Calculate uncertainty from ensemble predictions"""
        
        # Stack predictions
        pred_array = np.column_stack([pred for pred in predictions_dict.values() 
                                    if not np.isnan(pred).all()])
        
        if pred_array.shape[1] == 0:
            return None, None, None
        
        # Calculate ensemble statistics
        mean_pred = np.mean(pred_array, axis=1)
        std_pred = np.std(pred_array, axis=1)
        
        # Use standard deviation for uncertainty bounds
        lower_bound = mean_pred - 1.96 * std_pred
        upper_bound = mean_pred + 1.96 * std_pred
        
        return mean_pred, lower_bound, upper_bound
    
    def prediction_interval_coverage(self, actuals, lower_bounds, upper_bounds):
        """Calculate prediction interval coverage"""
        
        coverage = np.mean((actuals >= lower_bounds) & (actuals <= upper_bounds))
        return coverage
    
    def interval_width(self, lower_bounds, upper_bounds):
        """Calculate average prediction interval width"""
        
        width = np.mean(upper_bounds - lower_bounds)
        return width

# Initialize uncertainty quantifier
uq = UncertaintyQuantifier(n_bootstraps=50, confidence_level=0.95)

# Function for ARIMA bootstrap
def arima_bootstrap_func(data, n_forecasts):
    try:
        model = ARIMA(data['y'], order=(5, 1, 2))
        fitted = model.fit()
        forecast = fitted.forecast(steps=n_forecasts)
        return forecast
    except:
        return np.full(n_forecasts, np.nan)

# Calculate bootstrap uncertainty for ARIMA
print("Calculating bootstrap uncertainty for ARIMA...")
arima_mean, arima_lower, arima_upper = uq.bootstrap_uncertainty(
    arima_bootstrap_func, train_data, len(test_data)
)

# Calculate ensemble uncertainty
print("Calculating ensemble uncertainty...")
ensemble_mean, ensemble_lower, ensemble_upper = uq.ensemble_uncertainty(individual_predictions)

# Calculate coverage and width
if arima_mean is not None:
    arima_coverage = uq.prediction_interval_coverage(
        test_data['y'].values, arima_lower, arima_upper
    )
    arima_width = uq.interval_width(arima_lower, arima_upper)
    print(f"ARIMA Bootstrap - Coverage: {arima_coverage:.3f}, Width: {arima_width:.4f}")

if ensemble_mean is not None:
    ensemble_coverage = uq.prediction_interval_coverage(
        test_data['y'].values, ensemble_lower, ensemble_upper
    )
    ensemble_width = uq.interval_width(ensemble_lower, ensemble_upper)
    print(f"Ensemble - Coverage: {ensemble_coverage:.3f}, Width: {ensemble_width:.4f}")

# Plot uncertainty comparison
plt.figure(figsize=(15, 10))

# Plot 1: Bootstrap uncertainty
plt.subplot(2, 1, 1)
plt.plot(test_data['ds'], test_data['y'], 'k-', linewidth=2, label='Actual')
if arima_mean is not None:
    plt.plot(test_data['ds'], arima_mean, 'b-', linewidth=2, label='ARIMA Mean')
    plt.fill_between(test_data['ds'], arima_lower, arima_upper, 
                    color='blue', alpha=0.3, label='95% Bootstrap CI')
plt.title('Bootstrap Uncertainty Quantification (ARIMA)')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 2: Ensemble uncertainty
plt.subplot(2, 1, 2)
plt.plot(test_data['ds'], test_data['y'], 'k-', linewidth=2, label='Actual')
if ensemble_mean is not None:
    plt.plot(test_data['ds'], ensemble_mean, 'r-', linewidth=2, label='Ensemble Mean')
    plt.fill_between(test_data['ds'], ensemble_lower, ensemble_upper, 
                    color='red', alpha=0.3, label='95% Ensemble CI')
plt.title('Ensemble Uncertainty Quantification')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Uncertainty analysis
print("\n=== Uncertainty Analysis ===")
if arima_mean is not None:
    print(f"ARIMA Bootstrap Intervals:")
    print(f"  Average width: {arima_width:.4f}")
    print(f"  Coverage: {arima_coverage:.1%}")
    print(f"  Target coverage: {uq.confidence_level:.1%}")

if ensemble_mean is not None:
    print(f"\nEnsemble Intervals:")
    print(f"  Average width: {ensemble_width:.4f}")
    print(f"  Coverage: {ensemble_coverage:.1%}")
    print(f"  Target coverage: {uq.confidence_level:.1%}")

# Reliability diagram
if ensemble_mean is not None:
    plt.figure(figsize=(8, 6))
    
    # Calculate coverage by bins
    errors = np.abs(test_data['y'].values - ensemble_mean)
    uncertainty = (ensemble_upper - ensemble_lower) / 2
    
    # Bin by uncertainty
    bins = np.linspace(uncertainty.min(), uncertainty.max(), 10)
    bin_centers = (bins[:-1] + bins[1:]) / 2
    
    coverage_by_bin = []
    for i in range(len(bins) - 1):
        mask = (uncertainty >= bins[i]) & (uncertainty < bins[i+1])
        if np.sum(mask) > 0:
            coverage = np.mean(errors[mask] <= uncertainty[mask])
            coverage_by_bin.append(coverage)
        else:
            coverage_by_bin.append(0)
    
    plt.plot(bin_centers, coverage_by_bin, 'bo-', linewidth=2, markersize=8)
    plt.plot([bins.min(), bins.max()], [0.95, 0.95], 'r--', label='Target Coverage (95%)')
    plt.xlabel('Uncertainty (Interval Half-Width)')
    plt.ylabel('Observed Coverage')
    plt.title('Uncertainty Calibration (Reliability Diagram)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

🔄 Production Deployment¶

Deploying forecasting models to production requires careful consideration of scalability, reliability, and maintenance.

Deployment Architecture:¶

Batch processing: Periodic forecast generation
Real-time scoring: On-demand predictions
Model serving: REST APIs, microservices
Containerization: Docker, Kubernetes
Cloud platforms: AWS SageMaker, Azure ML, GCP AI Platform

Key Considerations:¶

Scalability: Handle varying loads
Latency: Response time requirements
Reliability: Fault tolerance and monitoring
Security: Data protection and access control
Cost: Computational resource optimization

MLOps Pipeline:¶

Data ingestion: Automated data pipelines
Model training: Scheduled retraining
Model validation: Automated testing
Model deployment: CI/CD pipelines
Monitoring: Performance tracking and alerting

class ForecastingService:
    """Production-ready forecasting service"""
    
    def __init__(self, model, scaler=None):
        self.model = model
        self.scaler = scaler
        self.performance_history = []
        self.last_training_date = None
        
    def predict(self, input_data, forecast_horizon=1):
        """Generate predictions with error handling"""
        
        try:
            # Validate input
            if not self._validate_input(input_data):
                raise ValueError("Invalid input data")
            
            # Generate prediction
            if hasattr(self.model, 'predict'):
                prediction = self.model.predict(input_data)
            elif hasattr(self.model, 'forecast'):
                prediction = self.model.forecast(steps=forecast_horizon)
            else:
                raise ValueError("Model does not have prediction method")
            
            # Log prediction
            self._log_prediction(prediction, input_data)
            
            return {
                'status': 'success',
                'prediction': prediction,
                'timestamp': pd.Timestamp.now(),
                'model_version': self._get_model_version()
            }
            
        except Exception as e:
            return {
                'status': 'error',
                'error': str(e),
                'timestamp': pd.Timestamp.now()
            }
    
    def _validate_input(self, input_data):
        """Validate input data format and quality"""
        
        # Check if data is not empty
        if input_data is None or len(input_data) == 0:
            return False
        
        # Check for required columns (if DataFrame)
        if isinstance(input_data, pd.DataFrame):
            required_cols = ['ds', 'y']
            if not all(col in input_data.columns for col in required_cols):
                return False
            
            # Check for missing values
            if input_data.isnull().any().any():
                return False
            
            # Check date column
            if not pd.api.types.is_datetime64_any_dtype(input_data['ds']):
                return False
        
        return True
    
    def _log_prediction(self, prediction, input_data):
        """Log prediction for monitoring"""
        
        log_entry = {
            'timestamp': pd.Timestamp.now(),
            'input_shape': input_data.shape if hasattr(input_data, 'shape') else len(input_data),
            'prediction_shape': prediction.shape if hasattr(prediction, 'shape') else len(prediction),
            'model_type': type(self.model).__name__
        }
        
        self.performance_history.append(log_entry)
        
        # Keep only last 1000 entries
        if len(self.performance_history) > 1000:
            self.performance_history = self.performance_history[-1000:]
    
    def _get_model_version(self):
        """Get model version information"""
        
        return f"v1.0_{pd.Timestamp.now().strftime('%Y%m%d')}"
    
    def update_model(self, new_model, new_scaler=None):
        """Update model with new version"""
        
        self.model = new_model
        if new_scaler is not None:
            self.scaler = new_scaler
        
        self.last_training_date = pd.Timestamp.now()
        
        print(f"Model updated at {self.last_training_date}")
    
    def get_performance_stats(self):
        """Get service performance statistics"""
        
        if not self.performance_history:
            return {}
        
        df = pd.DataFrame(self.performance_history)
        
        stats = {
            'total_predictions': len(df),
            'avg_prediction_time': (df['timestamp'].max() - df['timestamp'].min()).total_seconds() / len(df) if len(df) > 1 else 0,
            'last_prediction': df['timestamp'].max(),
            'model_type': df['model_type'].iloc[0] if len(df) > 0 else 'Unknown'
        }
        
        return stats
    
    def health_check(self):
        """Perform health check"""
        
        health_status = {
            'status': 'healthy',
            'model_loaded': self.model is not None,
            'last_training': self.last_training_date,
            'performance_stats': self.get_performance_stats()
        }
        
        # Check if model is recent (within 30 days)
        if self.last_training_date:
            days_since_training = (pd.Timestamp.now() - self.last_training_date).days
            if days_since_training > 30:
                health_status['status'] = 'warning'
                health_status['message'] = f'Model is {days_since_training} days old'
        
        return health_status

# Create forecasting service
service = ForecastingService(ensemble)

# Test service with sample data
test_input = test_data[['ds', 'y']].head(10)
result = service.predict(test_input, forecast_horizon=5)

print("Service Prediction Result:")
print(f"Status: {result['status']}")
if result['status'] == 'success':
    print(f"Prediction shape: {result['prediction'].shape}")
print(f"Timestamp: {result['timestamp']}")
print(f"Model version: {result['model_version']}")

# Test health check
health = service.health_check()
print(f"\nHealth Status: {health['status']}")
print(f"Model loaded: {health['model_loaded']}")
print(f"Performance stats: {health['performance_stats']}")

# Simulate multiple predictions for monitoring
print("\nSimulating service usage...")
for i in range(10):
    sample_input = test_data[['ds', 'y']].sample(5, random_state=i)
    service.predict(sample_input)

# Get updated performance stats
stats = service.get_performance_stats()
print(f"\nService Statistics:")
print(f"Total predictions: {stats['total_predictions']}")
print(f"Average prediction time: {stats['avg_prediction_time']:.4f} seconds")
print(f"Last prediction: {stats['last_prediction']}")
print(f"Model type: {stats['model_type']}")

📈 Real-World Applications¶

Time series forecasting powers many critical business applications.

Business Applications:¶

Demand forecasting: Retail sales, inventory management
Financial forecasting: Stock prices, economic indicators
Supply chain: Procurement, logistics optimization
Energy: Load forecasting, renewable energy prediction
Healthcare: Patient volume, epidemic forecasting
Transportation: Traffic prediction, ride-sharing demand

Industry Examples:¶

Retail: Walmart uses forecasting for inventory optimization
Finance: Hedge funds use high-frequency trading forecasts
Manufacturing: Predictive maintenance scheduling
E-commerce: Amazon’s recommendation and inventory systems
Weather: Meteorological forecasting for agriculture

Success Metrics:¶

Accuracy improvement: Percentage reduction in forecast error
Business impact: Cost savings, revenue increase
Operational efficiency: Reduced stockouts, better resource utilization
Customer satisfaction: Better service levels

def retail_demand_forecasting():
    """Example: Retail demand forecasting application"""
    
    print("=== Retail Demand Forecasting Example ===")
    
    # Generate retail sales data
    dates = pd.date_range('2023-01-01', periods=365, freq='D')
    
    # Base demand with seasonality
    t = np.arange(365)
    base_demand = 100 + 20 * np.sin(2 * np.pi * t / 365)  # Yearly seasonality
    weekly_pattern = 10 * np.sin(2 * np.pi * t / 7)       # Weekly pattern
    
    # Add promotional effects
    promotions = np.zeros(365)
    promo_days = [30, 91, 152, 213, 274, 335]  # Monthly promotions
    for day in promo_days:
        if day < 365:
            promotions[day-2:day+3] = 50  # 5-day promotion effect
    
    # Holiday effects
    holidays = np.zeros(365)
    holiday_dates = [0, 359, 360, 361]  # New Year period
    holidays[holiday_dates] = 80
    
    # Combine and add noise
    sales = base_demand + weekly_pattern + promotions + holidays
    sales += np.random.normal(0, 5, 365)  # Random variation
    
    # Create DataFrame
    retail_data = pd.DataFrame({
        'ds': dates,
        'y': sales,
        'promotion': promotions,
        'holiday': holidays
    })
    
    # Split data
    train_data = retail_data[:-30]
    test_data = retail_data[-30:]
    
    # Fit Prophet model with holidays and promotions
    model = Prophet(
        seasonality_mode='multiplicative',
        yearly_seasonality=True,
        weekly_seasonality=True,
        daily_seasonality=False
    )
    
    # Add custom seasonalities
    model.add_seasonality(name='monthly', period=30.44, fourier_order=5)
    
    # Add holiday effects
    holidays_df = pd.DataFrame({
        'holiday': 'promotion',
        'ds': dates[promotions > 0],
        'lower_window': -2,
        'upper_window': 2
    })
    
    holidays_df = pd.concat([holidays_df, pd.DataFrame({
        'holiday': 'holiday',
        'ds': dates[holidays > 0],
        'lower_window': -1,
        'upper_window': 1
    })])
    
    model.holidays = holidays_df
    model.fit(train_data[['ds', 'y']])
    
    # Generate forecast
    future = model.make_future_dataframe(periods=30, freq='D')
    forecast = model.predict(future)
    
    # Calculate accuracy
    test_forecast = forecast[forecast['ds'].isin(test_data['ds'])]
    mae = mean_absolute_error(test_data['y'], test_forecast['yhat'])
    mape = np.mean(np.abs((test_data['y'] - test_forecast['yhat']) / test_data['y'])) * 100
    
    print(f"Retail Demand Forecasting Results:")
print(f"MAE: {mae:.2f} units")
print(f"MAPE: {mape:.2f}%")
print(f"Average daily sales: {retail_data['y'].mean():.2f} units")
    
    # Plot results
    plt.figure(figsize=(15, 8))
    
    # Plot sales data
    plt.plot(retail_data['ds'], retail_data['y'], 'b-', alpha=0.7, linewidth=1, label='Actual Sales')
    
    # Plot forecast
    plt.plot(forecast['ds'], forecast['yhat'], 'r-', linewidth=2, label='Forecast')
    plt.fill_between(forecast['ds'], forecast['yhat_lower'], forecast['yhat_upper'], 
                    color='red', alpha=0.2, label='95% Confidence Interval')
    
    # Highlight promotions and holidays
    promo_dates = retail_data[retail_data['promotion'] > 0]['ds']
    holiday_dates = retail_data[retail_data['holiday'] > 0]['ds']
    
    for date in promo_dates:
        plt.axvline(x=date, color='green', alpha=0.5, linewidth=1)
    for date in holiday_dates:
        plt.axvline(x=date, color='orange', alpha=0.7, linewidth=2)
    
    plt.axvline(x=train_data['ds'].iloc[-1], color='k', linestyle='--', alpha=0.7, label='Train/Test Split')
    
    plt.title('Retail Demand Forecasting with Promotions and Holidays')
    plt.xlabel('Date')
    plt.ylabel('Sales Units')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()
    
    # Business impact analysis
    print(f"\nBusiness Impact Analysis:")
    print(f"Total forecasted sales (30 days): {test_forecast['yhat'].sum():.0f} units")
print(f"Total actual sales (30 days): {test_data['y'].sum():.0f} units")
print(f"Forecast accuracy: {(1 - mae/test_data['y'].mean()):.1%}")
    
    # Inventory recommendations
    safety_stock = test_forecast['yhat_upper'] - test_forecast['yhat']
    print(f"Recommended safety stock (average): {safety_stock.mean():.1f} units")
print(f"Stockout risk: {(test_data['y'] > test_forecast['yhat_upper']).mean():.1%}")
    
    return model, forecast

# Run retail forecasting example
retail_model, retail_forecast = retail_demand_forecasting()

🎯 Key Takeaways¶

Ensemble methods improve accuracy by combining multiple models
Uncertainty quantification is crucial for decision making
Production deployment requires robust engineering practices
Business applications drive forecasting model selection
Monitoring and maintenance ensure long-term performance

🔍 When to Use Advanced Techniques¶

✅ Good For:¶

High-stakes decisions: Financial forecasting, inventory management
Complex patterns: Multiple seasonalities, external factors
Large datasets: Sufficient data for ensemble methods
Production systems: Reliable, scalable deployment needed
Business impact: Direct revenue/cost implications

❌ Less Ideal For:¶

Simple problems: Basic ARIMA may suffice
Limited data: Deep learning needs substantial training data
Real-time constraints: Some methods are computationally expensive
Interpretability required: Ensemble methods can be black-box

💡 Pro Tips¶

Start simple: Use traditional methods first, add complexity as needed
Ensemble wisely: Combine diverse models for best results
Quantify uncertainty: Always provide confidence intervals
Monitor performance: Set up automated monitoring and alerts
Domain knowledge: Incorporate business rules and expert knowledge
Data quality: Clean, consistent data is more important than fancy models
Iterate quickly: Test, measure, improve in short cycles
Scale gradually: Start small, expand as confidence grows

🚀 Next Steps¶

Now that you understand advanced techniques, you’re ready for:

Bayesian Forecasting: Probabilistic approaches with uncertainty
Multivariate Methods: Forecasting multiple related time series
Deep Time Series: Advanced neural architectures
Reinforcement Learning: Sequential decision making
MLOps for Time Series: Production deployment and monitoring

📚 Further Reading¶

“Forecasting: Principles and Practice”: Comprehensive forecasting textbook
“Time Series Analysis and Its Applications”: Statistical foundations
MLOps Zoomcamp: Production ML engineering
“Practical Time Series Forecasting”: Business applications

Congratulations! You’ve completed the comprehensive time series forecasting curriculum! 🎉📈

From basic concepts to advanced production systems, you now have the knowledge to tackle real-world forecasting challenges. Remember: the best model is the one that solves your business problem reliably and efficiently.