05: Advanced Techniques & Applicationsยถ

โ€œThe best way to predict the future is to create it.โ€ - Peter Drucker

Welcome to the advanced techniques and practical applications of time series forecasting! This notebook covers ensemble methods, uncertainty quantification, and real-world deployment considerations.

๐ŸŽฏ Learning Objectivesยถ

By the end of this notebook, youโ€™ll be able to:

  • Implement ensemble forecasting methods

  • Quantify and visualize prediction uncertainty

  • Handle multivariate time series forecasting

  • Deploy models for production use

  • Monitor and maintain forecasting systems

  • Apply time series methods to real-world problems

๐Ÿค Ensemble Methodsยถ

Ensemble forecasting combines multiple models to improve accuracy and robustness.

Why Ensembles Work:ยถ

  • Bias-Variance Tradeoff: Different models have different strengths

  • Error Diversity: Models make different mistakes

  • Robustness: Less sensitive to individual model failures

  • Uncertainty: Better uncertainty quantification

Ensemble Types:ยถ

  • Simple Averaging: Equal weights for all models

  • Weighted Averaging: Performance-based weights

  • Stacking: Meta-model learns optimal combination

  • Bagging: Bootstrap aggregation

  • Boosting: Sequential model improvement

Time Series Specific:ยถ

  • Temporal validation: Respect time ordering

  • Rolling ensembles: Update with new data

  • Online learning: Continuous model adaptation

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.tsa.arima.model import ARIMA
from prophet import Prophet
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error
import torch
import torch.nn as nn
import warnings
warnings.filterwarnings('ignore')

# Set random seeds
np.random.seed(42)
torch.manual_seed(42)

# Set up plotting
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['font.size'] = 12]

print("Ensemble Forecasting Libraries Loaded!")
def generate_ensemble_data(n_days=500, noise_level=0.1):
    """Generate time series data for ensemble testing"""
    
    dates = pd.date_range('2020-01-01', periods=n_days, freq='D')
    t = np.arange(n_days)
    
    # Complex trend with changes
    trend = 0.002 * t + 0.000005 * t**2
    trend_changes = np.zeros(n_days)
    trend_changes[150:250] += 5  # Positive shock
    trend_changes[350:400] -= 8  # Negative shock
    
    # Multiple seasonal patterns
    weekly = 3 * np.sin(2 * np.pi * t / 7)
    monthly = 4 * np.sin(2 * np.pi * t / 30.44)
    quarterly = 2 * np.sin(2 * np.pi * t / 91.25)
    
    # External factors
    external = np.random.choice([-3, -1, 0, 1, 3], n_days, p=[0.1, 0.2, 0.4, 0.2, 0.1])
    external_smooth = np.convolve(external, np.ones(14)/14, mode='same')
    
    # Combine components
    y = (trend + trend_changes + weekly + monthly + quarterly + external_smooth)
    
    # Add noise
    noise = np.random.normal(0, noise_level * np.std(y), n_days)
    y += noise
    
    # Create DataFrame
    df = pd.DataFrame({
        'ds': dates,
        'y': y,
        'trend': trend,
        'weekly': weekly,
        'monthly': monthly,
        'quarterly': quarterly,
        'external': external_smooth,
        'noise': noise
    })
    
    return df

# Generate ensemble data
ensemble_data = generate_ensemble_data()

print(f"Generated {len(ensemble_data)} days of ensemble test data")
print(f"Date range: {ensemble_data['ds'].min()} to {ensemble_data['ds'].max()}")
print(f"Value range: {ensemble_data['y'].min():.2f} to {ensemble_data['y'].max():.2f}")

# Plot the data
plt.figure(figsize=(15, 8))
plt.plot(ensemble_data['ds'], ensemble_data['y'], 'b-', linewidth=1.5, alpha=0.8)
plt.title('Time Series Data for Ensemble Testing')
plt.xlabel('Date')
plt.ylabel('Value')
plt.grid(True, alpha=0.3)
plt.show()

# Show data structure
print("\nData Structure:")
print(ensemble_data.head())
print("\nData Info:")
print(ensemble_data.info())
class TimeSeriesEnsemble:
    """Ensemble forecasting class combining multiple models"""
    
    def __init__(self, models=None):
        if models is None:
            self.models = {
                'arima': None,
                'prophet': None,
                'rf': None,
                'linear': None,
                'lstm': None
            }
        else:
            self.models = models
        
        self.scaler = MinMaxScaler()
        self.is_trained = False
    
    def _prepare_features(self, data, lookback=30):
        """Prepare features for ML models"""
        df = data.copy()
        
        # Lag features
        for i in range(1, lookback + 1):
            df[f'lag_{i}'] = df['y'].shift(i)
        
        # Rolling statistics
        df['rolling_mean_7'] = df['y'].rolling(7).mean()
        df['rolling_std_7'] = df['y'].rolling(7).std()
        df['rolling_mean_30'] = df['y'].rolling(30).mean()
        df['rolling_std_30'] = df['y'].rolling(30).std()
        
        # Time features
        df['day_of_week'] = df['ds'].dt.dayofweek
        df['month'] = df['ds'].dt.month
        df['quarter'] = df['ds'].dt.quarter
        
        # Drop NaN values
        df = df.dropna()
        
        return df
    
    def fit(self, train_data):
        """Fit all models in the ensemble"""
        print("Training ensemble models...")
        
        # Prepare features for ML models
        train_features = self._prepare_features(train_data)
        X = train_features.drop(['ds', 'y'], axis=1)
        y = train_features['y']
        
        # Scale features
        X_scaled = self.scaler.fit_transform(X)
        
        # Train ARIMA
        print("  Training ARIMA...")
        try:
            arima_model = ARIMA(train_data['y'], order=(5, 1, 2))
            self.models['arima'] = arima_model.fit()
        except:
            print("    ARIMA training failed, skipping")
            self.models['arima'] = None
        
        # Train Prophet
        print("  Training Prophet...")
        try:
            prophet_df = train_data[['ds', 'y']].copy()
            prophet_model = Prophet(
                changepoint_prior_scale=0.05,
                seasonality_mode='additive',
                daily_seasonality=False
            )
            prophet_model.fit(prophet_df)
            self.models['prophet'] = prophet_model
        except:
            print("    Prophet training failed, skipping")
            self.models['prophet'] = None
        
        # Train Random Forest
        print("  Training Random Forest...")
        try:
            rf_model = RandomForestRegressor(
                n_estimators=100,
                max_depth=10,
                random_state=42
            )
            rf_model.fit(X_scaled, y)
            self.models['rf'] = rf_model
        except:
            print("    Random Forest training failed, skipping")
            self.models['rf'] = None
        
        # Train Linear Regression
        print("  Training Linear Regression...")
        try:
            linear_model = LinearRegression()
            linear_model.fit(X_scaled, y)
            self.models['linear'] = linear_model
        except:
            print("    Linear Regression training failed, skipping")
            self.models['linear'] = None
        
        # Placeholder for LSTM (would need proper implementation)
        self.models['lstm'] = None
        
        self.is_trained = True
        print("Ensemble training completed!")
    
    def predict(self, test_data, forecast_horizon=1):
        """Generate ensemble predictions"""
        if not self.is_trained:
            raise ValueError("Ensemble must be trained before prediction")
        
        predictions = {}
        
        # ARIMA predictions
        if self.models['arima'] is not None:
            try:
                arima_pred = self.models['arima'].forecast(steps=len(test_data))
                predictions['arima'] = arima_pred
            except:
                predictions['arima'] = np.full(len(test_data), np.nan)
        
        # Prophet predictions
        if self.models['prophet'] is not None:
            try:
                future = self.models['prophet'].make_future_dataframe(periods=len(test_data), freq='D')
                prophet_forecast = self.models['prophet'].predict(future)
                prophet_pred = prophet_forecast['yhat'].tail(len(test_data)).values
                predictions['prophet'] = prophet_pred
            except:
                predictions['prophet'] = np.full(len(test_data), np.nan)
        
        # ML model predictions
        test_features = self._prepare_features(test_data)
        if len(test_features) > 0:
            X_test = test_features.drop(['ds', 'y'], axis=1)
            X_test_scaled = self.scaler.transform(X_test)
            
            # Random Forest
            if self.models['rf'] is not None:
                try:
                    rf_pred = self.models['rf'].predict(X_test_scaled)
                    predictions['rf'] = rf_pred
                except:
                    predictions['rf'] = np.full(len(test_features), np.nan)
            
            # Linear Regression
            if self.models['linear'] is not None:
                try:
                    linear_pred = self.models['linear'].predict(X_test_scaled)
                    predictions['linear'] = linear_pred
                except:
                    predictions['linear'] = np.full(len(test_features), np.nan)
        
        # LSTM placeholder
        predictions['lstm'] = np.full(len(test_data), np.nan)
        
        return predictions
    
    def ensemble_predict(self, predictions, method='simple_average'):
        """Combine predictions using ensemble method"""
        
        # Stack predictions
        pred_df = pd.DataFrame(predictions)
        
        if method == 'simple_average':
            # Simple average of available predictions
            valid_preds = pred_df.dropna(axis=1)
            if len(valid_preds.columns) > 0:
                ensemble_pred = valid_preds.mean(axis=1)
            else:
                ensemble_pred = np.full(len(pred_df), np.nan)
        
        elif method == 'median':
            # Median of available predictions
            valid_preds = pred_df.dropna(axis=1)
            if len(valid_preds.columns) > 0:
                ensemble_pred = valid_preds.median(axis=1)
            else:
                ensemble_pred = np.full(len(pred_df), np.nan)
        
        else:
            raise ValueError(f"Unknown ensemble method: {method}")
        
        return ensemble_pred.values

# Split data for ensemble testing
train_size = int(len(ensemble_data) * 0.7)
train_data = ensemble_data[:train_size]
test_data = ensemble_data[train_size:]

print(f"Train set: {len(train_data)} samples")
print(f"Test set: {len(test_data)} samples")

# Create and train ensemble
ensemble = TimeSeriesEnsemble()
ensemble.fit(train_data)

# Generate predictions
individual_predictions = ensemble.predict(test_data)
ensemble_pred = ensemble.ensemble_predict(individual_predictions, method='simple_average')

# Calculate accuracies
accuracies = {}
for model_name, preds in individual_predictions.items():
    if not np.isnan(preds).all():
        valid_idx = ~np.isnan(preds)
        if np.sum(valid_idx) > 0:
            mae = mean_absolute_error(test_data['y'].values[valid_idx], preds[valid_idx])
            accuracies[model_name] = mae

# Ensemble accuracy
ensemble_mae = mean_absolute_error(test_data['y'], ensemble_pred)
accuracies['ensemble'] = ensemble_mae

print("\nModel Accuracies (MAE):")
for model, mae in accuracies.items():
    print(f"  {model}: {mae:.4f}")

# Plot ensemble results
plt.figure(figsize=(15, 8))

# Plot actual data
plt.plot(test_data['ds'], test_data['y'], 'k-', linewidth=2, label='Actual', alpha=0.8)

# Plot individual predictions
colors = ['red', 'blue', 'green', 'orange', 'purple']
for i, (model_name, preds) in enumerate(individual_predictions.items()):
    if not np.isnan(preds).all():
        plt.plot(test_data['ds'], preds, color=colors[i % len(colors)], 
                linewidth=1.5, alpha=0.7, label=f'{model_name.upper()}')

# Plot ensemble prediction
plt.plot(test_data['ds'], ensemble_pred, 'r-', linewidth=3, label='Ensemble')

plt.title('Ensemble Forecasting Results')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

๐Ÿ“Š Uncertainty Quantificationยถ

Uncertainty quantification is crucial for reliable forecasting and decision making.

Types of Uncertainty:ยถ

  • Aleatoric: Inherent randomness in data (irreducible)

  • Epistemic: Model uncertainty (reducible with more data/knowledge)

  • Prediction intervals: Range of likely future values

  • Confidence intervals: Uncertainty in parameter estimates

Quantification Methods:ยถ

  • Bootstrap: Resampling-based uncertainty

  • Bayesian: Probabilistic parameter estimation

  • Ensemble: Model disagreement as uncertainty

  • Quantile regression: Direct quantile prediction

  • Conformal prediction: Distribution-free uncertainty

Applications:ยถ

  • Risk assessment: Decision making under uncertainty

  • Inventory management: Safety stock calculations

  • Financial planning: Portfolio optimization

  • Supply chain: Demand planning with uncertainty

class UncertaintyQuantifier:
    """Class for quantifying prediction uncertainty"""
    
    def __init__(self, n_bootstraps=100, confidence_level=0.95):
        self.n_bootstraps = n_bootstraps
        self.confidence_level = confidence_level
        self.lower_percentile = (1 - confidence_level) / 2 * 100
        self.upper_percentile = (1 + confidence_level) / 2 * 100
    
    def bootstrap_uncertainty(self, model_func, data, n_forecasts):
        """Calculate uncertainty using bootstrapping"""
        
        predictions = []
        
        for i in range(self.n_bootstraps):
            # Bootstrap sample
            bootstrap_idx = np.random.choice(len(data), len(data), replace=True)
            bootstrap_sample = data.iloc[bootstrap_idx]
            
            # Fit model and predict
            try:
                pred = model_func(bootstrap_sample, n_forecasts)
                predictions.append(pred)
            except:
                continue
        
        if len(predictions) == 0:
            return None, None, None
        
        predictions = np.array(predictions)
        
        # Calculate statistics
        mean_pred = np.mean(predictions, axis=0)
        lower_bound = np.percentile(predictions, self.lower_percentile, axis=0)
        upper_bound = np.percentile(predictions, self.upper_percentile, axis=0)
        
        return mean_pred, lower_bound, upper_bound
    
    def ensemble_uncertainty(self, predictions_dict):
        """Calculate uncertainty from ensemble predictions"""
        
        # Stack predictions
        pred_array = np.column_stack([pred for pred in predictions_dict.values() 
                                    if not np.isnan(pred).all()])
        
        if pred_array.shape[1] == 0:
            return None, None, None
        
        # Calculate ensemble statistics
        mean_pred = np.mean(pred_array, axis=1)
        std_pred = np.std(pred_array, axis=1)
        
        # Use standard deviation for uncertainty bounds
        lower_bound = mean_pred - 1.96 * std_pred
        upper_bound = mean_pred + 1.96 * std_pred
        
        return mean_pred, lower_bound, upper_bound
    
    def prediction_interval_coverage(self, actuals, lower_bounds, upper_bounds):
        """Calculate prediction interval coverage"""
        
        coverage = np.mean((actuals >= lower_bounds) & (actuals <= upper_bounds))
        return coverage
    
    def interval_width(self, lower_bounds, upper_bounds):
        """Calculate average prediction interval width"""
        
        width = np.mean(upper_bounds - lower_bounds)
        return width

# Initialize uncertainty quantifier
uq = UncertaintyQuantifier(n_bootstraps=50, confidence_level=0.95)

# Function for ARIMA bootstrap
def arima_bootstrap_func(data, n_forecasts):
    try:
        model = ARIMA(data['y'], order=(5, 1, 2))
        fitted = model.fit()
        forecast = fitted.forecast(steps=n_forecasts)
        return forecast
    except:
        return np.full(n_forecasts, np.nan)

# Calculate bootstrap uncertainty for ARIMA
print("Calculating bootstrap uncertainty for ARIMA...")
arima_mean, arima_lower, arima_upper = uq.bootstrap_uncertainty(
    arima_bootstrap_func, train_data, len(test_data)
)

# Calculate ensemble uncertainty
print("Calculating ensemble uncertainty...")
ensemble_mean, ensemble_lower, ensemble_upper = uq.ensemble_uncertainty(individual_predictions)

# Calculate coverage and width
if arima_mean is not None:
    arima_coverage = uq.prediction_interval_coverage(
        test_data['y'].values, arima_lower, arima_upper
    )
    arima_width = uq.interval_width(arima_lower, arima_upper)
    print(f"ARIMA Bootstrap - Coverage: {arima_coverage:.3f}, Width: {arima_width:.4f}")

if ensemble_mean is not None:
    ensemble_coverage = uq.prediction_interval_coverage(
        test_data['y'].values, ensemble_lower, ensemble_upper
    )
    ensemble_width = uq.interval_width(ensemble_lower, ensemble_upper)
    print(f"Ensemble - Coverage: {ensemble_coverage:.3f}, Width: {ensemble_width:.4f}")

# Plot uncertainty comparison
plt.figure(figsize=(15, 10))

# Plot 1: Bootstrap uncertainty
plt.subplot(2, 1, 1)
plt.plot(test_data['ds'], test_data['y'], 'k-', linewidth=2, label='Actual')
if arima_mean is not None:
    plt.plot(test_data['ds'], arima_mean, 'b-', linewidth=2, label='ARIMA Mean')
    plt.fill_between(test_data['ds'], arima_lower, arima_upper, 
                    color='blue', alpha=0.3, label='95% Bootstrap CI')
plt.title('Bootstrap Uncertainty Quantification (ARIMA)')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 2: Ensemble uncertainty
plt.subplot(2, 1, 2)
plt.plot(test_data['ds'], test_data['y'], 'k-', linewidth=2, label='Actual')
if ensemble_mean is not None:
    plt.plot(test_data['ds'], ensemble_mean, 'r-', linewidth=2, label='Ensemble Mean')
    plt.fill_between(test_data['ds'], ensemble_lower, ensemble_upper, 
                    color='red', alpha=0.3, label='95% Ensemble CI')
plt.title('Ensemble Uncertainty Quantification')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Uncertainty analysis
print("\n=== Uncertainty Analysis ===")
if arima_mean is not None:
    print(f"ARIMA Bootstrap Intervals:")
    print(f"  Average width: {arima_width:.4f}")
    print(f"  Coverage: {arima_coverage:.1%}")
    print(f"  Target coverage: {uq.confidence_level:.1%}")

if ensemble_mean is not None:
    print(f"\nEnsemble Intervals:")
    print(f"  Average width: {ensemble_width:.4f}")
    print(f"  Coverage: {ensemble_coverage:.1%}")
    print(f"  Target coverage: {uq.confidence_level:.1%}")

# Reliability diagram
if ensemble_mean is not None:
    plt.figure(figsize=(8, 6))
    
    # Calculate coverage by bins
    errors = np.abs(test_data['y'].values - ensemble_mean)
    uncertainty = (ensemble_upper - ensemble_lower) / 2
    
    # Bin by uncertainty
    bins = np.linspace(uncertainty.min(), uncertainty.max(), 10)
    bin_centers = (bins[:-1] + bins[1:]) / 2
    
    coverage_by_bin = []
    for i in range(len(bins) - 1):
        mask = (uncertainty >= bins[i]) & (uncertainty < bins[i+1])
        if np.sum(mask) > 0:
            coverage = np.mean(errors[mask] <= uncertainty[mask])
            coverage_by_bin.append(coverage)
        else:
            coverage_by_bin.append(0)
    
    plt.plot(bin_centers, coverage_by_bin, 'bo-', linewidth=2, markersize=8)
    plt.plot([bins.min(), bins.max()], [0.95, 0.95], 'r--', label='Target Coverage (95%)')
    plt.xlabel('Uncertainty (Interval Half-Width)')
    plt.ylabel('Observed Coverage')
    plt.title('Uncertainty Calibration (Reliability Diagram)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

๐Ÿ”„ Production Deploymentยถ

Deploying forecasting models to production requires careful consideration of scalability, reliability, and maintenance.

Deployment Architecture:ยถ

  • Batch processing: Periodic forecast generation

  • Real-time scoring: On-demand predictions

  • Model serving: REST APIs, microservices

  • Containerization: Docker, Kubernetes

  • Cloud platforms: AWS SageMaker, Azure ML, GCP AI Platform

Key Considerations:ยถ

  • Scalability: Handle varying loads

  • Latency: Response time requirements

  • Reliability: Fault tolerance and monitoring

  • Security: Data protection and access control

  • Cost: Computational resource optimization

MLOps Pipeline:ยถ

  • Data ingestion: Automated data pipelines

  • Model training: Scheduled retraining

  • Model validation: Automated testing

  • Model deployment: CI/CD pipelines

  • Monitoring: Performance tracking and alerting

class ForecastingService:
    """Production-ready forecasting service"""
    
    def __init__(self, model, scaler=None):
        self.model = model
        self.scaler = scaler
        self.performance_history = []
        self.last_training_date = None
        
    def predict(self, input_data, forecast_horizon=1):
        """Generate predictions with error handling"""
        
        try:
            # Validate input
            if not self._validate_input(input_data):
                raise ValueError("Invalid input data")
            
            # Generate prediction
            if hasattr(self.model, 'predict'):
                prediction = self.model.predict(input_data)
            elif hasattr(self.model, 'forecast'):
                prediction = self.model.forecast(steps=forecast_horizon)
            else:
                raise ValueError("Model does not have prediction method")
            
            # Log prediction
            self._log_prediction(prediction, input_data)
            
            return {
                'status': 'success',
                'prediction': prediction,
                'timestamp': pd.Timestamp.now(),
                'model_version': self._get_model_version()
            }
            
        except Exception as e:
            return {
                'status': 'error',
                'error': str(e),
                'timestamp': pd.Timestamp.now()
            }
    
    def _validate_input(self, input_data):
        """Validate input data format and quality"""
        
        # Check if data is not empty
        if input_data is None or len(input_data) == 0:
            return False
        
        # Check for required columns (if DataFrame)
        if isinstance(input_data, pd.DataFrame):
            required_cols = ['ds', 'y']
            if not all(col in input_data.columns for col in required_cols):
                return False
            
            # Check for missing values
            if input_data.isnull().any().any():
                return False
            
            # Check date column
            if not pd.api.types.is_datetime64_any_dtype(input_data['ds']):
                return False
        
        return True
    
    def _log_prediction(self, prediction, input_data):
        """Log prediction for monitoring"""
        
        log_entry = {
            'timestamp': pd.Timestamp.now(),
            'input_shape': input_data.shape if hasattr(input_data, 'shape') else len(input_data),
            'prediction_shape': prediction.shape if hasattr(prediction, 'shape') else len(prediction),
            'model_type': type(self.model).__name__
        }
        
        self.performance_history.append(log_entry)
        
        # Keep only last 1000 entries
        if len(self.performance_history) > 1000:
            self.performance_history = self.performance_history[-1000:]
    
    def _get_model_version(self):
        """Get model version information"""
        
        return f"v1.0_{pd.Timestamp.now().strftime('%Y%m%d')}"
    
    def update_model(self, new_model, new_scaler=None):
        """Update model with new version"""
        
        self.model = new_model
        if new_scaler is not None:
            self.scaler = new_scaler
        
        self.last_training_date = pd.Timestamp.now()
        
        print(f"Model updated at {self.last_training_date}")
    
    def get_performance_stats(self):
        """Get service performance statistics"""
        
        if not self.performance_history:
            return {}
        
        df = pd.DataFrame(self.performance_history)
        
        stats = {
            'total_predictions': len(df),
            'avg_prediction_time': (df['timestamp'].max() - df['timestamp'].min()).total_seconds() / len(df) if len(df) > 1 else 0,
            'last_prediction': df['timestamp'].max(),
            'model_type': df['model_type'].iloc[0] if len(df) > 0 else 'Unknown'
        }
        
        return stats
    
    def health_check(self):
        """Perform health check"""
        
        health_status = {
            'status': 'healthy',
            'model_loaded': self.model is not None,
            'last_training': self.last_training_date,
            'performance_stats': self.get_performance_stats()
        }
        
        # Check if model is recent (within 30 days)
        if self.last_training_date:
            days_since_training = (pd.Timestamp.now() - self.last_training_date).days
            if days_since_training > 30:
                health_status['status'] = 'warning'
                health_status['message'] = f'Model is {days_since_training} days old'
        
        return health_status

# Create forecasting service
service = ForecastingService(ensemble)

# Test service with sample data
test_input = test_data[['ds', 'y']].head(10)
result = service.predict(test_input, forecast_horizon=5)

print("Service Prediction Result:")
print(f"Status: {result['status']}")
if result['status'] == 'success':
    print(f"Prediction shape: {result['prediction'].shape}")
print(f"Timestamp: {result['timestamp']}")
print(f"Model version: {result['model_version']}")

# Test health check
health = service.health_check()
print(f"\nHealth Status: {health['status']}")
print(f"Model loaded: {health['model_loaded']}")
print(f"Performance stats: {health['performance_stats']}")

# Simulate multiple predictions for monitoring
print("\nSimulating service usage...")
for i in range(10):
    sample_input = test_data[['ds', 'y']].sample(5, random_state=i)
    service.predict(sample_input)

# Get updated performance stats
stats = service.get_performance_stats()
print(f"\nService Statistics:")
print(f"Total predictions: {stats['total_predictions']}")
print(f"Average prediction time: {stats['avg_prediction_time']:.4f} seconds")
print(f"Last prediction: {stats['last_prediction']}")
print(f"Model type: {stats['model_type']}")

๐Ÿ“ˆ Real-World Applicationsยถ

Time series forecasting powers many critical business applications.

Business Applications:ยถ

  • Demand forecasting: Retail sales, inventory management

  • Financial forecasting: Stock prices, economic indicators

  • Supply chain: Procurement, logistics optimization

  • Energy: Load forecasting, renewable energy prediction

  • Healthcare: Patient volume, epidemic forecasting

  • Transportation: Traffic prediction, ride-sharing demand

Industry Examples:ยถ

  • Retail: Walmart uses forecasting for inventory optimization

  • Finance: Hedge funds use high-frequency trading forecasts

  • Manufacturing: Predictive maintenance scheduling

  • E-commerce: Amazonโ€™s recommendation and inventory systems

  • Weather: Meteorological forecasting for agriculture

Success Metrics:ยถ

  • Accuracy improvement: Percentage reduction in forecast error

  • Business impact: Cost savings, revenue increase

  • Operational efficiency: Reduced stockouts, better resource utilization

  • Customer satisfaction: Better service levels

def retail_demand_forecasting():
    """Example: Retail demand forecasting application"""
    
    print("=== Retail Demand Forecasting Example ===")
    
    # Generate retail sales data
    dates = pd.date_range('2023-01-01', periods=365, freq='D')
    
    # Base demand with seasonality
    t = np.arange(365)
    base_demand = 100 + 20 * np.sin(2 * np.pi * t / 365)  # Yearly seasonality
    weekly_pattern = 10 * np.sin(2 * np.pi * t / 7)       # Weekly pattern
    
    # Add promotional effects
    promotions = np.zeros(365)
    promo_days = [30, 91, 152, 213, 274, 335]  # Monthly promotions
    for day in promo_days:
        if day < 365:
            promotions[day-2:day+3] = 50  # 5-day promotion effect
    
    # Holiday effects
    holidays = np.zeros(365)
    holiday_dates = [0, 359, 360, 361]  # New Year period
    holidays[holiday_dates] = 80
    
    # Combine and add noise
    sales = base_demand + weekly_pattern + promotions + holidays
    sales += np.random.normal(0, 5, 365)  # Random variation
    
    # Create DataFrame
    retail_data = pd.DataFrame({
        'ds': dates,
        'y': sales,
        'promotion': promotions,
        'holiday': holidays
    })
    
    # Split data
    train_data = retail_data[:-30]
    test_data = retail_data[-30:]
    
    # Fit Prophet model with holidays and promotions
    model = Prophet(
        seasonality_mode='multiplicative',
        yearly_seasonality=True,
        weekly_seasonality=True,
        daily_seasonality=False
    )
    
    # Add custom seasonalities
    model.add_seasonality(name='monthly', period=30.44, fourier_order=5)
    
    # Add holiday effects
    holidays_df = pd.DataFrame({
        'holiday': 'promotion',
        'ds': dates[promotions > 0],
        'lower_window': -2,
        'upper_window': 2
    })
    
    holidays_df = pd.concat([holidays_df, pd.DataFrame({
        'holiday': 'holiday',
        'ds': dates[holidays > 0],
        'lower_window': -1,
        'upper_window': 1
    })])
    
    model.holidays = holidays_df
    model.fit(train_data[['ds', 'y']])
    
    # Generate forecast
    future = model.make_future_dataframe(periods=30, freq='D')
    forecast = model.predict(future)
    
    # Calculate accuracy
    test_forecast = forecast[forecast['ds'].isin(test_data['ds'])]
    mae = mean_absolute_error(test_data['y'], test_forecast['yhat'])
    mape = np.mean(np.abs((test_data['y'] - test_forecast['yhat']) / test_data['y'])) * 100
    
    print(f"Retail Demand Forecasting Results:")
print(f"MAE: {mae:.2f} units")
print(f"MAPE: {mape:.2f}%")
print(f"Average daily sales: {retail_data['y'].mean():.2f} units")
    
    # Plot results
    plt.figure(figsize=(15, 8))
    
    # Plot sales data
    plt.plot(retail_data['ds'], retail_data['y'], 'b-', alpha=0.7, linewidth=1, label='Actual Sales')
    
    # Plot forecast
    plt.plot(forecast['ds'], forecast['yhat'], 'r-', linewidth=2, label='Forecast')
    plt.fill_between(forecast['ds'], forecast['yhat_lower'], forecast['yhat_upper'], 
                    color='red', alpha=0.2, label='95% Confidence Interval')
    
    # Highlight promotions and holidays
    promo_dates = retail_data[retail_data['promotion'] > 0]['ds']
    holiday_dates = retail_data[retail_data['holiday'] > 0]['ds']
    
    for date in promo_dates:
        plt.axvline(x=date, color='green', alpha=0.5, linewidth=1)
    for date in holiday_dates:
        plt.axvline(x=date, color='orange', alpha=0.7, linewidth=2)
    
    plt.axvline(x=train_data['ds'].iloc[-1], color='k', linestyle='--', alpha=0.7, label='Train/Test Split')
    
    plt.title('Retail Demand Forecasting with Promotions and Holidays')
    plt.xlabel('Date')
    plt.ylabel('Sales Units')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()
    
    # Business impact analysis
    print(f"\nBusiness Impact Analysis:")
    print(f"Total forecasted sales (30 days): {test_forecast['yhat'].sum():.0f} units")
print(f"Total actual sales (30 days): {test_data['y'].sum():.0f} units")
print(f"Forecast accuracy: {(1 - mae/test_data['y'].mean()):.1%}")
    
    # Inventory recommendations
    safety_stock = test_forecast['yhat_upper'] - test_forecast['yhat']
    print(f"Recommended safety stock (average): {safety_stock.mean():.1f} units")
print(f"Stockout risk: {(test_data['y'] > test_forecast['yhat_upper']).mean():.1%}")
    
    return model, forecast

# Run retail forecasting example
retail_model, retail_forecast = retail_demand_forecasting()

๐ŸŽฏ Key Takeawaysยถ

  1. Ensemble methods improve accuracy by combining multiple models

  2. Uncertainty quantification is crucial for decision making

  3. Production deployment requires robust engineering practices

  4. Business applications drive forecasting model selection

  5. Monitoring and maintenance ensure long-term performance

๐Ÿ” When to Use Advanced Techniquesยถ

โœ… Good For:ยถ

  • High-stakes decisions: Financial forecasting, inventory management

  • Complex patterns: Multiple seasonalities, external factors

  • Large datasets: Sufficient data for ensemble methods

  • Production systems: Reliable, scalable deployment needed

  • Business impact: Direct revenue/cost implications

โŒ Less Ideal For:ยถ

  • Simple problems: Basic ARIMA may suffice

  • Limited data: Deep learning needs substantial training data

  • Real-time constraints: Some methods are computationally expensive

  • Interpretability required: Ensemble methods can be black-box

๐Ÿ’ก Pro Tipsยถ

  1. Start simple: Use traditional methods first, add complexity as needed

  2. Ensemble wisely: Combine diverse models for best results

  3. Quantify uncertainty: Always provide confidence intervals

  4. Monitor performance: Set up automated monitoring and alerts

  5. Domain knowledge: Incorporate business rules and expert knowledge

  6. Data quality: Clean, consistent data is more important than fancy models

  7. Iterate quickly: Test, measure, improve in short cycles

  8. Scale gradually: Start small, expand as confidence grows

๐Ÿš€ Next Stepsยถ

Now that you understand advanced techniques, youโ€™re ready for:

  • Bayesian Forecasting: Probabilistic approaches with uncertainty

  • Multivariate Methods: Forecasting multiple related time series

  • Deep Time Series: Advanced neural architectures

  • Reinforcement Learning: Sequential decision making

  • MLOps for Time Series: Production deployment and monitoring

๐Ÿ“š Further Readingยถ

  • โ€œForecasting: Principles and Practiceโ€: Comprehensive forecasting textbook

  • โ€œTime Series Analysis and Its Applicationsโ€: Statistical foundations

  • MLOps Zoomcamp: Production ML engineering

  • โ€œPractical Time Series Forecastingโ€: Business applications

Congratulations! Youโ€™ve completed the comprehensive time series forecasting curriculum! ๐ŸŽ‰๐Ÿ“ˆ

From basic concepts to advanced production systems, you now have the knowledge to tackle real-world forecasting challenges. Remember: the best model is the one that solves your business problem reliably and efficiently.