Chapter 12: What Makes \(e^x\) So Special?ΒΆ
The One Function That Is Its Own DerivativeΒΆ
Among all exponential functions, \(e^x\) holds a unique position: it is the only function (up to constant multiples) satisfying \(\frac{d}{dx}e^x = e^x\). Its Taylor series reveals why:
Differentiating term by term shifts every coefficient down by one index, reproducing the original series perfectly. The number \(e \approx 2.71828\) is not an arbitrary constant β it is the unique base for which the derivative of \(a^x\) equals \(a^x\) itself, because \(\ln(e) = 1\).
Why \(e^x\) is central to ML/AI: The exponential function is the backbone of the softmax activation (\(\sigma(z_i) = e^{z_i} / \sum_j e^{z_j}\)), the sigmoid function (\(\sigma(x) = 1/(1 + e^{-x})\)), probability distributions (Gaussian, Boltzmann), and the cross-entropy loss. Its self-derivative property makes gradient computations clean and efficient. In differential equations, \(e^x\) governs exponential growth and decay, which models learning rate warmup schedules, weight decay (L2 regularization as exponential shrinkage), and the dynamics of recurrent neural networks.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 10)
Visualizing What Makes \(e^x\) Unique Among ExponentialsΒΆ
For a general exponential \(a^x\), the derivative is \(\frac{d}{dx}a^x = \ln(a) \cdot a^x\) β the function times a proportionality constant \(\ln(a)\). When \(a = 2\), this constant is \(\ln(2) \approx 0.693\) (the derivative is smaller than the function). When \(a = 3\), it is \(\ln(3) \approx 1.099\) (the derivative is larger). The number \(e\) is the βGoldilocksβ base where \(\ln(e) = 1\), making the derivative exactly equal to the function.
The code below plots various exponential functions alongside their derivatives to reveal this relationship visually. It also plots the ratio \(f'(x)/f(x)\) for each base, showing that this ratio is constant (equal to \(\ln(a)\)) regardless of \(x\) β a hallmark property of exponential functions that distinguishes them from polynomials and trigonometric functions.