Run this notebook: Open in Colab Open in Kaggle

# Install required packages
!pip install -q gradio transformers pillow torch torchvision

import gradio as gr
import numpy as np
import pandas as pd
from PIL import Image
import matplotlib.pyplot as plt

print(f"Gradio version: {gr.__version__}")

Gradio Basics: From Python Function to Web Interface in Three Lines¶

Gradio transforms any Python function into a shareable web application by wrapping it with gr.Interface(fn, inputs, outputs). Behind the scenes, Gradio launches a local FastAPI server, serializes inputs/outputs between the browser and Python, and renders an interactive UI with input validation, loading indicators, and error handling – all without writing a single line of HTML, CSS, or JavaScript.

Why this matters for ML practitioners: the gap between a working model in a notebook and a demo that stakeholders can interact with is traditionally weeks of web development. Gradio collapses this to minutes. Each gr.Interface call defines a contract: what goes in (text, images, numbers), what comes out (labels, plots, text), and what function connects them. The launch() method starts the server, and in notebooks it renders the interface inline. For sharing, launch(share=True) creates a public URL tunneled through Gradio’s servers, valid for 72 hours.

# Example 1: Simple text function
def greet(name):
    return f"Hello {name}!"

# Create interface
demo = gr.Interface(
    fn=greet,
    inputs=gr.Textbox(label="Your Name"),
    outputs=gr.Textbox(label="Greeting")
)

# Launch (in notebooks, it will display inline)
demo.launch()

# Example 2: Mathematical function
def calculate(x, y, operation):
    operations = {
        "Add": x + y,
        "Subtract": x - y,
        "Multiply": x * y,
        "Divide": x / y if y != 0 else "Error: Division by zero"
    }
    return operations[operation]

demo = gr.Interface(
    fn=calculate,
    inputs=[
        gr.Number(label="First Number"),
        gr.Number(label="Second Number"),
        gr.Radio(["Add", "Subtract", "Multiply", "Divide"], label="Operation")
    ],
    outputs=gr.Textbox(label="Result"),
    title="Simple Calculator",
    description="Perform basic arithmetic operations"
)

demo.launch()

Key Gradio Components¶

Input Components:

Textbox - Text input
Number - Numeric input
Slider - Range selection
Checkbox - Boolean
Radio - Single choice
Dropdown - Dropdown menu
Image - Image upload
Audio - Audio upload
File - File upload

Output Components:

Textbox - Text display
Label - Classification results
Image - Image display
Plot - Matplotlib plots
Audio - Audio playback
JSON - JSON data
HTML - Custom HTML

Image Classification Interface¶

Building an image classification demo showcases Gradio’s ability to handle complex data types seamlessly. The gr.Image(type="pil") component handles file upload, format conversion, and resizing, delivering a PIL Image object to your function. The gr.Label output component renders prediction probabilities as a ranked bar chart – exactly the visualization stakeholders expect when evaluating a classifier.

Integration with Hugging Face Transformers: the pipeline("image-classification") API loads a pre-trained Vision Transformer (ViT) model with a single line, handling tokenization, inference, and post-processing internally. By connecting this pipeline to Gradio, you get a production-quality image classification demo that non-technical users can test with their own images, making it invaluable for model validation workshops, client demos, and portfolio projects.

from transformers import pipeline

# Load pre-trained image classifier
print("Loading model...")
classifier = pipeline("image-classification", model="google/vit-base-patch16-224")
print("Model loaded!")

def classify_image(img):
    """
    Classify an image and return top predictions.
    
    Args:
        img: PIL Image or numpy array
    
    Returns:
        dict: {label: confidence} for top predictions
    """
    if img is None:
        return {"Error": "No image provided"}
    
    # Get predictions
    predictions = classifier(img)
    
    # Format as dictionary
    return {pred['label']: pred['score'] for pred in predictions[:5]}

# Test the function
# You can test with a sample image URL or local file

# Create Gradio interface
demo = gr.Interface(
    fn=classify_image,
    inputs=gr.Image(type="pil", label="Upload Image"),
    outputs=gr.Label(num_top_classes=5, label="Predictions"),
    title="🖼️ Image Classifier",
    description="Upload an image to classify it using Vision Transformer",
    examples=[
        # Add example image paths here if available
    ],
    allow_flagging="never"
)

demo.launch()

Text Generation Interface: Controlling LLM Output¶

Text generation interfaces expose the key parameters that control language model behavior: max_length bounds the output size, temperature controls randomness (low values near 0.1 produce deterministic, repetitive text; high values near 2.0 produce creative but potentially incoherent text), and num_return_sequences generates multiple variations for comparison. Gradio’s gr.Slider components make these parameters intuitively adjustable by non-technical users, turning abstract hyperparameters into tangible controls.

The examples parameter is critical for good UX: pre-filled examples let users immediately see what the interface does without formulating their own input. For text generation specifically, examples also demonstrate the expected prompt format, reducing user confusion about what kind of input the model expects.

# Load text generation model (using a smaller model for demo)
text_generator = pipeline("text-generation", model="distilgpt2")

def generate_text(prompt, max_length, temperature, num_return):
    """
    Generate text based on a prompt.
    
    Args:
        prompt: Starting text
        max_length: Maximum length of generated text
        temperature: Randomness (0.1-2.0)
        num_return: Number of variations to generate
    
    Returns:
        str: Generated text
    """
    if not prompt.strip():
        return "Please provide a prompt!"
    
    # Generate
    outputs = text_generator(
        prompt,
        max_length=max_length,
        temperature=temperature,
        num_return_sequences=num_return,
        do_sample=True
    )
    
    # Format output
    results = []
    for i, output in enumerate(outputs, 1):
        results.append(f"**Generation {i}:**\n{output['generated_text']}\n")
    
    return "\n".join(results)

# Create interface with advanced controls
demo = gr.Interface(
    fn=generate_text,
    inputs=[
        gr.Textbox(
            lines=3,
            placeholder="Enter your prompt here...",
            label="Prompt"
        ),
        gr.Slider(minimum=10, maximum=200, value=50, step=10, label="Max Length"),
        gr.Slider(minimum=0.1, maximum=2.0, value=0.7, step=0.1, label="Temperature"),
        gr.Slider(minimum=1, maximum=3, value=1, step=1, label="Number of Generations")
    ],
    outputs=gr.Markdown(label="Generated Text"),
    title="✍️ Text Generator",
    description="Generate creative text continuations",
    examples=[
        ["Once upon a time", 50, 0.8, 1],
        ["The future of AI is", 75, 0.7, 2],
        ["In a world where", 100, 0.9, 1]
    ]
)

demo.launch()

Advanced Layouts with Blocks: Beyond Simple Input-Output¶

gr.Blocks() is Gradio’s lower-level API for building complex, multi-component interfaces that go beyond the single-function gr.Interface pattern. With Blocks, you control layout using gr.Row() and gr.Column() containers, wire multiple buttons to different functions, and create reactive UIs where one component’s output feeds into another. The .click() method connects a button to a function with explicit input/output mappings, enabling dashboards where multiple analyses run independently.

When to choose Blocks over Interface: use gr.Interface for single-function demos (classify an image, analyze sentiment). Switch to gr.Blocks when you need multiple functions on one page, custom layouts with side-by-side panels, stateful interactions (chat interfaces, multi-step workflows), or branded styling with gr.themes. The theme=gr.themes.Soft() parameter applies a cohesive visual design without custom CSS.

# Create a sentiment analysis function
sentiment_analyzer = pipeline("sentiment-analysis")

def analyze_sentiment(text):
    """Analyze sentiment of text."""
    if not text.strip():
        return "Please enter text!", None
    
    result = sentiment_analyzer(text)[0]
    label = result['label']
    score = result['score']
    
    # Format output
    emoji = "😊" if label == "POSITIVE" else "😞"
    message = f"{emoji} {label} (confidence: {score:.2%})"
    
    # Create visualization
    fig, ax = plt.subplots(figsize=(6, 2))
    ax.barh(['Sentiment'], [score], color='green' if label == 'POSITIVE' else 'red')
    ax.set_xlim(0, 1)
    ax.set_xlabel('Confidence')
    ax.set_title(f'{label} Sentiment')
    plt.tight_layout()
    
    return message, fig

# Create custom layout with Blocks
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# 🎭 Sentiment Analysis Dashboard")
    gr.Markdown("Analyze the sentiment of any text in real-time!")
    
    with gr.Row():
        with gr.Column(scale=2):
            text_input = gr.Textbox(
                lines=5,
                placeholder="Enter text to analyze...",
                label="Input Text"
            )
            analyze_btn = gr.Button("Analyze Sentiment", variant="primary")
            
            gr.Examples(
                examples=[
                    "I love this product! It's amazing!",
                    "This is the worst experience ever.",
                    "It's okay, nothing special.",
                    "Absolutely fantastic work!"
                ],
                inputs=text_input
            )
        
        with gr.Column(scale=3):
            result_text = gr.Textbox(label="Result", interactive=False)
            result_plot = gr.Plot(label="Confidence Score")
    
    # Connect components
    analyze_btn.click(
        fn=analyze_sentiment,
        inputs=text_input,
        outputs=[result_text, result_plot]
    )

demo.launch()

Multi-Modal Interface: Combining Data Types¶

Multi-modal interfaces accept different input types simultaneously – images, text, and numbers in a single submission. This pattern mirrors real-world ML applications where predictions depend on heterogeneous data sources: a medical diagnosis system might combine an X-ray image, patient symptoms (text), and lab values (numbers). Gradio handles the serialization of each input type independently, passing PIL Images, strings, and floats to your function as native Python objects.

Design consideration: when building multi-modal interfaces, make inputs optional (check for None) so users can submit partial data. The function should gracefully handle missing modalities rather than crashing, providing results for whichever inputs are available. This progressive disclosure pattern makes the interface more approachable – users can start with one input type and gradually explore more complex combinations.

def process_multimodal(image, text, number):
    """
    Process multiple types of inputs.
    
    Returns:
        tuple: (classification, text_analysis, visualization)
    """
    results = []
    
    # Process image
    if image is not None:
        img_result = classifier(image)[:3]
        img_output = "\n".join([f"- {r['label']}: {r['score']:.2%}" for r in img_result])
        results.append(f"**Image Classification:**\n{img_output}")
    
    # Process text
    if text and text.strip():
        sent_result = sentiment_analyzer(text)[0]
        results.append(f"**Text Sentiment:** {sent_result['label']} ({sent_result['score']:.2%})")
    
    # Process number
    if number is not None:
        results.append(f"**Number Analysis:** Square = {number**2}, Cube = {number**3}")
    
    # Create visualization
    fig, ax = plt.subplots(figsize=(8, 4))
    categories = ['Image', 'Text', 'Number']
    values = [
        1 if image is not None else 0,
        1 if text and text.strip() else 0,
        1 if number is not None else 0
    ]
    ax.bar(categories, values, color=['blue', 'green', 'orange'])
    ax.set_ylabel('Provided')
    ax.set_title('Input Types Provided')
    ax.set_ylim(0, 1.5)
    plt.tight_layout()
    
    summary = "\n\n".join(results) if results else "No inputs provided!"
    
    return summary, fig

# Create multi-modal interface
with gr.Blocks() as demo:
    gr.Markdown("# 🔮 Multi-Modal AI Processor")
    gr.Markdown("Upload and analyze different types of data simultaneously!")
    
    with gr.Row():
        with gr.Column():
            gr.Markdown("### Inputs")
            image_in = gr.Image(type="pil", label="Image")
            text_in = gr.Textbox(lines=3, label="Text")
            number_in = gr.Number(label="Number")
            submit_btn = gr.Button("Process All", variant="primary")
        
        with gr.Column():
            gr.Markdown("### Outputs")
            text_out = gr.Markdown(label="Analysis Results")
            plot_out = gr.Plot(label="Input Summary")
    
    submit_btn.click(
        fn=process_multimodal,
        inputs=[image_in, text_in, number_in],
        outputs=[text_out, plot_out]
    )

demo.launch()

Deployment Options¶

Temporary Public Link (share=True)
- ✅ Instant sharing
- ✅ No setup required
- ❌ Expires in 72 hours
- ❌ Not for production
Hugging Face Spaces (Covered in Notebook 3)
- ✅ Free permanent hosting
- ✅ Easy deployment
- ✅ Community visibility
- ❌ Public by default
Self-Hosted
- ✅ Full control
- ✅ Custom domain
- ✅ Private hosting
- ❌ Requires server management
Cloud Platforms (AWS, GCP, Azure)
- ✅ Scalable
- ✅ Professional deployment
- ❌ More complex
- ❌ May have costs

🎯 Key Takeaways¶

Gradio makes ML demos incredibly easy - Just a few lines of code!
Interface vs Blocks - Use Interface for simple apps, Blocks for complex layouts
Rich component library - Text, image, audio, video, and more
Easy sharing - share=True for instant public links
Examples improve UX - Always provide example inputs
Themes and styling - Customize appearance with built-in themes

📝 Practice Exercises¶

Build a Translation Interface
- Use a translation model from Hugging Face
- Add language selection dropdown
- Include example sentences
Create an Image Filter App
- Upload image
- Apply filters (blur, sharpen, edge detection)
- Show before/after comparison
Build a Data Analyzer
- Upload CSV file
- Display statistics
- Create visualizations
- Allow column selection

🔗 Resources¶

Next: Notebook 2 - Building with Streamlit