Run this notebook: Open in Colab Open in Kaggle

# Install required packages
!pip install -q huggingface-hub gradio transformers

from huggingface_hub import HfApi, create_repo, upload_file
import gradio as gr
import os

Part 1: Introduction to Hugging Face Spaces¶

What are Spaces?¶

Hugging Face Spaces is a free platform for hosting ML demos and applications.

Key Features:

✅ Free hosting for public Spaces
✅ Support for Gradio, Streamlit, and Docker
✅ Automatic builds from Git repository
✅ Easy integration with Hugging Face Hub
✅ Community sharing and discovery

Space SDKs:

Gradio - Quick ML demos
Streamlit - Data apps and dashboards
Static - HTML/CSS/JS only
Docker - Full custom control

Creating a Gradio Space: Structure and Configuration¶

A Hugging Face Space is a Git repository with a specific structure: app.py (the main application), requirements.txt (Python dependencies), and a README.md with YAML frontmatter that configures the Space’s SDK, title, and metadata. The build system reads the frontmatter, installs dependencies, and launches app.py automatically on every git push. Below, we create these files locally using %%writefile to prepare a complete, deployable Gradio Space.

The README.md frontmatter is the configuration layer: sdk: gradio tells the platform which runtime to use, app_file: app.py specifies the entry point, and pinned: true keeps the Space visible on your profile page. The sentiment analysis app below uses the Transformers pipeline API, which downloads the model on first launch and caches it in the Space’s persistent storage – subsequent restarts load from cache, reducing cold-start time from minutes to seconds.

# Create a directory for our Space
!mkdir -p gradio_space_demo

%%writefile gradio_space_demo/app.py
import gradio as gr
from transformers import pipeline

# Load sentiment analysis model
classifier = pipeline("sentiment-analysis")

def analyze_sentiment(text):
    """
    Analyze sentiment of input text.
    
    Args:
        text: Input text string
    
    Returns:
        str: Sentiment label and confidence
    """
    if not text.strip():
        return "Please enter some text!"
    
    result = classifier(text)[0]
    label = result['label']
    score = result['score']
    
    emoji = "😊" if label == "POSITIVE" else "😞"
    return f"{emoji} {label} (Confidence: {score:.2%})"

# Create Gradio interface
demo = gr.Interface(
    fn=analyze_sentiment,
    inputs=gr.Textbox(lines=3, placeholder="Enter text here...", label="Input Text"),
    outputs=gr.Textbox(label="Sentiment Analysis"),
    title="🎭 Sentiment Analyzer",
    description="Analyze the sentiment of any text using state-of-the-art NLP!",
    examples=[
        "I love this product! It's amazing!",
        "This is terrible, worst experience ever.",
        "It's okay, nothing special.",
        "Absolutely fantastic work, exceeded expectations!"
    ],
    theme=gr.themes.Soft()
)

if __name__ == "__main__":
    demo.launch()

%%writefile gradio_space_demo/requirements.txt
gradio
transformers
torch

%%writefile gradio_space_demo/README.md
---
title: Sentiment Analyzer
emoji: 🎭
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
---

# Sentiment Analyzer

Analyze the sentiment of any text using state-of-the-art NLP models.

## Features
- Real-time sentiment analysis
- Confidence scores
- Pre-loaded examples

## Usage
Simply enter your text and click submit!

Deploying to Hugging Face Spaces¶

Method 1: Web Interface (Easiest)

Go to huggingface.co/new-space
Fill in Space details:
- Space name
- Choose SDK: Gradio
- Visibility: Public or Private
Click “Create Space”
Upload files:
- app.py
- requirements.txt
- README.md
Space will automatically build and deploy!

Method 2: Git (Advanced)

# Clone your Space repository
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME

# Add files
cp app.py requirements.txt README.md .

# Commit and push
git add .
git commit -m "Initial commit"
git push

Method 3: Python API

# Example of programmatic Space creation (requires HF token)
# Uncomment and use with your token

# from huggingface_hub import HfApi

# api = HfApi()

# # Create Space
# api.create_repo(
#     repo_id="your-username/sentiment-analyzer",
#     repo_type="space",
#     space_sdk="gradio"
# )

# # Upload files
# api.upload_file(
#     path_or_fileobj="gradio_space_demo/app.py",
#     path_in_repo="app.py",
#     repo_id="your-username/sentiment-analyzer",
#     repo_type="space"
# )

print("See code above for programmatic deployment")

Creating a Streamlit Space: Same Platform, Different SDK¶

Streamlit Spaces follow the same repository structure as Gradio Spaces but with sdk: streamlit in the README frontmatter. The key difference is in how the app is served: Gradio launches a FastAPI server, while Streamlit runs its own Tornado-based server. Streamlit Spaces are particularly well-suited for data exploration dashboards and multi-page applications where users interact with filters, charts, and tables simultaneously.

Choosing between Gradio and Streamlit on Spaces: Gradio Spaces load faster (smaller runtime overhead) and work better for single-purpose ML demos with clear input/output contracts. Streamlit Spaces shine when the application requires complex layouts, multiple interactive widgets affecting a shared visualization, or multi-page navigation. Both are free for public Spaces and support the same hardware tiers (CPU Basic, GPU T4/A10G/A100) for compute-intensive models.

!mkdir -p streamlit_space_demo

%%writefile streamlit_space_demo/app.py
import streamlit as st
import pandas as pd
import plotly.express as px
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier

# Page config
st.set_page_config(
    page_title="Iris Classifier",
    page_icon="🌸",
    layout="wide"
)

# Title
st.title("🌸 Iris Species Classifier")
st.markdown("Predict iris species based on flower measurements")

# Load data
@st.cache_data
def load_data():
    iris = load_iris()
    df = pd.DataFrame(iris.data, columns=iris.feature_names)
    df['species'] = iris.target
    df['species'] = df['species'].map({0: 'setosa', 1: 'versicolor', 2: 'virginica'})
    return df, iris.target_names

df, target_names = load_data()

# Train model
@st.cache_resource
def train_model():
    iris = load_iris()
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    model.fit(iris.data, iris.target)
    return model

model = train_model()

# Sidebar inputs
st.sidebar.header("Input Features")
sepal_length = st.sidebar.slider('Sepal Length (cm)', 4.0, 8.0, 5.5)
sepal_width = st.sidebar.slider('Sepal Width (cm)', 2.0, 4.5, 3.0)
petal_length = st.sidebar.slider('Petal Length (cm)', 1.0, 7.0, 4.0)
petal_width = st.sidebar.slider('Petal Width (cm)', 0.1, 2.5, 1.3)

# Make prediction
input_data = [[sepal_length, sepal_width, petal_length, petal_width]]
prediction = model.predict(input_data)
prediction_proba = model.predict_proba(input_data)

species_names = ['Setosa', 'Versicolor', 'Virginica']
predicted_species = species_names[prediction[0]]

# Display prediction
col1, col2 = st.columns([1, 2])

with col1:
    st.subheader("Prediction")
    st.success(f"**{predicted_species}**")
    
    st.subheader("Confidence")
    for i, species in enumerate(species_names):
        st.write(f"{species}: {prediction_proba[0][i]:.2%}")

with col2:
    st.subheader("Probability Distribution")
    proba_df = pd.DataFrame({
        'Species': species_names,
        'Probability': prediction_proba[0]
    })
    fig = px.bar(proba_df, x='Species', y='Probability', color='Species')
    st.plotly_chart(fig, use_container_width=True)

# Dataset info
with st.expander("📊 View Dataset"):
    st.dataframe(df)
    
st.markdown("---")
st.caption("Built with Streamlit • Deployed on Hugging Face Spaces")

%%writefile streamlit_space_demo/requirements.txt
streamlit
pandas
plotly
scikit-learn

%%writefile streamlit_space_demo/README.md
---
title: Iris Classifier
emoji: 🌸
colorFrom: green
colorTo: blue
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
---

# Iris Species Classifier

Interactive classifier for iris flowers using Random Forest.

## Features
- Real-time predictions
- Interactive sliders
- Probability visualization
- Dataset explorer

Using Pre-trained Models from Hugging Face Hub¶

The Hugging Face Hub hosts over 500,000 pre-trained models across NLP, computer vision, audio, and multimodal tasks. The transformers.pipeline() API provides a unified interface: specify the task name (“sentiment-analysis”, “summarization”, “ner”) and optionally a model ID, and the library handles downloading weights, loading the tokenizer, and configuring inference parameters. This abstraction lets you build multi-task NLP applications without understanding the architecture details of each model.

The multi-task pattern shown below loads three separate pipelines (sentiment, summarization, NER) and routes user input to the selected task. In a Space, all three models load into memory on startup, so the first request is slow (downloading weights) but subsequent requests are fast. For production Spaces with memory constraints, consider loading models lazily (only when first requested) or using smaller distilled variants like distilbert-base-uncased which are 60% smaller with only 3-5% accuracy loss.

!mkdir -p hub_model_demo

%%writefile hub_model_demo/app.py
import gradio as gr
from transformers import pipeline

# Load multiple models from Hub
sentiment_model = pipeline("sentiment-analysis")
summarization_model = pipeline("summarization", model="facebook/bart-large-cnn")
ner_model = pipeline("ner", aggregation_strategy="simple")

def analyze_text(text, task):
    """
    Perform different NLP tasks on text.
    
    Args:
        text: Input text
        task: NLP task to perform
    
    Returns:
        str: Result based on selected task
    """
    if not text.strip():
        return "Please enter some text!"
    
    try:
        if task == "Sentiment Analysis":
            result = sentiment_model(text)[0]
            return f"**{result['label']}** (Confidence: {result['score']:.2%})"
        
        elif task == "Summarization":
            if len(text.split()) < 50:
                return "Text too short for summarization (need at least 50 words)"
            result = summarization_model(text, max_length=130, min_length=30, do_sample=False)
            return result[0]['summary_text']
        
        elif task == "Named Entity Recognition":
            entities = ner_model(text)
            if not entities:
                return "No entities found!"
            
            result = "**Entities Found:**\n\n"
            for ent in entities:
                result += f"- **{ent['word']}** ({ent['entity_group']}): {ent['score']:.2%}\n"
            return result
    
    except Exception as e:
        return f"Error: {str(e)}"

# Create interface
demo = gr.Interface(
    fn=analyze_text,
    inputs=[
        gr.Textbox(lines=5, placeholder="Enter your text here...", label="Input Text"),
        gr.Radio(
            ["Sentiment Analysis", "Summarization", "Named Entity Recognition"],
            label="Select Task",
            value="Sentiment Analysis"
        )
    ],
    outputs=gr.Markdown(label="Result"),
    title="🤗 Multi-Task NLP with Hugging Face",
    description="Perform various NLP tasks using pre-trained models from Hugging Face Hub",
    examples=[
        ["I absolutely love this product! Best purchase ever!", "Sentiment Analysis"],
        ["The Eiffel Tower is located in Paris, France. It was built by Gustave Eiffel.", "Named Entity Recognition"],
        ["Artificial intelligence is transforming industries worldwide. Machine learning algorithms can now process vast amounts of data, identify patterns, and make predictions with remarkable accuracy. Companies are using AI for everything from customer service chatbots to autonomous vehicles. However, concerns about job displacement and ethical considerations remain important topics of discussion.", "Summarization"]
    ],
    theme=gr.themes.Monochrome()
)

if __name__ == "__main__":
    demo.launch()

%%writefile hub_model_demo/requirements.txt
gradio
transformers
torch

Managing Secrets and Environment Variables¶

ML applications frequently need API keys (OpenAI, Anthropic, custom endpoints) or database credentials that must never appear in source code. Hugging Face Spaces provides a Repository Secrets feature: key-value pairs stored encrypted server-side and injected as environment variables at runtime, accessible via os.environ.get("KEY_NAME"). Secrets are never visible in the Space’s Git history, logs, or to other users – even collaborators with write access cannot read secret values, only overwrite them.

Security best practices for Spaces: never hardcode API keys in app.py, use .gitignore to exclude .env files from version control, and validate that required secrets are set at startup (failing fast with a clear error message rather than crashing mid-inference). For local development, use a .env file loaded with python-dotenv, and for the deployed Space, configure the same variables through the Settings > Repository Secrets panel.

%%writefile secrets_demo.py
import gradio as gr
import os

# Access secrets (these would be set in Space settings)
API_KEY = os.environ.get("API_KEY", "not-set")
SECRET_TOKEN = os.environ.get("SECRET_TOKEN", "not-set")

def check_secrets():
    """Show status of environment variables."""
    return f"""
    **Environment Variables:**
    
    - API_KEY: {'✅ Set' if API_KEY != 'not-set' else '❌ Not set'}
    - SECRET_TOKEN: {'✅ Set' if SECRET_TOKEN != 'not-set' else '❌ Not set'}
    
    **How to set secrets in Spaces:**
    1. Go to your Space settings
    2. Click on "Repository secrets"
    3. Add your secrets as key-value pairs
    4. Restart the Space
    
    **Never commit secrets to Git!**
    """

demo = gr.Interface(
    fn=check_secrets,
    inputs=None,
    outputs=gr.Markdown(),
    title="Secrets Management Demo",
    description="Check environment variable status"
)

if __name__ == "__main__":
    demo.launch()

print("""\nTo set secrets in Hugging Face Spaces:
1. Go to Space Settings > Repository secrets
2. Add secrets (e.g., OPENAI_API_KEY)
3. Access in code: os.environ.get('OPENAI_API_KEY')
""")

Part 6: Advanced Space Configuration¶

README.md Frontmatter Options¶

The README.md file contains metadata about your Space:

---
title: My Awesome Space           # Space title
emoji: 🚀                         # Icon
colorFrom: blue                   # Gradient start color
colorTo: purple                   # Gradient end color
sdk: gradio                       # gradio, streamlit, static, or docker
sdk_version: 4.0.0               # SDK version
app_file: app.py                 # Main app file
pinned: false                    # Pin to your profile
license: apache-2.0              # License type
duplicated_from: username/space  # If duplicated from another
---

Custom Python Packages¶

requirements.txt - Python dependencies

gradio==4.0.0
transformers>=4.30.0
torch

packages.txt - System-level dependencies

ffmpeg
libsm6
libxext6

Hardware Options¶

Available hardware tiers:

CPU Basic (Free) - 2 vCPU, 16GB RAM
CPU Upgrade (Paid) - 8 vCPU, 32GB RAM
GPU (Paid) - NVIDIA T4, A10G, or A100

Set in Space settings under “Hardware”

Space Examples and Best Practices¶

Building a production-quality Space goes beyond getting the model to work – it requires thoughtful error handling, clear documentation, and performance optimization. The directory structure below represents the standard layout: app.py for the main application, utils/ for reusable helper modules, and examples/ for sample inputs that appear in the interface. Pinning exact dependency versions in requirements.txt (e.g., gradio==4.0.0 rather than gradio>=4.0.0) prevents build failures when upstream packages release breaking changes.

Common failure modes and fixes: out-of-memory errors occur when loading large models on the free CPU tier (2 vCPU, 16GB RAM) – switch to a smaller model or upgrade hardware. Slow first-load times happen because model weights download from the Hub on every Space restart – use huggingface_hub.snapshot_download() with a persistent cache directory. Build failures usually trace to missing system-level dependencies, which go in packages.txt (not requirements.txt) for apt-get packages like ffmpeg or libgl1.

# Example: Image generation Space
print("""
Example Space Structure:

my-image-generator/
├── app.py                 # Main application
├── requirements.txt       # Python dependencies
├── README.md             # Space metadata and description
├── examples/             # Example images/data
│   ├── example1.jpg
│   └── example2.jpg
└── utils/                # Helper modules
    └── image_utils.py

Best Practices:

1. ✅ Add clear README with usage instructions
2. ✅ Include examples for users to try
3. ✅ Pin exact dependency versions
4. ✅ Handle errors gracefully
5. ✅ Add loading indicators for slow operations
6. ✅ Use caching to improve performance
7. ✅ Test locally before deploying
8. ✅ Monitor Space logs for errors
9. ✅ Keep model sizes reasonable (<1GB if possible)
10. ✅ Add appropriate license information

Common Issues:

❌ Out of memory - Use smaller models or upgrade hardware
❌ Slow loading - Cache model loading
❌ Build failures - Check requirements.txt syntax
❌ Missing dependencies - Add to packages.txt
""")

🎯 Key Takeaways¶

Hugging Face Spaces = Free ML Hosting - Perfect for demos and portfolios
Three SDKs available - Gradio (ML demos), Streamlit (data apps), Docker (custom)
Easy deployment - Just upload files or push to Git
Pre-trained models - Leverage thousands of models from Hub
Secrets management - Safely use API keys
Community sharing - Showcase your work

📝 Practice Exercises¶

Deploy a Gradio Space
- Create sentiment analysis app
- Add custom examples
- Deploy to Spaces
- Share the link!
Build a Streamlit Dashboard Space
- Create data visualization app
- Use caching effectively
- Deploy and monitor
Use Hub Models
- Try different models from Hub
- Create multi-model interface
- Compare results
Manage Secrets
- Create app needing API key
- Set up environment variables
- Test secret access

🔗 Resources¶

Next: Notebook 4 - AutoML Platforms