Computer Vision SpecializationΒΆ

πŸ–ΌοΈ OverviewΒΆ

Master computer vision from classification to generative models!

Time: 2-3 months | 150-200 hours
Prerequisites: Phases 1-8 complete
Outcome: Build production CV applications

πŸ“š What You’ll LearnΒΆ

  • Image classification (ResNet, Vision Transformers)

  • Object detection (YOLO, DETR)

  • Image embeddings (CLIP, DINO)

  • Semantic segmentation

  • Generative models (Stable Diffusion, DALL-E)

  • Multimodal AI (text + vision)

  • Video understanding

  • OCR and document AI

πŸ—‚οΈ Module StructureΒΆ

computer-vision/
β”œβ”€β”€ 00_START_HERE.ipynb
β”œβ”€β”€ 01_image_classification.ipynb
β”œβ”€β”€ 02_object_detection.ipynb
β”œβ”€β”€ 03_clip_embeddings.ipynb
β”œβ”€β”€ 04_stable_diffusion.ipynb
β”œβ”€β”€ 05_multimodal_rag.ipynb
β”œβ”€β”€ projects/
β”‚   β”œβ”€β”€ visual_search/
β”‚   β”œβ”€β”€ image_qa/
β”‚   └── content_moderation/
└── README.md

🎯 Key Projects¢

  1. Visual Search Engine - Find similar images using CLIP

  2. Image Q&A System - Chat with images

  3. Content Moderation - Classify safe/unsafe images

  4. AI Art Generator - Creative tool with Stable Diffusion

Start here: 00_START_HERE.ipynb