Projects

  1. Flare Removal (Google Research Inspired)
    Re-implemented and enhanced a deep learning model for lens flare removal using semi-synthetic flare-corrupted image pairs generated via empirical and wave-optics simulation. Improved generalization across real-world scenarios, achieving ~3 dB PSNR gain over baselines.
    PDF     Code

  2. Vision Transformer (ViT)
    Implemented the original Vision Transformer architecture for image classification based on patch embedding and transformer encoder blocks. Trained on standard datasets and explored fine-tuning techniques for better performance.
    PDF     Code

  3. Speech-Driven Personal Note Taker
    Developed a voice-enabled system using OpenAI Whisper and spaCy NLP to transcribe speech and extract key information in real-time. Integrated optional TTS feedback and structured note formatting.
        Code

  4. Face Detection
    Implemented a real-time face detection pipeline using OpenCV and pre-trained models. Explored Haar cascades, DNN modules, and custom dataset evaluation.
        Code