Projects
-
Flare Removal (Google Research Inspired)
Re-implemented and enhanced a deep learning model for lens flare removal using semi-synthetic flare-corrupted image pairs generated via empirical and wave-optics simulation. Improved generalization across real-world scenarios, achieving ~3 dB PSNR gain over baselines.
PDF Code -
Vision Transformer (ViT)
Implemented the original Vision Transformer architecture for image classification based on patch embedding and transformer encoder blocks. Trained on standard datasets and explored fine-tuning techniques for better performance.
PDF Code -
Speech-Driven Personal Note Taker
Developed a voice-enabled system using OpenAI Whisper and spaCy NLP to transcribe speech and extract key information in real-time. Integrated optional TTS feedback and structured note formatting.
Code -
Face Detection
Implemented a real-time face detection pipeline using OpenCV and pre-trained models. Explored Haar cascades, DNN modules, and custom dataset evaluation.
Code