DIWA-Net: A Parameter-Efficient Multi-Modal Architecture for Deepfake Detection in the Open-Set Paradigm
DIWA-Net is a research-grade multimodal deepfake detector that fuses DINOv2 visual features with Wav2Vec 2.0 audio representations through cross-modal attention and a balanced gated fusion head. The system is LoRA-adapted on the MAVOS-DD benchmark, supporting multiple languages and open-set evaluation against unseen manipulation methods.
| Program | Final Year Project (FYP) |
| Academic year | 2026 |
| Department | Department of Computer Science |
| University | University of Engineering and Technology, Taxila |
| Model | DIWA-Net Phase 3 (LoRA) |
| Benchmark | MAVOS-DD · 12K training set |
Student team
DIWA-Net was developed by a two-member student team over a full academic year at UET Taxila, under the supervision and co-authorship of Dr. Rabbia Mahum.
Supervisor & co-author
Dr. Rabbia Mahum supervised and co-authored this project — guiding problem formulation, model architecture, training strategy, experimental design, and thesis preparation at the Department of Computer Science, University of Engineering and Technology, Taxila.
- Dr. Rabbia Mahum for supervising and co-authoring the project, and for guiding architecture, experiments, and thesis preparation.
- MAVOS-DD benchmark dataset contributors and the broader deepfake detection research community.
- Meta AI (DINOv2) and Meta/Facebook (Wav2Vec 2.0) for open foundation model backbones.
- Faculty and staff at UET Taxila for guidance throughout the Final Year Project.
Contact: 22-CS-86@students.uettaxila.edu.pk