EMOLIPS: Emotion-Driven Facial Animation for Realistic Lip-Sync Synthesis
Overview
Emotion-controllable lip-sync framework using FiLM-based modulation of 3DMM expression coefficients.
Architecture
- Backbone: SadTalker (3DMM-based talking face)
- Novel Module: Emotion-Conditioned Fusion Module (ECFM)
- Conditioning: FiLM layers + Lip Consistency Gate
- Emotions: Neutral, Happy, Sad, Angry, Fear, Surprise, Disgust
Results
See EMOLIPS_Submission.ipynb for full methodology, architecture diagrams, outputs, and metrics.
Output Samples
| Emotion | Video |
|---|---|
| Neutral | emolips_neutral.mp4 |
| Happy | emolips_happy.mp4 |
| Sad | emolips_sad.mp4 |
| Angry | emolips_angry.mp4 |
| Fear | emolips_fear.mp4 |
| Surprise | emolips_surprise.mp4 |
| Disgust | emolips_disgust.mp4 |
Quick Start
git clone https://huggingface.co/primal-sage/emolips
cd emolips/code
bash setup.sh
python inference.py --audio ../samples/input_audio.wav --image ../samples/input_face.jpg --all-emotions
Citation
@article{emolips2026,
title={EMOLIPS: Emotion-Driven Facial Animation for Realistic Lip-Sync Synthesis},
year={2026}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support