arxiv:2605.17640
Debashish C PRO
d3bach
AI & ML interests
omni-modal inference and training. GPUs
Recent Activity
upvoted a paper 3 days ago
VL-JEPA: Joint Embedding Predictive Architecture for Vision-language upvoted a paper 4 days ago
Scaling Audio-Text Retrieval with Multimodal Large Language Models authored a paper 9 days ago
MARQUIS: A Three-Stage Pipeline for Video Retrieval-Augmented Generation