Mwangi PRO

Benson

118 292

AI & ML interests

None yet

Recent Activity

upvoted a paper about 14 hours ago

Scaling Language-Centric Omnimodal Representation Learning

liked a model about 15 hours ago

LCO-Embedding/LCO-Embedding-Omni-7B

liked a Space about 15 hours ago

mteb/leaderboard

View all activity

Organizations

None yet

upvoted a paper about 14 hours ago

Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13, 2025 • 109

upvoted a paper about 15 hours ago

WordVoice: Explicit and Decoupled Multi-Dimensional Word-Level Control for LLM-Based TTS

Paper • 2607.06461 • Published 18 days ago • 1

upvoted 3 papers about 1 month ago

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

Paper • 2606.10804 • Published Jun 9 • 54

Preference Learning Unlocks LLMs' Psycho-Counseling Skills

Paper • 2502.19731 • Published Feb 27, 2025 • 8

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

Paper • 2606.14702 • Published Jun 12 • 31

upvoted 5 papers about 2 months ago

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

Paper • 2605.26244 • Published May 25 • 38

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published May 26 • 146

upvoted 4 papers 2 months ago

Lance: Unified Multimodal Modeling by Multi-Task Synergy

Paper • 2605.18678 • Published May 18 • 79

APRES: An Agentic Paper Revision and Evaluation System

Paper • 2603.03142 • Published Mar 3 • 3

EgoMemReason: A Memory-Driven Reasoning Benchmark for Long-Horizon Egocentric Video Understanding

Paper • 2605.09874 • Published May 11 • 2

jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition

Paper • 2605.08384 • Published May 8 • 11

upvoted a collection 2 months ago

jina-embeddings-v5-omni

Collection

Multimodal (text + image + video + audio) embedding models aligned with jina-embeddings-v5-text-*. Two sizes, four task variants each. • 27 items • Updated May 12 • 36

upvoted 2 papers 2 months ago

CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models

Paper • 2605.08735 • Published May 9 • 71

SkillOS: Learning Skill Curation for Self-Evolving Agents

Paper • 2605.06614 • Published May 7 • 47

upvoted an article 3 months ago

Article

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

nvidia

•

Apr 28

• 62

upvoted 2 papers 3 months ago

Qwen3.5-Omni Technical Report

Paper • 2604.15804 • Published Apr 17 • 60

VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph

Paper • 2602.12735 • Published Feb 13 • 8

Mwangi PRO

AI & ML interests

Recent Activity

Organizations

Benson's activity

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents