RefControl FLUX.2 Klein - Reference + Lineart
Keep identity from reference, follow lineart structure
None defined yet.
Keep identity from reference, follow lineart structure
Music understanding model for caption and analysis
Text/speech to spoken response + 3D talking-avatar video
Multi-image instruction-guided image editing
Word-level timestamp alignment from audio + transcript
Subject-driven text-to-video from reference images (Wan2.2)
Image matting with diverse prompts via SAM2Matting
Multi-modal generation with diffusion transformers
Anima depth-conditioned image generation via VACE ControlNet
Separate audio into vocals and instruments with BS-Roformer
Polish speech recognition with fine-tuned Whisper Small
Real-time zero-shot stereo disparity estimation
Phone-use GUI agent - screenshot + task to next action
Video verification & temporal grounding with VideoSearch-R1
GUI grounding with VISTA-9B β predict click coordinates
Multi-view visual reasoning VLM based on Qwen3-VL 4B
2x latent super-resolution with FlowUpscaler in Flux.2 space
Object and Material Selection VLM
Document-parsing VLM (1.2B) by KoreaDeep
Vietnamese text-to-speech with Kokoro TTS
Interleaved text and image generation with SenseNova-U1
Parallel region captioning with multimodal diffusion LLM
Unified AR model for image understanding & generation