view post Post 10002 deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML> it uses CLIP and SAM features concatenated, so better grounding> very efficient per vision tokens/performance ratio> covers 100 languages See translation
Mar 20 Releases GAIR/daVinci-MagiHuman Updated about 8 hours ago • 21 • 16 datalab-to/chandra-ocr-2 Image-Text-to-Text • 5B • Updated 5 days ago • 6.3k • 77 meituan-longcat/LongCat-Flash-Prover Text Generation • 561B • Updated about 2 hours ago • 15 • 17 baidu/Qianfan-OCR Image-Text-to-Text • 5B • Updated 4 days ago • 6.24k • 305
Jan 26 Releases robbyant/lingbot-world-base-cam Image-to-Video • Updated Feb 2 • 328 nvidia/C-RADIOv4-H Feature Extraction • Updated Jan 30 • 4.85k • 61 deepseek-ai/DeepSeek-OCR-2 Image-Text-to-Text • 3B • Updated Feb 3 • 1.25M • 876 arcee-ai/Trinity-Large-Base Text Generation • 399B • Updated Jan 27 • 89 • 52
Mar 20 Releases GAIR/daVinci-MagiHuman Updated about 8 hours ago • 21 • 16 datalab-to/chandra-ocr-2 Image-Text-to-Text • 5B • Updated 5 days ago • 6.3k • 77 meituan-longcat/LongCat-Flash-Prover Text Generation • 561B • Updated about 2 hours ago • 15 • 17 baidu/Qianfan-OCR Image-Text-to-Text • 5B • Updated 4 days ago • 6.24k • 305
Jan 26 Releases robbyant/lingbot-world-base-cam Image-to-Video • Updated Feb 2 • 328 nvidia/C-RADIOv4-H Feature Extraction • Updated Jan 30 • 4.85k • 61 deepseek-ai/DeepSeek-OCR-2 Image-Text-to-Text • 3B • Updated Feb 3 • 1.25M • 876 arcee-ai/Trinity-Large-Base Text Generation • 399B • Updated Jan 27 • 89 • 52
Running on CPU Upgrade 18 Daggr Image To 3d 👀 Convert images into 3D assets with background removal and enhancement
Running on Zero Featured 112 SAM3 Video Segmentation 🐠 Track and label objects in videos using text prompts or clicks