Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Sam's picture

3

Sam

samsam55

·

AI & ML interests

None yet

Organizations

None yet

samsam55 's collections 13

Reinforcement Learning Etc..

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5, 2025 • 137

VISTA: A Test-Time Self-Improving Video Generation Agent

Paper • 2510.15831 • Published Oct 17, 2025 • 22
Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation

Paper • 2510.15624 • Published Oct 17, 2025 • 15

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16, 2025 • 106

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published Apr 1, 2025 • 27
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 141
microsoft/Fara-7B

Image-Text-to-Text • Updated Dec 11, 2025 • 57.5k • 477

Visual Multi Modal LLM

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9, 2025 • 21
Detect Anything via Next Point Prediction

Paper • 2510.12798 • Published Oct 14, 2025 • 50
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 120
DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published Nov 7, 2025 • 45

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4, 2025 • 16
LLM-guided Hierarchical Retrieval

Paper • 2510.13217 • Published Oct 15, 2025 • 21
AnyUp: Universal Feature Upsampling

Paper • 2510.12764 • Published Oct 14, 2025 • 12
katanemo/Arch-Router-1.5B

Text Generation • Updated Nov 16, 2025 • 1.02k • • 248

3D Models & Modeling

Towards Scalable and Consistent 3D Editing

Paper • 2510.02994 • Published Oct 3, 2025 • 6
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

Paper • 2509.24817 • Published Sep 29, 2025 • 9
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published Oct 16, 2025 • 64
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 50

nick007x/github-code-2025

Viewer • Updated Oct 15, 2025 • 147M • 5.23k • 116
fka/prompts.chat

Viewer • Updated about 16 hours ago • 1.47k • 23.3k • 9.62k

Run on CPU Optimizations

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29, 2025 • 78

World View Creation (out painting 3D)

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published Oct 15, 2025 • 73

zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 63.3k • • 1.21k
A Survey of Vibe Coding with Large Language Models

Paper • 2510.12399 • Published Oct 14, 2025 • 50

TTS & Speech to Text

Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction

Paper • 2510.03117 • Published Oct 3, 2025 • 12
ResembleAI/chatterbox

Text-to-Speech • Updated Sep 23, 2025 • 2.23M • • 1.51k
thewh1teagle/phonikud

0.3B • Updated Aug 24, 2025 • 188 • 1
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE

Paper • 2510.13344 • Published Oct 15, 2025 • 63

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

Paper • 2510.08002 • Published Oct 9, 2025 • 23

Reinforcement Learning Etc..

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5, 2025 • 137

nick007x/github-code-2025

Viewer • Updated Oct 15, 2025 • 147M • 5.23k • 116
fka/prompts.chat

Viewer • Updated about 16 hours ago • 1.47k • 23.3k • 9.62k

VISTA: A Test-Time Self-Improving Video Generation Agent

Paper • 2510.15831 • Published Oct 17, 2025 • 22
Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation

Paper • 2510.15624 • Published Oct 17, 2025 • 15

Run on CPU Optimizations

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29, 2025 • 78

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16, 2025 • 106

World View Creation (out painting 3D)

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published Oct 15, 2025 • 73

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published Apr 1, 2025 • 27
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 141
microsoft/Fara-7B

Image-Text-to-Text • Updated Dec 11, 2025 • 57.5k • 477

zai-org/GLM-4.6

Text Generation • 357B • Updated Sep 30, 2025 • 63.3k • • 1.21k
A Survey of Vibe Coding with Large Language Models

Paper • 2510.12399 • Published Oct 14, 2025 • 50

Visual Multi Modal LLM

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9, 2025 • 21
Detect Anything via Next Point Prediction

Paper • 2510.12798 • Published Oct 14, 2025 • 50
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 120
DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published Nov 7, 2025 • 45

TTS & Speech to Text

Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction

Paper • 2510.03117 • Published Oct 3, 2025 • 12
ResembleAI/chatterbox

Text-to-Speech • Updated Sep 23, 2025 • 2.23M • • 1.51k
thewh1teagle/phonikud

0.3B • Updated Aug 24, 2025 • 188 • 1
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE

Paper • 2510.13344 • Published Oct 15, 2025 • 63

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4, 2025 • 16
LLM-guided Hierarchical Retrieval

Paper • 2510.13217 • Published Oct 15, 2025 • 21
AnyUp: Universal Feature Upsampling

Paper • 2510.12764 • Published Oct 14, 2025 • 12
katanemo/Arch-Router-1.5B

Text Generation • Updated Nov 16, 2025 • 1.02k • • 248

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

Paper • 2510.08002 • Published Oct 9, 2025 • 23

3D Models & Modeling

Towards Scalable and Consistent 3D Editing

Paper • 2510.02994 • Published Oct 3, 2025 • 6
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

Paper • 2509.24817 • Published Sep 29, 2025 • 9
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published Oct 16, 2025 • 64
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 50

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs