Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
syed abuthahir
developerabu
1
1
12
Follow
Handyfff's profile picture
Adell1982's profile picture
2 followers
ยท
18 following
writerabu
abuvanth
AI & ML interests
None yet
Recent Activity
reacted
to
ginigen-ai
's
post
with ๐ฅ
2 days ago
๐ณ The RoboCasa Kitchen Leaderboard What does it take for a robot to handle kitchen chores the way a person does? It has to see (Vision), understand instructions (Language), and actually act (Action) โ and VLA (Vision-Language-Action) models are emerging as the answer. They're the bridge between large multimodal models and real-world embodied control. RoboCasa Kitchen is a leading robot-learning benchmark in which a single-arm robot (Franka Panda) performs 24 atomic manipulation tasks โ picking up cups and bowls, opening drawers and doors, turning faucets, pressing buttons, and more โ inside a photorealistic simulated kitchen. Because the layout and object placement are randomized every episode, it tests genuine generalization rather than memorized motions. The score (success rate, SR) is the average fraction of the 24 tasks completed as instructed, measured over multiple seeds so results aren't down to luck. The catch: this benchmark has no official leaderboard, and protocols (number of demonstrations, evaluation setup) differ from paper to paper, leaving scores scattered. Lining the numbers up naively quickly turns into an apples-to-oranges comparison. This leaderboard fixes that by collecting published scores with their sources and comparing only what is genuinely comparable. It's split into three tables: ๐ Kitchen 24-task (matched) โ head-to-head under identical conditions (per the RLDX-1 Technical Report). This is the core ranking you can actually trust. โ Other protocols โ self-reported under different setups (e.g. fewer demos). Not directly comparable, so kept separate. ๐ค GR1-Tabletop โ a different, humanoid-based variant suite, separated to avoid confusion. Any researcher can submit their own model's score directly, and submissions are reviewed before they appear on the board. Every number links to its source paper, so you can verify it yourself. ๐ https://huggingface.co/spaces/ginigen-ai/robocasa-kitchen-leaderboard
reacted
to
danielhanchen
's
post
with ๐ค
7 days ago
1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5 We gave 3 models the same prompt and compared one-shot outputs. The 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s. Which output do you like best? GGUF: https://huggingface.co/unsloth/GLM-5.2-GGUF
liked
a model
10 days ago
PatnaikAshish/kokoclone
View all activity
Organizations
None yet
developerabu
's models
13
Sort:ย Recently updated
developerabu/whisper-tiny-mnn
Updated
17 days ago
developerabu/Hy-MT1.5-1.8B-gguf
2B
โข
Updated
18 days ago
โข
239
developerabu/gemma-4-e2b-text-only-litertlm
Updated
23 days ago
developerabu/vits-tts-mnn
Text-to-Speech
โข
Updated
May 30
โข
10
โข
2
developerabu/LFM2-350M-Extract-MNN
Updated
Apr 15
โข
8
developerabu/Josiefied-Qwen3.5-0.8B-gabliterated-v1-MNN
Text Generation
โข
Updated
Apr 14
โข
5
developerabu/LFM2.5-350M-MNN
Updated
Apr 13
โข
4
developerabu/whisper-small-mnn
Updated
Apr 13
developerabu/whisper-base-mnn
Updated
Mar 18
developerabu/whisper-tiny-en-mnn
Updated
Mar 17
โข
1
developerabu/multilingual-e5-small-mnn
Updated
Mar 7
developerabu/mms-tts-tam-mnn
Updated
Feb 15
developerabu/bge-small-en-v1.5-mnn
Updated
Jan 11