A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities.
Ray Yang
rayruiyang
AI & ML interests
None yet
Recent Activity
upvoted a paper 30 minutes ago
Utonia: Toward One Encoder for All Point Clouds updated
a collection
about 1 month ago
VST updated
a dataset about 1 month ago
rayruiyang/vst_3d_grounding_benchmark Organizations
None yet