VIEW2SPACE: Studying Multi-View Visual Reasoning from Sparse Observations

view2space_4b is an ECCV 2026 VIEW2SPACE model built on top of Qwen/Qwen3-VL-4B-Instruct. It is designed for grounded multi-view visual reasoning from sparse observations.

VIEW2SPACE teaser

Quick start

Please see the VIEW2SPACE GitHub repository for evaluation code and usage:

GitHub Repository

Quick links

Overview

VIEW2SPACE studies how vision-language models reason across sparse and heterogeneous viewpoints. Instead of solving a task from a single image, the model must integrate partial observations from multiple views to form a more complete spatial understanding.

This checkpoint is the Qwen3-VL-4B VIEW2SPACE model release and is intended for multi-view visual reasoning under sparse observations.

Model Summary

  • Model name: view2space_4b
  • Base model: Qwen/Qwen3-VL-4B-Instruct
  • Architecture: Qwen3VLForConditionalGeneration
  • Project: VIEW2SPACE
  • Use case: multi-view visual reasoning from sparse observations
  • Venue: ECCV 2026

Highlights

  • Built for grounded multi-view reasoning rather than single-image prediction.
  • Targets sparse observations and heterogeneous viewpoints.
  • Released together with the public VIEW2SPACE testing set and evaluation code.

Resources

  • Public testing release: view2space-v1
  • Official repository: https://github.com/pokerme7777/VIEW2SPACE
  • Public eval pipeline: src/eval in the official repository

Usage Notes

  • Use the official VIEW2SPACE repository for evaluation scripts and prompt formatting.
  • The current public testing release is view2space-v1.
  • If you need another public data format, please open an issue in the GitHub repository.

Framework versions

  • TRL: 0.26.2
  • Transformers: 4.57.0
  • Pytorch: 2.7.1+cu126
  • Datasets: 4.4.2
  • Tokenizers: 0.22.1

Citations

@article{ke2026view2space,
  title={VIEW2SPACE: Studying Multi-View Visual Reasoning from Sparse Observations},
  author={Ke, Fucai and Cai, Zhixi and Li, Boying and Chen, Long and Lin, Beibei and Wang, Weiqing and Haghighi, Pari Delir and Haffari, Gholamreza and Rezatofighi, Hamid},
  journal={arXiv preprint arXiv:2603.16506},
  year={2026}
}
Downloads last month
22
Safetensors
Model size
570k params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Pokerme/view2space_4b

Finetuned
(320)
this model

Collection including Pokerme/view2space_4b

Paper for Pokerme/view2space_4b