GridDepth: Pretrained Checkpoints for Transparent-Object Depth Completion
This repo hosts the pretrained checkpoints that go with the atom525/ProgressiveDepth codebase (idea.md series-joint pipeline: TransDiff Refined1 β LIDF) plus our local RFTrans reproduction baselines.
Recipe: see atom525/ProgressiveDepth README.md and docs/PIPELINE.md.
File layout
GridDepth/
βββ progressivedepth/ # idea.md δΈ»ηΊΏοΌModule A=ip_basic + Module B=LIDFοΌ
β βββ ckpts/
β β βββ lidf_stage1_epoch059.pth # 248 MB β LIDF Stage 1 (frozen baseline, CG-only Adam 60 ep)
β β βββ C_stage2_epoch029.pth # 2.2 MB β Stage 2 RefineNet, retrained on Refined1 input (idea.md C run)
β β βββ C_stage3_epoch029.pth # 2.2 MB β Stage 3 RefineNet hard-neg, retrained on Refined1 input
β βββ configs/
β βββ train_progressive_stage2.yaml
β βββ train_progressive_stage3.yaml
β βββ pipeline_config.yaml # inference / evaluate config
β
βββ rftrans/ # RFTrans ε€η°δΊ§η©
βββ ckpts/
β βββ rfnet_refractive_flow_epoch500.pth # 467 MB β RFNet (DRN backbone), Adam 500 ep on unity/train
β βββ f2net_flow2normal_epoch500.pth # 356 MB β F2Net (simple_unet), Adam 500 ep on unity/train
β βββ mask_adam_epoch195.pth # 312 MB β mask network (DRN), Adam 200 ep on unity/train, mIoU 0.847
β βββ outlines_side_adam_epoch195.pth # 312 MB β boundary network (DRN side-output), Adam 200 ep on unity/train
βββ configs/
βββ refractive_flow_config.yaml # RFNet train config (Adam, 500 ep)
βββ flow2normal_config.yaml # F2Net train config (Adam, 500 ep)
βββ mask_adam_config.yaml # mask train config (Adam, 200 ep)
βββ outlines_side_adam_config.yaml # boundary train config (Adam, 200 ep)
βββ exp017_paperfaithful.yaml # rgb2normal e2e config (paper-faithful: SGD 100 ep, lr=1e-4 mom=0.9 wd=5e-4)
ProgressiveDepth (idea.md series-joint pipeline)
Pipeline:
RGB + Noisy Depth
β
βΌ Module A: TransDiff Data Preprocessing (ip_basic ε€ε°ΊεΊ¦ε½’ζε¦ε‘«ε
)
Refined Depth1
β
βΌ Module B: LIDF (Stage 1 frozen + Stage 2 / 3 retrained on Refined1)
Final Depth
Final results (paper protocol: 256Γ144 + per-image avg + corrupt mask)
C_full = lidf_stage1_epoch059.pth + C_stage2_epoch029.pth + C_stage3_epoch029.pthοΌevaluation η¨ mode A (feed_to_lidf=refined1):
| Dataset | C_full RMSEβ | C_full Ξ΄1.05β | B baseline RMSE | B baseline Ξ΄1.05 | LIDF paper Table 1 |
|---|---|---|---|---|---|
| real-test (Real-novel) β | 0.0403 | 45.28 | 0.0443 | 40.18 | 0.0250 / 76.21 |
| real-val (Real-known) | 0.0351 | 77.22 | 0.0358 | 77.18 | 0.0280 / 82.37 |
| synthetic-test (Syn-novel) | 0.0328 | 62.82 | 0.0305 | 66.12 | 0.0280 / 68.62 |
| synthetic-val (Syn-known) | 0.0129 | 93.72 | 0.0111 | 96.07 | 0.0120 / 94.79 |
Conclusion: idea.md series-joint approach is effective on real-world data (Real-novel RMSE β9%, Ξ΄1.05 β5 pts vs baseline B), regression on synthetic (where ip_basic adds noise to clean inputs). The remaining gap to paper Table 1 is due to Omniverse Object Dataset being unavailable (link broken since 2025-03, NVlabs/implicit_depth#3).
RFTrans reproduction
Pipeline (per RFTrans paper Β§III-C):
RGB ββ> RFNet ββ> refractive flow + mask + boundary
β
βββ> F2Net ββ> surface normal
β
βββ> depth2depth global opt ββ> Refined Depth
Caveats
- Architecture deviation: paper Β§III-C says "RFNet predicts mask, boundary, and refractive flow" (multi-task), but the official repo doesn't implement this. We trained separate networks (RFNet predicts only flow, F2Net predicts normal from flow, mask & boundary as independent DeepLab+DRN networks) β this matches the actual repo structure but not the paper text.
- Optimizer deviation: paper Β§IV-A specifies SGD lr=1e-4 momentum=0.9 weight_decay=5e-4 for 100 epochs. We used Adam for sub-network training because we empirically found SGD lr=1e-4 from random init does not converge (mask val mIoU ~0.46 = random level after 100 ep SGD vs 0.85 with Adam 200 ep). The provided
exp017_paperfaithful.yamlIS paper-faithful (SGD 100 ep) β used for the end-to-end fine-tuning stage, where it warm-starts from the Adam-trained RFNet/F2Net. - Training data: all networks trained on
data/unity/train/(5000 RGB + flow + mask + boundary + normal GT, generated with Unity-RefractiveFlowRender) β this is the dataset specified by RFTrans paper Β§IV-A.
How to use these RFTrans ckpts
In your RFTrans/eval_depth_completion/config_*.yaml:
rgb2flow:
pathWeightsFile: <path_to>/rfnet_refractive_flow_epoch500.pth
flow2normal:
pathWeightsFile: <path_to>/f2net_flow2normal_epoch500.pth
masks:
pathWeightsFile: <path_to>/mask_adam_epoch195.pth # OR cleargrasp_orig/.../checkpoint_mask.pth
outlines:
pathWeightsFile: <path_to>/outlines_side_adam_epoch195.pth # OR cleargrasp_orig/.../checkpoint_outlines.pth
Environment / dependencies
- python 3.8, pytorch 2.0.0+cu118
- LIDF: see implicit_depth/requirements.txt
- RFTrans: needs
depth2depthC++ binary andlibhdf5.sofrom conda env
License
- LIDF Stage 1 ckpt and code: NVIDIA Source Code License (Non-Commercial), inherited from NVlabs/implicit_depth
- RFTrans ckpts and code: inherited from LJY-XCX/RFTrans license
- Our extensions (transdiff_preprocess wrapper, train_progressive trainer, retrains): same as upstream
Citation
If you use these ckpts please cite the original works:
@inproceedings{zhu2021rgbd,
title={RGB-D Local Implicit Function for Depth Completion of Transparent Objects},
author={Zhu, Luyang and Mousavian, Arsalan and Xiang, Yu and Mazhar, Hammad and van Eenbergen, Jozef and Debnath, Shoubhik and Fox, Dieter},
booktitle={CVPR},
year={2021}
}
@article{tang2024rftrans,
title={RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation},
author={Tang, Tutian and Liu, Jiyu and Zhang, Jieyi and Fu, Haoyuan and Xu, Wenqiang and Lu, Cewu},
journal={IEEE Robotics and Automation Letters},
year={2024}
}