Graspmax — GeoMatch v2 · GeoMatch++ · GeoMatch v1

Graspmax contains geometry-aware contact prediction models for dexterous robotic grasping, trained on the CMapDataset across 5 robot end-effectors (EZGripper, Barrett, Robotiq 3-Finger, Allegro, ShadowHand).

⚠️ Version notice: GeoMatch v1 and GeoMatch++ were trained with a corrupted robot_keypoints.json (2× scale factor and wrong shadowhand axis-swap stage). Use GeoMatch v2 for any new work. v1 and GeoMatch++ are kept for reproducibility only.

Models at a Glance

Model	Status	File prefix	Val loss	Val acc
GeoMatch v2	✅ Recommended	`geomatch_v2_*`	1.594	0.695
GeoMatch++	⚠️ Deprecated (built on v1 encoders)	`geomatch_pp_*`	0.350	0.940
GeoMatch v1	⚠️ Deprecated (corrupted keypoints)	`geomatch_final / checkpoint_epoch*`	0.435	0.959

The lower loss/higher accuracy of v1 and GeoMatch++ are an artefact of training on corrupted keypoints — the 2× scale inflated keypoint distances making the contact maps geometrically trivial to predict. v2 trains on correct geometry and is the only model that produces valid IK targets during grasp generation.

Architecture

GeoMatch (v1 and v2 share the same architecture)

Dual GCN encoder (object + robot surface) → L2-normalised embeddings → linear projection heads (512→64) × 2 → 5 autoregressive MLP modules → per-keypoint BCE contact map prediction.

Based on: Geometry Matching for Multi-Embodiment Grasping (NeurIPS 2024)

GeoMatch++

Extends GeoMatch with a morphology encoder (GCN over the robot kinematic-tree graph, 9D node features, 32 nodes) and a DCP-style cross-attention transformer that fuses object geometry with robot morphology before contact prediction. Pretrained GeoMatch v1 encoders are frozen.

Based on: GeoMatch++: Morphology-Aware Grasping via Correspondence Learning

Component Comparison

Component	GeoMatch v1 / v2	GeoMatch++
Object GCN encoder	3 layers × 256 → 512, trainable	Same, frozen (from GeoMatch v1)
Robot surface GCN	3 layers × 256 → 512, trainable	Same, frozen (from GeoMatch v1)
Morphology encoder	—	NEW GCN(9 → 256×3 → 512), trainable
Cross-attention	—	NEW DCP transformer (512-dim, 4 heads, 1 layer)
Projection heads	Linear(512→64) × 2	Same, re-initialised
AR keypoint modules	5× MLP	Same, re-initialised
Total params	~1.9M	~~6.4M (~~5.8M trainable)

What Changed in v2 (Keypoint Bug Fix)

GeoMatch requires a robot_keypoints.json that defines canonical 3D keypoint positions for each robot in rest-pose world space. The v1 keypoints had two bugs:

Bug 1 — 2× scale factor: The generation script applied world_pos *= 2.0, citing HandModel's hand_scale=2.0 class default. However, every actual call site passes hand_scale=1.0, overriding that default. Because the scale was applied before the inverse-FK projection that HandModel.get_canonical_keypoints() uses (T⁻¹[2p;1] ≠ 2·T⁻¹[p;1]), the distortion was not uniform — it grew with each link's distance from the kinematic root, corrupting both training labels and inference IK targets.

Bug 2 — ShadowHand axis-swap at wrong stage: The [x, -z, y] axis permutation for ShadowHand was applied to the final world-space world_pos (after FK). The reference implementation (gripper_utils.py) applies it to raw mesh points in link-local space before the visual-origin transform. Rotation and axis permutation do not commute, so the wrong stage produced scrambled keypoint positions for any ShadowHand link with a non-zero visual-origin rotation.

Both bugs were confirmed by comparing generate_keypoints_json.py against gripper_utils.py and verified by observing that v1 ShadowHand tip keypoints had y-values of ~−0.84 m (outside any physical hand envelope) versus the corrected ~0.01 m.

Training Details

GeoMatch v2 ✅ (Recommended)

Setting	Value
Dataset	CMapDataset (ContactDB + YCB), fixed keypoints
End-effectors	EZGripper, Barrett, Robotiq 3-Finger, Allegro, ShadowHand
Batch size	256
Optimizer	Adam (β₁=0.9, β₂=0.99)
Learning rate	1e-4
Epochs	200
Hardware	AMD Instinct MI300X (192 GB HBM3), ROCm 6.2.4
Training time	8.58 hours
Precision	FP32
Final val loss	1.594
Final val accuracy	0.695

GeoMatch v2 Training Curves

Epoch	Val Loss	Val Accuracy
0	1.935	0.205
25	1.731	0.563
50	1.675	0.580
100	1.649	0.632
150	1.603	0.656
199	1.594	0.695

GeoMatch++ ⚠️ (Deprecated — built on GeoMatch v1 encoders)

Setting	Value
Initialisation	Pretrained GeoMatch v1 encoders (frozen)
Trainable params	~5.8M
Batch size	32 per GPU × 8 GPUs = 256 effective
Optimizer	Adam (β₁=0.9, β₂=0.99)
Learning rate	5e-5
Epochs	150
Hardware	8× AMD Instinct MI300X, ROCm 6.2.4 (DDP)
Training time	~2.8 hours
Precision	FP32
Final val loss	0.350 (artefact of corrupted training data)
Final val accuracy	0.940 (artefact of corrupted training data)

GeoMatch++ Training Curves

Epoch	Val Loss	Val Accuracy
0	0.465	0.999
25	0.370	0.880
89	0.362	0.902
149	0.350	0.940

GeoMatch v1 ⚠️ (Deprecated — corrupted keypoints)

Setting	Value
Dataset	CMapDataset (ContactDB + YCB), corrupted keypoints
Batch size	256
Optimizer	Adam (β₁=0.9, β₂=0.99)
Learning rate	1e-4
Epochs	200
Hardware	AMD Instinct MI300X (192 GB HBM3), ROCm 6.2.4
Training time	22.18 hours
Precision	FP32
Final val loss	0.435 (artefact of corrupted training data)
Final val accuracy	0.959 (artefact of corrupted training data)

Checkpoints

GeoMatch v2 ✅ (Use these)

File	Epoch	Val Loss	Notes
`geomatch_v2_epoch50.pth`	50	1.675	Early convergence
`geomatch_v2_epoch100.pth`	100	1.649	Mid-training
`geomatch_v2_epoch150.pth`	150	1.603	Near-converged
`geomatch_v2_final.pth`	199	1.594	Final model (recommended)

GeoMatch++ ⚠️ (Deprecated)

File	Epoch	Notes
`geomatch_pp_checkpoint_epoch50.pth`	50	Early convergence
`geomatch_pp_checkpoint_epoch100.pth`	100	Mid-training
`geomatch_pp_checkpoint_epoch140.pth`	140	Near-converged
`geomatch_pp_final.pth`	149	Final (deprecated)

GeoMatch v1 ⚠️ (Deprecated)

File	Epoch	Notes
`checkpoint_epoch50.pth`	50	Early convergence
`checkpoint_epoch100.pth`	100	Mid-training
`checkpoint_epoch150.pth`	150	Near-converged
`geomatch_final.pth`	200	Final (deprecated)

Usage

GeoMatch v2 (Recommended)

import torch, sys
sys.path.append(".")
import config
from models.geomatch import GeoMatch

model = GeoMatch(config).cuda()
model.load_state_dict(torch.load("geomatch_v2_final.pth", map_location="cuda"))
model.eval()

with torch.no_grad():
    contact_map, keypoint_probs = model(
        obj_pc,               # [B, 2048, 3]   object point cloud
        robot_pc,             # [B, 6, 3]      robot surface points (6 keypoints)
        robot_key_point_idx,  # [B, 6]         keypoint indices into robot_pc
        obj_adj,              # [B, 2048, 2048] object adjacency (sparse COO)
        robot_adj,            # [B, 6, 6]      robot adjacency
        xyz_prev,             # [B, 6, 3]      previous keypoint positions
    )
# contact_map:    [B, 2048, 6, 1]  — per-object-point × per-keypoint contact probability
# keypoint_probs: [B, 2048, 5, 1]  — autoregressive keypoint contact probabilities

GeoMatch++ (Deprecated — kept for reproducibility)

import torch, sys
sys.path.append(".")
import config
from models.geomatch_pp import GeoMatchPP

model = GeoMatchPP(config).cuda()
model.load_state_dict(torch.load("geomatch_pp_final.pth", map_location="cuda"))
model.eval()

with torch.no_grad():
    contact_map, keypoint_probs = model(
        obj_pc,               # [B, 2048, 3]
        robot_pc,             # [B, 6, 3]
        robot_key_point_idx,  # [B, 6]
        obj_adj,              # [B, 2048, 2048]
        robot_adj,            # [B, 6, 6]
        xyz_prev,             # [B, 6, 3]
        morph_features,       # [B, 32, 9]     morphology node features
        morph_adj,            # [B, 32, 32]    morphology adjacency
    )

Morphology graphs are pre-built per robot using preprocess_morphology.py → gnn_morphology_new.pt.

Repository Structure

models/
  geomatch.py      # GeoMatch model (shared by v1 and v2)
  geomatch_pp.py   # GeoMatch++ model (+ morphology encoder + DCP transformer)
  gnn.py           # Graph Convolutional Network
  mlp.py           # MLP building block
config.py          # Hyperparameters for all models
generate_keypoints_json.py  # Fixed keypoint generator (used for v2 training data)

Citation

@inproceedings{geomatch2024,
  title     = {Geometry Matching for Multi-Embodiment Grasping},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year      = {2024},
}

@article{geomatch_pp2024,
  title   = {GeoMatch++: Morphology-Aware Grasping via Correspondence Learning},
  journal = {arXiv preprint arXiv:2412.18998},
  year    = {2024},
}

License

Original GeoMatch code © 2023 DeepMind Technologies Limited, licensed under the Apache License 2.0.
GeoMatch++ extension, v2 training, and all checkpoints produced by Dimios45 as part of the Graspmax project.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics

Paper for Dimios45/Graspmax

GeoMatch++: Morphology Conditioned Geometry Matching for Multi-Embodiment Grasping

Paper • 2412.18998 • Published Dec 25, 2024