Graspmax — GeoMatch v2 · GeoMatch++ · GeoMatch v1

Graspmax contains geometry-aware contact prediction models for dexterous robotic grasping, trained on the CMapDataset across 5 robot end-effectors (EZGripper, Barrett, Robotiq 3-Finger, Allegro, ShadowHand).

⚠️ Version notice: GeoMatch v1 and GeoMatch++ were trained with a corrupted robot_keypoints.json (2× scale factor and wrong shadowhand axis-swap stage). Use GeoMatch v2 for any new work. v1 and GeoMatch++ are kept for reproducibility only.


Models at a Glance

Model Status File prefix Val loss Val acc
GeoMatch v2 Recommended geomatch_v2_* 1.594 0.695
GeoMatch++ ⚠️ Deprecated (built on v1 encoders) geomatch_pp_* 0.350 0.940
GeoMatch v1 ⚠️ Deprecated (corrupted keypoints) geomatch_final / checkpoint_epoch* 0.435 0.959

The lower loss/higher accuracy of v1 and GeoMatch++ are an artefact of training on corrupted keypoints — the 2× scale inflated keypoint distances making the contact maps geometrically trivial to predict. v2 trains on correct geometry and is the only model that produces valid IK targets during grasp generation.


Architecture

GeoMatch (v1 and v2 share the same architecture)

Dual GCN encoder (object + robot surface) → L2-normalised embeddings → linear projection heads (512→64) × 2 → 5 autoregressive MLP modules → per-keypoint BCE contact map prediction.

Based on: Geometry Matching for Multi-Embodiment Grasping (NeurIPS 2024)

GeoMatch++

Extends GeoMatch with a morphology encoder (GCN over the robot kinematic-tree graph, 9D node features, 32 nodes) and a DCP-style cross-attention transformer that fuses object geometry with robot morphology before contact prediction. Pretrained GeoMatch v1 encoders are frozen.

Based on: GeoMatch++: Morphology-Aware Grasping via Correspondence Learning

Component Comparison

Component GeoMatch v1 / v2 GeoMatch++
Object GCN encoder 3 layers × 256 → 512, trainable Same, frozen (from GeoMatch v1)
Robot surface GCN 3 layers × 256 → 512, trainable Same, frozen (from GeoMatch v1)
Morphology encoder NEW GCN(9 → 256×3 → 512), trainable
Cross-attention NEW DCP transformer (512-dim, 4 heads, 1 layer)
Projection heads Linear(512→64) × 2 Same, re-initialised
AR keypoint modules 5× MLP Same, re-initialised
Total params ~1.9M 6.4M (5.8M trainable)

What Changed in v2 (Keypoint Bug Fix)

GeoMatch requires a robot_keypoints.json that defines canonical 3D keypoint positions for each robot in rest-pose world space. The v1 keypoints had two bugs:

Bug 1 — 2× scale factor: The generation script applied world_pos *= 2.0, citing HandModel's hand_scale=2.0 class default. However, every actual call site passes hand_scale=1.0, overriding that default. Because the scale was applied before the inverse-FK projection that HandModel.get_canonical_keypoints() uses (T⁻¹[2p;1] ≠ 2·T⁻¹[p;1]), the distortion was not uniform — it grew with each link's distance from the kinematic root, corrupting both training labels and inference IK targets.

Bug 2 — ShadowHand axis-swap at wrong stage: The [x, -z, y] axis permutation for ShadowHand was applied to the final world-space world_pos (after FK). The reference implementation (gripper_utils.py) applies it to raw mesh points in link-local space before the visual-origin transform. Rotation and axis permutation do not commute, so the wrong stage produced scrambled keypoint positions for any ShadowHand link with a non-zero visual-origin rotation.

Both bugs were confirmed by comparing generate_keypoints_json.py against gripper_utils.py and verified by observing that v1 ShadowHand tip keypoints had y-values of ~−0.84 m (outside any physical hand envelope) versus the corrected ~0.01 m.


Training Details

GeoMatch v2 ✅ (Recommended)

Setting Value
Dataset CMapDataset (ContactDB + YCB), fixed keypoints
End-effectors EZGripper, Barrett, Robotiq 3-Finger, Allegro, ShadowHand
Batch size 256
Optimizer Adam (β₁=0.9, β₂=0.99)
Learning rate 1e-4
Epochs 200
Hardware AMD Instinct MI300X (192 GB HBM3), ROCm 6.2.4
Training time 8.58 hours
Precision FP32
Final val loss 1.594
Final val accuracy 0.695

GeoMatch v2 Training Curves

Epoch Val Loss Val Accuracy
0 1.935 0.205
25 1.731 0.563
50 1.675 0.580
100 1.649 0.632
150 1.603 0.656
199 1.594 0.695

GeoMatch++ ⚠️ (Deprecated — built on GeoMatch v1 encoders)

Setting Value
Initialisation Pretrained GeoMatch v1 encoders (frozen)
Trainable params ~5.8M
Batch size 32 per GPU × 8 GPUs = 256 effective
Optimizer Adam (β₁=0.9, β₂=0.99)
Learning rate 5e-5
Epochs 150
Hardware 8× AMD Instinct MI300X, ROCm 6.2.4 (DDP)
Training time ~2.8 hours
Precision FP32
Final val loss 0.350 (artefact of corrupted training data)
Final val accuracy 0.940 (artefact of corrupted training data)

GeoMatch++ Training Curves

Epoch Val Loss Val Accuracy
0 0.465 0.999
25 0.370 0.880
89 0.362 0.902
149 0.350 0.940

GeoMatch v1 ⚠️ (Deprecated — corrupted keypoints)

Setting Value
Dataset CMapDataset (ContactDB + YCB), corrupted keypoints
Batch size 256
Optimizer Adam (β₁=0.9, β₂=0.99)
Learning rate 1e-4
Epochs 200
Hardware AMD Instinct MI300X (192 GB HBM3), ROCm 6.2.4
Training time 22.18 hours
Precision FP32
Final val loss 0.435 (artefact of corrupted training data)
Final val accuracy 0.959 (artefact of corrupted training data)

Checkpoints

GeoMatch v2 ✅ (Use these)

File Epoch Val Loss Notes
geomatch_v2_epoch50.pth 50 1.675 Early convergence
geomatch_v2_epoch100.pth 100 1.649 Mid-training
geomatch_v2_epoch150.pth 150 1.603 Near-converged
geomatch_v2_final.pth 199 1.594 Final model (recommended)

GeoMatch++ ⚠️ (Deprecated)

File Epoch Notes
geomatch_pp_checkpoint_epoch50.pth 50 Early convergence
geomatch_pp_checkpoint_epoch100.pth 100 Mid-training
geomatch_pp_checkpoint_epoch140.pth 140 Near-converged
geomatch_pp_final.pth 149 Final (deprecated)

GeoMatch v1 ⚠️ (Deprecated)

File Epoch Notes
checkpoint_epoch50.pth 50 Early convergence
checkpoint_epoch100.pth 100 Mid-training
checkpoint_epoch150.pth 150 Near-converged
geomatch_final.pth 200 Final (deprecated)

Usage

GeoMatch v2 (Recommended)

import torch, sys
sys.path.append(".")
import config
from models.geomatch import GeoMatch

model = GeoMatch(config).cuda()
model.load_state_dict(torch.load("geomatch_v2_final.pth", map_location="cuda"))
model.eval()

with torch.no_grad():
    contact_map, keypoint_probs = model(
        obj_pc,               # [B, 2048, 3]   object point cloud
        robot_pc,             # [B, 6, 3]      robot surface points (6 keypoints)
        robot_key_point_idx,  # [B, 6]         keypoint indices into robot_pc
        obj_adj,              # [B, 2048, 2048] object adjacency (sparse COO)
        robot_adj,            # [B, 6, 6]      robot adjacency
        xyz_prev,             # [B, 6, 3]      previous keypoint positions
    )
# contact_map:    [B, 2048, 6, 1]  — per-object-point × per-keypoint contact probability
# keypoint_probs: [B, 2048, 5, 1]  — autoregressive keypoint contact probabilities

GeoMatch++ (Deprecated — kept for reproducibility)

import torch, sys
sys.path.append(".")
import config
from models.geomatch_pp import GeoMatchPP

model = GeoMatchPP(config).cuda()
model.load_state_dict(torch.load("geomatch_pp_final.pth", map_location="cuda"))
model.eval()

with torch.no_grad():
    contact_map, keypoint_probs = model(
        obj_pc,               # [B, 2048, 3]
        robot_pc,             # [B, 6, 3]
        robot_key_point_idx,  # [B, 6]
        obj_adj,              # [B, 2048, 2048]
        robot_adj,            # [B, 6, 6]
        xyz_prev,             # [B, 6, 3]
        morph_features,       # [B, 32, 9]     morphology node features
        morph_adj,            # [B, 32, 32]    morphology adjacency
    )

Morphology graphs are pre-built per robot using preprocess_morphology.pygnn_morphology_new.pt.


Repository Structure

models/
  geomatch.py      # GeoMatch model (shared by v1 and v2)
  geomatch_pp.py   # GeoMatch++ model (+ morphology encoder + DCP transformer)
  gnn.py           # Graph Convolutional Network
  mlp.py           # MLP building block
config.py          # Hyperparameters for all models
generate_keypoints_json.py  # Fixed keypoint generator (used for v2 training data)

Citation

@inproceedings{geomatch2024,
  title     = {Geometry Matching for Multi-Embodiment Grasping},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year      = {2024},
}

@article{geomatch_pp2024,
  title   = {GeoMatch++: Morphology-Aware Grasping via Correspondence Learning},
  journal = {arXiv preprint arXiv:2412.18998},
  year    = {2024},
}

License

Original GeoMatch code © 2023 DeepMind Technologies Limited, licensed under the Apache License 2.0.
GeoMatch++ extension, v2 training, and all checkpoints produced by Dimios45 as part of the Graspmax project.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for Dimios45/Graspmax