Instructions to use aarondevstack/DepthPro-1024x1024-coreml with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Depth Pro
How to use aarondevstack/DepthPro-1024x1024-coreml with Depth Pro:
# Download checkpoint pip install huggingface-hub huggingface-cli download --local-dir checkpoints aarondevstack/DepthPro-1024x1024-coreml
import depth_pro # Load model and preprocessing transform model, transform = depth_pro.create_model_and_transforms() model.eval() # Load and preprocess an image. image, _, f_px = depth_pro.load_rgb("example.png") image = transform(image) # Run inference. prediction = model.infer(image, f_px=f_px) # Results: 1. Depth in meters depth = prediction["depth"] # Results: 2. Focal length in pixels focallength_px = prediction["focallength_px"] - Notebooks
- Google Colab
- Kaggle
| license: apple-ascl | |
| library_name: coreml | |
| tags: | |
| - depth-estimation | |
| - visionos | |
| - apple-silicon | |
| - amlr | |
| - computer-vision | |
| - depth-pro | |
| - 1024x1024 | |
| extra_gated_heading: DepthPro CoreML (High-Resolution 1024px) | |
| extra_gated_button_content: Access Model | |
| # DepthPro CoreML (1024x1024 High-Resolution) | |
| This repository contains the **High-Resolution (1024x1024)** version of the DepthPro model, optimized for CoreML. | |
| DepthPro is a state-of-the-art monocular depth estimation model that provides sharp, metric-scale depth maps. This 1024px version is specifically designed for **High-Quality 3D Exports** where edge precision and fine detail preservation are critical. | |
| ## π Key Features | |
| - **High Fidelity**: Captures thin structures (threads, instruments, hair) with superior accuracy compared to the 512px version. | |
| - **Symmetric 3D Rendering Optimized**: Perfectly suited for symmetric shifting in VR/AR to minimize visual discomfort. | |
| - **VisionOS Ready**: Fully compatible with Apple Vision Pro (optimized for GPU/CPU). | |
| ## π Performance & Requirements | |
| | Metric | Specification | | |
| | :--- | :--- | | |
| | **Input Resolution** | 1024 x 1024 pixels | | |
| | **Compute Units** | GPU + CPU (Recommended for stability) | | |
| | **Average Latency** | ~7.5s per frame (on M2 Ultra/M3 Max) | | |
| | **Target Use Case** | Offline Video Conversion / High-Quality Spatial Video Export | | |
| > [!IMPORTANT] | |
| > To ensure inference stability at this resolution, this model is configured to use the **GPU/CPU path** rather than ANE to avoid memory limits. | |
| ## π¦ Repository Contents | |
| The repository contains the following core components: | |
| 1. `DepthPro_transform.mlpackage`: Image preprocessing. | |
| 2. `DepthPro_encoder.mlpackage`: Feature extraction (ViT-Large). | |
| 3. `DepthPro_decoder.mlpackage`: Multiresolution fusion. | |
| 4. `DepthPro_depth.mlpackage`: Final depth output and high-res feature generation. | |
| ## π Usage with Swift Transformers | |
| You can download and cache this model dynamically using `swift-transformers`: | |
| ```swift | |
| let hub = Hub() | |
| let modelDir = try await hub.snapshot(repoId: "aarondevstack/DepthPro-1024x1024-coreml") | |
| // Load models from the downloaded directory | |