Instructions to use MLLabIISc/ModHiFi-ResNet50-ImageNet-Small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MLLabIISc/ModHiFi-ResNet50-ImageNet-Small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", trust_remote_code=True) pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", trust_remote_code=True) model = AutoModelForImageClassification.from_pretrained("MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
| language: en | |
| license: gpl-3.0 | |
| library_name: transformers | |
| tags: | |
| - vision | |
| - image-classification | |
| - resnet | |
| - pruning | |
| - sparse | |
| base_model: microsoft/resnet-50 | |
| pipeline_tag: image-classification | |
| datasets: | |
| - ILSVRC/imagenet-1k | |
| metrics: | |
| - accuracy | |
| # ModHiFi Pruned ResNet-50 (Small) | |
| ## Model Description | |
| This model is a **structurally pruned** version of the standard [ResNet-50](https://huggingface.co/microsoft/resnet-50) architecture. | |
| Developed by the **Machine Learning Lab at the Indian Institute of Science**, it has been compressed to remove **~30% of the parameters** while achieving *higher accuracy* than the base model. | |
| Unlike unstructured pruning (which zeros out weights), **structural pruning** physically removes entire channels and filters. | |
| This results in a model that is natively **smaller, faster, and reduces FLOPs** on standard hardware without needing specialized sparse inference engines. | |
| - **Developed by:** Machine Learning Lab, Indian Institute of Science | |
| - **Model type:** Convolutional Neural Network (Pruned ResNet) | |
| - **License:** GNU General Public License v3.0 | |
| - **Base Model:** Microsoft ResNet-50 | |
| ## Performance & Efficiency | |
| | Model Variant | Sparsity | Top-1 Acc | Top-5 Acc | Params (M) | FLOPs (G) | Size (MB) | | |
| | :--- | :---: | :---: | :---: | :---: | :---: | :---: | | |
| | **Original ResNet-50** | 0% | 76.13% | 92.86% | 25.56 | 4.12 | ~98 | | |
| | **ModHiFi-Small** | **~32%** | **76.70%** | **93.32%** | **17.4** | **1.9** | **~66** | | |
| On the hardware we test on (detailed in our [Paper](https://arxiv.org/abs/2511.19566)) we observe speedups of **1.69x on CPUs** and **1.70x on GPUs**. | |
| > **Note:** "FLOPs" measures the number of floating-point operations required for a single inference pass. Lower is better for latency and battery life. | |
| ## ⚠️ Critical Note on Preprocessing & Accuracy | |
| **Please Read Before Evaluating:** This model was trained and evaluated using standard PyTorch `torchvision.transforms`. The Hugging Face `pipeline` uses `PIL` (Pillow) for image resizing by default. | |
| Due to subtle differences in interpolation (Bilinear vs. Bicubic) and anti-aliasing between PyTorch's C++ kernels and PIL, **you may observe a ~0.5% - 1.0% drop in Top-1 accuracy** if you use the default `preprocessor_config.json`. | |
| To reproduce the exact numbers listed in the table above, we recommend wrapping the `pipeline` with the exact PyTorch transforms used during training: | |
| ```python | |
| from torchvision import transforms | |
| from transformers import pipeline | |
| import torch | |
| # 1. Define the Exact PyTorch Transform | |
| val_transform = transforms.Compose([ | |
| transforms.Resize(256), # Resize shortest edge to 256 | |
| transforms.CenterCrop(224), # Center crop 224x224 | |
| transforms.ToTensor(), # Convert to Tensor (0-1) | |
| transforms.Normalize( # ImageNet Normalization | |
| mean=[0.485, 0.456, 0.406], | |
| std=[0.229, 0.224, 0.225] | |
| ), | |
| ]) | |
| # 2. Define a Wrapper to force Pipeline to use PyTorch | |
| class PyTorchProcessor: | |
| def __init__(self, transform): | |
| self.transform = transform | |
| self.image_processor_type = "custom" | |
| def __call__(self, images, **kwargs): | |
| if not isinstance(images, list): images = [images] | |
| # Apply transforms and stack | |
| pixel_values = torch.stack([self.transform(img.convert("RGB")) for img in images]) | |
| return {"pixel_values": pixel_values} | |
| # 3. Initialize Pipeline with Custom Processor | |
| pipe = pipeline( | |
| "image-classification", | |
| model="MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", | |
| image_processor=PyTorchProcessor(val_transform), # <--- Fixes the accuracy gap | |
| trust_remote_code=True, | |
| device=0 # Use GPU if available | |
| ) | |
| ``` | |
| ## Quick Start | |
| If you do not require bit-perfect reproduction of the original accuracy and prefer simplicity, you can use the model directly with the standard Hugging Face pipeline. | |
| ### Install dependencies | |
| ```bash | |
| pip install torch transformers | |
| ``` | |
| ## Inference example | |
| ```python | |
| import requests | |
| from PIL import Image | |
| from transformers import pipeline | |
| # Load model (ensure trust_remote_code=True for custom architecture) | |
| pipe = pipeline( | |
| "image-classification", | |
| model="MLLabIISc/ModHiFi-ResNet50-ImageNet-Small", | |
| trust_remote_code=True | |
| ) | |
| # Load an image | |
| url = "http://images.cocodataset.org/val2017/000000039769.jpg" | |
| image = Image.open(requests.get(url, stream=True).raw) | |
| # Run Inference | |
| results = pipe(image) | |
| print(f"Predicted Class: {results[0]['label']}") | |
| print(f"Confidence: {results[0]['score']:.4f}") | |
| ``` | |
| ## Citation | |
| If you use this model in your research, please cite the following paper: | |
| ``` | |
| @inproceedings{kashyap2026modhifi, | |
| title = {ModHiFi: Identifying High Fidelity predictive components for Model Modification}, | |
| author = {Kashyap, Dhruva and Murti, Chaitanya and Nayak, Pranav and Narshana, Tanay and Bhattacharyya, Chiranjib}, | |
| booktitle = {Advances in Neural Information Processing Systems}, | |
| year = {2025}, | |
| eprint = {2511.19566}, | |
| archivePrefix = {arXiv}, | |
| primaryClass = {cs.LG}, | |
| url = {https://arxiv.org/abs/2511.19566}, | |
| } | |
| ``` | |