Papers
arxiv:2601.08025

Where to Split? A Pareto-Front Analysis of DNN Partitioning for Edge Inference

Published on Jan 12
Authors:
,
,
,

Abstract

DNN partitioning is reformulated as a multi-objective optimization problem using Pareto front analysis to balance latency and throughput trade-offs in heterogeneous edge computing environments.

AI-generated summary

The deployment of deep neural networks (DNNs) on resource-constrained edge devices is frequently hindered by their significant computational and memory requirements. While partitioning and distributing a DNN across multiple devices is a well-established strategy to mitigate this challenge, prior research has largely focused on single-objective optimization, such as minimizing latency or maximizing throughput. This paper challenges that view by reframing DNN partitioning as a multi-objective optimization problem. We argue that in real-world scenarios, a complex trade-off between latency and throughput exists, which is further complicated by network variability. To address this, we introduce ParetoPipe, an open-source framework that leverages Pareto front analysis to systematically identify optimal partitioning strategies that balance these competing objectives. Our contributions are threefold: we benchmark pipeline partitioned inference on a heterogeneous testbed of Raspberry Pis and a GPU-equipped edge server; we identify Pareto-optimal points to analyze the latency-throughput trade-off under varying network conditions; and we release a flexible, open-source framework to facilitate distributed inference and benchmarking. This toolchain features dual communication backends, PyTorch RPC and a custom lightweight implementation, to minimize overhead and support broad experimentation.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.08025 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.08025 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.08025 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.