Publications

An Efficient Distributed Machine Learning Inference Framework with Byzantine Fault Detection

Abstract

The gap between the complexity of the most advanced machine learning (ML) models, e.g., the large language model (LLMs), PaLM, with 540 billion parameters, and what hardware resources at the edge can support is growing. Two approaches to mitigate this gap are leveraging cloud-based ML servers, which introduce widely studied security, privacy, and reliability risks, and distributed inference, in which several local edge-based devices share the computational burden. Motivated by the fact that the security of the distributed inference approach has received far less attention, this paper proposes a low-cost and versatile scheme to add redundancy to distributed inference to mitigate compromised devices that can exhibit faulty or malicious behavior modeled as Byzantine faults. We mathematically derive the number of inferences required to detect the attack as a function of computation overhead and also develop …

Date
June 30, 2025
Authors
Xuan Zhou, Utkarsh Mohan, Yao Liu, Peter Beerel
Book
Proceedings of the Great Lakes Symposium on VLSI 2025
Pages
56-63