4.6 Article

Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml

Related references

Note: Only part of the references are listed.
Article Physics, Particles & Fields

Model compression and simplification pipelines for fast deep neural network inference in FPGAs in HEP

Simone Francescato et al.

Summary: The article introduces a multi-stage compression approach developed for implementing fast real-time inference for deep neural networks on the latest generation of hardware accelerators, with application in high energy physics use cases, and summarizes the effectiveness of the method.

EUROPEAN PHYSICAL JOURNAL C (2021)

Correction Physics, Particles & Fields

Model compression and simplification pipelines for fast deep neural network inference in FPGAs in HEP (vol 81, 969, 2021)

Simone Francescato et al.

EUROPEAN PHYSICAL JOURNAL C (2021)

Article Computer Science, Artificial Intelligence

Fast convolutional neural networks on FPGAs with hls4ml

Thea Aarrestad et al.

Summary: An automated tool is introduced for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. Through model compression techniques, significant reduction in FPGA critical resource consumption can be achieved with minimal to no loss in model accuracy.

MACHINE LEARNING-SCIENCE AND TECHNOLOGY (2021)

Article Computer Science, Information Systems

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

Yutaro Iiyama et al.

Summary: This paper discusses the design of distance-weighted graph networks that can be executed with a latency of less than one microsecond on an FPGA for crucial tasks in particle physics. By using a graph network architecture developed for specific tasks and applying additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization.

FRONTIERS IN BIG DATA (2021)

Article Computer Science, Artificial Intelligence

Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors

Claudionor N. Coelho et al.

Summary: The paper discusses a quantization method for deep learning models that can reduce energy consumption and model size while maintaining high accuracy, suitable for efficient inference on edge devices.

NATURE MACHINE INTELLIGENCE (2021)

Article Computer Science, Artificial Intelligence

Compressing deep neural networks on FPGAs to binary and ternary precision with hls4ml

Jennifer Ngadiuba et al.

Summary: The research presents the implementation of binary and ternary neural networks in the hls4ml library, which aims to automatically convert deep neural network models into digital circuits with FPGA firmware. By reducing the numerical precision of network parameters, the binary and ternary implementation achieves similar performance to higher precision implementations while using drastically fewer FPGA resources. The study discusses the trade-off between model accuracy, resource consumption, and the balance between latency and accuracy.

MACHINE LEARNING-SCIENCE AND TECHNOLOGY (2021)

Article Instruments & Instrumentation

Fast inference of deep neural networks in FPGAs for particle physics

J. Duarte et al.

JOURNAL OF INSTRUMENTATION (2018)