Published in: 2022 IEEE Hot Chips 34 Symposium (HCS)

Abstract: DFX: a low-latency multi-FPGA appliance for accelerating transformer-based text generation-DFX is a multi-FPGA appliance that accelerates transformer-based text generation-DFX adopts model parallelism to efficiently process the large-scale language model-Xilinx Alveo U280 data center accelerator card provides high performance with low-cost-FPGA-to-FPGA communication is enabled by QSFP cable at 100 Gb/s.

Authors: Seongmin Hong; Seungjae Moon; Junsoo Kim; Sungjae Lee; Minsub Kim; Dongsoo Lee