DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
Published in: 2022 IEEE Hot Chips 34 Symposium (HCS) Abstract: DFX: a low-latency multi-FPGA appliance for accelerating transformer-based text generation-DFX is...
