Published in: 2023 IEEE Hot Chips 35 Symposium (HCS)
Authors: Seungjae Moon; Junsoo Kim; Jung-Hoon Kim; Junseo Cha; Gyubin Choi; Seongmin Hong
Introduction
- The fundamental goal of AI is to create human-like intelligence. Generative AI has enabled
AI to do what we thought was innate to only humans: show creativity. - Transformer-based large language models (LLM) with multi-billion parameters, such as
OpenAI GPT, Meta LLaMA, can create original texts and visual contents. - For efficient model Inference, a latency-oriented and scalable hardware for small-batch
memory-intensive workloads is required to meet the needs of different users - Latency Processing Unit, the world-first hardware accelerator dedicated for the end-toend inference of LLM.