community.arm.com, Aug. 29, 2019 –
Neural Networks are a key component of Machine Learning (ML) applications. Project Trillium, Arm's heterogeneous ML platform, provides a range of technologies in this field, including instructions that accelerate such applications running on CPUs based on the Arm®v8-A architecture.
The next revision of the Armv8-A architecture will introduce Neon and SVE vector instructions designed to accelerate certain computations using the BFloat16 (BF16) floating-point number format. BF16 has recently emerged as a format tailored specifically to high-performance processing of Neural Networks (NNs). BF16 is a truncated form of the IEEE 754 [ieee754-2008] single-precision representation (IEEE-FP32), which has only 7 fraction bits, instead of 23
Several major CPU and GPU architectures, and Neural Network accelerators (NPUs), have announced an intention to support BF16.