Source Flex Logic - Silicon Valley, CA, Mar. 28, 2019 –
nnMAX™ Inference Acceleration Architecture
Today at the Autonomous Vehicle Hardware Summit, Cheng Wang presented a detailed update of nnMAX for Inference Acceleration. nnMAX can achieve throughput in excess of a Data Center Inference Card at a fraction of the cost and power.
nnMAX has 1024 inference-optimized MACs in 4mm2 of TSMC16FFC. MACs are grouped in 16 clusters of 64. MACs can process INT8x8 and INT16x8 at full 1.067GHz, or INT16x16 and BFloat16x16 at half rate. Numerics can be mixed layer by layer to maximize throughput.
You can see Cheng's slides and a 4-pager summary at www.flex-logix.com.
InferX™ Inference Co-Processor
Come to the Linley Spring Processor Conference on April 10th in Santa Clara to see Cheng give details on the InferX X1 chip, based on nnMAX inference acceleration, and the nnMAX Compiler for programming it using TensorFlow Lite and ONNX.