aiia dnn benchmarkaiiaorg.cn/uploadfile/2019/0709/20190709091740147.pdf · 6 face detection xilinx...
TRANSCRIPT
DNN processor benchmark for Inference at the edge基于端侧推断任务的深度神经网络处理器基准测试 第二轮评估结果发布
2019.6.28 南京
AIIA DNN benchmarkV0.5b evaluation results
I. AIIA DNN Benchmark简介 About AIIA DNN Benchmark
II. V0.5版本评估方案简介 Introduction of Version 0.5
III. 评测指标及场景 Metrics and scenarios
IV. 致谢 Acknowledgement
V. v0.5第二轮 评测结果发布 v0.5b Results
VI. 结果分析 Interpretation
Content
I AIIA DNN benchmarkAbout us: Provide selection reference for application companies, and provide third-party evaluation results for chip companies.关于我们:为应用企业提供选型参考,为芯片企业提供第三方评测结果Aims: The goal of AIIA DNN benchmarks is to objectively reflect the current state of AI accelerator capabilities, and all metrics are designed to provide an objective comparison dimension. 目标:在芯片发展过程中,基于清晰指标的技术竞争可以帮助企业快速进步。AIIA DNN benchmark致力于客观地反应AI加速器能力现状,所有度量指标旨在提供客观的比对维度Evaluation method: step-by-step, version iterations, training and inference, terminal and cloud 工作方式:「版本迭代、不断丰富、不断完善」,训练+推断,端+云
I AIIA DNN benchmark-Work already done
已制定两套评估规范、完成两轮端侧评估评测工作
2018.122018.10 2019.3Release edge/inference evaluation method V0.5发布端侧v0.5版本评估方案
Release AIIA DNN benchmark v0.5 Edge/inference first evaluation result发布端侧v0.5版本首轮评估结果
start up AIIA DNN benchmark v0.5 edge/inference first evaluation启动端侧v0.5版本首轮评估
2019.4
start up AIIA DNN benchmark v0.5 edge/inference second evaluation启动端侧v0.5版本第二轮评估
2019.5.27 2019.6.28…
Release AIIA DNN benchmark v0.5 Edge/inference second evaluation result发布端侧v0.5第二轮评估结果
Release cloud/inference evaluation method V0.5发布云端推断v0.5版本评估方案
Device
start
model
AIIA DNN benchmarkV0.5 Tools
II Version 0.5:Evaluation methods of DNN processor benchmark for Inference at the edge0.5版本工具已支持Android & Linux系统
Classification分类
Object recognition目标识别
Super-Resolution超分辨率
Semantic segmentation语义分割
Face Recognition 人脸识别
Face Detection 人脸识别
Two evaluation metrics两大类关键评测指标
Six typical application scenario六种典型应用场景
Eighteen network models18种网络模型
III Metrics and scenarios 评测指标及应用情景
No Application scenarios Test data network metrics framework Source State
1 Classification ImageNet
MobileNet_v1
fps, top1,top5
TensorFlow Qualcomm b
MobileNet_v2
caffe AIIA aTensorFlow AIIA
btflite Imaginationcaffe Xilinx
Resnet101 caffe AIIA aTensorFlow AIIA b
VGG16 caffe AIIA aTensorFlow AIIA bTensorFlow AIIA a
Inception_v3 TensorFlow Qualcommb
caffe Xilinx
2 Object recognition VOC2012
SSD_VGG16
fps, mAP
caffe AIIA aSSD_VGG caffe ARM b
ssd_mobilenet_v1 caffe AIIA aTensorFlow Qualcomm b
ssd_mobilenet_v2 caffe AIIA aSSD TensorFlow Xilinx b
3 Super-Resolution 2017CVPRvdsr
fps, PSNRcaffe AIIA a
TensorFlow QualcommbVGG19 TFlite Imagination
4 Semantic segmentation VOC2012Deeplabv3+
fps,mIoU
TensorFlow AIIA aTensorFlow Qualcomm b
FCN caffe AIIA aFPN caffe Xilinx b
5 Face recognition FLW Light CNN fps, accuracy caffe ARM bInception-ResNet-v1 TensorFlow AIIA b
6 Face detection Xilinx Data DenseBox fps caffe Xilinx b
III Metrics and scenariosV0.5版本相较首轮增加两类应用场景+九种网络模型
a:首轮评估模型 b:第二轮评估增加模型
IV AIIA DNN Benchmark tools were mainly supported by : 感谢在评测过程中给予大力支持的20余家企业及机构
HK 南京华科广发
V. Version 0.5b Results
Key words
推断任务 端侧 区分整型与浮点
Inference at the edge int8 fp16 fp32
行业应用
0.01 0.1 1 10 100
安防摄像头机器人 IOT 手机 自动驾驶
Log(Power)(W)
DUT1 Information芯片基本信息披露Mobile phone SOC(UniSoc T710) 工程机
processor UniSoc T710
description mobile phone SOC
process TSMC 12FFC
CPU 4 (Cortex-A75) + 4 (Cortex-A55)
NNA(NPU) Imagination PowerVR AX2185
GPU Imagination PowerVR GM9446
interface PCIE3.0, USB3.0, UFS2.1
system Android Ubuntu
supported mobile framework TensorFlow, Caffe, ONNX
year 2018
No Application scenarios Test data network metrics framework Source State
1 Classification ImageNet
MobileNet_v1
fps, top1,top5
TensorFlow Qualcomm b
MobileNet_v2
caffe AIIA aTensorFlow AIIA
btflite Imaginationcaffe Xilinx
Resnet101 caffe AIIA aTensorFlow AIIA b
VGG16 caffe AIIA aTensorFlow AIIA bTensorFlow AIIA a
Inception_v3 caffe Xilinx b
2 Object recognition VOC2012
SSD_VGG16
fps, mAP
caffe AIIA aSSD_VGG caffe ARM b
ssd_mobilenet_v1 caffe AIIA aTensorFlow Qualcomm b
ssd_mobilenet_v2 caffe AIIA aSSD TensorFlow Xilinx b
3 Super-Resolution 2017CVPRVDSR
fps, PSNRcaffe AIIA a
TensorFlow QualcommbVGG19 TFlite Imagination
4 Semantic segmentation VOC2012DeepLabv3+
fps,mIoU
TensorFlow AIIA aTensorFlow Qualcomm b
FCN caffe AIIA aFPN caffe Xilinx b
5 Face recognition FLW Light CNN fps, accuracy caffe ARM bInception-ResNet-v1 TensorFlow AIIA b
6 Face detection Xilinx Data DenseBox fps caffe Xilinx b
UniSoc T710 参测场景
a:首轮评估模型 b:第二轮评估增加模型
两类场景的五种模型的两种加速方式(PowerVR NN /AndroidNN)
Completely standalone H/W accelerator All key layers are fully accelerated硬件加速目前主流的Layer,Industry leading performanceInt8/16 的端测推理性能优异
The Top1/Top5 accurate are kept well from INT16 to INT8 . 基于自带方案在提供优异性能的同时保持TOP1/TOP5精度稳定
UniSoc T710
0 50 100 150 200
MobileNet v2IMG NNA Tools VS Android NN API
Android NN API INT8 IMG NNA Tools INT8
Test cases NetWork FPS Accurate/PSNR
INPUT API
Face recogniation Inception_resnet_v1_quant8
46.7 88.8% 160x160 Android NN
Super_Resolution VGG19-Quant8 10.24 58.25(PSNR)
192x192 Android NN
Object_Classification MobileNetV2_quant8 108.28 85.50%(Top5)
224x224 Android NN
Wide network layer support and well Android NN API support
Offline tools to support network productisationSupport for conversion from popular frameworks
通过Offline工具,可大幅提高性能,加速产品化落地
Wide network layer support and well Android NN API support良好的Android NN API支持
中国人工智能产业发展联盟
UniSoc T710
DUT2智能语音识别模组CI1006A1CSD02
processor CI1006A1CSD02description 于ASIC架构的DNN语音识别芯片
process -
CPU ARM M4
NPU BNPU
内存 16Minterface UART、I2C、SPI、PWM、红外等外围控
制接口system RTOSsupported mobile framework -
year 2017
Information芯片基本信息披露
环境:
语音类芯片模组测试条件: 指标:
序号 测试项目 环境 安静 平稳噪声 非平稳噪声 自噪声1 误唤醒率 一般混响 1m(声源多角度) 0% 0% 10次/50小时 0%
3m(声源多角度) 0% 0% 10次/50小时 0%5m(声源多角度) 0% 0% 10次/50小时 0%
大混响 1m(声源多角度) 0% 0% 10次/50小时 0%3m(声源多角度) 0% 0% 10次/50小时 0%5m(声源多角度) 0% 0% 10次/50小时 0%
2 唤醒率 一般混响 1m(声源多角度) 99.9% 99% 93% 99.9%3m(声源多角度) 99.9% 98% 92% 99.9%5m(声源多角度) 99% 97% 90% 99%
大混响 1m(声源多角度) 99% 98% 92% 99%3m(声源多角度) 97% 96% 92% 97%5m(声源多角度) 95% 94% 90% 95%
3 识别准确率 一般混响 1m(声源多角度) 99.9% 98% 93% 99.9%3m(声源多角度) 99.9% 97% 92% 99.9%5m(声源多角度) 99% 96% 90% 99%
大混响 1m(声源多角度) 99% 97% 92% 99%3m(声源多角度) 97% 96% 90% 97%5m(声源多角度) 96% 92% 88% 96%
4 误识别次数 一般混响 1m(声源多角度) 0% 2% 6% 0%3m(声源多角度) 0% 3% 7% 0%5m(声源多角度) 1% 4% 8% 1%
大混响 1m(声源多角度) 1% 2% 6% 1%3m(声源多角度) 2% 3% 8% 2%5m(声源多角度) 3% 7% 10% 3%
测试集依据:成都启英泰伦科技有限公司标准《本地语音模块语音识别及性能测试标准》
四大类评测指标的具体结果:
DUT3ZCU104板卡
processor ZU7EV
descriptionEvaluation kit for embedded vision applications
process 16nm
CPU
quad-core ARM® Cortex™-A53 applications processor, dual-core Cortex-R5 real-time processor
GPU Mali™-400 MP2 interface USB3, DP, SATA, LPC FMCsystem Linuxsupported mobile framework Nyear 2017
Information芯片基本信息披露
No Application scenarios Test data network metrics framework Source State
Classification ImageNet
MobileNet_v1
fps, top1,top5
TensorFlow Qualcomm b
1
MobileNet_v2
caffe AIIA aTensorFlow AIIA
btflite Imaginationcaffe Xilinx
Resnet101 caffe AIIA aTensorFlow AIIA b
VGG16 caffe AIIA aTensorFlow AIIA bTensorFlow AIIA a
Inception_v3 caffe Xilinx b
2 Object recognition VOC2012
SSD_VGG16
fps, mAP
caffe AIIA aSSD_VGG caffe ARM b
ssd_mobilenet_v1 caffe AIIA aTensorFlow Qualcomm b
ssd_mobilenet_v2 caffe AIIA aSSD TensorFlow Xilinx b
3 Super-Resolution 2017CVPRvdsr
fps, PSNRcaffe AIIA a
TensorFlow QualcommbVGG19 TFlite Imagination
4 Semantic segmentation VOC2012Deeplabv3+
fps,mIoU
TensorFlow AIIA aTensorFlow Qualcomm b
FCN caffe AIIA aFPN caffe Xilinx b
5 Face recognition FLW Light CNN fps, accuracy caffe ARM bInception-ResNet-v1 TensorFlow AIIA b
6 Face detection Xilinx Data DenseBox fps caffe Xilinx b
ZCU104参测场景
a:首轮评估模型 b:第二轮评估增加模型
四类场景,七种模型
INTE8 (ZU7EV)四类场景,七种模型的性能与精度结果
processor Snapdragon 855 Mobile Platform
description First mobile platform to collectively commercialize 5G, AI, XR
process 7nmCPU Qualcomm® Kryo™ 485 CPU (Octa-core)GPU Qualcomm® Adreno™ 640 GPUinterface USB Version 3.1; USB Type-C Supportsystem Androidsupported mobile framework SNPEyear 2018
DUT4高通QRD855参考测试机
Information芯片基本信息披露
No Application scenarios Test data network metrics framework Source State
1 Classification ImageNet
MobileNet_v1
fps, top1,top5
TensorFlow Qualcomm b
MobileNet_v2
caffe AIIA aTensorFlow AIIA
btflite Imaginationcaffe Xilinx
Resnet101 caffe AIIA aTensorFlow AIIA b
VGG16 caffe AIIA aTensorFlow AIIA bTensorFlow AIIA a
Inception_v3 caffe Xilinx b
2 Object recognition VOC2012
SSD_VGG16
fps, mAP
caffe AIIA aSSD_VGG caffe ARM b
ssd_mobilenet_v1 caffe AIIA aTensorFlow Qualcomm b
ssd_mobilenet_v2 caffe AIIA aSSD TensorFlow Xilinx b
3 Super-Resolution 2017CVPRVDSR
fps, PSNRcaffe AIIA a
TensorFlow QualcommbVGG19 TFlite Imagination
4 Semantic segmentation VOC2012DeepLabv3+
fps,mIoU
TensorFlow AIIA aTensorFlow Qualcomm b
FCN caffe AIIA aFPN caffe Xilinx b
5 Face recognition FLW Light CNN fps, accuracy caffe ARM bInception-ResNet-v1 TensorFlow AIIA b
6 Face detection Xilinx Data DenseBox fps caffe Xilinx b
QRD855参测场景
a:首轮评估模型 b:第二轮评估增加模型
四类场景,五种模型
(SNPE: v1.27.0)
INT8四类场景,五种模型的性能与精度结果
ssd_mobilenetv1(300x300)
DeepLabv3+(513x513)
VDSR(256x256)
INT8 0.385 0.6993 25.5544原始精度 —— —— ——
AIIA DNN benchmark v0.5 Top1
AIIA DNN benchmark v0.5 五类场景12种模型top1榜单(手机类 INT8)
刷榜方式(指定):ü 模型ü 测试数据集ü 预处理方式ü 单线程推理任务
增加测试场景方式(提交):ü 原始FP32模型文件ü 前处理ü 精度ü 数据集ü 后处理脚本
定期公布更新数据,欢迎企业刷榜
性能 精度
ImageNet Validation 1000张
定期公布更新数据,欢迎企业刷榜
AIIA DNN benchmark v0.5 五类场景10种模型top1榜单(手机类 FP16)性能 精度
ImageNet Validation 1000张
刷榜方式(指定):ü 模型ü 测试数据集ü 预处理方式ü 单线程推理任务
增加测试场景方式(提交):ü 原始FP32模型文件ü 前处理ü 精度ü 数据集ü 后处理脚本
AIIA DNN benchmark v0.5 五类场景10种模型top1榜单(板卡类 INT8)
定期公布更新数据,欢迎企业刷榜
性能 精度
刷榜方式(指定):ü 模型ü 测试数据集ü 预处理方式ü 单线程推理任务
增加测试场景方式(提交):ü 原始FP32模型文件ü 前处理ü 精度ü 数据集ü 后处理脚本
AIIA DNN benchmark model details
Work Plan Application scenarios iteration
场景迭代Rich evaluation object: Voice interaction/ADAS/smart camera
评测对象丰富:语音、自动驾驶、安防metrics expansion: Power
指标扩充Benchmark demo update
Benchmark demo更新
Release Version 1.0 guidelines :
发布V1.0评估方案Guidelines of artificial intelligence chip benchmark Part 1:Metrics and evaluation methods for terminal-based deep neural network processor benchmark
人工智能芯片测试评估规范:第1部分:人工智能端侧芯片基准测试指标要求和评估方法
Iteration benchmark result——
迭代结果发布2019 Artificial Intelligence Developer Conference
AIIA 2019人工智能开发者大会……
Release Benchmark v0.5 evaluation method: DNN processorbenchmark for inference at the cloud
云端推断v0.5首轮测试启动
Thanks
Contact:[email protected]