Branch Convolution Quantization for Object Detection
-
Graphical Abstract
-
Abstract
Quantization is one of the research topics on lightweight and edge-deployed convolutional neural networks (CNNs). Usually, the activation and weight bit-widths between layers are inconsistent to ensure good performance of CNN, meaning that dedicated hardware has to be designed for specific layers. In this work, we explore a unified quantization method with extremely low-bit quantized weights for all layers. We use thermometer coding to convert the 8-bit RGB input images to the same bit-width as that of the activations of middle layers. For the quantization of the results of the last layer, we propose a branch convolution quantization (BCQ) method. Together with the extremely low-bit quantization of the weights, the deployment of the network on circuits will be simpler than that of other works and consistent throughout all the layers including the first layer and the last layer. Taking tiny_yolo_v3 and yolo_v3 on VOC and COCO datasets as examples, the feasibility of thermometer coding on input images and branch convolution quantization on output results is verified. Finally, tiny_yolo_v3 is deployed on FPGA, which further demonstrates the high performance of the proposed algorithm on hardware.
-
-