In Figure 3, we replicate the results of Figure 2 with an entire ResNet-50, rather than individual operations. As shown in the left subplot, factorizing all of the convolutions andlinear layers decreases the FLOP count significantly,but increases the runtime thanks to increased memory bandwidth usage and kemel launch overhead. However, removing the batch normalization ops (as is commonly done during inference) reduces time significantly-despite having almost no impact on FLOP count.A similar pattern lolds for parameter count in the middle plot.
However,in Figure 3 (right),we see that measuring the size of input and output operands correctly orders the different ResNet-50 variants-though it is still not a reliable predictor of runtime.

本文来自知之小站
报告已上传百度网盘群,限时15元即可入群及获得1年期更新
(如无法加入或其他事宜可联系zzxz_88@163.com)