计算高效深度学习-算法趋势和机遇.pdf

determines the amount of memory consumed by the model and optimizer state. However, like FLOPs, it does not account for most factors affecting cost and runtime. Further,different models with similar parameter counts may need different amounts of work to achieve a certain performarlce level; e.g., increasing the input’s resolution or sequence length increases the compute requirements but does not change the number of parameters.

Carbon Emission. The carbon footprint of DNN training is another useful metric. However, accurately measuring carbon emissions can be challenging.This metric dependshighly on the local electricity infrastructure and current demand; therefore, one cannot easily compare results when experiments are performed in different locations or even in the same location at different times [Schwartz et al.2020].

Electricity Usage. A time- and location-agnostic way toquantify training efficiency is the electricity used during DNN training. GPUs and CPUs can report their current power usage, which can be used to estimate the total electricity consumed during training. Unfortunately, electricity usage is dependent on the hardware used for model training, which makes it difficult to perform a fair comparison across methods implemented by different researchers. Moreover, even for fixed hardware,it is possible to trade offpower consumption and runtime[You et al.2022].

Operand sizes. The total number of activations in a model’s forward pass is a proxy for memory bandwidth con-sumption and can be a useful proxy for runtime [Dollar etal.2021;Radosavovic et al.2020].This metric can be defined rigorously and independent of hardware for a given compute graph.It also decreases when operators are fused,which may or may not be desirable.

本文来自知之小站

PDF完整报告已分享至知识星球，微信扫码加入立享4万+最新精选报告

（星球内含更多专属精选报告.其它事宜可联系zzxz_88@163.com）

相关文章

资本下沉与人口回流正提振小城车市；小城车市洞察报告 (2025版).pdf

资产管理行业分析报告——银行理财篇.pdf

资产管理行业分析报告——私募基金篇.pdf