卷积神经网络(CNN)

Posted Jan 29, 2026

By Zepeng Lin

1 min read

卷积神经网络(CNN)

CNN

用来提取图像特征,分为两个算子$Convolution\ operrator$ 和$Pooling\ operator$

文档分类 $\Rightarrow$ AlexNet(Imagenet 文本分类) $\Rightarrow$ CNN dominate all vision tasks $\Rightarrow$ Transformer 时代

文本-Attention is all you need”:NeurIPS 2017
图像-An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”, ICLR 2021

卷积核的通道数和图片的通道数要相同,卷积核从左向右从上向下移动与对应的图像矩阵块做点乘,所以一个卷积核提取得到一个feature map

而且由于滑动的性质输出的feature map维度会不断下降,为了防止这个引出Padding 机制

Number of learnable parameters的计算:

Number of multiply-add operations的计算:

先算output的数:10*32*32 = 10,240 outputs
再算得到每个output需要的操作数: Each output is the inner product of two 3x5x5 tensors (75 elems)
Total = 75*10240 = 768K