上篇文章中我們講解了卷積神經(jīng)網(wǎng)絡(luò)的基本原理,包括幾個(gè)基本層的定義、運(yùn)算規(guī)則等。本文主要寫卷積神經(jīng)網(wǎng)絡(luò)如何進(jìn)行一次完整的訓(xùn)練,包括前向傳播和反向傳播,并自己手寫一個(gè)卷積神經(jīng)網(wǎng)絡(luò)。如果不了解基本原理的,可以先看看上篇文章:????【深度學(xué)習(xí)系列】卷積神經(jīng)網(wǎng)絡(luò)CNN原理詳解(一)——基本原理
卷積神經(jīng)網(wǎng)絡(luò)的前向傳播
首先我們來看一個(gè)最簡單的卷積神經(jīng)網(wǎng)絡(luò):
1.輸入層---->卷積層
以上一節(jié)的例子為例,輸入是一個(gè)4*4 的image,經(jīng)過兩個(gè)2*2的卷積核進(jìn)行卷積運(yùn)算后,變成兩個(gè)3*3的feature_map
以卷積核filter1為例(stride = 1 ):
計(jì)算第一個(gè)卷積層神經(jīng)元o11的輸入:
神經(jīng)元o11的輸出:(此處使用Relu激活函數(shù))
其他神經(jīng)元計(jì)算方式相同
2.卷積層---->池化層
計(jì)算池化層m11的輸入(取窗口為 2 * 2),池化層沒有激活函數(shù)
3.池化層---->全連接層
池化層的輸出到flatten層把所有元素“拍平”,然后到全連接層。
4.全連接層---->輸出層
全連接層到輸出層就是正常的神經(jīng)元與神經(jīng)元之間的鄰接相連,通過softmax函數(shù)計(jì)算后輸出到output,得到不同類別的概率值,輸出概率值最大的即為該圖片的類別。
卷積神經(jīng)網(wǎng)絡(luò)的反向傳播
傳統(tǒng)的神經(jīng)網(wǎng)絡(luò)是全連接形式的,如果進(jìn)行反向傳播,只需要由下一層對前一層不斷的求偏導(dǎo),即求鏈?zhǔn)狡珜?dǎo)就可以求出每一層的誤差敏感項(xiàng),然后求出權(quán)重和偏置項(xiàng)的梯度,即可更新權(quán)重。而卷積神經(jīng)網(wǎng)絡(luò)有兩個(gè)特殊的層:卷積層和池化層。池化層輸出時(shí)不需要經(jīng)過激活函數(shù),是一個(gè)滑動(dòng)窗口的最大值,一個(gè)常數(shù),那么它的偏導(dǎo)是1。池化層相當(dāng)于對上層圖片做了一個(gè)壓縮,這個(gè)反向求誤差敏感項(xiàng)時(shí)與傳統(tǒng)的反向傳播方式不同。從卷積后的feature_map反向傳播到前一層時(shí),由于前向傳播時(shí)是通過卷積核做卷積運(yùn)算得到的feature_map,所以反向傳播與傳統(tǒng)的也不一樣,需要更新卷積核的參數(shù)。下面我們介紹一下池化層和卷積層是如何做反向傳播的。
在介紹之前,首先回顧一下傳統(tǒng)的反向傳播方法:
卷積層的反向傳播
由前向傳播可得:
然后依次對輸入元素求偏導(dǎo)
觀察一下上面幾個(gè)式子的規(guī)律,歸納一下,可以得到如下表達(dá)式:
圖中的卷積核進(jìn)行了180°翻轉(zhuǎn),與這一層的誤差敏感項(xiàng)矩陣deltai,j)deltai,j)周圍補(bǔ)零后的矩陣做卷積運(yùn)算后,就可以得到?E/?i11,即?E/?ii,j=∑m?∑nhm,nδi+m,j+n
第一項(xiàng)求完后,我們來求第二項(xiàng)?i11/?neti11
此時(shí)我們的誤差敏感矩陣就求完了,得到誤差敏感矩陣后,即可求權(quán)重的梯度。
由于上面已經(jīng)寫出了卷積層的輸入neto11與權(quán)重hi,j之間的表達(dá)式,所以可以直接求出:
推論出權(quán)重的梯度:
偏置項(xiàng)的梯度:
可以看出,偏置項(xiàng)的偏導(dǎo)等于這一層所有誤差敏感項(xiàng)之和。得到了權(quán)重和偏置項(xiàng)的梯度后,就可以根據(jù)梯度下降法更新權(quán)重和梯度了。
池化層的反向傳播
池化層的反向傳播就比較好求了,看著下面的圖,左邊是上一層的輸出,也就是卷積層的輸出feature_map,右邊是池化層的輸入,還是先根據(jù)前向傳播,把式子都寫出來,方便計(jì)算:
假設(shè)上一層這個(gè)滑動(dòng)窗口的最大值是outo11
這樣就求出了池化層的誤差敏感項(xiàng)矩陣。同理可以求出每個(gè)神經(jīng)元的梯度并更新權(quán)重。
手寫一個(gè)卷積神經(jīng)網(wǎng)絡(luò)
1.定義一個(gè)卷積層
首先我們通過ConvLayer來實(shí)現(xiàn)一個(gè)卷積層,定義卷積層的超參數(shù)
class ConvLayer(object):
'''
參數(shù)含義:
input_width:輸入圖片尺寸——寬度
input_height:輸入圖片尺寸——長度
channel_number:通道數(shù),彩色為3,灰色為1
filter_width:卷積核的寬
filter_height:卷積核的長
filter_number:卷積核數(shù)量
zero_padding:補(bǔ)零長度
stride:步長
learning_rate:學(xué)習(xí)率
'''
def __init__(self, input_width, input_height,
channel_number, filter_width,
filter_height, filter_number,
zero_padding, stride, activator,
learning_rate):
self.input_width = input_width
self.input_height = input_height
self.channel_number = channel_number
self.filter_width = filter_width
self.filter_height = filter_height
self.filter_number = filter_number
self.zero_padding = zero_padding
self.stride = stride
self.output_width =
ConvLayer.calculate_output_size(
self.input_width, filter_width, zero_padding,
stride)
self.output_height =
ConvLayer.calculate_output_size(
self.input_height, filter_height, zero_padding,
stride)
self.output_array = np.zeros((self.filter_number,
self.output_height, self.output_width))
self.filters = []
for i in range(filter_number):
self.filters.append(Filter(filter_width,
filter_height, self.channel_number))
self.activator = activator
self.learning_rate = learning_rate
其中calculate_output_size用來計(jì)算通過卷積運(yùn)算后輸出的feature_map大小
@staticmethod def calculate_output_size(input_size,
filter_size, zero_padding, stride):
return (input_size - filter_size +5
2 * zero_padding) / stride + 1
2.構(gòu)造一個(gè)激活函數(shù)
此處用的是RELU激活函數(shù),因此我們在activators.py里定義,forward是前向計(jì)算,backforward是計(jì)算公式的導(dǎo)數(shù):
class ReluActivator(object):
def forward(self, weighted_input):
#return weighted_input
return max(0, weighted_input)
def backward(self, output):
return 1 if output > 0 else 0
其他常見的激活函數(shù)我們也可以放到activators里,如sigmoid函數(shù),我們可以做如下定義:
class SigmoidActivator(object): def forward(self, weighted_input): return 1.0 / (1.0 + np.exp(-weighted_input))#the partial of sigmoiddef backward(self, output): return output * (1 - output)
如果我們需要自動(dòng)以其他的激活函數(shù),都可以在activator.py定義一個(gè)類即可。
3.定義一個(gè)類,保存卷積層的參數(shù)和梯度
class Filter(object):
def __init__(self, width, height, depth):
#初始權(quán)重
self.weights = np.random.uniform(-1e-4, 1e-4,
(depth, height, width))
#初始偏置
self.bias = 0
self.weights_grad = np.zeros(
self.weights.shape)
self.bias_grad = 0
def __repr__(self):
return 'filter weights: %s bias: %s' % (
repr(self.weights), repr(self.bias))
def get_weights(self):
return self.weights
def get_bias(self):
return self.bias
def update(self, learning_rate):
self.weights -= learning_rate * self.weights_grad
self.bias -= learning_rate * self.bias_grad
4.卷積層的前向傳播
1).獲取卷積區(qū)域
# 獲取卷積區(qū)域
def get_patch(input_array, i, j, filter_width,
filter_height, stride):
'''
從輸入數(shù)組中獲取本次卷積的區(qū)域,
自動(dòng)適配輸入為2D和3D的情況
'''
start_i = i * stride
start_j = j * stride
if input_array.ndim == 2:
input_array_conv = input_array[
start_i : start_i + filter_height,
start_j : start_j + filter_width]
print "input_array_conv:",input_array_conv
return input_array_conv
elif input_array.ndim == 3:
input_array_conv = input_array[:,
start_i : start_i + filter_height,
start_j : start_j + filter_width]
print "input_array_conv:",input_array_conv
return input_array_conv
2).進(jìn)行卷積運(yùn)算
def conv(input_array,
kernel_array,
output_array,
stride, bias):
'''
計(jì)算卷積,自動(dòng)適配輸入為2D和3D的情況
'''
channel_number = input_array.ndim
output_width = output_array.shape[1]
output_height = output_array.shape[0]
kernel_width = kernel_array.shape[-1]
kernel_height = kernel_array.shape[-2]
for i in range(output_height):
for j in range(output_width):
output_array[i][j] = (
get_patch(input_array, i, j, kernel_width,
kernel_height, stride) * kernel_array
).sum() + bias
3).增加zero_padding
#增加Zero padding
def padding(input_array, zp):
'''
為數(shù)組增加Zero padding,自動(dòng)適配輸入為2D和3D的情況
'''
if zp == 0:
return input_array
else:
if input_array.ndim == 3:
input_width = input_array.shape[2]
input_height = input_array.shape[1]
input_depth = input_array.shape[0]
padded_array = np.zeros((
input_depth,
input_height + 2 * zp,
input_width + 2 * zp))
padded_array[:,
zp : zp + input_height,
zp : zp + input_width] = input_array
return padded_array
elif input_array.ndim == 2:
input_width = input_array.shape[1]
input_height = input_array.shape[0]
padded_array = np.zeros((
input_height + 2 * zp,
input_width + 2 * zp))
padded_array[zp : zp + input_height,
zp : zp + input_width] = input_array
return padded_array
4).進(jìn)行前向傳播
def forward(self, input_array):
'''
計(jì)算卷積層的輸出
輸出結(jié)果保存在self.output_array
'''
self.input_array = input_array
self.padded_input_array = padding(input_array,
self.zero_padding)
for f in range(self.filter_number):
filter = self.filters[f]
conv(self.padded_input_array,
filter.get_weights(), self.output_array[f],
self.stride, filter.get_bias())
element_wise_op(self.output_array,
self.activator.forward)
其中element_wise_op函數(shù)是將每個(gè)組的元素對應(yīng)相乘
# 對numpy數(shù)組進(jìn)行element wise操作,將矩陣中的每個(gè)元素對應(yīng)相 def element_wise_op(array, op) for i in np.nditer(array op_flags=['readwrite']):i[...] = op(i)
5.卷積層的反向傳播
1).將誤差傳遞到上一層
def bp_sensitivity_map(self, sensitivity_array,
activator):
'''
計(jì)算傳遞到上一層的sensitivity map
sensitivity_array: 本層的sensitivity map
activator: 上一層的激活函數(shù)
'''
# 處理卷積步長,對原始sensitivity map進(jìn)行擴(kuò)展
expanded_array = self.expand_sensitivity_map(
sensitivity_array)
# full卷積,對sensitivitiy map進(jìn)行zero padding
# 雖然原始輸入的zero padding單元也會(huì)獲得殘差
# 但這個(gè)殘差不需要繼續(xù)向上傳遞,因此就不計(jì)算了
expanded_width = expanded_array.shape[2]
zp = (self.input_width +
self.filter_width - 1 - expanded_width) / 2
padded_array = padding(expanded_array, zp)
# 初始化delta_array,用于保存?zhèn)鬟f到上一層的
# sensitivity map
self.delta_array = self.create_delta_array()
# 對于具有多個(gè)filter的卷積層來說,最終傳遞到上一層的
# sensitivity map相當(dāng)于所有的filter的
# sensitivity map之和
for f in range(self.filter_number):
filter = self.filters[f]
# 將filter權(quán)重翻轉(zhuǎn)180度
flipped_weights = np.array(map(
lambda i: np.rot90(i, 2),
filter.get_weights()))
# 計(jì)算與一個(gè)filter對應(yīng)的delta_array
delta_array = self.create_delta_array()
for d in range(delta_array.shape[0]):
conv(padded_array[f], flipped_weights[d],
delta_array[d], 1, 0)
self.delta_array += delta_array
# 將計(jì)算結(jié)果與激活函數(shù)的偏導(dǎo)數(shù)做element-wise乘法操作
derivative_array = np.array(self.input_array)
element_wise_op(derivative_array,
activator.backward)
self.delta_array *= derivative_array
2).保存?zhèn)鬟f到上一層的sensitivity map的數(shù)組
def create_delta_array(self): return np.zeros((self.channel_number, self.input_height, self.input_width))
3).計(jì)算代碼梯度
def bp_gradient(self, sensitivity_array):
# 處理卷積步長,對原始sensitivity map進(jìn)行擴(kuò)展
expanded_array = self.expand_sensitivity_map(
sensitivity_array)
for f in range(self.filter_number):
# 計(jì)算每個(gè)權(quán)重的梯度
filter = self.filters[f]
for d in range(filter.weights.shape[0]):
conv(self.padded_input_array[d],
expanded_array[f],
filter.weights_grad[d], 1, 0)
# 計(jì)算偏置項(xiàng)的梯度
filter.bias_grad = expanded_array[f].sum()
4).按照梯度下降法更新參數(shù)
def update(self): '''按照梯度下降,更新權(quán)重 '''for filter in self.filters: filter.update(self.learning_rate)
6.MaxPooling層的訓(xùn)練
1).定義MaxPooling類
class MaxPoolingLayer(object):
def __init__(self, input_width, input_height,
channel_number, filter_width,
filter_height, stride):
self.input_width = input_width
self.input_height = input_height
self.channel_number = channel_number
self.filter_width = filter_width
self.filter_height = filter_height
self.stride = stride
self.output_width = (input_width -
filter_width) / self.stride + 1
self.output_height = (input_height -
filter_height) / self.stride + 1
self.output_array = np.zeros((self.channel_number,
self.output_height, self.output_width))
2).前向傳播計(jì)算
# 前向傳播
def forward(self, input_array):
for d in range(self.channel_number):
for i in range(self.output_height):
for j in range(self.output_width):
self.output_array[d,i,j] = (
get_patch(input_array[d], i, j,
self.filter_width,
self.filter_height,
self.stride).max())
3).反向傳播計(jì)算
#反向傳播
def backward(self, input_array, sensitivity_array):
self.delta_array = np.zeros(input_array.shape)
for d in range(self.channel_number):
for i in range(self.output_height):
for j in range(self.output_width):
patch_array = get_patch(
input_array[d], i, j,
self.filter_width,
self.filter_height,
self.stride)
k, l = get_max_index(patch_array)
self.delta_array[d,
i * self.stride + k,
j * self.stride + l] =
sensitivity_array[d,i,j]
完整代碼請見:cnn.py(https://github.com/huxiaoman7/PaddlePaddle_code/blob/master/1.mnist/cnn.py)
#coding:utf-8
'''
Created by huxiaoman 2017.11.22
'''
import numpy as np
from activators import ReluActivator,IdentityActivator
class ConvLayer(object):
def __init__(self,input_width,input_weight,
channel_number,filter_width,
filter_height,filter_number,
zero_padding,stride,activator,
learning_rate):
self.input_width = input_width
self.input_height = input_height
self.channel_number = channel_number
self.filter_width = filter_width
self.filter_height = filter_height
self.filter_number = filter_number
self.zero_padding = zero_padding
self.stride = stride #此處可以加上stride_x, stride_y
self.output_width = ConvLayer.calculate_output_size(
self.input_width,filter_width,zero_padding,
stride)
self.output_height = ConvLayer.calculate_output_size(
self.input_height,filter_height,zero_padding,
stride)
self.output_array = np.zeros((self.filter_number,
self.output_height,self.output_width))
self.filters = []
for i in range(filter_number):
self.filters.append(Filter(filter_width,
filter_height,self.channel_number))
self.activator = activator
self.learning_rate = learning_rate
def forward(self,input_array):
'''
計(jì)算卷積層的輸出
輸出結(jié)果保存在self.output_array
'''
self.input_array = input_array
self.padded_input_array = padding(input_array,
self.zero_padding)
for i in range(self.filter_number):
filter = self.filters[f]
conv(self.padded_input_array,
filter.get_weights(), self.output_array[f],
self.stride, filter.get_bias())
element_wise_op(self.output_array,
self.activator.forward)
def get_batch(input_array, i, j, filter_width,filter_height,stride):
'''
從輸入數(shù)組中獲取本次卷積的區(qū)域,
自動(dòng)適配輸入為2D和3D的情況
'''
start_i = i * stride
start_j = j * stride
if input_array.ndim == 2:
return input_array[
start_i : start_i + filter_height,
start_j : start_j + filter_width]
elif input_array.ndim == 3:
return input_array[
start_i : start_i + filter_height,
start_j : start_j + filter_width]
# 獲取一個(gè)2D區(qū)域的最大值所在的索引
def get_max_index(array):
max_i = 0
max_j = 0
max_value = array[0,0]
for i in range(array.shape[0]):
for j in range(array.shape[1]):
if array[i,j] > max_value:
max_value = array[i,j]
max_i, max_j = i, j
return max_i, max_j
def conv(input_array,kernal_array,
output_array,stride,bias):
'''
計(jì)算卷積,自動(dòng)適配輸入2D,3D的情況
'''
channel_number = input_array.ndim
output_width = output_array.shape[1]
output_height = output_array.shape[0]
kernel_width = kernel_array.shape[-1]
kernel_height = kernel_array.shape[-2]
for i in range(output_height):
for j in range(output_width):
output_array[i][j] = (
get_patch(input_array, i, j, kernel_width,
kernel_height,stride) * kernel_array).sum() +bias
def element_wise_op(array, op):
for i in np.nditer(array,
op_flags = ['readwrite']):
i[...] = op(i)
class ReluActivators(object):
def forward(self, weighted_input):
# Relu計(jì)算公式 = max(0,input)
return max(0, weighted_input)
def backward(self,output):
return 1 if output > 0 else 0
class SigmoidActivator(object):
def forward(self,weighted_input):
return 1 / (1 + math.exp(- weighted_input))
def backward(self,output):
return output * (1 - output)
最后,我們用之前的4 * 4的image數(shù)據(jù)檢驗(yàn)一下通過一次卷積神經(jīng)網(wǎng)絡(luò)進(jìn)行前向傳播和反向傳播后的輸出結(jié)果:
def init_test():
a = np.array(
[[[0,1,1,0,2],
[2,2,2,2,1],
[1,0,0,2,0],
[0,1,1,0,0],
[1,2,0,0,2]],
[[1,0,2,2,0],
[0,0,0,2,0],
[1,2,1,2,1],
[1,0,0,0,0],
[1,2,1,1,1]],
[[2,1,2,0,0],
[1,0,0,1,0],
[0,2,1,0,1],
[0,1,2,2,2],
[2,1,0,0,1]]])
b = np.array(
[[[0,1,1],
[2,2,2],
[1,0,0]],
[[1,0,2],
[0,0,0],
[1,2,1]]])
cl = ConvLayer(5,5,3,3,3,2,1,2,IdentityActivator(),0.001)
cl.filters[0].weights = np.array(
[[[-1,1,0],
[0,1,0],
[0,1,1]],
[[-1,-1,0],
[0,0,0],
[0,-1,0]],
[[0,0,-1],
[0,1,0],
[1,-1,-1]]], dtype=np.float64)
cl.filters[0].bias=1
cl.filters[1].weights = np.array(
[[[1,1,-1],
[-1,-1,1],
[0,-1,1]],
[[0,1,0],
[-1,0,-1],
[-1,1,0]],
[[-1,0,0],
[-1,0,1],
[-1,0,0]]], dtype=np.float64)
return a, b, cl
運(yùn)行一下:
def test():
a, b, cl = init_test()
cl.forward(a)
print "前向傳播結(jié)果:", cl.output_array
cl.backward(a, b, IdentityActivator())
cl.update()
print "反向傳播后更新得到的filter1:",cl.filters[0]
print "反向傳播后更新得到的filter2:",cl.filters[1]
if __name__ == "__main__":
test()
運(yùn)行結(jié)果:
前向傳播結(jié)果: [[[ 6. 7. 5.]
[ 3. -1. -1.]
[ 2. -1. 4.]]
[[ 2. -5. -8.]
[ 1. -4. -4.]
[ 0. -5. -5.]]]
反向傳播后更新得到的filter1: filter weights:
array([[[-1.008, 0.99 , -0.009],
[-0.005, 0.994, -0.006],
[-0.006, 0.995, 0.996]],
[[-1.004, -1.001, -0.004],
[-0.01 , -0.009, -0.012],
[-0.002, -1.002, -0.002]],
[[-0.002, -0.002, -1.003],
[-0.005, 0.992, -0.005],
[ 0.993, -1.008, -1.007]]])
bias:
0.99099999999999999
反向傳播后更新得到的filter2: filter weights:
array([[[ 9.98000000e-01, 9.98000000e-01, -1.00100000e+00],
[ -1.00400000e+00, -1.00700000e+00, 9.97000000e-01],
[ -4.00000000e-03, -1.00400000e+00, 9.98000000e-01]],
[[ 0.00000000e+00, 9.99000000e-01, 0.00000000e+00],
[ -1.00900000e+00, -5.00000000e-03, -1.00400000e+00],
[ -1.00400000e+00, 1.00000000e+00, 0.00000000e+00]],
[[ -1.00400000e+00, -6.00000000e-03, -5.00000000e-03],
[ -1.00200000e+00, -5.00000000e-03, 9.98000000e-01],
[ -1.00200000e+00, -1.00000000e-03, 0.00000000e+00]]])
bias:
-0.0070000000000000001
PaddlePaddle卷積神經(jīng)網(wǎng)絡(luò)源碼解析
卷積層
在上篇文章中,我們對paddlepaddle實(shí)現(xiàn)卷積神經(jīng)網(wǎng)絡(luò)的的函數(shù)簡單介紹了一下。在手寫數(shù)字識(shí)別中,我們設(shè)計(jì)CNN的網(wǎng)絡(luò)結(jié)構(gòu)時(shí),調(diào)用了一個(gè)函數(shù)simple_img_conv_pool(上篇文章的鏈接已失效,因?yàn)橐呀?jīng)把framework--->fluid,更新速度太快了 = =)使用方式如下:
conv_pool_1 = paddle.networks.simple_img_conv_pool(
input=img,
filter_size=5,
num_filters=20,
num_channel=1,
pool_size=2,
pool_stride=2,
act=paddle.activation.Relu())
這個(gè)函數(shù)把卷積層和池化層兩個(gè)部分封裝在一起,只用調(diào)用一個(gè)函數(shù)就可以搞定,非常方便。如果只需要單獨(dú)使用卷積層,可以調(diào)用這個(gè)函數(shù)img_conv_layer,使用方式如下:
conv = img_conv_layer(input=data, filter_size=1, filter_size_y=1, num_channels=8, num_filters=16, stride=1, bias_attr=False, act=ReluActivation())
我們來看一下這個(gè)函數(shù)具體有哪些參數(shù)(注釋寫明了參數(shù)的含義和怎么使用)
def img_conv_layer(input,
filter_size,
num_filters,
name=None,
num_channels=None,
act=None,
groups=1,
stride=1,
padding=0,
dilation=1,
bias_attr=None,
param_attr=None,
shared_biases=True,
layer_attr=None,
filter_size_y=None,
stride_y=None,
padding_y=None,
dilation_y=None,
trans=False,
layer_type=None):
"""
適合圖像的卷積層。Paddle可以支持正方形和長方形兩種圖片尺寸的輸入
也可適用于圖像的反卷積(Convolutional Transpose,即deconv)。
同樣可支持正方形和長方形兩種尺寸輸入。
num_channel:輸入圖片的通道數(shù)??梢允?或者3,或者是上一層的通道數(shù)(卷積核數(shù)目 * 組的數(shù)量)
每一個(gè)組都會(huì)處理圖片的一些通道。舉個(gè)例子,如果一個(gè)輸入如偏的num_channel是256,設(shè)置4個(gè)group,
32個(gè)卷積核,那么會(huì)創(chuàng)建32*4 = 128個(gè)卷積核來處理輸入圖片。通道會(huì)被分成四塊,32個(gè)卷積核會(huì)先
處理64(256/4=64)個(gè)通道。剩下的卷積核組會(huì)處理剩下的通道。
name:層的名字??蛇x,自定義。
type:basestring
input:這個(gè)層的輸入
type:LayerOutPut
filter_size:卷積核的x維,可以理解為width。
如果是正方形,可以直接輸入一個(gè)元祖組表示圖片的尺寸
type:int/ tuple/ list
filter_size_y:卷積核的y維,可以理解為height。
PaddlePaddle支持長方形的圖片尺寸,所以卷積核的尺寸為(filter_size,filter_size_y)
type:int/ None
act: 激活函數(shù)類型。默認(rèn)選Relu
type:BaseActivation
groups:卷積核的組數(shù)量
type:int
stride: 水平方向的滑動(dòng)步長?;蛘呤澜巛斎胍粋€(gè)元祖,代表水平數(shù)值滑動(dòng)步長相同。
type:int/ tuple/ list
stride_y:垂直滑動(dòng)步長。
type:int
padding: 補(bǔ)零的水平維度,也可以直接輸入一個(gè)元祖,水平和垂直方向上補(bǔ)零的維度相同。
type:int/ tuple/ list
padding_y:垂直方向補(bǔ)零的維度
type:int
dilation:水平方向的擴(kuò)展維度。同樣可以輸入一個(gè)元祖表示水平和初值上擴(kuò)展維度相同
:type:int/ tuple/ list
dilation_y:垂直方向的擴(kuò)展維度
type:int
bias_attr:偏置屬性
False:不定義bias True:bias初始化為0
type: ParameterAttribute/ None/ bool/ Any
num_channel:輸入圖片的通道channel。如果設(shè)置為None,自動(dòng)生成為上層輸出的通道數(shù)
type: int
param_attr:卷積參數(shù)屬性。設(shè)置為None表示默認(rèn)屬性
param_attr:ParameterAttribute
shared_bias:設(shè)置偏置項(xiàng)是否會(huì)在卷積核中共享
type:bool
layer_attr: Layer的 Extra Attribute
type:ExtraLayerAttribute
param trans:如果是convTransLayer,設(shè)置為True,如果是convlayer設(shè)置為conv
type:bool
layer_type:明確layer_type,默認(rèn)為None。
如果trans= True,必須是exconvt或者cudnn_convt,否則的話要么是exconv,要么是cudnn_conv
ps:如果是默認(rèn)的話,paddle會(huì)自動(dòng)選擇適合cpu的ExpandConvLayer和適合GPU的CudnnConvLayer
當(dāng)然,我們自己也可以明確選擇哪種類型
type:string
return:LayerOutput object
rtype:LayerOutput
"""
def img_conv_layer(input,
filter_size,
num_filters,
name=None,
num_channels=None,
act=None,
groups=1,
stride=1,
padding=0,
dilation=1,
bias_attr=None,
param_attr=None,
shared_biases=True,
layer_attr=None,
filter_size_y=None,
stride_y=None,
padding_y=None,
dilation_y=None,
trans=False,
layer_type=None):
if num_channels is None:
assert input.num_filters is not None
num_channels = input.num_filters
if filter_size_y is None:
if isinstance(filter_size, collections.Sequence):
assert len(filter_size) == 2
filter_size, filter_size_y = filter_size
else:
filter_size_y = filter_size
if stride_y is None:
if isinstance(stride, collections.Sequence):
assert len(stride) == 2
stride, stride_y = stride
else:
stride_y = stride
if padding_y is None:
if isinstance(padding, collections.Sequence):
assert len(padding) == 2
padding, padding_y = padding
else:
padding_y = padding
if dilation_y is None:
if isinstance(dilation, collections.Sequence):
assert len(dilation) == 2
dilation, dilation_y = dilation
else:
dilation_y = dilation
if param_attr.attr.get('initial_smart'):
# special initial for conv layers.
init_w = (2.0 / (filter_size**2 * num_channels))**0.5
param_attr.attr["initial_mean"] = 0.0
param_attr.attr["initial_std"] = init_w
param_attr.attr["initial_strategy"] = 0
param_attr.attr["initial_smart"] = False
if layer_type:
if dilation > 1 or dilation_y > 1:
assert layer_type in [
"cudnn_conv", "cudnn_convt", "exconv", "exconvt"
]
if trans:
assert layer_type in ["exconvt", "cudnn_convt"]
else:
assert layer_type in ["exconv", "cudnn_conv"]
lt = layer_type
else:
lt = LayerType.CONVTRANS_LAYER if trans else LayerType.CONV_LAYER
l = Layer(
name=name,
inputs=Input(
input.name,
conv=Conv(
filter_size=filter_size,
padding=padding,
dilation=dilation,
stride=stride,
channels=num_channels,
groups=groups,
filter_size_y=filter_size_y,
padding_y=padding_y,
dilation_y=dilation_y,
stride_y=stride_y),
**param_attr.attr),
active_type=act.name,
num_filters=num_filters,
bias=ParamAttr.to_bias(bias_attr),
shared_biases=shared_biases,
type=lt,
**ExtraLayerAttribute.to_kwargs(layer_attr))
return LayerOutput(
name,
lt,
parents=[input],
activation=act,
num_filters=num_filters,
size=l.config.size)
我們了解這些參數(shù)的含義后,對比我們之前自己手寫的CNN,可以看出paddlepaddle有幾個(gè)優(yōu)點(diǎn):
支持長方形和正方形的圖片尺寸
支持滑動(dòng)步長stride、補(bǔ)零zero_padding、擴(kuò)展dilation在水平和垂直方向上設(shè)置不同的值
支持偏置項(xiàng)卷積核中能夠共享
自動(dòng)適配cpu和gpu的卷積網(wǎng)絡(luò)
在我們自己寫的CNN中,只支持正方形的圖片長度,如果是長方形會(huì)報(bào)錯(cuò)?;瑒?dòng)步長,補(bǔ)零的維度等也只支持水平和垂直方向上的維度相同。了解卷積層的參數(shù)含義后,我們來看一下底層的源碼是如何實(shí)現(xiàn)的:ConvBaseLayer.py有興趣的同學(xué)可以在這個(gè)鏈接下看看底層是如何用C++寫的ConvLayer
池化層同理,可以按照之前的思路分析,有興趣的可以一直順延看到底層的實(shí)現(xiàn),下次有機(jī)會(huì)再詳細(xì)分析。(占坑明天補(bǔ)一下tensorflow的源碼實(shí)現(xiàn))
總結(jié)
本文主要講解了卷積神經(jīng)網(wǎng)絡(luò)中反向傳播的一些技巧,包括卷積層和池化層的反向傳播與傳統(tǒng)的反向傳播的區(qū)別,并實(shí)現(xiàn)了一個(gè)完整的CNN,后續(xù)大家可以自己修改一些代碼,譬如當(dāng)水平滑動(dòng)長度與垂直滑動(dòng)長度不同時(shí)需要怎么調(diào)整等等,最后研究了一下paddlepaddle中CNN中的卷積層的實(shí)現(xiàn)過程,對比自己寫的CNN,總結(jié)了4個(gè)優(yōu)點(diǎn),底層是C++實(shí)現(xiàn)的,有興趣的可以自己再去深入研究。寫的比較粗糙,如果有問題歡迎留言:)
-
深度學(xué)習(xí)
+關(guān)注
關(guān)注
73文章
5463瀏覽量
120890 -
卷積神經(jīng)網(wǎng)絡(luò)
+關(guān)注
關(guān)注
4文章
359瀏覽量
11831
原文標(biāo)題:【深度學(xué)習(xí)系列】卷積神經(jīng)網(wǎng)絡(luò)詳解(二)——自己手寫一個(gè)卷積神經(jīng)網(wǎng)絡(luò)
文章出處:【微信號:AI_shequ,微信公眾號:人工智能愛好者社區(qū)】歡迎添加關(guān)注!文章轉(zhuǎn)載請注明出處。
發(fā)布評論請先 登錄
相關(guān)推薦
評論