【原创】机器学习从零开始系列连载(8)——机器学习中的统一框架
【公众号后台回复 Loss 获取本文涉及的两篇论文】
机器学习中的统一框架
目标函数
损失函数
0-1损失的优化是组合优化问题且为NP-hard,无法在多项式时间内求得;
损失函数非凸非光滑,很多优化方法无法使用;
对权重的更新可能会导致损失函数大的变化,即变化不光滑;
只能使用正则,其他正则形式都不起作用;
即使使用正则,依然是非凸非光滑,优化求解困难。
正则化项
L2 正则
L1 正则
正则化的几何解释
from wiki
Dropout正则化与数据扩充
神经网络框架
Linear Regression
Logistic Regression
Support Vector Machine
Bootstrap Neural Network
Boosting Neural Network
前情回顾
神经网络在维基百科上的定义是:
NN is a network inspired by biological neural networks (the central nervous systems of animals, in particular the brain) which are used to estimate or approximate functions that can depend on a large number of inputs that are generally unknown.(from wikipedia)
神经元
神经网络的常用结构
前馈神经网络
反向传播神经网络
循环神经网络
卷积神经网络
自编码器
Google DeepMind 记忆神经网络(用于AlphaGo)
一个简单的神经网络例子
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import random
import math
import keras
from keras.models import Sequential
from keras.layers.core import Dense,Dropout,Activation
def gd(x,m,s):
left=1/(math.sqrt(2*math.pi)*s)
right=math.exp(-math.pow(x-m,2)/(2*math.pow(s,2)))
return left*right
def pt(x, y1, y2):
if len(x) != len(y1) or len(x) != len(y2):
print 'input error.'
return
plt.figure(num=1, figsize=(20, 6))
plt.title('NN fitting Gaussian distribution', size=14)
plt.xlabel('x', size=14)
plt.ylabel('y', size=14)
plt.plot(x, y1, color='b', linestyle='--', label='Gaussian distribution')
plt.plot(x, y2, color='r', linestyle='-', label='NN fitting')
plt.legend(loc='upper left')
plt.savefig('ann.png', format='png')
def ann(train_d, train_l, prd_d):
if len(train_d) == 0 or len(train_d) != len(train_l):
print 'training data error.'
return
model = Sequential()
model.add(Dense(30, input_dim=1))
model.add(Activation('relu'))
model.add(Dense(30))
model.add(Activation('relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(train_d,train_l,batch_size=250, nb_epoch=50, validation_split=0.2)
p = model.predict(prd_d,batch_size=250)
return p
if __name__ == '__main__':
x = np.linspace(-5, 5, 10000)
idx = random.sample(x, 900)
train_d = []
train_l = []
for i in idx:
train_d.append(x[i])
train_l.append(gd(x[i],0,1))
y1 = []
y2 = []
for i in x:
y1.append(gd(i,0,1))
y2 = ann(np.array(train_d).reshape(len(train_d), 1), np.array(train_l), np.array(x).reshape(len(x), 1))
pt(x, y1, y2.tolist())
往
期
推
荐