ML之RS之CF:基于用户的CF算法—利用大量用户的电影及其评分数据集对一个新用户Jason进行推荐电影+(已知Jason曾观看几十部电影及其评分)
ML之RS之CF:基于用户的CF算法—利用大量用户的电影及其评分数据集对一个新用户Jason进行推荐电影+(已知Jason曾观看几十部电影及其评分)
输出结果
先看推荐结果显示
实现代码
from math import sqrt
#pearson距离
def pearson_dis(rating1, rating2):
sum_xy = 0
sum_x = 0
sum_y = 0
sum_x2 = 0
sum_y2 = 0
n = 0
for key in rating1:
if key in rating2:
n += 1
x = rating1[key]
y = rating2[key]
sum_xy += x * y
sum_x += x
sum_y += y
sum_x2 += pow(x, 2)
sum_y2 += pow(y, 2)
# now compute denominator
denominator = sqrt(sum_x2 - pow(sum_x, 2) / n) * sqrt(sum_y2 - pow(sum_y, 2) / n)
if denominator == 0:
return 0
else:
return (sum_xy - (sum_x * sum_y) / n) / denominator
#查找最近邻函数
def computeNearestNeighbor(username, users):
"""在给定username的情况下,计算其他用户和它的距离并排序"""
distances = []
for user in users: #全用户遍历,找到两个用户,计算pearson距离,依次添加到列表内
if user != username:
#distance = manhattan_dis(users[user], users[username])
distance = pearson_dis(users[user], users[username])
distances.append((distance, user))
distances.sort()
return distances
#进行推荐函数
def recommend(username, users):
nearest = computeNearestNeighbor(username, users)[0][1]
recommendations = []
neighborRatings = users[nearest]
userRatings = users[username]
for artist in neighborRatings:
if not artist in userRatings:
recommendations.append((artist, neighborRatings[artist]))
results = sorted(recommendations, key=lambda artistTuple: artistTuple[1], reverse = True)
for result in results:
print(result[0], result[1])
recommend('Jason', users)
相关文章推荐
ML之RS之CF:基于用户的CF算法—利用大量用户的电影及其评分数据集对一个新用户Jason进行推荐电影+(已知Jason曾观看几十部电影及其评分)
赞 (0)