社交网络数据采集算法的设计(软件工程课程设计报告)

社交网络数据采集算法的设计(软件工程课程设计报告)

ID:16134049

大小:31.07 KB

页数:26页

时间:2018-08-08

社交网络数据采集算法的设计(软件工程课程设计报告)_第1页
社交网络数据采集算法的设计(软件工程课程设计报告)_第2页
社交网络数据采集算法的设计(软件工程课程设计报告)_第3页
社交网络数据采集算法的设计(软件工程课程设计报告)_第4页
社交网络数据采集算法的设计(软件工程课程设计报告)_第5页
资源描述:

《社交网络数据采集算法的设计(软件工程课程设计报告)》由会员上传分享,免费在线阅读,更多相关内容在教育资源-天天文库

1、社交网络数据采集算法的设计(软件工程课程设计报告)软件工程课程设计社交网络数据收集算法的设计摘要随着互联网的发展,人们正处于一个信息爆炸的时代。社交网络数据信息量大、主题性强,具有巨大的数据挖掘价值,是互联网大数据的重要组成部分。一些社交平台如Twitter、新浪微博、人人网等,允许用户申请平台数据的采集权限,并提供了相应的API接口采集数据,通过注册社交平台、申请API授权、调用API方法等流程获取社交信息数据。但社交平台采集权限的申请比较严格,申请成功后对于数据的采集也有限制。因此,本文采用网络爬虫的方式,利用社交账户模拟登录社交平台,访问社交平台的网页信息,并在爬虫任务执

2、行完毕后,及时返回任务执行结果。相比于过去的信息匮乏,面对现阶段海量的信息数据,对信息的筛选和过滤成为了衡量一个系统好坏的重要指标。本文运用了爬虫和协同过滤算法对网络社交数据进行收集。关键词:软件工程;社交网络;爬虫;协同过滤算法目录摘要·······················································································-2-目录·················································································

3、······-3-课题研究的目的········································································-1-1.1课题研究背景································································-1-2优先抓取策略--PageRank·························································-2-2.1PageRank简介···································

4、·······························-2-2.2PageRank流程··································································-2-3爬虫····················································································-4-3.1爬虫介绍········································································-4-3.1.

5、1爬虫简介·····································································-4-3.1.2工作流程····································································-4-3.1.3抓取策略介绍······························································-5-3.2工具介绍····················································

6、····················-6-3.2.1Eclipse········································································-7-3.2.2Python语言·································································-7-3.2.3BeautifulSoup·······························································-7-3.3实现············

7、··································································-8-3.4运行结果········································································-9-4算法部分·············································································-10-4.1获取数据的三种途

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。