《大数据挖掘技术》@复旦 课程项目,试图从搜狗实验室用户查询日志数据(2008)中找出搜索记录中有较高支持度关键词的频繁二项集。在实现层面上,我搭建了一个由五台服务器组成的微型 Hadoop 集群,并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。
-
Updated
Mar 29, 2021 - Python
《大数据挖掘技术》@复旦 课程项目,试图从搜狗实验室用户查询日志数据(2008)中找出搜索记录中有较高支持度关键词的频繁二项集。在实现层面上,我搭建了一个由五台服务器组成的微型 Hadoop 集群,并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。
Using hadoop to utilize data from an automobile tracking platform that tracks the history of important incidents after the initial sale of a new vehicle.
Add the MapReduce codes in any language in defined folder to maintain a repository to help students learn Big Data
A simple project on the use of map and reduce in Hadoop.
A MapReduce implementation in python in a docker simulated distributed system
PageRank algorithm using Hadoop Streaming
Lambda to start EMR and run a map reduce job
A distributed map-reduce implemented by Python 3 and gRPC
A Hadoop based Map-Reduce based SQL engine
This repository have codes that extracts meaningful information from News headline data-set.
基于Item-based CF和XGBRegressor完成的用户对商品的推荐系统
Performing Map reduce to get the page rank on the WDC data.
Using mapreduce in hadoop and python to score sentiments
Modified from big-data-europe/docker-hadoop
Understand how map reduce works for parsing a text data with parallel processing of sub tasks using multi threading
Tugas Besar Big Data (hadoop)
A repository containing the source codes for the assignments done as a part of the Big Data course (UE18CS322) at PES University.
Add a description, image, and links to the mapreduce-python topic page so that developers can more easily learn about it.
To associate your repository with the mapreduce-python topic, visit your repo's landing page and select "manage topics."