In this talk, I will first introduce the basic knowledge of Hadoop, and also the difference between Hadoop1 and Hadoop2 (YARN), and then mainly focus on Hadoop2 with all the terminology and detailed work-flow of a MapReduce Job. The second part of the talk is about the Rhipe package in R, i.e. the R and Hadoop Integrated Programming Environment. I will start with a word count example, and then focus more on the unique property of Rhipe and what kind of job can be achieved by using it.

Seminar Information

  • Speaker: Xiaosu Tong
  • Date: Thursday, March 10, 2016
  • Time: 4:30 pm - 5:20 pm
  • Location: Math Library Lounge

Slides

Slides can be downloaded here.