In this post, we will write the Word count program in Java. We explained the logic of this program in MapReduce Hello World (Part 1). Before writing the program , here is the data type differences between Java and MapReduce: Equivalent of int in MapReduce is IntWritable Equivalent of String is Text Equivalent […]
Category: Hadoop
MapReduce Hello World (Part 1)
In this post, we will : 1) Understand MapReduce basics 2) Write a word count program in Map Reduce This is also considered as the Hello World program in MapReduce programming. What is MapReduce ? MapReduce is the ‘heart‘ of Hadoop that consists of two parts – ‘map’ and ‘reduce’. Maps […]
Big Data and Hadoop Basics
This post provides an introduction to following concepts : Hadoop Basics What is HDFS ? What is YARN ? Lets start with the simplest question first. What is Big Data ? Big data is a term coined for huge volume of data(in terrabytes or petabytes) that is difficult to manage using […]
What is Big Data and Why do Enterprises care about Big Data?
In this article, we will discuss what is Big data and why do enterprises care about Big data.we will learn: What is wrong with our traditional DWH solutions? When RDBMS could not help much Technical issues we face with RDBMS How Hadoop is different from RDBMS Core features of Hadoop What is wrong with our […]
NoSQL and Advantages of NOSQL
Why NoSQL? Bigger data handling capability To handle Volume of data it allows data to be spread across commodity hardware Elastic scaling Scale out the storage Flexible Schema/fixed data model It does not need a organized schema. The schema is dynamic or fixed schema Integrated caching facility To Increase performance, NOSQL cache data in system […]