Blog

Main Features of Apache Hadoop

Hadoop open source framework and popular data storage system. Hadoop is used to stores large set of structured,semi structured and unstructured data. Here we are discuss about main features of Apache hadoop

1. Hadoop is Open Source:

Hadoop framework is the open source so we can changed project coding according to business requirements

2. Fault Tolerant:

In Hadoop all […]

Top Ten Difference Between Hadoop One and Hadoop Two

S.No
Hadoop 1.X
Hadoop 2.X

1
Hadoop 1.x having only two components for processing the data that two components are HDFS and MapReduce.
Hadoop 2.x having three components for processing the data that three components are HDFS,MapReduce and Yarn.

2
Hadoop 1.x supports mapreduce tools and distributed models only but not support non mapreduce tools.
It allows Mapreduce tools and other types of […]

Top Five Big Data Tools

Big Data is an open source framework and used to stores a large amount of structured, unstructured and semi-structured data. Big Data tools help to extract and analyze the data which really save the time. In the world, most of the companies used big data tools for accessing the Hadoop Data. Here we discuss Top […]

Apache Sqoop Tutorial – Basic Sqoop Import and Export Operations

What is Sqoop?

Sqoop is one type of tool which used to transfer data between RDBMS and HDFS. It is export and import data from datastores to HDFS. It uses a MapReduce for export the data for processing the large amount of data. Sqoop only works with relational databases and it is a open source tool […]

Apache HDFS Architecture and Components

HDFS means Hadoop Distributed File System and it manages big data sets with high volume. HDFS stores the data at distributed manner and it is the primary storage system. HDFS allows read and write the files but cannot updated the files in HDFS. When we move file in HDFS that file are splited into small […]

Basic Hadoop Shell Commands to Manage HDFS

Hadoop is a Java-based framework and it runs on Linux based windows so commands are important for Hadoop. Hadoop Commands are used to access the all HDFS cluster files and directories. Here we discuss basic and important Hadoop commands to manage HDFS.

Latest and Updated Hadoop Training Topics PDF Download

Basic Commands:

1. mkdir:

mkdir is used to create […]

Types of Nodes in Hadoop

1. NameNode:

NameNode is the main and heartbeat node of Hdfs and also called master. It stores the meta data in RAM for quick access and track the files across hadoop cluster. If Namenode failure the whole hdfs is inaccessible so NameNode is very critical for HDFS. NameNode is the health of datanode and it access […]

Top 20 Hadoop Interview Questions and Answers

1. What is Hadoop?

Hadoop is an open source framework and one type of tool which is used to store large amount of data sets.Hadoop is provided for data storage,data access,data processing and security operations. Many organizations are used hadoop for storage purpose because Hadoop storing large amount of data quickly.

2. What are the Main Components […]

Ten Amazing Big Data Myths

Big Data holds great promise for enterprises of all sizes. It can bring insights that help the business drive revenue and also understand gaps in service and products.

 

Here are some myths about data:

1. Big data is new

Huge cross references of every single word used in the Bible,called “Concordances”,were in use by scholar monks for centuries […]

Top Ten Difference Between Apache Hbase and Hive

 

S.NO
Apache Hive
Apache Hbase

1
Hive is Datawarehousing tool and used to process the data in hadoop and HDFS.Hive is similar to SQL because it analyze and process the data with querying language.
Apache Hbase is open source framework and it is a NoSql Database.

2
Hive runs on MapReduce and top of the Hadoop
Hbase runs on top of the HDFS

3
Main […]