Call Us @ 9884312236


About Raja Priya

This author has not yet filled in any details.
So far Raja Priya has created 51 blog entries.

Apache Sqoop Tutorial – Basic Sqoop Import and Export Operations

What is Sqoop?

Sqoop is one type of tool which used to transfer data between RDBMS and HDFS. It is export and import data from datastores to HDFS. It uses a MapReduce for export the data for processing the large amount of data. Sqoop only works with relational databases and it is a open source tool […]

Apache HDFS Architecture and Components

HDFS means Hadoop Distributed File System and it manages big data sets with high volume. HDFS stores the data at distributed manner and it is the primary storage system. HDFS allows read and write the files but cannot updated the files in HDFS. When we move file in HDFS that file are splited into small […]

Basic Hadoop Shell Commands to Manage HDFS

Hadoop is a Java-based framework and it runs on Linux based windows so commands are important for Hadoop. Hadoop Commands are used to access the all HDFS cluster files and directories. Here we discuss basic and important Hadoop commands to manage HDFS.

Latest and Updated Hadoop Training Topics PDF Download

Basic Commands:

1. mkdir:

mkdir is used to create […]

Types of Nodes in Hadoop

1. NameNode:

NameNode is the main and heartbeat node of Hdfs and also called master. It stores the meta data in RAM for quick access and track the files across hadoop cluster. If Namenode failure the whole hdfs is inaccessible so NameNode is very critical for HDFS. NameNode is the health of datanode and it access […]

Top 20 Hadoop Interview Questions and Answers

1. What is Hadoop?

Hadoop is an open source framework and one type of tool which is used to store large amount of data sets.Hadoop is provided for data storage,data access,data processing and security operations. Many organizations are used hadoop for storage purpose because Hadoop storing large amount of data quickly.

2. What are the Main Components […]

Ten Amazing Big Data Myths

Big Data holds great promise for enterprises of all sizes. It can bring insights that help the business drive revenue and also understand gaps in service and products.


Here are some myths about data:

1. Big data is new

Huge cross references of every single word used in the Bible,called “Concordances”,were in use by scholar monks for centuries […]

Top Ten Difference Between Apache Hbase and Hive


Apache Hive
Apache Hbase

Hive is Datawarehousing tool and used to process the data in hadoop and HDFS.Hive is similar to SQL because it analyze and process the data with querying language.
Apache Hbase is open source framework and it is a NoSql Database.

Hive runs on MapReduce and top of the Hadoop
Hbase runs on top of the HDFS

Main […]

Apache Hadoop Oozie Tutorial

Oozie is mainly used to manages the hadoop jobs in HDFS and it combines the multiple jobs in particular order to achieve the big task. It is the open source framework and used to make multiple hadoop jobs. Oozie supports the jobs in mapreduce,hive and hdfs also. In Oozie job workflow based on Directed Acylic […]

Apache Spark Tutorial

What is Apche Spark?

Spark also open source framework and mainly used for data analytics. Spark runs more faster than hadoop and it designed on top of the hadoop. Spark does not have separate file system and it integrated with another one. Main feature of spark is does not use YARN for functioning.

Spark does not have […]

Apche Hadoop Flume Tutorial

What is Apache Flume?
Apache Flume is one tool and used to moving data from one place to another place.Flume is the distributed systems that transporting the data at reliable manner.Flume is most important part of hadoop ecosystem.In Apache flume all data unit consider as one event. It collecting log data from various web servers to […]