Loading...
「ツール」は右上に移動しました。
利用したサーバー: natural-voltaic-titanium
1いいね 144回再生

Big Data Tutorial | 7. Introduction to HDFS | Learn Big Data From Scratch

Big Data Tutorial | 7. Introduction to HDFS | Learn Big Data From Scratch
Hello Everyone,
In this entire tutorial , we will focus on Big Data Ecosystem and Big Data Terminology .
In this video we will discuss about HDFS i.e. Hadoop Distributed File System.
What is Big Data ?
extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations,
especially relating to human behaviour and interactions.
"Big data” is a frequently heard buzzword in 2012. This year, IBM teamed with the University of Oxford to help organizations
look beyond the big data hype and gain a deeper view into how their peers are defining and tackling big data today to improve business performance.
Currently we are living in the era of huge data generation in which everyday enormous data is generating every seconds. According to survey Data
generate in 2010-2015 is triple as data generated into past of 10 years. So right now We have to think about data handleing and data analyis concepts for our
valuable data .

Introduction to Hadoop :-
Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.
Hadoop is an Apache top-level project being built and used by a global community of contributors and users. It is licensed under the Apache License 2.0.
Hadoop was created by Doug Cutting and Mike Cafarella in 2005. It was originally developed to support distribution for the Nutch search engine project.
Doug, who was working at Yahoo! at the time and is now Chief Architect of Cloudera, named the project after his son's toy elephant. Cutting's son was
2 years old at the time and just beginning to talk. He called his beloved stuffed yellow elephant "Hadoop" (with the stress on the first syllable).
Now 12, Doug's son often exclaims, "Why don't you say my name, and why don't I get royalties? I deserve to be famous for this!"

Idea Behind HDFS :-
Today's big data is 'too big' to store in ONE single computer -- no matter how powerful it is and how much storage it has. This eliminates lot of
storage system and databases that were built for single machines. So we are going to build the system to run on multiple networked computers.
The file system will look like a unified single file system to the 'outside' world
HDFS is an Apache Software Foundation project and a subproject of the Apache Hadoop project.
Hadoop is ideal for storing large amounts of data, like terabytes and petabytes, and uses HDFS as its storage system.
HDFS lets you connect nodes (commodity personal computers) contained within clusters over which data files are distributed.

Please go to this link for Big Data tutorial :-    • Learn Big Data | Big Data Tutorial for Beg...…
Subscribe to our YouTube channel at    / @databek  
Also please like my facebook page :- www.facebook.com/learnetix/?ref=bookmarks

Please comment down if you getting any issue regarding technical contain and also please share with others.

~-~~-~~~-~~-~
Please watch: "Apache Pig Tutorial | 1. Introduction to Apache Pig | Hadoop Pig Tutorial For Beginners"
   • Apache Pig Tutorial | 1. Introduction to A...  
~-~~-~~~-~~-~

コメント