- Hadoop - Multi-Node Cluster
- Hadoop - Streaming
- Hadoop - MapReduce
- Hadoop - Command Reference
- Hadoop - HDFS Operations
- Hadoop - HDFS Overview
- Hadoop - Environment Setup
- Hadoop - Introduction
- Hadoop - Big Data Solutions
- Hadoop - Big Data Overview
- Hadoop - Home
Hadoop Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Hadoop - Big Data Solutions
Traditional Approach
In this approach, an enterprise will have a computer to store and process big data. For storage purpose, the programmers will take the help of their choice of database vendors such as Oracle, IBM, etc. In this approach, the user interacts with the apppcation, which in turn handles the part of data storage and analysis.
Limitation
This approach works fine with those apppcations that process less voluminous data that can be accommodated by standard database servers, or up to the pmit of the processor that is processing the data. But when it comes to deapng with huge amounts of scalable data, it is a hectic task to process such data through a single database bottleneck.
Google’s Solution
Google solved this problem using an algorithm called MapReduce. This algorithm spanides the task into small parts and assigns them to many computers, and collects the results from them which when integrated, form the result dataset.
Hadoop
Using the solution provided by Google, Doug Cutting and his team developed an Open Source Project called HADOOP.
Hadoop runs apppcations using the MapReduce algorithm, where the data is processed in parallel with others. In short, Hadoop is used to develop apppcations that could perform complete statistical analysis on huge amounts of data.
Advertisements