Open Access   Article Go Back

Big Data Platform-A Review

Sunny Kumar1

Section:Review Paper, Product Type: Journal Paper
Volume-3 , Issue-10 , Page no. 84-87, Oct-2015

Online published on Oct 31, 2015

Copyright © Sunny Kumar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Sunny Kumar, “Big Data Platform-A Review,” International Journal of Computer Sciences and Engineering, Vol.3, Issue.10, pp.84-87, 2015.

MLA Style Citation: Sunny Kumar "Big Data Platform-A Review." International Journal of Computer Sciences and Engineering 3.10 (2015): 84-87.

APA Style Citation: Sunny Kumar, (2015). Big Data Platform-A Review. International Journal of Computer Sciences and Engineering, 3(10), 84-87.

BibTex Style Citation:
@article{Kumar_2015,
author = {Sunny Kumar},
title = {Big Data Platform-A Review},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {10 2015},
volume = {3},
Issue = {10},
month = {10},
year = {2015},
issn = {2347-2693},
pages = {84-87},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=710},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=710
TI - Big Data Platform-A Review
T2 - International Journal of Computer Sciences and Engineering
AU - Sunny Kumar
PY - 2015
DA - 2015/10/31
PB - IJCSE, Indore, INDIA
SP - 84-87
IS - 10
VL - 3
SN - 2347-2693
ER -

VIEWS PDF XML
2568 2312 downloads 2319 downloads
  
  
           

Abstract

Hadoop is popular distributed system used for the analysis of large amount of data. Hadoop is based on distributed computing having HDFS (Hadoop Distributed File System) &Map Reduce programming paradigm. Hadoop is highly fault-tolerant due to its imitation of data transversely on multiple nodes and can be set out on low cost hardware. The file system –HDFS—written in JAVA and designed for heterogeneous hardware and software. Hadoop is very much appropriate for high volume of data & where data format is different like semi structured, unstructured. Hadoop also make available the high speed admittance to the data of the application which we want to use. Hadoop architecture is cluster based (cluster consists of racks), which is consist of nodes (data note, name node), physically separate to each other, in idyllic circumstances. In Hadoop a program known as map-reduce is used to collect data according to query. As Hadoop is used for massive amount of data therefore scheduling and way of containing data in Hadoop must be efficient for better presentation. With this feature of Hadoop the traditional system is replacing with Hadoop. The research objective is to study and explore various scheduling techniques, which are used to increase performance in Hadoop. This paper include the idea of working of Hadoop, its internal details and why Hadoop is better than the Traditional system.

Key-Words / Index Term

Hadoop, HDFS, Name node, Data node. Map Reduce, Data locality, Job Tracker, Task Tracker

References

[1] Transl. J. Magn. Japan, [Digests 9th Annual Conf. Magnetics Japan, Vol. 2, pp. 740-741, August 1987 pp. 301, 1982].
[2] Chris Eaton and Tom Deutsch, Understanding Big Data-Analytics for Enterprise Class Hadoop and Streaming Data.
[3] Arun C. Murthy and Vinod Kumar Vavilapalli, Apache Hadoop YARN-Moving beyond MapReduce and Batch Processing with Apache Hadoop 2.
[4] http://www.bigdatauniversity.com/web/media/player.php?file=BD001V212EN/Videos/Unit_1_What_is_Hadoop_Part1.mp4&caption=files.db2university.com/BD001V212EN/Videos/EN/Unit_1_What_i s_Hadoop_Part1.srt
[5] https://www.youtube.com/watch?v=DLutRT6K2rM
[6] Figure 2. The flow of data in a simple MapReduce job pp.62 Chris Eaton and Tom Deutsch, Understanding Big Data- Analytics for Enterprise Class Hadoop and Streaming Data.