Open Access   Article Go Back

Improved Analysis of Unstructured Datasets using Thesaurus Model

Simy Mary Kurian1 , Neema George2 , Jinu P Sainudeen3 , Neethu Maria John4

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-2 , Page no. 1033-1037, Feb-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i2.10331037

Online published on Feb 28, 2019

Copyright © Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John, “Improved Analysis of Unstructured Datasets using Thesaurus Model,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.2, pp.1033-1037, 2019.

MLA Style Citation: Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John "Improved Analysis of Unstructured Datasets using Thesaurus Model." International Journal of Computer Sciences and Engineering 7.2 (2019): 1033-1037.

APA Style Citation: Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John, (2019). Improved Analysis of Unstructured Datasets using Thesaurus Model. International Journal of Computer Sciences and Engineering, 7(2), 1033-1037.

BibTex Style Citation:
@article{Kurian_2019,
author = {Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John},
title = {Improved Analysis of Unstructured Datasets using Thesaurus Model},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2019},
volume = {7},
Issue = {2},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {1033-1037},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=5508},
doi = {https://doi.org/10.26438/ijcse/v7i2.10331037}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i2.10331037}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=5508
TI - Improved Analysis of Unstructured Datasets using Thesaurus Model
T2 - International Journal of Computer Sciences and Engineering
AU - Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John
PY - 2019
DA - 2019/02/28
PB - IJCSE, Indore, INDIA
SP - 1033-1037
IS - 2
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
92 123 downloads 69 downloads
  
  
           

Abstract

Humankind has put away in excess of 295 billion gigabytes (or 295 Exabyte) of information beginning around 1986, according to a report by the University of Southern California. Putting away and checking this information in generally disseminated conditions for all day, every day is an enormous errand for worldwide assistance associations. These datasets require high handling power which can`t be presented by conventional information bases as they are put away in an unstructured arrangement. Although one can utilize Map Reduce worldview to take care of this issue utilizing java-based Hadoop, it can`t give us with most extreme usefulness. Downsides can be defeated utilizing Hadoop-streaming methods that permit clients to characterize non-java executable for handling this dataset. This paper proposes a THESAURUS model which permits a quicker and more straightforward form of business examination.

Key-Words / Index Term

Hadoop, MapReduce, HDFS, NoSQL

References

[1] Apache Hadoop.[Online].Available: http://hadoop.apache.org
[2] Apache Hadoop-Streaming.[Online].:http://hadoop- streaming.apache.org
[3] Cassandra wiki, operations. [Online]. Available: http://wiki.apache.org/cassandra/Operations
[4] NOSQL data storage [online]: http://nosql-database.org
[5] E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, and L. Ramakrishnan, “A processing pipeline for cassandra datasets based on Hadoop streaming,” in Proc. IEEE Big Data Conf., Res. Track, Anchorage, AL, USA, pp. 168–175,2014.
[6] E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, L. Ramakrishnan, "Processing Cassandra Datasets with Hadoop-Streaming Based Approaches",IEEE Transactions on Services Computing, Vol. 9,Issue 1,pp 46-58.
[7] J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae,J. Qiu, and G. Fox, “Twister: A runtime for iterative mapreduce,” in Proc. 19th ACMInt. Symp. High Perform. Distrib. Comput., pp. 810–818,2010