Improved Analysis of  Unstructured   Datasets using Thesaurus Model

Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John

Open Access Article Go Back

Improved Analysis of Unstructured Datasets using Thesaurus Model

Simy Mary Kurian¹ , Neema George² , Jinu P Sainudeen³ , Neethu Maria John⁴

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-2 , Page no. 1033-1037, Feb-2019

CrossRef-DOI: https://doi.org/10.26438/ijcse/v7i2.10331037

Online published on Feb 28, 2019

Copyright © Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John, “Improved Analysis of Unstructured Datasets using Thesaurus Model,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.2, pp.1033-1037, 2019.

MLA Style Citation: Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John "Improved Analysis of Unstructured Datasets using Thesaurus Model." International Journal of Computer Sciences and Engineering 7.2 (2019): 1033-1037.

APA Style Citation: Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John, (2019). Improved Analysis of Unstructured Datasets using Thesaurus Model. International Journal of Computer Sciences and Engineering, 7(2), 1033-1037.

BibTex Style Citation:
@article{Kurian_2019,
author = {Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John},
title = {Improved Analysis of Unstructured Datasets using Thesaurus Model},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2019},
volume = {7},
Issue = {2},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {1033-1037},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=5508},
doi = {https://doi.org/10.26438/ijcse/v7i2.10331037}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i2.10331037}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=5508
TI - Improved Analysis of Unstructured Datasets using Thesaurus Model
T2 - International Journal of Computer Sciences and Engineering
AU - Simy Mary Kurian, Neema George, Jinu P Sainudeen, Neethu Maria John
PY - 2019
DA - 2019/02/28
PB - IJCSE, Indore, INDIA
SP - 1033-1037
IS - 2
VL - 7
SN - 2347-2693
ER -

VIEWS	PDF	XML
92	123 downloads	69 downloads

Bar Line

Abstract

Humankind has put away in excess of 295 billion gigabytes (or 295 Exabyte) of information beginning around 1986, according to a report by the University of Southern California. Putting away and checking this information in generally disseminated conditions for all day, every day is an enormous errand for worldwide assistance associations. These datasets require high handling power which can`t be presented by conventional information bases as they are put away in an unstructured arrangement. Although one can utilize Map Reduce worldview to take care of this issue utilizing java-based Hadoop, it can`t give us with most extreme usefulness. Downsides can be defeated utilizing Hadoop-streaming methods that permit clients to characterize non-java executable for handling this dataset. This paper proposes a THESAURUS model which permits a quicker and more straightforward form of business examination.

Key-Words / Index Term

Hadoop, MapReduce, HDFS, NoSQL

References

[1] Apache Hadoop.[Online].Available: http://hadoop.apache.org
[2] Apache Hadoop-Streaming.[Online].:http://hadoop- streaming.apache.org
[3] Cassandra wiki, operations. [Online]. Available: http://wiki.apache.org/cassandra/Operations
[4] NOSQL data storage [online]: http://nosql-database.org
[5] E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, and L. Ramakrishnan, “A processing pipeline for cassandra datasets based on Hadoop streaming,” in Proc. IEEE Big Data Conf., Res. Track, Anchorage, AL, USA, pp. 168–175,2014.
[6] E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, L. Ramakrishnan, "Processing Cassandra Datasets with Hadoop-Streaming Based Approaches",IEEE Transactions on Services Computing, Vol. 9,Issue 1,pp 46-58.
[7] J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae,J. Qiu, and G. Fox, “Twister: A runtime for iterative mapreduce,” in Proc. 19th ACMInt. Symp. High Perform. Distrib. Comput., pp. 810–818,2010

Citations	2325
h-index	16
i10-index	47