Open Access   Article Go Back

Fast and Effective System for Name Entity Recognition on Big Data

Jigyasa Nigam1 , Sandeep Sahu2

Section:Research Paper, Product Type: Journal Paper
Volume-3 , Issue-2 , Page no. 31-35, Feb-2015

Online published on Feb 28, 2015

Copyright © Jigyasa Nigam , Sandeep Sahu . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Jigyasa Nigam , Sandeep Sahu, “Fast and Effective System for Name Entity Recognition on Big Data,” International Journal of Computer Sciences and Engineering, Vol.3, Issue.2, pp.31-35, 2015.

MLA Style Citation: Jigyasa Nigam , Sandeep Sahu "Fast and Effective System for Name Entity Recognition on Big Data." International Journal of Computer Sciences and Engineering 3.2 (2015): 31-35.

APA Style Citation: Jigyasa Nigam , Sandeep Sahu, (2015). Fast and Effective System for Name Entity Recognition on Big Data. International Journal of Computer Sciences and Engineering, 3(2), 31-35.

BibTex Style Citation:
@article{Nigam_2015,
author = {Jigyasa Nigam , Sandeep Sahu},
title = {Fast and Effective System for Name Entity Recognition on Big Data},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2015},
volume = {3},
Issue = {2},
month = {2},
year = {2015},
issn = {2347-2693},
pages = {31-35},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=398},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=398
TI - Fast and Effective System for Name Entity Recognition on Big Data
T2 - International Journal of Computer Sciences and Engineering
AU - Jigyasa Nigam , Sandeep Sahu
PY - 2015
DA - 2015/02/28
PB - IJCSE, Indore, INDIA
SP - 31-35
IS - 2
VL - 3
SN - 2347-2693
ER -

VIEWS PDF XML
2834 2537 downloads 2639 downloads
  
  
           

Abstract

In today scenario all data store in digital form and data size is too large. So problem is that how to manage this big data or extract information with speed and efficiency. Information extraction is a technique which using in text mining. Information extraction extract required information whose user demand from unstructured text. Information extraction use NLP (Natural Language Processing) and NER (Name entity recognition). NER systems help to machine recognize proper noun (entity), events, relationships and so on. There are several NER systems in the world. Such as GATE, CRFClassifier, OpenNLP and Stanford NLP (Natural Language Processing ). The NER system works fast for limited amount of documents but drawback of this system is that it works slows for huge/large amount of data. To overcome the drawback of NER system, this paper, report the implement of a NER which is based on Map Reduce, a distributed programming model. This improvement helps to achieve the fast extraction and reduce storage cost with better performance.

Key-Words / Index Term

Distributed computing, Big textual data, Named Entity Recognition (NER) , Natural Language Processing (NLP), MapReduce, Hadoop and Maxent Tagger

References

[1]. Nigam, Jigyasa, and Sandeep Sahu. "An Effective Text Processing Approach With MapReduce."
[2]. James J. (Jong Hyuk) Park et al. (eds.), Mobile, Ubiquitous, and Intelligent Computing,Lecture Notes in Electrical Engineering 274,DOI: 10.1007/978-3-642-40675-1_41, © Springer-Verlag Berlin Heidelberg 2014
[3]. Kim, J., Lee, S., Jeong, D.-H., Jung, H.: Semantic Data Model and Service for Supporting Intelligent Legislation Establishment. In: The 2nd Joint International Semantic Technology Conference (2012)
[4]. Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423–430 (2003)
[5]. Dean, J., Ghemawat, S.: MapReduce: simplified
data processing on large clusters. In: OSDI, pp. 137–150 (2004)
[6]. HDFS (hadoop distributed file system) architecture(2009),http://hadoop.apache.org/common/docs/current/hdfs-design.html
[7]. Seo, D., Hwang, M.-N., Shin, S., Choi, S.: Development of Crawler System Gathering Web Document on Science and Technology. In: The 2nd Joint International SemanticTechnology Conference (2012) Morphological features help POS tagging of unknown words across language varieties
[8]. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pages 425–432,Sydney, July 2006. c2006 Association for Computational Linguistics
[9]. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases (VLDB-94), pages 487–499, Santiago, Chile, Sept. 1994.
[10]. en.wikipedia.org/wiki/Information_extraction
[11]. Shvachko,K. Yahoo!,Sunnyvale,CA,USA Hairong Kuang ; Radia, S. ; Chansler, R.Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on E-ISBN :978-1-4244-7153-9