giới thiệu một số công cụ xử lý ngôn ngữ tự nhiên và khai phá dữ liệu
DESCRIPTION
Giới thiệu một số công cụ xử lý ngôn ngữ tự nhiên và khai phá dữ liệu. TRẦN MAI VŨ. Vietnamese NLP Tools. JVnTextPro : http://sourceforge.net/projects/jvntextpro/ Sentence Segmentation, Sentence Tokenization, Word Segmentation, Pos Tagging - PowerPoint PPT PresentationTRANSCRIPT
GIỚI THIỆU MỘT SỐ CÔNG CỤ XỬ LÝ NGÔN NGỮ TỰ NHIÊN VÀ KHAI PHÁ DỮ
LIỆUTRẦN MAI VŨ
VIETNAMESE NLP TOOLS
JVnTextPro: http://sourceforge.net/projects/jvntextpro/ Sentence Segmentation, Sentence Tokenization, Word Segmentation,
Pos Tagging
VnToolkit: http://www.loria.fr/~lehong/softwares.php A software for automatically extracting LTAGs* from treebanks. An automatic tagger for Vietnamese texts A tokenize for automatic word segmentation of Vietnamese texts A sentence detector for automatic detecting sentences of Vietnamese
texts
VLSP Tools: http://vlsp.vietlp.org:8080/demo/?page=resources
Vietnamese Chunking(*) Lexicalized Tree Adjoining Grammars
NLP TOOLS
LingPipe: http://alias-i.com/lingpipe/ Gate – General Architecture for Text Engineering:
http://gate.ac.uk/ Mallet - Machine Learning for Language Toolkit:
http://mallet.cs.umass.edu/ MinorThird: http://sourceforge.net/projects/minorthird/ OpenNLP: http://opennlp.sourceforge.net/
PREPROCESSING TOOLS
TextCat - Java Text Categorizing Library: http://textcat.sourceforge.net/
HTML Parser: http://htmlparser.sourceforge.net/ CyberNeko HTML Parser: http://nekohtml.sourceforge.net/ Crawler4J: http://code.google.com/p/crawler4j/ Lucene: http://lucene.apache.org/
OTHER TOOLS
SVM-Light Support Vector Machine: http://svmlight.joachims.org/
CRF: http://crf.sourceforge.net/ Text Clustering Toolkit: http://mlg.ucd.ie/tct A Java Implementation of Latent Dirichlet Allocation (LDA)
using Gibbs Sampling for Parameter Estimation and Inference: http://jgibblda.sourceforge.net/
DATA MINING TOOLS
Weka - Machine Learning Software in Java: http://sourceforge.net/projects/weka/
RapidMiner -- Data Mining, ETL, OLAP, BI: http://sourceforge.net/projects/yale/
RSES - Rough Set Exploration System: http://logic.mimuw.edu.pl/~rses/
ONTOLOGY TOOLS
The Protégé Ontology Editor and Knowledge Acquisition System: http://protege.stanford.edu/
Jena Semantic Web Framework: http://jena.sourceforge.net/