presentation
TRANSCRIPT
M.Tech Thesis PresentationLanguage Communicator Tool
Submitted To : Presented By : Sushil Buriya Megha Jain Banasthali University 15976
About Organization(IIIT-H)
Anusaaraka (LTRC)
Table Of Content(1.) Purpose/Motive
(2.) Language Communicator Tool
(3.) ACE Parser
(4.) Methodology
(5.) Conclusion
(6.) References
1.) Purpose/Motive● A new approach to convert paninian Hindi sentence (CHL) into English
sentence.
● Working on a Machine Translation Module for Hindi-English pairs by
performing transfer at semantic level using Precision Grammar.
● Generated Data for translated text from Hindi to English using one-to-one
semantic relationships.
2.) Language Communicator Tool
2.1) Implementation(1.) Start typing Hindi sentence to start the conversion.(2.) Use rule-based parser which accepts rule defined by human being.(3.) If you'd like to assist with identifying proper nouns or demarcating sentences in a complex sentence then you can select the options.
(3.1.) Mark Complex Sentences (3.2.) Tag Proper Nouns
(4.) Tag Word Chunk(5.) Controlled Hindi Text (CHL)
2.2) CHL To English Text
3.) ACE Parser● What is ACE ?
● Why ACE ?
● ACE in Detail.
3.1) Parsing And GenerationParsing With :
(1.) REPP support(2.) Built-in part-of-speech tagging and unknown word handling
Generation With :
(1.) Optional pre-generation "fixup" rule phase(2.) Index accessibility filtering(3.) Optional post-generation token mapping phase
3.2) Command Line Usage(1.) Parsing (Input is one sentence per line) :
“ace -g grammer.dat [input file][-1 | -n count]”
“ace -g grammer.dat -ITf [input file]”
(2.) Generating (Input is one MRS per line) :
“ace -g grammer.dat -e [input file][-1 | -n count]”
(3.) Compiling a grammar :
“ace -G grammer.dat -g path-to-config.tdl”
(3.3) ProcessingEnglish Sentence
DMRS
MRS
English Sentence
MRS/DMRSMRS
(1.) What is MRS ?
(2.) MRS Structure ?
(3.) Example :
DMRS
(1.) What is DMRS ?
(2.) DMRS Structure ?
(3.) Example :
MRS Illustration By An Example
DMRS Illustration By An Example
4.) Methodology(1.) Preprocessing
● Input File Format (CSV)● Template● Dictionary
(2.) Implementation
● Shell File● Automate CHL to User CSV● Automate User CSV To Developer CSV
4.1) Preprocessing(1.) Input File Format :
● What is CSV ?
● Why CSV ?
● Types Of CSV In Our Tool ?
4.1.1) User CSV
4.1.2) Developer CSV
4.1.2) TemplateSome Examples Of Handled Sentence Formats :
Adjective Imperative Where
Adverb Imperative Preposition Passive
Cardinal Mass Noun Reflexive Pronoun
Causative Model verb Verb Nominalization
Compound_noun Negation When
How Noun Conjunction What (some more….)
4.1.2) Template SamplesWhat :
4.1.2) Template Samples Reflexive Pronoun :
4.1.3) Dictionary(1.) Chl_rel_prep_mapping (6.) Noun_single_sense
(2.) Concept_dictionary (7.) Pronoun-lemma
(3.) Default_prep_dictionary (8.) Sense_info
(4.) Karaka-rel_verb (9.) tam_mapping
(5.) Link_list
4.2) Implementation(1.) Prerequisites:1. pydelphin need to be installed in $HOME/2. Ace parser 0.24 version or higher in $HOME/
(2.) Set pydelphin path in bashrc:1. export PYDELPHIN=$HOME/pydelphin2. source ~/.bashrc
(3.) Run:sh run.sh <chl_input>Ex: sh run.sh sleep.csv
Generate sentence by changing dmrs manually :1. Run above shell file2. If the sentence is not generated then, modify <file_name>.csv_new_dmrs.txt manually present in output directory.3. Now run,
sh run_mod_dmrs.sh <file_name>.csv(Note: If above modification is correctly done, sentence is generated)
4.2.1) Shell ScriptIn shell script we deal with various coding implementations which are brief as below :
(1.) get_chl_rels_info_into_facts : Map CHL relations to facts .
(2.) insert_quant : Insertion of quantifier before noun.
(3.) get_node_nd_link : Generation of nodes and links in DMRS.
(4.) separate_node_nd_link : Separation of nodes and links.
(5.) replace_sense_info_in_dmrs : Change sense information in DMRS according to noun_single_sense_information dictionary.
(6.) tree_to_dict : Converts DMRS into MRS.
(7.) human_readable_dmrs : Append english lemma along with their ID’s defined in links to make DMRS easier to read.
4.2.2) Automate User To Developer CSVRun:(1.) sh convert_user_to_dev_csv.sh <chl_input> > output(2.) sh run.sh outputEx: a.) sh convert_user_to_dev_csv.sh boy_can_eat_rice_with_the_spoon.csv > boy_can_eat_rice_with_the_spoon_dev.csv b.) sh run.sh boy_can_eat_rice_with_the_spoon_dev.csv
Dictionaries Dealt With : (1.) pronoun_lemma (4.) tam_mapping(2.) link_list (5.) chl_rel_prep_mapping(3.) concept_dictionary
4.2.3) Automate CHL To User CSV● Obtain CHL as an output from language communicator tool.
● Designing User interface.
● According to user response automatically CSV will be created.
5.) ConclusionAt broad scale work can be divided into 3 categories : a.) Manual creation of user CSV and developer CSV files.b.) Automate CHL to user CSV.c.) Automate user CSV to developer CSV.
Tools : ACE-0.9.24 ,HPSG-LOGON ,Python 2.7
Future Vision : Along with Hindi, processing with Japanese is also introduced.
6.) References1.) Ann Copestake, et el., “Minimal Recursion Semantics: An Introduction”, Research on Language and Computation , pp: 281–332, Jan.20052.) Ann Copestake, et el., “Resources for Building Applications with Dependency Minimal Recursion Semantics” , spring , march 20063.) www.iiit.ac.in/4.) anusaaraka.iiit.ac.in/5.) https://github.com/delph-in/pydelphin/create-hindi-parser 6.) www.oxfordlearnersdictionaries.com/7.) erg.delph-in.net8.) http://sweaglesw.org/linguistics/ace/