Dependency parsing and machine learning approach for transfer grammar component in Malayalam – English MT system

By: Material type: TextTextSubject(s): Dissertation note: Master of Philosophy in Computer Science 2014-2015 INT Summary: Machine translation has become an important in the field of communication around the world with people having their own native language. It is one of the research area under computational linguistics and various methods have been proposed to automate the language translation process. Even though there are so many such automated translation systems, they are not enough to solve the challenges of Malayalam-English Machine Translation. Due to the high agglutinative and free word order nature of Malayalam language, the translation process has become a challenging task. The existing Machine Translation system for the Malayalam-English translation faces difficulty in handling complex sentences for Tokenization and it also results in low performance and accuracy in parsing of sentences. This work addresses these issues by introducing some optimization techniques and a statistical machine learning technique for the entire Transfer Grammar for Malayalam-English translation. The work proposed ‘Regular Expressions with Pattern Matching’ for the optimization of Tokenization process in Malayalam-English MT system. The concept of ‘Dependency Parsing’ is introduced for generating the pharse tree by finding the relation of a verb with other tokens in a sentence and thus avoids the difficulty in handling the free-word nature of Malayalam. This approach finds to be much more efficient than the simple parser in accuracy and performance. Finally, this work suggests ‘YamCha’, a statistical machine learning tool for developing Transfer Grammar components for Malayalam-English machine translation.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Call number Status Date due Barcode
Project Reports Project Reports Kerala University of Digital Sciences, Innovation and Technology Knowledge Centre Not for loan R-632

Master of Philosophy in Computer Science 2014-2015 INT Elizabeth Sherly


Machine translation has become an important in the field of communication around the world with people having their own native language. It is one of the research area under computational linguistics and various methods have been proposed to automate the language translation process. Even though there are so many such automated translation systems, they are not enough
to solve the challenges of Malayalam-English Machine Translation. Due to the high agglutinative and free word order nature of Malayalam language, the translation process has become a challenging task. The existing Machine Translation system for the Malayalam-English translation faces difficulty in handling complex sentences for Tokenization and it also results in low performance and accuracy in parsing of sentences. This work addresses these issues by introducing some optimization techniques and a statistical machine learning technique for the entire Transfer Grammar for Malayalam-English translation. The work proposed ‘Regular Expressions with Pattern Matching’ for the optimization of Tokenization process in Malayalam-English MT system. The concept of ‘Dependency Parsing’ is introduced for generating the pharse tree by finding the relation of a verb with other tokens in a sentence and thus avoids the difficulty in handling the free-word nature of Malayalam. This approach finds to be much more efficient than the simple parser in accuracy and performance. Finally, this work suggests ‘YamCha’, a statistical machine learning tool for developing Transfer Grammar components
for Malayalam-English machine translation.

There are no comments on this title.

to post a comment.