Dependency parsing and machine learning approach for transfer grammar component in Malayalam – English MT system
Material type:
TextSubject(s): Dissertation note: Master of Philosophy in Computer Science 2014-2015 INT Summary:
Machine translation has become an important in the field of communication around the world with people having their own native language. It is one of the research area under computational linguistics and various methods have been proposed to automate the language translation process. Even though there are so many such automated translation systems, they are not enough
to solve the challenges of Malayalam-English Machine Translation. Due to the high agglutinative and free word order nature of Malayalam language, the translation process has become a challenging task. The existing Machine Translation system for the Malayalam-English translation faces difficulty in handling complex sentences for Tokenization and it also results in low performance and accuracy in parsing of sentences. This work addresses these issues by introducing some optimization techniques and a statistical machine learning technique for the entire Transfer Grammar for Malayalam-English translation. The work proposed ‘Regular Expressions with Pattern Matching’ for the optimization of Tokenization process in Malayalam-English MT system. The concept of ‘Dependency Parsing’ is introduced for generating the pharse tree by finding the relation of a verb with other tokens in a sentence and thus avoids the difficulty in handling the free-word nature of Malayalam. This approach finds to be much more efficient than the simple parser in accuracy and performance. Finally, this work suggests ‘YamCha’, a statistical machine learning tool for developing Transfer Grammar components
for Malayalam-English machine translation.
| Item type | Current library | Call number | Status | Date due | Barcode | |
|---|---|---|---|---|---|---|
Project Reports
|
Kerala University of Digital Sciences, Innovation and Technology Knowledge Centre | Not for loan | R-632 |
Master of Philosophy in Computer Science 2014-2015 INT Elizabeth Sherly
Machine translation has become an important in the field of communication around the world with people having their own native language. It is one of the research area under computational linguistics and various methods have been proposed to automate the language translation process. Even though there are so many such automated translation systems, they are not enough
to solve the challenges of Malayalam-English Machine Translation. Due to the high agglutinative and free word order nature of Malayalam language, the translation process has become a challenging task. The existing Machine Translation system for the Malayalam-English translation faces difficulty in handling complex sentences for Tokenization and it also results in low performance and accuracy in parsing of sentences. This work addresses these issues by introducing some optimization techniques and a statistical machine learning technique for the entire Transfer Grammar for Malayalam-English translation. The work proposed ‘Regular Expressions with Pattern Matching’ for the optimization of Tokenization process in Malayalam-English MT system. The concept of ‘Dependency Parsing’ is introduced for generating the pharse tree by finding the relation of a verb with other tokens in a sentence and thus avoids the difficulty in handling the free-word nature of Malayalam. This approach finds to be much more efficient than the simple parser in accuracy and performance. Finally, this work suggests ‘YamCha’, a statistical machine learning tool for developing Transfer Grammar components
for Malayalam-English machine translation.
There are no comments on this title.