Posted: 10/08/16
Maoxi Li, an associate Professor in Computer Science at Jiangxi Normal University of China, gave a leaving talk to ADAPT members titled ‘Extract Domain-specific Paraphrase from Monolingual Corpus for Automatic Evaluation of Machine Translation’.
During his talk Professor Li highlighted how a paraphrase, which is a restatement of the meaning of a word, phrase or sentence using other words, plays an important role in automatic evaluation of Machine Translation (MT) to match words or phrases with the same or similar meaning. The traditional MT approaches extract paraphrase in general domain from bilingual corpus. However, Professor Li’s research aims to overcome the current challenges. Because the WMT16 metrics task consists of machine translations in three different domains and little relevant domain-specific bilingual parallel corpus is available for paraphrase extraction, Professor Li proposes to extract domain-specific paraphrase tables from monolingual corpus to replace the general paraphrase table.
Speaking about the research Professor Li said: “We utilize the M-L approach to filter the large scale general monolingual corpus into a domain-specific sub-corpus, and exploit Markov network model to extract paraphrase tables from the sub-corpus.
The submission result using the proposed approach for WMT’16 Metrics Task was the best for to-English system-level metric scores with human assessment of relative ranking, and was the best for fi-en and tr-en system-level metric scores with 10K hybrid systems.”
Maoxi Li received his PhD in Pattern Recognition from Institute of Automation of Chinese Academy of Sciences in 2011. His research focuses on machine translation and natural language processing. He has published papers on high-level journals such as “ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)”, and at top-level conferences such as ACL.
Share this article: