LinkedIn, the world’s largest professional network, and the ADAPT SFI Research Centre for AI-Driven Digital Content Technology at Dublin City University (DCU) have collaborated on a technology initiative with the potential to increase access to multilingual content for LinkedIn members.
Responding to the need for accessible content, LinkedIn’s engineering team in Ireland, along with ADAPT researchers, developed a machine translation (MT) system focused on specific categories of information that will enable their translation to specific languages. This will improve the experience of LinkedIn members across the globe searching for information on products in different languages.
The need for accessible multilingual content has grown significantly in recent years as LinkedIn’s global membership has expanded to over 900 million members in more than 200 countries and territories worldwide. However, building MT systems for a specific domain is challenging as it requires a large and accurate parallel corpus which can be hard to obtain. LinkedIn’s Engineering team, working with ADAPT researchers at DCU, investigated whether highly accurate MT systems could enable LinkedIn to translate specific categories of information such as product descriptions from English to a target language.
Speaking about the development that was announced in Washington during the Irish Government’s St Patrick’s Day festivities, Declan McKibben, Executive Director of the ADAPT Centre said: “ADAPT is delighted to work with LinkedIn in building this domain-specific MT system. As Ireland’s leading research centre for human-centered AI, ADAPT was the perfect partner for the R&D. Through our collaborative approach we have succeeded in advancing the MT capabilities within LinkedIn, pioneered new technologies to meet the evolving needs of their users, and strengthened the international research network between Ireland and the US.”
Dr Siobhan Roche, Director Science for the Economy, Science welcomed the announcement, saying: “As recognised by the Irish Government’s National Strategy for Artificial Intelligence, digitisation is transforming our lives and our economy, presenting opportunities for economic growth and competitive advantage. SFI Research Centres such as ADAPT are a driving force for connecting leading-edge industry with world-class researchers. SFI welcomes this research collaboration with LinkedIn harnessing the power of machine learning and AI, enhancing Ireland’s international reputation for research excellence.”
The domain-specific MT system focuses on the translation of specific categories such as product descriptions. It can help with augmenting existing LinkedIn data, as well as with developing multilingual product classification and recommendation systems at LinkedIn. Typically, LinkedIn has more labelled training data available in English than in other languages.
The new system makes it possible to translate the English data into French and German by an MT system and create additional ‘synthetic’ labelled data for the LinkedIn classifier. This has a multiplier effect and allows for the training of larger and more accurate classification models in the given target language.
Speaking about the collaboration, Dr Tatiana Habruseva, Staff AI Engineer at Linkedin, said: “LinkedIn is committed to providing a positive and inclusive experience for everyone, no matter what part of the world they call home. We want our members to be able to access the professional knowledge and content they are looking for in their native language wherever possible, and this collaboration is one step on our journey to achieving this. Between our engineering and AI teams in Ireland and across the globe, we were able to collaborate with the world class researchers at ADAPT to look at how machine translation can be used to provide greater access to information in multiple languages.”
Professor Andy Way, Deputy Director of the ADAPT Centre and Professor of Computing at DCU said: “Our MT team at DCU are world leaders in the area of AI-assisted translation and have developed a variety of tools powered by artificial intelligence, machine learning, and neural networks for organisations. Building Machine Translation systems for a specific domain requires a sufficiently large and good quality parallel corpus in that domain. However, this is a challenging task because for many domains and language-pairs, there is no parallel data.“
“In this collaboration, ADAPT developed English-to-French translation systems for software product descriptions from the LinkedIn website. Moreover, a first-ever parallel test set of product descriptions was created. Several MT systems were built and compared: a baseline system trained on publicly available parallel data from general domain, and domain-adapted systems trained on specialised data selected using sentence embedding-based corpus filtering and domain-specific sub-corpora extraction. Evaluation results show that the domain-adapted model based on our proposed approaches outperforms the baseline.”