Dr John P. McCrae

Research Group Leader

I am the leader of the Unit for Linguistic Data at the Data Science Institute of the National University of Ireland Galway. My work has focussed on the intersection of NLP and data science, and I have lead the development of the linguistic linked open data cloud, a large-scale integration of many language resources. I am the co-ordinator of the Prêt-à-LLOD project, funded by the European Union H2020 project, which aims to make linguistic linked open data ready-to-use. I am also a work package leader in the ELEXIS project on building a new lexicographic infrastructure for Europe. In addition, I am funded by the Irish Research Council under the Laureate program with the Cardamom project focused on the development of the comparable deep models for minority and historical languages. Finally, I am a PI in the SFI Insight Centre for Data Analytics and, from 2021, a PI in the SFI ADAPT Centre. I am also a member of the Centre for Applied Linguistics and Multilingualism (CALM) and have active research collaborations with Fidelity Investments and Huawei.

I completed my PhD within 3 years while still publishing a journal article (with 47 citations) and contributing to the BioCaster system for detecting disease outbreaks by processing texts in East Asian languages. After joining Bielefeld University in 2009, I played a leading role in at least two major scientific breakthroughs. Firstly, the development of the lemon Lexicon Model for Ontologies was a major contribution to the representation of semantics relative to natural language and is now being used by most relevant research groups and was one of the most significant outcomes of the Monnet project, an FP7 funded project. Secondly, out of the work on this topic I have been instrumental in creating the topic of linguistic linked open data as a major research theme which has been supported by over a dozen workshops and events and was a major theme of the 2016 Language Resource and Evaluation Conference (LREC). This topic lead to the Lider project, which used linguistic linked open data as an enabler for content analytics in enterprise and was funded by FP7, where I played a major role in writing the grant and in implementing the work plan. More recently, my work in linked data has played a pivotal role in obtaining funding for the ELEXIS project (under H2020-INFRAIA), where we will apply linked data technologies to lexicography.

My work has lead to over 100 publications, nearly all of these citations are for work that did not involve my PhD supervisor and I have co-authored with over 150 co-authors from institutions around the world.

Research Domains

Artificial Intelligence
Data Analytics
Data Integration and Quality
Data Management
Deep Learning
Digital Humanities
Emotion
FinTech
Knowledge & Data Engineering
Languages
Linguistics
Linked Data / Semantic Web / Knowledge Graphs
Machine Learning
Machine Translation
Natural Language Processing
Standards

Publications by John P. McCrae

MG2P: An Empirical Study Of Multilingual Training for Manx G2P

PUBLICATION:	LDK 2023 - 4th Conference on Language Data and Knowledge
AUTHOR(S):	Shubhanker Banerjee, Bharathi Raja Chakravarthi, John P. McCrae
DATE:	01 September 2023
TYPE:	Conf papers

Documenting the Open Multilingual Wordnet

PUBLICATION:	GWC 2023 - 12th Global WordNet Conference
AUTHOR(S):	Francis Bond, Michael Wayne Goodman, Ewa Rudnicka, Luis Morgado da Costa, Alexandre Rademaker, John P. McCrae
DATE:	01 January 2023
TYPE:	Conf papers

Enriching a terminology for under-resourced languages using knowledge graphs

PUBLICATION:	eLex 2021 - 7th Biennial Conference on Electronic Lexicography
AUTHOR(S):	John P. McCrae, Atul Kumar Ojha, Bharathi Raja Chakravarthi, Ian Kelly, Patricia Buffini, Grace Tang, Eric Paquin and Manuel Locria
DATE:	05 July 2021
TYPE:	Conf papers

Dr John P. McCrae

Research Group Leader

Social Links

Other Academic Links

Publications by John P. McCrae