Date:
Part of the Fall 2022 Middle Eastern and North African Studies Colloquium Series
Elsayed Issa, PhD Candidate in Linguistics and MENAS, University of Arizona
Abstract
Dialect Identification (DID) is a special case of Language Identification (LID), that presents specific challenges and problems related to the linguistic similarity between dialects. Even though LID can be considered a well-understood problem, closely related dialects and language varieties still pose significant challenges for their automatic recognition. Several workshops (WANLP) and challenges (VardDial, MGB) have contributed to improve identification results by attracting re- searchers to this topic of study. This talk presents two published papers on Arabic Dialect identification (ADI). The First paper was published in WANLP workshop 2021. It investigates the value of augmenting recurrent neural networks with feature engineering for the Second Nuanced Arabic Dialect Identification (NADI) Subtask 1.2: Country-level DA identification. The performance of a simple word-level LSTM using pretrained embeddings is compared with one enhanced using feature embeddings for engineered linguistic features. Results show that the addition of explicit features to the LSTM is detrimental to performance. The second paper presents a full end-to-end pipeline for ADI using intonation patterns and acoustic representations. Results of the experiments show that intonation patterns for Arabic dialects provide sufficient information to achieve state-of-the-art results on the VarDial 17 ADI datatset, outperforming single-feature systems. The authors conjecture on the importance of sufficient information as a criterion for optimality in a deep learning ADI task.
Bio
Elsayed Issa is currently a Ph.D. candidate specializing in Arabic linguistics at the School of Middle Eastern and North African Studies at the University of Arizona. He obtained his M.A. degree in Machine Translation from Alexandria University in Egypt. His thesis involved designing software for translating simple English sentences into their Arabic equivalents. He obtained another M.S. degree from the Human Language Technology (HLT) program at the Linguistics Department at the University of Arizona. His research interests include phonology, morphology, natural language processing, machine learning, blended learning, and education technology.
This event is hybrid, and will be held over Zoom and in-person in Marshall 490. Masks are not required but are strongly recommended.
Here is the Zoom meeting link:
To request disability-related accommodations that would ensure your full participation in this event, please contact: jellison@arizona.edu