Universitetet i Agder

Interpretable Architectures and Algorithms for Natural Language Processing

Natural Language Processing (NLP) is one of the branches of Artificial Intelligence (AI) that teaches computers to understand, process, and generate language.
Some popular NLP tasks include text classification, sentiment analysis, information retrieval, machine translation, and reading comprehension.

Rohan Kumar Yadav

PhD Candidate

You may follow the disputation online. Link for registration as an online spectator at the bottom of this page.

Rohan Kumar Yadav of the Faculty of Engineering and Science at the University of Agder has submitted his thesis entitled «Interpretable Architectures and Algorithms for Natural Language Processing» and will defend the thesis for the PhD-degree Monday 28 November 2022.

He has followed the PhD programme in Engineering and Science at UiA, with Specialisation in Information- and Communication Technologies (ICT).

Summary of the thesis by Rohan Kumar Yadav:

Interpretable Architectures and Algorithms for Natural Language Processing

Natural Language Processing (NLP) is one of the branches of Artificial Intelligence (AI) that teaches computers to understand, process, and generate language.

Some popular NLP tasks include text classification, sentiment analysis, information retrieval, machine translation, and reading comprehension.

Popular – but difficult for humans

In order to accomplish these tasks, Deep Neural Network (DNNs) has been the popular choice.

However, the structure of DNN makes it difficult for humans to interpret and explain the model.

Moreover, the necessity of explainability in NLP comes into the picture when sensitive information needs to be evaluated out of the model. Such necessity of explainable NLP escalates mostly in the education domain, legal document analysis, and medical diagnosis using Electronic Health Records (EHR).

Hence, we have designed various interpretable architectures and algorithms that are useful for the application of explainable NLP.

In addition to this we explore the use of TM in DNNs for better explainability of the model thereby maintaining the state-of.the-art performance.

The thesis is divided into two sections:

Interpretable Text Classification using TM and
Interpretable Text Classification using DNNs.

Interpretable Text Classification using TM

Here, we propose several architectures that deal with extracting novel interpreting methods with TM.

For a better understanding of what the model's interpretation looks like, we adopt the simple text classification, position-dependent text classification, feature extended TM, and robustness of the model.

Interpretable Text Classification using DNNs

Since DNN-based models are better in terms of performance and hard to interpret, we simplify the complex position-dependent NLP tasks using masking techniques so that it is easier to map the weights directly to the input features.

In addition to this, we extend this task by encoding the interpretable information from TM into DNN for fine-tuning the weights for better performance and explainability.

In general, we design various interpretable architectures and algorithms using TM and DNNs for various tasks on NLP. Our experiment and results demonstrate that each model mentioned above performs either at par or above par with state-of-the-art in the spectrum of interpretable NLP.

Disputation facts:

The trial lecture and the public defence will take place on campus in Auditorium C2 041, Campus Grimstad, and online via the Zoom conferencing app - registration link below.

Associate Professor Morgan Konnestad, Department of Information and Communication Technology, Faculty of Engineering and Science, University of Agder, will chair the disputation.

The trial lecture Monday 28 November at 10:15 hours

Public defense Monday 28 November at 12:15 hours

Given topic for trial lecture: «Are AI systems soon ready to replace journalists and authors?»

Thesis Title: «Interpretable Architectures and Algorithms for Natural Language Processing»

Search for the thesis in AURA - Agder University Research Archive, a digital archive of scientific papers, theses and dissertations from the academic staff and students at the University of Agder.

The thesis is available here:

PhD Thesis Rohan Kumar Yadav Interpretable_Architectures_and_Algorithms_for_Natural_Language_Processing_print

The Candidate: Rohan Kumar Yadav (1993, Siraha, Nepal) BE from Anna University, India (2015) ME from Chosun University, South-Korea (2019) Present position: Data Scientist at Kobler As, Oslo, Norway.

Opponents:

First opponent: Senior Assistant Professor, Tommaso Caselli, PhD, Faculty of Arts, University of Groningen, Nederland

Second opponent: Professor Jim Tørresen, Department of Informatics, University of Oslo, Norway

Associate Professor Nadia Noori, Department of Information and Communication Technology, Faculty of Engineering and Science, University of Agder, is appointed as the administrator for the assessment committee.

Supervisors in the doctoral work were Associate Professor Lei Jiao, University of Agder (main supervisor) and Professor Morten Goodwin, University of Agder (co-supervisor)

What to do as an online audience member:

The disputation is open to the public, but to follow the trial lecture and the public defence online, transmitted via the Zoom conferencing app, you have to register as an audience member on this link:

https://uiano.zoom.us/meeting/register/u5MqcOmsrDosGtO2BIpndc_ejCoveaNGfV74

A Zoom-link will be returned to you. (Here are introductions for how to use Zoom: support.zoom.us if you cannot join by clicking on the link.)

We ask online audience members to join the virtual trial lecture at 10:05 at the earliest and the public defense at 12:05 at the earliest. After these times, you can leave and rejoin the meeting at any time. Further, we ask online audience members to turn off their microphone and camera and keep them turned off throughout the event. You do this at the bottom left of the image when in Zoom. We recommend you use ‘Speaker view’. You select that at the top right corner of the video window when in Zoom.

Opponent ex auditorio:

The chair invites members of the public to pose questions ex auditorio in the introduction to the public defense. Deadline is during the break between the two opponents. The person asking questions should have read the thesis. For online audience the Contact Persons e-mail are available in the chat function during the Public Defense, and questions ex auditorio can be submitted to Kristine Evensen Reinfjord at e-mail kristine.reinfjord@uia.no