PhD Student, Intelligent Systems Program, University of Pittsburgh
28 Aug 2020 - Arun Balajiee
The general theme of the ISSP 2020 talk was of utilizing techniques in Natural Language Processing (NLP) in Biomedical applications.
The first presentation was by Dr. Shandong Wu. Most of the talk was about his work with collaborators in applying AI and Computer Vision techniques in medical Imaging. An interesting point to note is the size of the data is covered by the medical imaging, among the other patient data collected. 90% of the data collected happens to be contributed by the medical imaging devices – almost 50 PetaBytes of data. Much of the collaborative work of Dr. Shandong happens to be with RSNA – a prominent radiology institute in North America. Dr. Wu also presented further the possibilities of applying NLP in the context of medical imaging applications.
From Dr. Shandong’s presentation we gather the multiple possibilities of applying AI in medical imaging. This is a promising field with infinite possibilities and ideas for research. It is exciting to know the more work that are possibly going to be the publications from the lab
The presentation from Dr. Litman was more in the direction of applying NLP techniques in the area of education technologies, specifically on developing automated assessment systems. Most of the work in her lab happens to be about building education technologies using NLP, using data mining to perform Learning and Teaching analytics on the education systems built with these techniques, laying out foundations for further research based on the outcomes from these two steps and in turn, improving the education sytems built with NLP techniques. Considering that my research is also situated in the broader context of her field of expertise, I was excited to understand the implications of her research and the work by her students from her lab. Most of the work in her lab can be categorized into three broad categories – learning the principles of language to build applications using NLP, using language to further improve the education systems built with these techniques and processing language in educational teaching and learning to build support systems.
We think that NLP is extremely important to build robust learning management systems that have the potential to replace existing educational frameworks. Students can leverage the benefits of learning and be automatically assessed to gain the skills in complext fields such as computer science, mathematics and arts. The area of educational technologies is a booming field in the modern era and there is no better time than now to work on these applications.
The final talk of the seminar and by far the most intriguing application of NLP techniques was in the area where one could possibly least expect - biology. Dr. Madhavi presented her work in the area of gene analytics using NLP. The gene protein structure can be represented using strings of characters which has its own grammar and syntax. Processing the language of gene structure in the form of string patterns could help in building causal models for diseases such as cancer and help in medical applications that require early diagnosis to mitigate the disease. The process of analysing gene structures can be performed using simple N-gram technique – which enthralled us the most. Something as simple as n-gram analysis of strings to develop semantics and syntax of gene structure has really large practical implications.
Overall, her research work deals with using NLP to predict early undesirable outcomes, annotate the different gene structures and build a large database of models for prediction, analyze the different possible structures and patterns from gene protein data and engage with medical communities in applying the analysis to build systems with practical benefits. Clearly, this field is here to stay and has large implications. We are excited to be in the know of the latest developments from her lab