Arun Balajiee

PhD Student, Intelligent Systems Program, University of Pittsburgh

Language probes with V-Information Estimators

09 Sep 2020 - Arun Balajiee

Talk Speaker: John Hewitt, PhD Student, Stanford University

Talk Date: 09/09/2020

The idea behind Language Probes is to be able to identify the different layers in a model and see what information of the input data is encoded in it. This is helpful in analysing how much information is already carried as input to the next layer and how much information is added other than the stuff that is already known as the input.

With sound mathematical proofs to show that Shannon Information transmitted thought V Estimators – a subset of functions of all possible models that can be used in NLP – increases with every passing layer of input and that the information encoded input is always slightly different from the other layers.

While mathematically and from the perspective of machine learning, this may be an interesting finding – but what is more fascinating about this implementation is its possible ramifications in the field of interpretability in the field of NLP. When we have an idea on what information is encoded from the input for building the model, what section of the input data affects the overall outcome of the model, we can better control those parameters in getting more desirable results, along with a better understanding of these results. For example. in a use case such as automating grading tools, if we have 10 layer neural network – to put it simply we could, if we performing language probing – be able to notice the different skills or aspects of the grading that is considered by the system in each layer which is different from the preivous layer. When the system provides an output score, we know better about that different skill levels of the student and be able to classify the student’s performance based on this information.

This was definitely among the most interesting talks I attended. The most engaging aspect of the talk was the speaker broke down the ideas into small bits of information that could be digested by audience with any knowledge level. While there is a lot of research baching these theories, the audience wouldn’t even feel the burden to understand the weight of the topic but still be able to effortlessly grasp the fundamentals. This way as an audience soon we could ask the same questions that John askeed himself as a PhD student before working on this idea to publish papers in ACL 2020. This style of explanation and the presentation skills is very commendable and I got to learn it live and practically – I will let it get absorbed to become a part of my style as well – of course I need to do the same level of work that he has done and go to the depths of the vast ocean of knowledge to be able to do what he did today! :)