PhD Student, Intelligent Systems Program, University of Pittsburgh
16 Apr 2021 - Arun Balajiee
In this talk, Sanya presented the steps to construction and use of knowledge graph to understand natural product drug interactions. Biomedical knowledge graphs are constructed using hierarchical presentation of domain ontology. The methods involved in the process are Data Integration, Standardization of Terminology, Machine Reading, Hypothesis generation and evaluation. CheBI is a knowledge representation used for this. In the current work, Sanya has implemented until the machine reading process upto the process of hypothesis generation using knowledge graphs. The ideas for future work are to include information for over 600 natural producates of interest, improve machine reading for better entity recognition along with a few other research goals.
Tushar’s work focus on training a classifier to identify biomarkers for detection of carcinogenic beryllium traces from data on berryllium exposure. As preliminary approach to the discovery of biomarkers using machine learning, Logistic Regression proved to be a useful with high accuracy. Tushar further implemented a t-SNE visualization approach for the disease. Using a K-Nearest Neighbour approach, explored thatt classificaiton using logistic regresion outperforms other models over imputed data.
For future explorations, some steps that could be taken are applying Random Forest classifiers for small sample sizes, consistency of different methods of feature selection for filter and wrapper methods and the analysis of the decision making steps involved in Logistic Regression.