Simona Sorrentino interviews Alessandro Moschitti
In 2015 too, the University of Trento has won one of the rare IBM Faculty Awards given each year to internationally-selected research groups that are highly qualified and active in sectors that are strategic for the US computing giant.
The prize, in the field of cognitive computing, was awarded on 18 November 2015 to Alessandro Moschitti, associate professor of the Department of Information Engineering and Computer Science (DISI), at the conference on “Deep Natural Language Processing for Cognitive Dialog Systems”. Professor Moschitti’s research also received awards from IBM in 2010, 2011 and 2013. IBM’s interest in the winning project, “Hybrid Knowledge Retrieval for Dialog and Question Answering”, indicates the future potential of automatic processing of natural language, a technology that is being developed by the University of Trento and by other centers in Trentino, such as Fondazione Bruno Kessler and the CNR.
Professor Moschitti, what exactly is Natural Language Processing or NLP?
NLP is the study of techniques, models, theories and algorithms for the processing of natural language. So it’s the automatic processes that extract from a text a variety of linguistic information, such as the morphology of the words, the syntactic structure of a sentence, or the identification of concepts and entities, right up to the semantic processing of the entire text.
A more modern and relevant definition could be the following: Natural Language Processing deals with the theories, algorithms and automatic systems that improve the management of simple texts, for example by translating from one language to another, or by extracting information about a person or entity from millions of documents, or by searching for information on the basis of the semantics of the question, rather than simply the individual words.
What are the aims of the technology developed by your group, for which you received the IBM award?
The main aim is to develop intelligent search systems that can retrieve information on the basis of complex questions formulated by users in natural language. The systems are called Question Answering (QA) Systems and, typically, they answer questions that are more complex than those entered into search engines, which are rarely able to provide correct answers for this type of question, or which provide the answer somewhere within very long documents. Obviously, interpreting complex questions can also be difficult for humans, who are able to ask their interlocutor for clarification or for an example, thus starting a dialogue.
My group won the IBM prize for contributing to building state of the art systems that integrate Question Answering systems with the technology of dialogue.
Human-machine dialogue is in continual evolution. What are the next frontiers?
Human machine dialogue is a very active research field, which started several decades ago with the so-called Expert Systems, such as ELIZA. It is so complex that for decades no-one was able to model it effectively with machine learning techniques. These techniques optimize the probability of providing the correct answer, by using statistical analysis. Unfortunately, even the most recent techniques, such as reinforcement learning, still don’t give satisfactory solutions for complex application domains involving dialogue.
In relation to this, one research direction that my group is also pursuing is the use of hybrid approaches based on techniques for unstructured information search, such as Question Answering, and basic dialogue techniques. This seems to be an interesting direction for overcoming one of the most complex challenges in dialogue systems, that is, their application in a wider range of domains, not limited to simple applications such as responding to specific questions on specific products in call centers.
One of the areas of application of Natural Language Processing is medicine. What support can it provide?
One of the main problems in this area is to make information from the whole medical community available to doctors, and to coordinate the information to create a synthesis, to aid doctors in making a correct diagnosis. Often a doctor’s work involves recognizing patterns of symptoms, typically described in medical documents. NLP systems, of which IBM Watson is a forerunner, will be able to respond to questions entered into the system by returning the documents containing the desired patterns and highlighting the sections that the doctor doesn’t know, improving the doctor’s ability to make a diagnosis.
Other support, which at the moment is still futuristic, includes the capacity to respond to a doctor’s question by combining information from different sources in order to complete a creative task, that of finding new treatments or conclusions that are not in the databases.
Which other sectors could be improved or even revolutionized in the future?
The semantic analysis of texts can have a decisive impact on any field of application. In the era of information technology those who are able to access information more quickly and more fully have an advantage over their competitors. This gives a distinct advantage in industrial applications.
The award represents a new stage in the collaboration with the IBM group. What further developments are foreseen?
The collaboration with IBM has been going on for about seven years. It started in the IBM labs in New York, in the days of the famous IBM Watson supercomputer, and it has consolidated since then thanks to the long-term vision of the managers and researchers at IBM Italia, who recognized the great industrial opportunities that NLP technology offers, and the University of Trento can be a key component in developing these.