Team Leads: Paul Buitelaar and Suzanne Little
Data is available across a wide range of modalities, from visual data in the form of images and video, language data in the form of text and speech, audio data such as music and sounds, as well as other sensory data such as smell, taste or touch. Additionally, data often comes in a form that crosses several modalities such as video material, which typically includes at least image and sound (speech, music) as well as text in the form of subtitles that are, moreover, likely to be in a language different from the corresponding speech data.
Other settings also provide data in multiple modalities such as in human sensing, in which for instance facial expression data in the form of images can be combined with auditory (speech, sound), haptic (touch) or other sensory data.
The research challenge on Multimodal Data Analysis is therefore concerned with the integration and interpretation of data within and across modalities as well as human interaction with multimodal data and the knowledge and insights derived from it. The outcomes from our research will enable improved understanding and modelling of rich data sources in domains such as business, health, environment and education.