A MULTIMODAL COMPLEXITY COMPREHENSION-TIME FRAMEWORK FOR AUTOMATED PRESENTATION SYNTHESIS (WedAmPO1)
Author(s) :
Harini Sridharan (Arizona State University, United States of America)
Ankur Mani (Arizona State University, United States of America)
Hari Sundaram (Arizona State University, United States of America)
Abstract : In this paper, we present a joint multimodal (audio, visual and text) framework to map the informational complexity of the media elements to comprehension time. The problem is important for interactive multimodal presentations. We propose the joint comprehension time to be a function of the media Kolmogorov complexity. For audio and images, the complexity is estimated using a lossless universal coding scheme. The text complexity is derived by analyzing the sentence structure. For all three channels, we conduct user-studies to map media complexity to comprehension time. For estimating the joint comprehension time, we assume channel independence resulting in a conservative comprehension time estimate. The time for the visual channels (text and images) are deemed additive, and the joint time is then the maximum of the visual and the auditory channel comprehension times. The user studies indicate that the model works very well, when compared with fixed-time multimodal presentations.

Menu