A Multimodal Dialogue System for Conversational Image Editing

02/16/2020

∙

In this paper, we present a multimodal dialogue system for Conversational Image Editing. We formulate our multimodal dialogue system as a Partially Observed Markov Decision Process (POMDP) and trained it with Deep Q-Network (DQN) and a user simulator. Our evaluation shows that the DQN policy outperforms a rule-based baseline policy, achieving 90% success rate under high error rates. We also conducted a real user study and analyzed real user behavior.

READ FULL TEXT

A Multimodal Dialogue System for Conversational Image Editing

Sign in with Google

Consider DeepAI Pro