Is Relevancy Everything? A Deep Learning Approach to Understand the Coupling of Image and Text

Jingcun Cao, Xiaolin Li, Lingling Zhang

Research output: Contribution to journalJournal

12 Downloads (Pure)

Abstract

Firms increasingly use a combination of image and text when displaying products or engaging consumers. Existing research has examined the role of text and image in consumer choice separately, without systematically considering the semantic relationship between them. In this research, we examine the effect of image-and text-based product representation and explore how image-text congruence affects consumer preference. We propose a state-of-the-art deep learning model to measure information congruence between products cover image and text description. Our Two-Branch Neural Networks model incorporates Wide-ResNet-50-2 (WRN) and BERT (Bidirectional Encoder Representations from Transformers) to capture the semantic relationship between image and text. Using individual-level consumption data from an online reading platform, we further examine the impacts of the image-text congruence on consumer behavior and identify a U-shape relationship: consumers prefer a product when the image-text congruence is either high or low, but not in the middle level. We explore the underlying mechanisms using an online study and find that the effect of high congruence is driven by information flu-ency, while that of low congruence is due to a “surprise-evoked information elaboration. Our study contributes to the literature of consumer information processing, and provides important managerial implications to marketing practitioners and policy makers.
Original languageEnglish
JournalManagement Science
Publication statusPublished - 2021

Keywords

  • multi-media formats
  • image-text congruence
  • visual analytics
  • consumer information processing
  • deep learning

Indexed by

  • FT
  • SSCI
  • ABDC-A*

Fingerprint

Dive into the research topics of 'Is Relevancy Everything? A Deep Learning Approach to Understand the Coupling of Image and Text'. Together they form a unique fingerprint.

Cite this

Cao, J., Li, X., & Zhang, L. (2021). Is Relevancy Everything? A Deep Learning Approach to Understand the Coupling of Image and Text. Management Science. https://cpb-us-e2.wpmucdn.com/sites.utdallas.edu/dist/8/1090/files/2022/02/cao-j-is-relevancy-everything-a-deep-learning-approach-to-understand-the-coupling-of-image-and-text.pdf