Analysing the Impact of Images and Text for Predicting Human Creativity Through Encoders
Published in Proceedings of the 11th International Conference on Information and Communication Technologies for Ageing Well and e-Health ICT4AWE - Volume 1, 15-24, 2025
This paper is publicly available here.
Authors:
Amaia Pikatza-Huerga \(\cdot\) Pablo Matanzas de Luis \(\cdot\) Miguel Fernandez-de-Retana Uribe \(\cdot\) Javier Peña Lasa \(\cdot\) Unai Zulaika \(\cdot\) Aitor Almeida
Keywords:
Machine Learning \(\cdot\) Creativity Assessment \(\cdot\) Originality Evaluation \(\cdot\) Artistic Expression \(\cdot\) Text and Image Analysis \(\cdot\) EEG
Abstract:
This study explores the application of multimodal machine learning techniques to evaluate the originality and complexity of drawings. Traditional approaches in creativity assessment have primarily focused on visual analysis, often neglecting the potential insights derived from accompanying textual descriptions. The research assesses four target features: drawings’ originality, flexibility and elaboration level, and titles’ creativity, all labelled by expert psychologists. The research compares different image encoding and text embeddings to examine the effectiveness and impact of individual and combined modalities. The results indicate that incorporating textual information enhances the predictive accuracy for all features, suggesting that text provides valuable contextual insights that images alone may overlook. This work demonstrates the importance of a multimodal approach in creativity assessment, paving the way for more comprehensive and nuanced evaluations of artistic expression.