Abstract
Generative machine learning models have achieved unprecedented feats in recent years and look primed to reach even more impressive heights. By learning data distributions through unsupervised training and by leveraging the power of neural networks, these models are responsible for breakthroughs in various domains. The aim of this paper is to cover some of the prominent generative model architectures through the bottom-up construction of an illustrated storybook generating interface that uses transfer learning on a transformer-based text generator, and the Vector Quantized Generative Adversarial network (VQGAN) coupled with Contrastive Language–Image Pre-training (CLIP) for prompt-driven image generation.
Advisor
Chowdhury, Subhadip
Second Advisor
Bhowmik, Khowshik
Department
Computer Science; Mathematics
Recommended Citation
Mustafa, Ussama, "Exploring the Power of Generative Architectures such as GANs, Transformers, and VQGAN+CLIP through the Construction of an Illustrated Storybook Generator" (2023). Senior Independent Study Theses. Paper 10709.
https://openworks.wooster.edu/independentstudy/10709
Disciplines
Art and Design | Computer Engineering
Keywords
Transformers, GANs, VQGAN, text-to-image synthesis, image generation, text generation
Publication Date
2023
Degree Granted
Bachelor of Arts
Document Type
Senior Independent Study Thesis
© Copyright 2023 Ussama Mustafa