Abstract
This project modifies existing machine translation models for the purpose of translating code-mixed English and Spanish language (Spanglish) into monolingual Spanish. Firstly, this project summarizes the cultural and linguistic factors that define Spanglish as a form of oral and written communication distinctive from its English and Spanish constituents, referencing particularly the "radically bilingual" adjectival structures of the memoir Killer Crónicas by Susana Chávez-Silverman. Secondly, this work provides background information of machine learning concepts as a foundation to then analyze the Helsinki and mBART50 machine translation models, highlighting the architectural and training paradigms that enable them to perform machine translation. Thirdly, this project outlines the strategies and GPT concepts implemented to generate a synthetic parallel corpus used to fine-tune Helsinki and mBART50 for the purpose of translating Spanglish adjectival phrases from the selected literature. Finally, this work analyzes the results of the fine-tuned models, contextualizing their performance from both a linguistic and machine intelligence standpoint.
Advisor
Nord, Alex
Second Advisor
Balam, Osmer
Department
Computer Science; Spanish
Recommended Citation
Johnson, Walker A., "Augmented Translation of Spanglish Adjectival Phrases through Fine-Tuning of mBART50 & Helsinki Machine Translation Models" (2025). Senior Independent Study Theses. Paper 11281.
https://openworks.wooster.edu/independentstudy/11281
Disciplines
Artificial Intelligence and Robotics | Computer Sciences | Spanish and Portuguese Language and Literature | Spanish Linguistics
Keywords
Machine translation, Spanglish, machine learning, Helsinki, mBART50, synthetic parallel corpus, fine-tuning, adjectival phrases, linguistics, code-mixing
Publication Date
2025
Degree Granted
Bachelor of Arts
Document Type
Senior Independent Study Thesis
External Link
https://github.com/johnsonwa84/Spanglish-Translation-Senior-IS
© Copyright 2025 Walker A. Johnson