Abstract

This study explores how GPU resources can be used more efficiently for AI workloads running on shared clusters. Many existing schedulers leave GPUs idle or create long wait times for jobs. To address this, I developed three scheduling approaches—Hybrid Priority, Predictive Backfill, and Smart Batch—and tested them in a simulated multi-node environment. Each scheduler focuses on a specific improvement: Hybrid Priority balances fairness and efficiency, Predictive Backfill fills upcoming resource gaps, and Smart Batch groups similar jobs to reduce overhead. I compared them with standard schedulers like FIFO and SJF using metrics such as utilization, throughput, and wait time. The results show that the new schedulers reduce idle GPU time and improve fairness and overall performance. This work provides a practical framework for testing GPU scheduling strategies and helps guide future designs for efficient resource management in AI systems.

Advisor

Guarnera, Drew

Department

Computer Science

Keywords

GPU, AI

Publication Date

2025

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis

Share

COinS
 

© Copyright 2025 Akhmadillo Mamirov