Video segmentation is a type of deep learning algorithm that enables autonomous vehicles to perceive and interpret real-world scenes in real-time [19]. Since video footage comprises multiple static frames, a fast image segmentation algorithm can be utilized for video segmentation. Image segmentation is divided into two categories: instance segmentation and semantic segmentation. Instance segmentation is superior to semantic segmentation because it preserves the 3D spatial location of objects. Furthermore, an instance segmentation algorithm is an object detection algorithm that can generate a pixel-wise mask for each object instance [34]. This paper discusses the fundamental components of two model families, R-CNN and YOLO, as well as the evaluation metrics and a comparison between a version of the R-CNN (Mask R-CNN) model and a version of the YOLO (YOLOv5) model.


Kelvey, Rob

Second Advisor

Taylor, Max


Computer Science; Mathematics



Publication Date


Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis


© Copyright 2023 Minh Duc Dao