Bielski, Adam Jakub (2024). Unsupervised Object Segmentation with Generative Models. (Thesis). Universität Bern, Bern
|
Text
24bielski_aj.pdf - Thesis Available under License Creative Commons: Attribution-Noncommercial (CC-BY-NC 4.0). Download (58MB) | Preview |
Abstract
Advances in computer vision have transformed how we interact with technology, driven by significant breakthroughs in scalable deep learning and the availability of large datasets. These technologies now play a crucial role in various applications, from improving user experience through applications like organizing digital photo libraries, to advancing medical diagnostics and treatments. Despite these valuable applications, the creation of annotated datasets remains a significant bottleneck. It is not only costly and labor-intensive but also prone to inaccuracies and human biases. Moreover, it often requires specialized knowledge or careful handling of sensitive information. Among the tasks in computer vision, image segmentation particularly highlights these challenges, with its need for precise pixel-level annotations. This context underscores the need for unsupervised approaches in computer vision, which can leverage the large volumes of unlabeled images produced every day. This thesis introduces several novel methods for learning fully unsupervised object segmentation models using only collections of images. Unlike much prior work, our approaches are e!ective on complex real-world images and do not rely on any form of annotations, including pre-trained supervised networks, bounding boxes, or class labels. We identify and leverage intrinsic properties of objects – most notably, the cohesive movement of object parts – as powerful signals for driving unsupervised object segmentation. Utilizing innovative generative adversarial models, we employ this principle to either generate segmented objects or directly segment them in a manner that allows for realistic movement within scenes. Our work demonstrates how such generated data can train a segmentation model that e!ectively generalizes to real-world images. Furthermore, we introduce a method that, in conjunction with recent advances in self-supervised learning, achieves state-of-the-art results in unsupervised object segmentation. Our methods rely on the e!ectiveness of Generative Adversarial Networks, which are known to be challenging to train and exhibit mode collapse. We propose a new, more principled GAN loss, whose gradients encourage the generator model to explore missing modes in its distribution, addressing these limitations and enhancing the robustness of generative models.
| Item Type: | Thesis |
|---|---|
| Dissertation Type: | Single |
| Date of Defense: | 24 April 2024 |
| Subjects: | 000 Computer science, knowledge & systems 500 Science > 510 Mathematics |
| Institute / Center: | 08 Faculty of Science > Institute of Computer Science (INF) |
| Depositing User: | Hammer Igor |
| Date Deposited: | 23 Dec 2025 15:16 |
| Last Modified: | 23 Dec 2025 23:25 |
| URI: | https://boristheses.unibe.ch/id/eprint/7004 |
Actions (login required)
![]() |
View Item |
