Pattern Recognition on Reduced Graphs

Gillioz, Anthony (2024). Pattern Recognition on Reduced Graphs. (Thesis). Universität Bern, Bern

Preview

Text
24gillioz_a.pdf - Thesis
Available under License Creative Commons: Attribution (CC-BY 4.0).
Download (20MB) | Preview

Abstract

In both everyday life and business, a huge amount of data is generated, underscoring the need for researching and developing efficient and accurate methods to automatically process these massive amounts of data. The data generated is often inherently complex, making traditional feature vectors not well suited for data representation. Graphs provide a general and versatile data representation that can be used to represent a wide range of complex systems. In addition, a variety of graph-based pattern recognition algorithms have been proposed in recent years, further strengthening this approach. One of the main challenges hindering the widespread use of graph-based pattern recognition, is the expensive computation time required by most of the graph-based algorithms. Although approximation algorithms have been proposed for various tasks, the computation time remains a challenge, in particular when the graph sizes increase. In this thesis, we aim to address the problem of computation time by reducing the input size of the underlying graphs through an approximation of the data. In this particular case, data approximation refers to reducing the size of a graph while maintaining its essential properties. The major hypothesis of this thesis is as follows. As the input problem is reduced, the overall computation cost of graph-based pattern recognition algorithm should be reduced as well. The crucial question is, of course, what happens to the accuracy of the respective methods when they operate on reduced rather than on the original data. In the present thesis, four graph reduction methods are introduced and thoroughly evaluated in the context of graph-based pattern recognition. The first technique is based on sampling the most important nodes within a graph. To this end, centrality measures to compute scores for each node are used. Next, one can remove a given percentage of the nodes with the lowest scores to generate reduced versions of the graphs. The second method is based on a spectral clustering algorithm of the graphs, which identifies communities within a graph. Those communities are aggregated into super-nodes, and the intercommunity edges are aggregated into super-edges to create reduced graphs. The third method uses a modified graph neural network to learn the importance of each node in the graph from the data. Based on these learned scores, the graph reduction method removes the nodes with the least importance. The fourth method uses a compression-based distance metric. In this approach, graphs are reduced with a compression algorithm and a distance is derived out of this compression. To evaluate the benefits and imitations of the four reduction methods, we perform comprehensive empirical evaluations of all reduction methods on different real-world datasets. In this context, we compare both their classification accuracy and their computational efficiency with the results obtained on the original graphs. A wide variety of graph classifiers are used. In general, the evaluation confirms that even highly reduced graphs maintain satisfactory classification accuracies and can significantly speed up graph-based pattern recognition. Furthermore, we utilize the reduced graphs in various novel graph matching frameworks with the general aim to improve the overall classification accuracy, and we succeed in this endeavor in several scenarios.

Item Type:	Thesis
Dissertation Type:	Single
Date of Defense:	27 May 2024
Subjects:	000 Computer science, knowledge & systems 500 Science > 510 Mathematics
Institute / Center:	08 Faculty of Science
Depositing User:	Sarah Stalder
Date Deposited:	02 Jun 2025 11:22
Last Modified:	02 Jun 2025 22:25
URI:	https://boristheses.unibe.ch/id/eprint/6235

Actions (login required)

View Item