Predicting the generalization gap in neural networks using topological data analysis

Rubén Ballester, Xavier Arnal Clemente, Carles Casacuberta, Meysam Madadi, Ciprian A. Corneanu, Sergio Escalera

Understanding how neural networks generalize on unseen data is crucial for designing more robust and reliable models. In this paper, we study the generalization gap of neural networks using methods from topological data analysis. For this purpose, we compute homological persistence diagrams of weighted graphs constructed from neuron activation correlations after a training phase, aiming to capture patterns that are linked to the generalization capacity of the network. We compare the usefulness of different numerical summaries from persistence diagrams and show that a combination of some of them can accurately predict and partially explain the generalization gap without the need of a test set. Evaluation on two computer vision recognition tasks (CIFAR10 and SVHN) shows competitive generalization gap prediction when compared against state-of-the-art methods.
@article{predicting_generalization_using_tda_ballester, title = {Predicting the generalization gap in neural networks using topological data analysis}, journal = {Neurocomputing}, pages = {127787}, year = {2024}, issn = {0925-2312}, doi = {https://doi.org/10.1016/j.neucom.2024.127787}, url = {https://www.sciencedirect.com/science/article/pii/S0925231224005587}, author = {Rubén Ballester and Xavier Arnal Clemente and Carles Casacuberta and Meysam Madadi and Ciprian A. Corneanu and Sergio Escalera}, keywords = {Deep learning, Neural network, Topological data analysis, Generalization gap}, abstract = {Understanding how neural networks generalize on unseen data is crucial for designing more robust and reliable models. In this paper, we study the generalization gap of neural networks using methods from topological data analysis. For this purpose, we compute homological persistence diagrams of weighted graphs constructed from neuron activation correlations after a training phase, aiming to capture patterns that are linked to the generalization capacity of the network. We compare the usefulness of different numerical summaries from persistence diagrams and show that a combination of some of them can accurately predict and partially explain the generalization gap without the need of a test set. Evaluation on two computer vision recognition tasks (CIFAR10 and SVHN) shows competitive generalization gap prediction when compared against state-of-the-art methods.} }