(See also https://bit.ly/3F1Kv2F )
Centered Kernel Alignment (CKA) is a similarity metric designed to measure the similarity of between representations of features in neural networks[^Kornblith2019].
CKA is based on the Hilbert-Schmidt Independence Criterion (HSIC). HSIC is defined using the centered kernels of the features to compare[^Gretton2005]. But HSIC is not invariant to isotropic scaling which is required for a similarity metric of representations[^Kornblith2019]. CKA is a normalization of HSIC.
The attached figure shows why CKA makes sense.
CKA has problems too. Seita et al argues that CKA is a metric based on intuitive tests, i.e., calculate cases that we believe that should be similar and check if the CKA values is consistent with this intuition. Seita et al built a quantitive benchmark[^Seita].
[^Kornblith2019]: http://arxiv.org/abs/1905.00414
[^Gretton2005]: https://link.springer.com/chapter/10.1007%2F11564089_7
[^Seita]: https://bair.berkeley.edu/blog/2021/11/05/similarity/