Skip to content

Instantly share code, notes, and snippets.

@DSamuelHodge
Last active February 23, 2025 03:19
Show Gist options
  • Select an option

  • Save DSamuelHodge/30a0e2697ed76c999c5f9bf1374ca751 to your computer and use it in GitHub Desktop.

Select an option

Save DSamuelHodge/30a0e2697ed76c999c5f9bf1374ca751 to your computer and use it in GitHub Desktop.
Understanding of Deep Learning through Heavy-Tailed Self-Regularization Theory by Charles Martin, PhD.

Acronyms and Notation Tables

Table 1: Definitions of acronyms used in HTSR

Acronym Description
DNN Deep Neural Network
ML Machine Learning
SGD Stochastic Gradient Descent
RMT Random Matrix Theory
MP Marchenko Pastur
ESD Empirical Spectral Density
PL Power Law
HT Heavy-Tailed
TW Tracy Widom (Law)
SVD Singular Value Decomposition
FC Fully Connected (Layer)
VC Vapnik Chrevoniks (Theory)
SMTOG Statistical Mechanics Theory of Generalization

Table 2: Definitions of notation used in HTSR

Notation Description
$$\mathbf{W}$$ DNN layer weight matrix of size $$N \times M$$, with $$N \geq M$$
$$\mathbf{W}_l$$ DNN layer weight matrix for $$l^{th}$$ layer
$$\mathbf{W}_l^e$$ DNN layer weight matrix for $$l^{th}$$ layer at $$e^{th}$$ epoch
$$\mathbf{W}^{rand}$$ random rectangular matrix, elements from truncated Normal distribution
$$\mathbf{W}(\mu)$$ random rectangular matrix, elements from Pareto distribution
$$\mathbf{X} = (1/N)\mathbf{W}^T\mathbf{W}$$ normalized correlation matrix for layer weight matrix $$\mathbf{W}$$
$$Q = N/M > 0$$ aspect ratio of $$\mathbf{W}$$
$$\nu$$ singular value of $$\mathbf{W}$$
$$\lambda$$ eigenvalue of $$\mathbf{X}$$
$$\lambda_{max}$$ maximum eigenvalue in an ESD
$$\lambda^+$$ eigenvalue at edge of MP Bulk
$$\lambda_k$$ eigenvalue lying outside MP Bulk, $$\lambda^+ < \lambda_k \leq \lambda_{max}$$
$$\rho_{emp}(\lambda)$$ actual ESD, from some $$\mathbf{W}$$ matrix
$$\rho(\lambda)$$ theoretical ESD, infinite limit
$$\rho_N(\lambda)$$ theoretical ESD, finite $$N$$ size
$$\rho(\nu)$$ theoretical empirical density of singular values, infinite limit
$$\sigma^2_{mp}$$ elementwise variance of $$\mathbf{W}$$, used to define MP distribution
$$\sigma^2_{shuf}$$ elementwise variance of $$\mathbf{W}$$, as measured after random shuffling
$$\sigma^2_{bulk}$$ elementwise variance of $$\mathbf{W}$$, after removing/ignoring all spikes $$\lambda_k > \lambda^+$$
$$\sigma^2_{emp}$$ elementwise variance of $$\mathbf{W}$$, determined empirically
$$\mathcal{R}(\mathbf{W})$$ Hard Rank, number of non-zero singular values, Eqn. (5)
$$\mathcal{S}(\mathbf{W})$$ Matrix Entropy, as defined on $$\mathbf{W}$$, Eqn. (6)
$$\mathcal{R}_s(\mathbf{W})$$ Stable Rank, measures decay of singular values, Eqn. (7)
$$\mathcal{R}_{mp}(\mathbf{W})$$ MP Soft Rank, applied after and depends on MP fit, Eqn. (11)
$$\mathcal{S}(\mathbf{v})$$ Vector Entropy, as defined on vector $$\mathbf{v}$$
$$\mathcal{L}(\mathbf{v})$$ Localization Ratio, as defined on vector $$\mathbf{v}$$
$$\mathcal{P}(\mathbf{v})$$ Participation Ratio, as defined on vector $$\mathbf{v}$$
$$p(x) \sim x^{-1-\mu}$$ Pareto distribution, parameterized by $$\mu$$
$$p(x) \sim x^{-\alpha}$$ Pareto distribution, parameterized by $$\alpha$$
$$\rho(\lambda) \sim \lambda^{-(\mu/2+1)}$$ theoretical relation, for ESD of $$\mathbf{W}(\mu)$$, between $$\alpha$$ and $$\mu$$ (for $$0 < \mu < 4$$)
$$\rho_N(\lambda) \sim \lambda^{-(\mu+b)}$$ empirical relation, for ESD of $$\mathbf{W}(\mu)$$, between $$\alpha$$ and $$\mu$$ (for $$2 < \mu < 4$$)
$$\Delta\lambda = ||\lambda - \lambda^+||$$ empirical uncertainty, due to finite-size effects, in theoretical MP bulk edge
$$\Delta$$ model of perturbations and/or strong correlations in $$\mathbf{W}$$

Martin, C.H., Peng, T.(. & Mahoney, M.W. Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data. Nat Commun 12, 4122 (2021). https://doi.org/10.1038/s41467-021-24025-8

Paper: https://www.nature.com/articles/s41467-021-24025-8

Code: https://github.com/CalculatedContent/WeightWatcher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment