NTK

分类: 基础理论

定义

描述无限宽神经网络训练动力学的核函数：在无限宽极限下，梯度下降训练等价于在 NTK 诱导的 RKHS 中做核回归，且 NTK 在训练过程中保持不变（常数核）。

$\Theta(x, x') = \mathbb{E}_{\theta_0}\!\left[\nabla_\theta f(x;\theta_0) \cdot \nabla_\theta f(x';\theta_0)\right]$

在无限宽极限下，参数 $\theta$ 的梯度流满足： $\frac{\partial f(x;\theta_t)}{\partial t} = -\Theta \cdot (f(\theta_t) - y)$

其中 $\Theta$ 在训练过程中不随 $t$ 变化（lazy training regime）。

Jacot et al. (2018) 提出，适用于无限宽（infinite-width）网络

有限宽网络 NTK 会发生漂移（feature learning），与无限宽有本质区别

用于分析收敛速度、泛化误差、训练动力学

局限：无限宽假设与实际网络差距大；不能解释有限宽网络的特征学习

今日论文”Gaussian Comparison Theorem”使用 CGMT 框架给出 NTK 的 non-asymptotic 有效性证明

Jacot et al. (2018), Neural Tangent Kernel: Convergence and Generalization in Neural Networks

Gaussian Comparison Theorem（2026-03-11）：在高斯混合数据假设下证明 NTK 近似的有效性