What is KAN?

Content

Recently, one of the fundamental building blocks in deep learning today, Multi-Layer Perceptron (MLP) comes to its alternation inspired by Kolmogorov-Arnold representation (KAN).

Although the author suggested that KAN could not replace MLP, and they all showed unreplaceable features in specific domains, KANs might change the current status in deep learning that rely heavily on MLPs.

Here is the comparison between MLP and KAN:

Model MLP KAN
Theorem Universal Approximation Theorem Kolmogorov-Arnold Representation Theorem
Formula (Shallow) f(x)i=1N(ϵ)aiσ(wix+bi) f(x)=q=12n+1Φq[p=1nϕq,p(xp)]
Formula (Deep) MLP(x)=(W3σ2W2σ1W1)(x) KAN(x)=(Φ3Φ2Φ1)(x)

There are much more computational details in Liu's work, this note is just a simple introduction of KAN. I would like to keep tracing its architecture and the profs on how KAN could beat the curse of dimensionality. Looking forward to further development and generalization of KANs.

References

[2404.19756] KAN: Kolmogorov-Arnold Networks (arxiv.org)