What is KAN?

#zettel #productivity #writing

Content

Recently, one of the fundamental building blocks in deep learning today, Multi-Layer Perceptron (MLP) comes to its alternation inspired by Kolmogorov-Arnold representation (KAN).

Although the author suggested that KAN could not replace MLP, and they all showed unreplaceable features in specific domains, KANs might change the current status in deep learning that rely heavily on MLPs.

Here is the comparison between MLP and KAN:

Model	MLP	KAN
Theorem	Universal Approximation Theorem	Kolmogorov-Arnold Representation Theorem
Formula (Shallow)	$f (x) \approx \sum_{i = 1}^{N (ϵ)} a_{i} σ (w_{i} \cdot x + b_{i})$	$f (x) = \sum_{q = 1}^{2 n + 1} Φ_{q} [\sum_{p = 1}^{n} ϕ_{q, p} (x_{p})]$
Formula (Deep)	$M L P (x) = (W_{3} \circ σ_{2} \circ W_{2} \circ σ_{1} \circ W_{1}) (x)$	$K A N (x) = (Φ_{3} \circ Φ_{2} \circ Φ_{1}) (x)$

There are much more computational details in Liu's work, this note is just a simple introduction of KAN. I would like to keep tracing its architecture and the profs on how KAN could beat the curse of dimensionality. Looking forward to further development and generalization of KANs.

References

[2404.19756] KAN: Kolmogorov-Arnold Networks (arxiv.org)