Initialization Strategies for Gaussian Mixture Layers
Click here for the updated report
Based on Gaussian mixture layers for neural networks by Professor Sinho Chewi. Essentially, instead of a finite number of neurons, we have a spectrum of neurons parameterized by a Gaussian mixture. Then we take the expected value of a neuron-esque operation over the Gaussian mixture. It is very important to note that empirically, this architecture is not amazing. However, it is an attempt to represent infinitely many parameters, which is cool. My idea is not the most novel thing, but it works.
I most definitely need to add the rest of the figures, but I wanted to initially share what I was working on.
TODO:
- Show that Kaiming isn’t doing all of the heavy lifting.
- Harder classification tasks.
- Compare infinite width neural network with GM network.
- Prove why the method works.