3. Build multi layer perceptron for feature vector from scratch

I watched Andrej Karpathy’s 3rd video in the YouTube playlist: Neural Networks: Zero to Hero, and built a MLP (multi layer perceptron) from scratch to embed characters in a feature vector space and predict the next character based on the previous 3 characters. The original paper was published in 2003 by Bengio et al. titled “Neural Probabilistic Language Model”, 10 years before the classic Word2Vec paper was published in 2013. The core idea is similar: representing discrete entities in a continuous feature vector space. My code can be found on Github.

2. Build a bigram language model from scratch

I watched Andrej Karpathy’s 2nd video in the YouTube playlist: Neural Networks: Zero to Hero, and built a bigram language model from scratch in Python. Gradient descent of neural network training produces shockingly(?) similar results as the statistical analysis of bigram. My code can be found on Github.

1. Build backpropagation from scratch

This is the first time I implemented an automatic differentiation and the chain rule in Python, and built a backpropagation gradient descent algorithm from scratch. Thank you Andrej Karpathy so much for your amazing YouTube playlist: Neural Networks: Zero to Hero. My code can be found on Github.