Sign in

This blog is aiming to describe in detail about Bert and all the extraordinary tricks it uses. We will split the blog into two parts:

LSTM is dead. Long Live Transformers!

This is a very interesting title of an excellent youtube video talking about transformers.

Motivations

When training with RNN, we can only do computation entry by entry. Thus, it becomes impossible for us to use GPU to improve its training time. …

RoyOnBus

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store