On Dec. 3 at 5:00 p.m. an open seminar will be held by DeepPavlov, where Anna Rogers will present a bertological report in English.
Topic: “When BERT plays the lottery, all tickets are winning!”
The lottery ticket hypothesis was originally developed for randomly initialized models, but might it also apply to pre-trained Transformers? If the “good” subnetworks exist, can they tell us anything about how BERT achieves its performance?
The original paper can be found here https://arxiv.org/pdf/2005.00561