Maties Machine Learning (MML) is a seminar series and discussion forum with the goal of bringing together people working on machine learning at Stellenbosch University. We meet roughly every second week for short talks on people’s current work, some ML-related topic, or open discussions. The idea is to get to know what others are working on and to strengthen machine learning research at Stellenbosch.
6 March 2020, 13:00-13:50 in K303 (Knowledge Centre, Engineering)
Jan Buys - Neural Text (De)generation
Despite considerable advances in neural language modelling, it is still an open question how to best apply language models to generate text. In this talk we present two complementary approaches to enable high-quality long-form text generation. First, we investigate what the best decoding strategy is for open-ended text generation. While language models are trained to maximize likelihood, maximization-based decoding methods such as beam search lead to degeneration — producing text that is bland, incoherent, or repetitive. To address this we propose Nucleus Sampling, a simple but effective decoding method that avoids text degeneration by truncating the unreliable tail of the probability distribution, and sampling from the nucleus of tokens containing most of the probability mass. We perform an extensive evaluation of multiple decoding methods, comparing generations from each method to human text along several axes including likelihood, diversity, and repetition. Second, we propose a framework for addressing neural degeneration through a committee of cooperative discriminators that can guide the language model towards more globally coherent generations. Each discriminator specializes in a different linguistic principle of communication, and their scores are combined with the language model in a stochastic beam search decoder. Our results show that Nucleus Sampling is the best currently available decoding strategy for generating long-form text that is both high-quality and diverse, while human evaluation demonstrates that text generated using cooperative discriminators is preferred over baselines by a large margin, enhancing the overall coherence, style, and informativeness of the generations.
20 March 2020, 13:00-13:50 in K303 (Knowledge Centre, Engineering)
Noe Fouotsa Manfouo
24 April 2020, 13:00-13:50 in K303 (Knowledge Centre, Engineering)
Daniel B. le Roux
15 May 2020, 13:00-13:50 in K303 (Knowledge Centre, Engineering)
- Rensu Theart
- Kayode Olaleye
22 May 2020, 13:00-13:50 in K303 (Knowledge Centre, Engineering)
Charl van Heerden (SAIgen)