Publications
Improved training methods for language models using data generation and reinforcement learning
Abstract
The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.
- Date
- January 16, 2025
- Authors
- VD Lai, T Bui, S Yoon, Q Tran, H Tan, H Deilamsalehy, A Salinas, ...
- Inventors
- Viet Dac Lai, Trung Bui, Seunghyun Yoon, Quan Tran, Hao Tan, Hanieh Deilamsalehy, Abel Salinas, Franck Dernoncourt
- Patent_office
- US
- Application_number
- 18220910