Too powerful NLP model (GPT-2) – Towards Data Science

OpenAI released generative pre-training model which achieved the state-of-the-art result in many NLP task in 2018.

Unlike other model and practise, OpenAI does not publish the full version model but a lightweight version.

Due to our concerns about malicious applications of the technology, we are not releasing the trained model.

In other word, lower casing, tokenization and other step are skipped as authors believe that these pre-processing step restrict the capability of the model and it is able evaluate all language model benchmark.

To cater different scenario, 4 model with different parameters are trained.

Model TrainingGPT-2 use unsupervised learning approach to train the language model.

The only released trained model is the smallest one which having 117M parameters one.

This article was summarized automatically with AI / Article-Σ ™/ BuildR BOT™.

Original link

Sharing is caring!

Leave a Reply

Your email address will not be published. Required fields are marked *