gpt-3speech.ee.ntu.edu.tw/~tlkagk/courses/dlhlp20/gpt3 (v6).pdf · 2020. 6. 28. · openal elliot...

GPT-3Hung-yi Lee 李宏毅

Bigger Model

Megatron

TuringNLG

17B

https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/

GPT-3 has 175B parameters! (10 times larger than Turing NLG)

假設 ELMO 的參數量是長 30 公分的尺

那麼 GPT-3 比台北 101 還高

GPT-3 的參數量大約是 ELMO 的 2000 倍

GPT-3 是來自於暗黑大陸的模型

https://www.zhihu.com/question/398114261

https://github.com/openai/gpt-3/issues/1

https://www.zhihu.com/question/398114261

Text without annotation

Pre-train

A model that can read text

Model

Model

Task Specific

Model

Task Specific

Model

Task Specific

Fine-tune

Task-specific data with annotation

GPT 系列的野望

題型說明

少數範例

Few-shot Learning

One-shot Learning

Zero-shot Learning

(no gradient descent)

“In-context” Learning

Average of 42 tasks

Closed Book QA

Turing Advice Challengehttp://rowanzellers.com/advice/

http://rowanzellers.com/advice/

Turing Advice Challenge

raster order

Source of image: https://openai.com/blog/image-gpt/

My Favorite Ones

gpt-3speech.ee.ntu.edu.tw/~tlkagk/courses/dlhlp20/gpt3 (v6).pdf · 2020. 6. 28. · openal elliot...

Documents