gpt-3speech.ee.ntu.edu.tw/~tlkagk/courses/dlhlp20/gpt3 (v6).pdf · 2020. 6. 28. · openal elliot...

23
GPT-3 Hung-yi Lee 李宏毅

Upload: others

Post on 25-Mar-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

GPT-3Hung-yi Lee 李宏毅

Page 2: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

GPT-3

Page 3: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Bigger Model

Megatron

TuringNLG

17B

https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/

GPT-3 has 175B parameters! (10 times larger than Turing NLG)

Page 4: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

假設 ELMO 的參數量是長 30 公分的尺

那麼 GPT-3 比台北 101 還高

GPT-3 的參數量大約是 ELMO 的 2000 倍

Page 5: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

GPT-3 是來自於暗黑大陸的模型

Page 6: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

https://www.zhihu.com/question/398114261

https://github.com/openai/gpt-3/issues/1

Page 7: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Text without annotation

Pre-train

A model that can read text

Model

Model

Task Specific

Model

Task Specific

Model

Task Specific

Fine-tune

Task-specific data with annotation

Page 8: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

GPT 系列的野望

題型說明

少數範例

Page 9: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Few-shot Learning

One-shot Learning

Zero-shot Learning

(no gradient descent)

“In-context” Learning

Page 10: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Average of 42 tasks

Page 11: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Closed Book QA

Page 12: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 13: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 14: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 15: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 16: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 17: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 18: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 19: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Turing Advice Challengehttp://rowanzellers.com/advice/

Page 20: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many
Page 21: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Turing Advice Challenge

Page 22: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

raster order

Page 23: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many

Source of image: https://openai.com/blog/image-gpt/

My Favorite Ones