Day20 학습정리

부스트캠프 AI Tech 1기 [T1209 최보미]/U stage

Day20 학습정리 - NLP5

B1001101 2021. 2. 19. 23:56

강의복습

1. Self-supervised Pre-training Models

1) GPT-1

효과적인 transfer learning 위해 <S>/<E>/ $ 와 같은 special token 사용

2) BERT

masked language modeling task
large-scale data & large-scale model
Pre-training Tasks
- Masked Language Model (MLM)
  - Mask some percentage of the input tokens at random, and then predict those masked tokens.
  - 15% of the words to predict
    - 80% of the time, replace with [MASK]
    - 10% of the time, replace with a random word
    - 10% of the time, keep the sentence as same
- Next Sentence Prediction (NSP): Predict whether Sentence B is an actual sentence that proceeds Sentence A, or a random sentence

3) 기타

BERT: SQuAD 1.1 → Use token 0 ([CLS]) to emit logit for “no answer”
BERT: On SWAG → Run each Premise + Ending through BERT, Produce logit for each pair on token 0 ([CLS])

2. Advanced Self-supervised Pre-training Models

1) GPT-2

down-stream tasks in a zero-shot setting
Use conversation question answering dataset(CoQA)

2) GPT-3

Language Models are Few-shot Learners
- Prompt: the prefix given to the model
- Zero-shot:Predict the answer given only a natural language description of the task
- One-shot: See a single example of the task in addition to the task description
- Few-shot: See a few examples of the task

3) ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Factorized Embedding Parameterization
- V = Vocabulary size
- H = Hidden-state dimension
- E = Word embedding dimension

Cross-layer Parameter Sharing
- Shared-FFN: Only sharing feed-forward network parameters across layers
- Shared-attention: Only sharing attention parameters across layers
- All-shared: Both of them

4) ELECTRA: Efficiently Learning an Encoder that Classifies Token Replacements Accurately

Discriminator: the main networks for pre-training

5) Light-weight Models

DistillBERT (NeurIPS 2019 Workshop)
TinyBERT (Findings of EMNLP 2020)

6) Fusing Knowledge Graph into Language Model

ERNIE: Enhanced Language Representation with Informative Entities (ACL 2019)
KagNET: Knowledge-Aware Graph Networks for Commonsense Reasoning (EMNLP 2019)

피어세션 & 마스터클래스 & 과제해설

오늘은 아침부터 타운홀 미팅이 있었고 마스터클래스랑 과제해설도 있어서 매우 바빴다. 타운홀 미팅에서는 출결 및 훈련장려금 관련 공지사항을 전달해줬다. 마스터클래스는 수요일에 이어 오늘도 주재걸교수님이 진행하셨다. 과제해설 시간에는 어제 과제에 대한 설명, 그리고 좋은 논문 찾는 법을 알려주셨다. 과제 코드를 보면서 정규식에 대해 궁금했어서 과제해설 때 알려주시길 기대했는데 조교님도 잘 모르시는 것 같았다.

코멘트

정신없이 강의를 듣고 과제를 하다 보니 어느새 4주차가 끝났다. 이번 주에 배운 자연어처리는 어렵긴 했지만 신기하고 흥미로웠던 것 같다. 주말동안 이번주 내용을 다시 한 번 복습해야겠다. 그리고 오늘 과제를 하면서 Kaggle을 처음 해봤는데 대회 참가자들 순위가 매겨지니까 재미있었고 승부욕이 생겼다. 앞으로 다른 캐글 대회에도 많이 참여해봐야겠다.

저작자표시