Mastering NLP Tasks

Study materials for NLP tasks

Evaluation Settings

GPT-like

The GPT-like model’s prediction is one of the following form. Autoregressive process outputs the predicted answer. Additional context can be added to the prompt for open-book question answering tasks.

# zero-shot open question 
question = "What is the capital of Korea?"
answer = "Seoul" 
prefix = f"Answer the following question."
template= f"{prefix}\nQ:{question}\nA:"

# zero-shot closed question 
question = "What is the capital of Korea?"
candidates = "(a) Seoul, (b) Busan" 
prefix="Answer the following question."
template= f"{prefix}\nQ:{question}\nCandidates:{candidates}\nA:"

# few-shot open question 
fs_qs = ["What is the capital of Korea?", 'What is the capital of Italy?']
fs_as = ['Seoul', 'Rome']
prefix="Answer the following question."
template= f"{prefix}\n"+"".join([f"Q:{q}\nA:{a}\n" for q, a in zip(fs_qs, fs_as)]) + "A:"

# few-shot closed question type I
fs_qs = ["The capital of Korea is Seoul", 'The capital of Korea is Busan']
fs_as = ['True', 'False']
candidates = "(a) True, (b) False"

question = "The capital of Italy is Rome"
answer= "True"
prefix=f"Answer the following question.\nAnswer candidates: {candidates}\n"
template= f"{prefix}\n"+"".join([f"Q:{q}\nA:{a}\n" for q, a in zip(fs_qs, fs_as)])+f"Q:{question}\nA:"

T5-Like

An encoder-and-decoder architecture encode the question and the context, and predict answers. Rather than using few-shot, the decoder predict the direct label (autoregressively).

question = "What is the capital of Korea?"
answer = "Seoul" 

question = "What is the capital of Korea?"
context = "The capital of Korea is started by S"
answer = "Seoul" 

template=f"QA question:{question} context:{context}" 

In the SuperGLUE, and GLUE, the task description is added and all the training datasets for tasks are concatenated to construct a single training data set.

BERT-like


Notes

Todo