What is the difference between two types of transformer models (encoder-only vs encoder-decoder) and how they are used for answering questions?
What is the difference between two types of transformer models (encoder-only vs encoder-decoder) and how they are used for answering questions?