Beam

Decoding Strategies in Large Language Models 📚 Background 🏃‍♂️ Greedy Search ⚖️ Beam Search 🎲 Top-k sampling 🔬 Nucleus sampling Conclusion

The tokenizer, Byte-Pair Encoding on this instance, translates each token within the input text right into a corresponding token ID. Then, GPT-2 uses these token IDs as input and tries to predict the subsequent...

Recent posts

Popular categories

ASK ANA