📝 Vietnamese NLP – POS Tagging Benchmarks & Resources

1. VLSP 2013 POS Tagging

📊 Dataset: 27,000+ sentences for training, 870 dev, 2,120 test (from VLSP 2013 Shared Task)
Model Accuracy Method / Reference Code
PhoBERT-large 96.8 Nguyen et al. ArXiv'20 Official
vELECTRA 96.77 Bui et al. ArXiv'20 Official
PhoBERT-base 96.7 Nguyen et al. ArXiv'20 Official
VnMarMoT 95.88 Nguyen et al. NAACL'18 Official
BiLSTM-CRFs + CNN-char 95.40 Ma et al. ACL'16 Link
BiLSTM-CRF + LSTM-char 95.31 Lample et al. NAACL'16 Link
BiLSTM-CRF 95.31 Huang et al. ArXiv'15 Link
RDRPOSTagger 95.11 Nguyen et al. EACL'14 Official
JointWPD 94.03 Nguyen et al. '18

2. VietTreeBank

📁 Paper: VietTreeBank Paper
Dataset: train: 7,268 | dev: 1,038 | test: 2,077 sentences
Model Accuracy Method Code Note
BiLSTM-CRFs 93.52 Nguyen et al. '18 Official 10-fold CV
VNTagger 93.40 Le et al. TALN'10 Official 10-fold CV
RDRPOSTagger 91.96 Pham et al. IJCNLP'17 Official 5-fold CV
NNVLP 91.92 Pham et al. IJCNLP'17 Official 5-fold CV
vTools 90.73 Tran et al. VLSP'13 Official
Vitk 88.41 Official

3. Social Media POS Tagging

4. Miscellaneous Papers & Datasets

5. Tools, Demos & Open Source Code