Shaolei Wang (王少磊)

Ph.D. candidate, Language analysis group in HIT-SCIR | mail: slwang (at) | [Get CV in PDF]

Natural Language Processing including disfluency detection, grammatical error detection and correction, joint entity and relation extraction. My supervisor is Wanxiang Che.

Ph.D. candidate, Harbin Institute of Technology2015.9 - present
Major: Computer Science

M.S., Harbin Institute of Technology2014.9 - 2015.7
Major: Computer Science

B.E., Harbin Institute of Technology2010.9 - 2014.7
Major: Computer Science

Shaolei Wang, Baoxin Wang, Jiefu Gong, Zhongyuan Wang, Xiao Hu, Xingyi Duan, Zizhuo Shen, Gang Yue, Ruiji Fu, Dayong Wu, Wanxiang Che, Shijin Wang, Guoping Hu, Ting Liu. Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis. In the 6th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2020).

Shaolei Wang, Zhongyuan Wang, Wanxiang Che and Ting Liu. Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP2020). [code]

Shaolei Wang, Wanxiang Che, Qi Liu, Pengda Qin, Ting Liu and William Yang Wang. Multi-Task Self-Supervised Learning for Disfluency Detection. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI2020).

Shaolei Wang, Yue Zhang, Wanxiang Che and Ting Liu. Joint Extraction of Entities and Relations Based on a Novel Graph Scheme. In Proceedings of the 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence (IJCAI2018). | [code] | [slide]

Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang and Ting Liu. Transition-Based Disfluency Detection using LSTMs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP2017). | [code] | [poster]

Shaolei Wang, Wanxiang Che and Ting Liu. A Neural Attention Model for Disfluency Detection. In Proceedings of the 26th InternationalConference on Computational Linguistics (COLING2016). | [slide]

Shaolei Wang, Wanxiang Che, Yijia Liu and Ting Liu. Enhancing Neural Disfluency Detection with Hand-Crafted Features. In Proceedings of China National Conference on Chinese Computational Linguistics (CCL2016).

Conference Reviewer/Secondary Reviewer: AAAI2018, ACL2018, TALLIP2018, COLING2018, ACL2019, EMNLP2019, CCL2019, ICPCSEE2019.

Academic/Industrial Experiences
Visiting Phd Student, University of California, Santa Barbara. 2018.09 - 2019.09
worked with Dr. William Wang , on disfluency detection and grammatical error detection.

Visiting Phd Student, Singapore University of Technology and Design. 2017.03 - 2017.12
worked with Dr. Yue Zhang, on disfluency detection and entity relation extraction.

Intern Researcher and Developer, iFLYTEK Inc., Research and Development Institute, Hefei. 2015.6 - 2015.9
Worked on Punctuation and Disfluency Detection.

Intern Researcher and Developer, Baidu Inc., NLP Department. 2014.6 - 2014.9
Worked on Query-based Synonym Mining.

Chinese Grammatical Error Diagnosis System, 2021.3 - present
Supported by Sina·MData( We are building a chinese grammatical error diagnosis system for

Our system (Flying) achieves 2 first (Most valuable indicators), 1 second, 1 third among six tracks, out of 43 submitted
systems (17 teams). Chinese Grammatical Error Diagnosis (CGED) aims to diagnose four types of grammatical errors which
are missing words (M), redundant words (R), bad word selection (S) and disordered words (W). The automatic CGED system
contains two parts including error detection and error correction.

End-to-end Entity mention and Relation Extraction, 2017.3 - 2017.12
Supported by bluepool( The goal of end-to-end entity mention and relation extraction is to
discover relational structures of entity mentions from unstructured texts. Previous work mainly focus on sequence-level
relation extraction.

Disfluency Detection, 2014.10 - 2016.10
Supported by iFLYTEK. The purpose of disfluency detection is to detect the infelicities in spoken language transcripts.
We firstly regard the disfluency detection as a sequence-to-sequence problem and propose a neural attention-based model
which achieves the best reported performance.

Query-based Synonym Mining, 2014.6 - 2014.9
Supported by Baidu. Based on the assumption that synonym should have similar user clicks when querying,
we try to use search engine's query log for synonym mining and finally obtain about a million synonyms.

National Graduate Scholarship (Phd) 2018.9
National Graduate Scholarship (Master)2014.9

TA, Python Programming, 2014 fall

Programming Languages: C, C++, Python