NLPCC 2020 Tutorials

Abstract : Pre-Trained Language Models (PLM) have become fundamental elements in recent research of natural language processing. In this tutorial, we will revisit the technical progress of the text representations, i.e., from one-hot embedding to the recent PLMs. We will describe several popular PLMs (such as BERT, XLNet, RoBERTa, ALBERT, ELECTRA, etc.) with their technical details and utilizations. On the other hand, we will also introduce various efforts on Chinese PLMs. At the end of this talk, we will analyze the shortcomings of the recent PLMs and envision the directions of future research.

Speaker: Yiming Cui is a research director at iFLYTEK Research. He received M.S. and B.S. degree and is currently pursuing a doctoral degree at Harbin Institute of Technology (HIT), majoring in computer science. His main research interests include Machine Reading Comprehension (MRC), Question Answering (QA), and Pre-trained Language Model (PLM), etc. He participated in several international machine translation competitions, including IWSLT 2012, IWSLT 2014, NIST OpenMT 15, and got several first prizes. He and his team also achieved top rankings on several MRC competitions, including SQuAD, CoQA, QuAC, SemEval 2018, HotpotQA, etc. He organized the first MRC evaluation workshop (i.e., CMRC 2017) and its following events. He has published more than 20 papers, including top-tier NLP/AI conference papers, such as in ACL, EMNLP, AAAI, COLING, NAACL, etc. He also serves as a reviewer for major top-tier NLP/AI conferences and ESI journals.

Lecture 2

Time: 14:00-17:00, 14th October, 2020

Title: Machine Reasoning in NLP

Abstract : Machine learning and deep learning models achieved huge success on a wide range of NLP problems. However, majority of these models are black-boxed, lack the transparency behind the decision making process. In addition, traditional methods highly depend on annotated data, while neglecting important knowledge from domain experts.
This talk will cover three active knowledge-based machine reasoning pipelines. In the first section, I will talk about first-order logic, from introducing its standard inference algorithms to recent extentions in NLP, including neural proving and regularizing neural models with logical constraints.
In the second section, I will introduce neural-symbolic models, which are equiped with logical forms that could execute/interact with the environment. Applications will include semantic parsers with discrete executors and neural module models that are guided with logical forms yet learnt in an end-to-end fashion. In the third section, I will introduce evidence-based models which use external evidence through single-turn or multi-turn retrieval.

Speaker: Dr. Duyu Tang is a Senior Researcher in Natural Language Computing group of Microsoft Research Asia, working on natural language processing (NLP). Duyu joined Microsoft Research Asia in 2016, after receiving his Ph.D. and M.S. in Computer Science from Harbin Institute of Technology. Duyu’s research has been advancing the state of art of robust, explainable and trustworthy NLP systems, while making direct technical contributions to production. Over the years, Duyu worked on a wide range of NLP problems, from sentiment analysis, question answering, conversational semantic parsing, knowledge-driven machine reasoning, fact checking and fake news detection, to AI for software engineering. Duyu served on the program committees of top NLP/AI conferences: ACL (2016-2020), EMNLP (2015-2020), NeurIPS (2018/2020), TACL (2020-2022), NAACL (2018/2019), ICLR (2021). He was an area chair for EMNLP 2020. Duyu is recognized as the Most Influential Scholar Award Honorable Mention (Rank #65, 2009–2019) by AMiner, and a recipient of CIPS Best Ph.D. Thesis Awards in 2016.

Lecture 3

Time: 19:00-22:00, 14th October, 2020

Title: Frontiers in GCN and Network Embedding

Abstract : Nowadays, larger and larger, more and more sophisticated networks are used in more and more applications. It is well recognized that network data is sophisticated and challenging. To process graph data effectively, the first critical challenge is network data representation, that is, how to represent networks properly so that advanced analytic tasks, such as pattern discovery, analysis and prediction, can be conducted efficiently in both time and space. In this talk, I will introduce the recent trends and latest progress on network embedding and GCN, including disentangled GCN, anti-attack GCN as well as auto machine learning for network embedding.

Speaker: Dr. Peng Cui is an Associate Professor with tenure in Tsinghua University. He got his PhD degree from Tsinghua University in 2010. His research interests include network representation learning, causally-regularized machine learning, and social dynamics modeling. He has published more than 100 papers in prestigious conferences and journals in data mining and multimedia. His recent research won the IEEE Multimedia Best Department Paper Award, SIGKDD 2016 Best Paper Finalist, ICDM 2015 Best Student Paper Award, SIGKDD 2014 Best Paper Finalist, IEEE ICME 2014 Best Paper Award, ACM MM12 Grand Challenge Multimodal Award, and MMM13 Best Paper Award. He is the Associate Editors of IEEE TKDE, IEEE TBD, ACM TIST, and ACM TOMM etc., and the program co-chair of ACM CIKM19 and MMM2020. He is a Senior Member of CCF and IEEE.

Lecture 4

Time: 09:00-12:00, 15th October, 2020

Title: Machine Reading Comprehension (MRC) Framework as Universal Solutions to Various NLP Tasks

Abstract : In the presentation, I will present how to formalize various NLP tasks (e.g., NER, Coreference Resolution, Relation Extraction, Text Classification) under the framework of Squad-style machine reading comprehension (MRC). By taking advantages of prior knowledge provided by the MRC framework, we observe not only significant performance boosts, but also the increase in domain adaptation abilities and zero-shot learning capabilities.

Speaker: Dr. Jiwei Li is a co-founder of ShannonAI. He obtained his Ph.D degree from the CS department at Stanford Unversity. His major research focus is Natural Language Processing and Deep Learning. He was named Global "35 Technology Innovations Under 35" by MIT Technology Review in 2020.

Lecture 5

Time: 14:00-17:00, 15th October, 2020

Title: Semantic Learning and Inference Cross Vision and Language

Abstract : Recent years have witnessed the advances of research cross computer vision and natural language processing with many exciting downstream applications, including image captioning, visual storytelling, visual question answering and vision navigation. Although there are significant progresses made for these applications in terms of automatic metrics, the huge gap of the two modalities still exists and limits the further improvement of current models.
In this tutorial, we will focus on cutting edge research for semantic learning and inference cross vision and language. Topics include visual theme understanding for text generation, pretraining techniques for vision and language, and multi-modal knowledge graph construction and inference.

Speaker: Dr. Zhongyu Wei is an associate Professor in School of Data Science at Fudan University and he serves as the secretory in Social Media Processing (SMP) comiittee of Chinese Information Processing Society of China (CIPS). At Fudan, he is the director of Data Intelligence and Social Computing Reseach Lab (Fudan DISC), and member of a larger NLP group directed by Prof. Xuanjing Huang. Before joining Fudan, he was a postdoctoral researcher in Human Language Technology Research Institute at University of Texas at Dallas. He got his Phd in The Chinese University of Hong Kong in 2014. His research focuses on natural language processing, machine learning, with special emphasis on multi-modality information understanding and generation cross vision and language, argumentation mining and some cross-disciplinary topics. He has published more than 60 papers on top-tier conferences in related research fields, including ACL, EMNLP, ICML, ICLR, IJCAI, AAAI and so on. He served as area co-chair of multi-modality in EMNLP 2020.

Speaker: Dr. Meng Wang is working as an assistant professor in the Knowledge Graph & AI Research Group, School of Computer Science and Engineering, Southeast University, China. He is also a SEU Zhishan Young Scholar (awarded in 2019). He obtained the doctoral degree from the Department of Computer Science and Technology, Xi’an Jiaotong University in 2018. He was a visiting scholar in the DKE lab at University of Queensland, Australia in 2016. His research area is in the cross-modal data, knowledge graph (KG), semantic search, and NLP. He has published more than 30 papers in top-tier conferences such as ISWC, ICSE, AAAI and IJCAI.

Organizer

Hosts

Publishers

Special Technical Sponsor


Primary Sponsors

Diamond Sponsors

Platinum Sponsors



Golden Sponsors


Silver Sponsors