NLPCC 2020 Call for Participation (Shared Tasks)

The CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC) is the annual conference of CCF TCCI (Technical Committee of Chinese Information, China Computer Federation). The NLPCC conferences have been successfully held in Beijing (2012)，Chongqing (2013), Shenzhen (2014), Nanchang (2015), Kunming (2016), Dalian (2017), Hohhot (2018) and Dunhuang (2019). This year's NLPCC conference will be held in Zhengzhou on October 14-18, 2020.

NLPCC 2020 will follow the NLPCC tradition of holding several shared tasks in natural language processing and Chinese computing. This year’s shared tasks focus on both classical problems and newly emerging problems, including: Light Pre-Training Chinese Language Model for NLP Task, Multi-Aspect-based Multi-Sentiment Analysis and Auto Information Extraction.

Participants from both academia and industry are welcomed. Each group can participate in one or multiple tasks and members in each group can attend the NLPCC conference to present their techniques and results. The participants will be invited to submit papers to the main conference and the accepted papers will appear in the conference proceedings published by Springer LNCS.

The top 3 participating teams of each task will be certificated by NLPCC and CCF Technical Committee on Chinese Information Technology. If a task has multiple sub-tasks, then only the top 1 participating team of each sub-task will be certificated.

1. Overview of the Shared Tasks

There are three shared tasks in this year’s NLPCC conference and the detailed description of each task can be found in the task guidelines released. Here we only give a brief overview of each task.

◇ Task 1 - Light Pre-Training Chinese Language Model for NLP Task

The goal of this task is to train a light language model which is still as powerful as the other normal models. Each model will be tested on many different downstream NLP tasks. We would take the number of parameters, accuracy and inference time as the metrics to measure the performance of a model. To meet the challenge of the lack of Chinese corpus, we will provide a big Chinese corpus for this task and will release them for all the researchers later.

Organizer: CLUE benchmark

Contact: CLUEbenchmark@163.com

◇ Task 2 - Multi-Aspect-based Multi-Sentiment Analysis (MAMS)

In existing aspect-based sentiment analysis (ABSA) datasets, most sentences contain only one aspect or multiple aspects with the same sentiment polarity, which may make ABSA task degenerate to sentence-level sentiment analysis. In NLPCC-2020, we manually annotated a large-scale restaurant reviews corpus for MAMS, in which each sentence contains at least two different aspects with different sentiment polarities. The MAMS task includes two subtasks: (1) aspect term sentiment analysis (ATSA) that aims to identify the sentiment polarity towards the given aspect terms and (2) aspect category sentiment analysis (ACSA) that aims to identify the sentiment polarity towards the pre-specified aspect categories. We will provide train and development sets to participating teams to build their models.

Organizer: Shenzhen Institutes of Advanced Technology (SIAT), Chinese Academy of Sciences and Harbin Institute of Technology (Shenzhen)

Contact: Min Yang (min.yang@siat.ac.cn) and Ruifeng Xu (xuruifeng@hit.edu.cn)

◇ Task 3 - Auto Information Extraction (AutoIE)

Entity extraction is a fundamental problem in language technology. Most previous work focus on the scenario in which labelled data is provided for interested entities. However, the categories of entities can be hierarchical and cannot be enumerated sometimes. Thus, a generic solution cannot depend on the hypothesis that enough labeled data is provided. This task is to build IE systems with Noise and Incomplete annotations. Specifically, given a list of entities of specific type and an unlabeled corpus containing these entities, the task aims to build an IE system which may recognize and extract the interested entities of given types. The task setting is very practical and thus the proposed solutions may generalize well in real world applications.

Organizer: Zhuiyi Technology

Contact: Xuefeng Yang (ryan@wezhuiyi.com)

2. How to Participate

◇ Task 1 - Light Pre-Training Chinese Language Model for NLP Task

Registration online with the following steps:

(1.1) Visit www.CLUEbenchmark.com, and click the button 【注册】 at the top right corner of the page. After that, please log in.

(1.2) After selecting the【NLPCC测评】in the top navigation bar, please register our task in 【比赛注册】.

◇ Task 2 - Multi-Aspect-based Multi-Sentiment Analysis (MAMS)

Please fill out the Shared Task 2 Registration Form (Word File) and send it to the following registration email.

Registration Email: lei.chen@siat.ac.cn

◇ Task 3 - Auto Information Extraction (AutoIE)

Please fill out the Shared Task 3 Registration Form (Word File) and send it to the following registration email.

Registration Email: ryan@wezhuiyi.com

3. Important dates

2020/03/10：announcement of shared tasks and call for participation;

2020/03/10：registration open;

2020/03/25：release of detailed task guidelines & training data;

2020/05/01：registration deadline;

2020/05/15：release of test data;

2020/05/20：participants’ results submission deadline;

2020/05/30：evaluation results release and call for system reports and conference paper;

2020/06/30：conference paper submission deadline (only for shared tasks);

2020/07/30：conference paper accept/reject notification;

2020/08/10：camera-ready paper submission deadline;

4. Paper Submission Guidelines

The evaluation papers are English only. The papers will be in the the proceedings of the NLPCC-2020 conference (for English) which will be published as a volume in the Springer LNAI series (EI & ISTP indexed, English papers). Submissions should follow the LNCS formatting instructions. The maximum paper length is 12 pages, including references; The submissions must therefore be formatted in accordance with the standard Springer style sheets ([LaTeX][Microsoft Word]). Manuscripts should be submitted electronically through the submission website (https://www.softconf.com/nlpcc/eval-2020). Email submissions will not be accepted. Submissions should be in PDF format.

5. Shared Task Organizers (in alphabetical order)

Yunbo Cao, Tencent

Junyi Li, CLUE benchmark

Minglei Li, Huawei Cloud

Shoushan Li, Soochow University

Liang Xu, CLUE benchmark

Ruifeng Xu, Harbin Institute of Technology (Shenzhen)

Min Yang, Shenzhen Institutes of Advanced Technology (SIAT), Chinese Academy of Sciences

Xuefeng Yang, ZhuiYi Technology