NLPCC 2017 Events
>> Home
>> Program
>> Information
>> Call for Papers
>> Evaluation
>> Organization
>> Past Conferences
NLPCC 2017 Session Program Detail
Best Papers 2017-11-12 09:30-10:40, 1F Joyous Gathering Palace A(1F 聚和宫A区), DIFCC Chair: TBD | |
09:40-10:10 |
Long Zhou, Jiajun Zhang and Chengqing Zong show abstract hide abstractABSTRACT: The attention model has become a standard component in neural machine translation (NMT) and it guides translation process by selectively focusing on parts of the source sentence when predicting each target word. However, we find that the generation of a target word does not only depend on the source sentence, but also rely heavily on the previous generated target words, especially the distant words which are difficult to model by using recurrent neural networks. To solve this prob- lem, we propose in this paper a novel look-ahead attention mechanism for generation in NMT, which aims at directly capturing the dependency relationship between target words. We further design three patterns to integrate our look-ahead attention into the conventional attention model. Experiments on NIST Chinese-to-English and WMT English-to-German translation tasks show that our proposed look-ahead attention mecha- nism achieves substantial improvements over state-of-the-art baselines. |
10:10-10:40 |
Tianyu Liu, Bingzhen Wei, Baobao Chang and Zhifang Sui show abstract hide abstractABSTRACT: Numerous machine learning tasks achieved substantial ad- vances with the help of large-scale supervised learning corpora over past decade. However, there’s no large-scale question-answer corpora available for Chinese question answering over knowledge bases. In this paper, we present a 28M Chinese Q&A corpora based on the Chinese knowledge base provided by NLPCC2017 KBQA challenge. We propose a novel neural network architecture which combines template-based method and seq2seq learning to generate highly uent and diverse questions. Both automatic and human evaluation results show that our model achieves outstanding performance (76.8 BLEU and 43.1 ROUGE). We also propose a new statistical metric called DIVERSE to measure the linguistic diversity of generated questions and prove that our model can generate much more diverse questions compared with other baselines. |
Industrial Forum 2017-11-11 11:40-12:10, 1F Joyous Gathering Palace A(1F 聚和宫A区), DIFCC Chair: TBD | |
11:40-11:55 |
NLP Challenges in Knowledge-Driven Spoken Dialogue Systems Dr. Min CHU show abstract hide abstractABSTRACT: In recent years, the spoken dialogue system becomes the most important component in many AI applications such personal assistants and smart speakers. The goal for most of these dialogue systems is to complete predefined tasks. Yet, there are more complicated scenarios, such as intelligent customer service, intelligent business representatives, where knowledge is organized in as aggregation of multiple knowledge representations such as QA pairs, knowledge graphs, tables etc. In this presentation, I will talk about the needs for building up knowledge-driven spoken dialogue systems and NLP challenges in such systems. SHORT BIO: Dr. Chu joined AISpeech as the vice present recently. Her responsibility is to build up Aispeech Beijing R&D center. The new R&D center will focus on developing key technologies in knowledge-driven spoken dialogue systems and exploring new business and application opportunities.Before joining AISpeech, Dr. Chu spend 8 years with Alibaba, leading the R&D efforts in speech interaction area and supporting the speech interaction needs within the company (Yun OS, Alipay, Taobao, DingDing etc).Before Alibaba, Dr. Chu worked in MSRA for about 10 years. Her main research interests are in ASR, TTS, NLP, machine learning, big data etc. She has published 100+ academic papers and applied 30+ patents. |
11:55-12:10 |
A Roadmap of Practical Machine Translation System Dr. Shiqi LI show abstract hide abstractABSTRACT: Recently, neural machine translation has become a favorite technology in machine translation systems. Most research works focus on how to enhance the neural networks while others focus on changing the network structure. This speech will talk about how to build a practical machine translation system. The roadmap includes system training, optimization, deployment and logging, in which involves the most useful tricks on big data for training. Several methods will be introduced to optimize the model without much time cost during decoding. At the deployment stage, a high available framework of machine translation system will be described. Finally, a solution will be provided on how to analysis a translation system through a logging system. SHORT BIO: Shiqi Li, Ph.D. in Computer Application Technology, Harbin Institute of Technology. Assistant Dean of 2020 Cognitive Intelligence Research Institute. GTCOM Co., Ltd. Research Interests: machine learning, natural language processing and cognitive modeling |
NLP Fundamentals I 2017-11-11 14:00-15:20, 1F Joyous Gathering Palace A(1F 聚和宫A区), DIFCC Chair: TBD Return to Top | |
14:00-14:20 |
Fang Kong and Guodong Zhou show abstract hide abstractABSTRACT: Chinesezero pronoun(ZP)resolution plays a critical role in discourse analysis. Different from traditional mention to mention approaches, this paper proposes a chain to chain approach to improve the performance of ZP resolution from three aspects. Firstly, consecutive ZPs are clustered into coreferential chains, each working as one independent anaphor as a whole. In this way, those ZPs far away from their overt antecedents can be bridged via other consecutive ZPs in the same coreferential chains and thus better resolved. Secondly, common noun phrases (NPs) are automatically grouped into coreferential chains using traditional approaches, each working as one independent antecedent candidate as a whole. Then, ZP resolution is made between ZP coreferential chains and common NP coreferential chains. In this way, the performance can be much improved due to the effective reduction of search space by pruning singletons and negative instances. Finally, additional features from ZP and common NP coreferential chains are employed to better represent anaphors and their antecedent candidates, respectively. Comprehensive experiments on the OntoNotes corpus show that our chain to chain approach significantly outperforms the state-of-the-art mention to mention approaches. To our knowledge, this is the first work to resolve zero pronouns in a chain to chain way. |
14:20-14:40 |
Chen Sheng, Fang Kong and Guodong Zhou show abstract hide abstractABSTRACT: Chinesezero pronoun(ZP)resolution plays an important role in natural language understanding. This paper focuses on improving Chinese ZP resolu- tion from discourse perspective. In particular, various kinds of discourse information are employed in both stages of ZP resolution. During the ZP detection stage, we first propose an elementary discourse unit(EDU) based method to generate ZP candidates from discourse perspective and then exploit relevant discourse context to help better identify ZPs. During the ZP resolution stage, we employ a tree-style discourse rhetorical structure to improve the resolution. Evaluation on OntoNotes shows the significant importance of discourse information to the performance of ZP resolution. To the best of our knowledge, this is the first work to improve Chinese ZP resolution from discourse perspective. |
14:40-15:00 |
xihan yue, Luoyi Fu and Xinbing Wang show abstract hide abstractABSTRACT: Without discourse connectives,recognizing implicit discourse relations is a great challenge and a bottleneck for discourse parsing. The key factor lies in proper representing the two discourse arguments as well as modeling their interactions. This paper proposes two novel neural networks, i.e., externally controllable LSTM (ECLSTM) and attention- augmented GRU (AAGRU), which can be stacked to incorporate argu- ments’ interactions into their representing process. The two networks are variants of Recurrent Neural Network (RNN) but equipped with ex- ternally controllable cells that their working processes can be dynam- ically regulated. ECLSTM is relatively conservative and easily com- prehensible while AAGRU works better for small datasets. Multilevel RNN with smaller hidden state allows critical information to be gradu- ally exploited, and thus enables our model to fit deeper structures with slightly increased complexity. Experiments on the Penn Discourse Tree- bank (PDTB) benchmark show that our method achieves significant per- formance gain over vanilla LSTM/CNN models and competitive with previous state-of-the-art models. |
15:00-15:20 |
Shichao Sun and Zhipeng Xie show abstract hide abstractABSTRACT: Readability assessment plays an important role in selecting proper reading materials for language learners, and is applicable for many NLP tasks such as text simplification and document summarization. In this study, we de- signed 100 factors to systematically evaluate the impact of four levels of linguis- tic features (shallow, POS, syntactic, discourse) on predicting text difficulty for L1 Chinese learners. We further selected 22 significant features with regression. Our experiment results show that the 100-feature model and the 22-feature model both achieve the same predictive accuracies as the BOW baseline for the majority of the text difficulty levels, and significantly better than baseline for the others. Using 18 out of the 22 features, we derived one of the first readability formulas for contemporary simplified Chinese language. |
Machine Translation I 2017-11-11 14:00-15:20, 2F Meeting Room 5(2F 5号会议室), DIFCC Chair: TBD Return to Top | |
14:00-14:20 |
Shuangzhi Wu, Dongdong Zhang, Shujie Liu and Ming Zhou show abstract hide abstractABSTRACT: Contextual information is very important to select the appropriate phrases in statistical machine translation (SMT). The selection of different tar- get phrases is sensitive to different parts of source contexts. Previous approaches based on either local contexts or global contexts neglect impacts of different con- texts and are not always effective to disambiguate translation candidates. As a matter of fact, the indicative contexts are expected to play more important roles for disambiguation. In this paper, we propose to leverage the indicative con- texts for translation disambiguation. Our model assigns phrase pairs confidence scores based on different source contexts which are then intergraded into the SMT log-linear model to help select translation candidates. Experimental results show that our proposed method significantly improves translation performance on the NIST Chinese-to-English translation tasks compared with the state-of-the-art SMT baseline. |
14:20-14:40 |
Li Shaotong, Xu JinAn, Miao Guoyi, Zhang Yujie and Chen Yufeng show abstract hide abstractABSTRACT: The problem of unknown words in neural machine translation (NMT), which not only affects the semantic integrity of the source sentences but also adversely affects the generating of the target sentences. The traditional methods usually replace the unknown words according to the similarity of word vectors, these approaches are difficult to deal with rare words and polysemous words. Therefore, this paper proposes a new method of unknown words processing in NMT based on the semantic concept of the source language. Firstly, we use the semantic concept of source language semantic dictionary to find the candidate in-vocabulary words. Secondly, we propose a method to calculate the semantic similarity by integrating the source language model and the semantic concept network, to obtain the best replacement word. Experiments on English to Chi- nese translation task demonstrate that our proposed method can achieve more than 2.6 BLEU points over the conventional NMT method. Compared with the traditional method based on word vector similarity, our method can also obtain an improvement by nearly 0.8 BLEU points. |
14:40-15:00 |
Huanqin Wu, Hongyang Zhang, Jingmei Li, Junguo Zhu and Muyun Yang show abstract hide abstractABSTRACT: Aimed at providing efficient training data for neural translation quality estimation model, a pseudo data construction method for target dataset is proposed; the model is trained by two stage model training method: pre training based on pseudo data and fine tuning ; The experimental design of different pseudo data scale is carried out. The experiment results show that the Machine Translation quality estimation model trained by the pseudo data has significantly improved in the correlation between the scores given by human and the artificial scores. |
15:00-15:20 |
Yiming Tan, Mingwen Wang and Maoxi Li show abstract hide abstractABSTRACT: Automatic post-editing (APE) aims to correct machine translation errors by rule methods or statistical methods; it plays an important role in the application and popularization of machine translation. Through statistical analysis of the training set released by WMT APE Shared Task, we found that more than half of machine translations only need a small amount of edit operations. To reduce over-editing problem, we proposes to make advantage of the neural post-editing (NPE) to build two special models, one is used to provide minor edit operations, the other is used to provide single edit operation, and make advantage of machine translation quality estimation to establish a filtering algorithm to integrate the special models with the regular NPE model into a jointed model. Experimental results on the test set of WMT16 APE shared task show that the proposed approach statistically outperforms the baseline. Deep analysis further confirms that our approach can bring considerable relief from the over-editing problem in APE. |
Shared Task Workshop 2017-11-11 14:00-15:20, 2F Meeting Room 6(2F 6号会议室), DIFCC Chair: TBD Return to Top | |
14:00-14:15 |
Zhongbo Yin, Jintao Tang, Chengsen ruchengsen@nudt.edu.cn, Wei Luo, Zhunchen Luo and Xiaolei Ma show abstract hide abstractABSTRACT: Recently there has been an increasing research interest in short text such as news headline. Due to the inherent sparsity of short text, the current text classi cation methods perform badly when applied to the classi cation of news headlines. To overcome this problem, a novel method which enhances the semantic representation of headlines is pro- posed in this paper. Firstly, we add some keywords extracted from the most similar news to expand the word features. Secondly, we use the cor- pus in news domain to pre-train the word embedding so as to enhance the word representation. Moreover, Fasttext classi er, which uses a liner method to classify text with fast speed and high accuracy, is adopted for news headline classi cation. On the task for Chinese news headline categorization in NLPCC2017, the proposed method achieve |
14:15-14:30 |
Lu Zhonglei, Liu Wenfen, Zhou Yanfang, Hu Xuexian and Wang Binyu show abstract hide abstractABSTRACT: In NLPCC 2017 shared task two, we propose an efficient approach for Chinese news headline classification based on multi-representation mixed model with attention and ensemble learning. Firstly, we model the headline se- mantic both on character and word level via Bi-directional Long Short-Term Memory (BiLSTM), with the concatenation of output states from hidden layer as the semantic representation. Meanwhile, we adopt attention mechanism to highlight the key characters or words related to the classification decision, and we get a preliminary test result. Then, for samples with lower confidence level in the preliminary test result, we utilizing ensemble learning to determine the final category of the whole test samples by sub-models voting. Testing on the NLPCC 2017 official test set, the overall F1 score of our model eventually reached 0.8176, which can be ranked No. 3. |
14:30-14:45 |
Yimeng Zhuang, Wang Xianliang, Han Zhang, Jinghui Xie and Xuan Zhu show abstract hide abstractABSTRACT: As an important step of human-computer interaction, con- version generation has attracted much attention and has a rising ten- dency in recent years. This paper gives a detailed description about an ensemble system for short text conversation generation. The proposed system consists of four subsystems, a quick response candidates select- ing module, an information retrieval system, a generation-based system and an ensemble module. An advantage of this system is that multiple versions of generated responses are taken into account resulting a more reliable output. In the NLPCC 2017 shared task ”Emotional Conversa- tion Generation Challenge”, the ensemble system generates appropriate responses for Chinese SNS posts and ranks at the top of participant list. |
14:45-15:00 |
Liwei Hou, Po Hu and Chao Bei show abstract hide abstractABSTRACT: Due to the difficulty of abstractive summarization, the great majority of past work on document summarization has been extractive, while the recent success of sequence-to-sequence framework has made abstractive summariza- tion viable, in which a set of recurrent neural networks models based on atten- tion encoder-decoder have achieved promising performance on short-text sum- marization tasks. Unfortunately, these attention encoder-decoder models often suffer from the undesirable shortcomings of generating repeated words or phrases and inability to deal with out-of-vocabulary words appropriately. To address these issues, in this work we propose to add an attention mechanism on output sequence to avoid repetitive contents and use the subword method to deal with the rare and unknown words. We applied our model to the public da- taset provided by NLPCC 2017 shared task3. The evaluation results show that our system achieved the best ROUGE performance among all the participating teams and is also competitive with some state-of-the-art methods. |
15:00-15:15 |
Lingfei Qian, Anran Wang, Yan Wang, Yuhang Huang, Jian Wang and Hongfei Lin show abstract hide abstractABSTRACT: With the popularity of mobile Internet, many social networking applications provide users with the function to share their personal information. It is of high commercial value to leverage the users’ personal information such as tweets, preferences and locations for user profiling. There are two subtasks working in user profiling. Subtask one is to predict the Point-of-Interest (POI) a user will check in at. We adopted a combination of multiple approach results, including user-based collaborative filtering (CF) and social-based CF to predict the locations. Subtask two is to predict the users’ gender. We divided the users into two groups, depending on whether the user has posted or not. We treat this task subtask as a classification task. Our results achieved first place in both subtasks. |
Student Workshop 2017-11-11 14:00-15:20, 2F Meeting Room 7(2F 7号会议室), DIFCC Chair: TBD Return to Top | |
14:00-14:30 |
AI-assisted Content Creation and Commenting Lei Li show abstract hide abstractABSTRACT: Abstract: In the mobile era, we are being presented an exciting opportunity to shape the way people acquire and consume information. We believe that AI will fundamentally change the way people connect with information, and we can use AI to improve the effectiveness and efficiency in the entire process of content creation, moderation, dissemination, consumption, and interaction. In this talk, we will introduce the roles of AI technologies in information consumption platforms. We will share several recent work at Toutiao AI Lab towards more efficient information creation and interaction. We will introduce a robot writer, Xiaomingbot, which has produced over 6000 articles since August 2016. We will present a deep-learning based model that answers factoid questions with the state-of-the-art accuracy. We will also introduce our latest research in visual understanding of objects and scene in short videos, and how these technologies assist authors to create better content. SHORT BIO: Dr. Lei Li is a research scientist and tech director at Toutiao AI Lab. Before Toutiao, he worked at Baidu's Institute of Deep Learning in Silicon Valley as a Principal Research Scientist (少帅学者). Previously, he was working in EECS department of UC Berkeley as a Post-Doctoral Researcher. He worked as an intern at Microsoft Research (Asia and Redmond), Google (Mountain View), and IBM (TJ Watson Reserch Center). His research interest lies in the intersection of deep learning, statistical inference, natural language understanding, and time series analysis. He served as a co-chair of KDD Cup 2017 and as a PC member of many conferences such as KDD, NIPS, ICML, AAAI, and IJCAI. He has published over 30 papers on probabilistic programming, Bayesian inference, time series prediction, question answering, and semantic parsing. Lei received his B.S. in Computer Science and Engineering from Shanghai Jiao Tong University (ACM class) and Ph.D. in Computer Science from Carnegie Mellon University, respectively. His dissertation work on fast algorithms for mining co-evolving time series was awarded ACM KDD best dissertation (runner up). |
14:30-15:00 |
A Path out of the Nightmare: How to Survive in Graduate Schools Jiwei Li show abstract hide abstractABSTRACT: SHORT BIO: Jiwei Li got his B.S. in Biology from Peking University (2008-2012) and Ph. D in Computer Science from Stanford University (2014-2017). He was a winner of Facebook Fellowship 2015 and Baidu Fellowship 2016. He works on Natural Language Processing, advised by Prof. Dan Jurafsky. |
15:00-15:20 |
中文对话中的时态分析——从日常语言现象到NLP研究 Tao Ge show abstract hide abstractABSTRACT: 摘要:对于学习和研究自然语言处理的同学们来説,如何发现研究问题切入点往往是困扰大家的一个难题。通过阅读大量论文诚然是寻找研究切入点的一个有效方法,但除此之外,在我们的日常生活中也隐藏了很多有趣的NLP问题值得去研究。本次报告中,我会结合自己在中文对话时态问题上的研究经历,来谈一谈如何通过观察日常生活中一些有趣的语言现象来发现和研究NLP问题。 个人简介:葛涛,微软亚洲研究院自然语言计算组副研究员,于2017年取得北京大学计算机博士学位。研究兴趣包括文本流信息抽取、知识挖掘以及中文信息处理,在ACL、EMNLP、COLING等计算语言学顶级会议发表多篇学术论文。 |
NLP Fundamentals II 2017-11-11 15:40-17:10, 1F Joyous Gathering Palace A(1F 聚和宫A区), DIFCC Chair: TBD Return to Top | |
15:40-16:00 |
Qian Yan, Chenlin Shen, Shoushan Li, Fen Xia and Zekai Du show abstract hide abstractABSTRACT: Previous studies normally formulate Chinese word segmentation as a character sequence labeling task and optimize the solution in sentence-level. In this paper, we address Chinese word segmentation as a document-level optimization problem. First, we apply a state-of-the-art approach, i.e., long short-term memory (LSTM), to perform character classification; Then, we propose a global objective function on the basis of character classification and achieve global optimization via Integer Linear Programming (ILP). Specifically, we propose sev- eral kinds of global constrains in ILP to capture various segmentation knowledge, such as segmentation consistency and domain-specific regulations, to achieve document-level optimization, besides label transition knowledge to achieve sen- tence-level optimization. Empirical studies demonstrate the effectiveness of the proposed approach to domain-specific Chinese word segmentation. |
16:00-16:20 |
Zhipeng Xie and Junfeng Hu show abstract hide abstractABSTRACT: This paper proposes a deep convolutional neural model for character-based Chinese word segmentation. It first constructs position embeddings to encode unigram and bigram features that are directly re- lated to single positions in input sentence, and then adaptively builds up hierarchical position representations with a deep convolutional net. In ad- dition, a multi-task learning strategy is used to further enhance this deep neural model by treating multiple supervised CWS datasets as different tasks. Experimental results have shown that our neural model outper- forms the existing neural ones, and the model equipped with multi-task learning has successfully achieved state-of-the-art F-score performance for standard benchmarks: 0.964 on PKU dataset and 0.978 on MSR dataset. |
16:20-16:40 |
Zhan Shi, Xinchi Chen, Xipeng Qiu and Xuanjing Huang show abstract hide abstractABSTRACT: Recently, recurrent neural networks (RNNs) have been increasingly used for Chinese word segmentation to model the contextual information without the limit of context window. In practice, two kinds of gated RNNs, long short- term memory (LSTM) and gated recurrent unit (GRU), are often used to alleviate the long dependency problem. In this paper, we propose the hyper-gated recur- rent neural networks for Chinese word segmentation, which enhance the gates to incorporate the historical information of gates. Experiments on the benchmark datasets show that our model outperforms the baseline models as well as the state-of-the-art methods. |
16:40-17:00 |
Zuyi Bao, Si Li, Sheng GAO and Weiran XU show abstract hide abstractABSTRACT: There has a large scale annotated newswire data for Chinese word segmentation. However, some research proves that the performance of the seg- menter has significant decrease when applying the model trained on the newswire to other domain, such as patent and literature. The same character appeared in different words may be in different position and with different meaning. In this paper, we introduce contextualized character embedding to neural domain adap- tation for Chinese word segmentation. The contextualized character embedding aims to capture the useful dimension in embedding for target domain. The exper- iment results show that the proposed method achieves competitive performance with previous Chinese word segmentation domain adaptation methods. |
Summarization/Text Mining 2017-11-11 15:40-17:10, 2F Meeting Room 5(2F 5号会议室), DIFCC Chair: TBD Return to Top | |
15:40-16:00 |
Junnan Zhu, Long Zhou, Haoran Li, Jiajun Zhang, Yu Zhou and Chengqing Zong show abstract hide abstractABSTRACT: Neural sequence-to-sequence model has achieved great suc- cess in abstractive summarization task. However, due to the limit of input length, most of previous works can only utilize lead sentences as the input to generate the abstractive summarization, which ignores crucial infor- mation of the document. To alleviate this problem, we propose a novel approach to improve neural sentence summarization by using extractive summarization, which aims at taking full advantage of the document in- formation as much as possible. Furthermore, we present both of stream- line strategy and system combination strategy to achieve the fusion of the contents in different views, which can be easily adapted to other domains. Experimental results on CNN/Daily Mail dataset demonstrate both our proposed strategies can significantly improve the performance of neural sentence summarization. |
16:00-16:20 |
Xinyi Lin, Rui Yan and Dongyan Zhao show abstract hide abstractABSTRACT: Word and sentence units are two granularities to characterize document information in automatic summarization. Yet, few unsupervised studies take both factors into account due to the difficulties in fusing word-level and sentence-level information from different semantic spaces. In this paper, we propose a hybrid optimization framework to optimize word-level information while simultaneously incorporate sentence-level information as constraints. The optimization is conducted by iterative unit substitutions. The performance on DUC benchmark datasets demonstrates the effectiveness of our proposed framework in terms of ROUGE evaluation. |
16:20-16:40 |
jian xu, hao yin, Lu Zhang, Shoushan Li and Guodong Zhou show abstract hide abstractABSTRACT: Review rating is a sentiment analysis task which aims to predict a recommendation score for a review. Basically, classification and regression models are two major approaches to review rating, and these two approaches have their own characteristics and strength. For instance, the classification model can flexibly utilize distinguished models in machine learning, while the regression model can capture the connections between different rating scores. In this study, we propose a novel approach to review rating, namely joint LSTM, by exploiting the advantages of both review classification and regression mod- els. Specifically, our approach employs an auxiliary Long-Short Term Memory (LSTM) layer to learn the auxiliary representation from the classification set- ting, and simultaneously join the auxiliary representation into the main LSTM layer for the review regression setting. In the learning process, the auxiliary classification LSTM model and the main regression LSTM model are jointly learned. Empirical studies demonstrate that our joint learning approach per- forms significantly better than using either individual classification or regres- sion model on review rating. |
16:40-17:00 |
Lishuang Li, Jia Wan and Degen Huang show abstract hide abstractABSTRACT: Most word embedding methods are proposed with general purpose which take a word as a basic unit and learn embeddings by words’ external con- texts. However, in the field of biomedical text mining, there are many biomedi- cal entities and syntactic chunks which can enrich the semantic meaning of word embeddings. Furthermore, large scale background texts for training word embeddings are not available in some scenarios. Therefore, we propose a novel biomedical domain-specific word embeddings model based on maximum- margin (BEMM) to train word embeddings using small set of background texts, which incorporates biomedical domain information. Experimental results show that our word embeddings overall outperform other general-purpose word em- beddings on some biomedical text mining tasks. |
Shared Task Workshop 2017-11-11 15:40-16:40, 2F Meeting Room 6(2F 6号会议室), DIFCC Chair: TBD Return to Top | |
15:40-15:55 |
Yuxuan Lai, Yanyan Jia, yang lin, Yansong Feng and Dongyan Zhao show abstract hide abstractABSTRACT: Aiming at the task of open domain question answering based on knowledge base in NLPCC 2017, we build a question answering system which can automatically nd the promised entities and predicates for single-relation questions. After a features based entity linking component and a word vector based candidate predicates generation component, deep convolutional neual networks are used to rerank the entity-predicate pairs, and all intermediary scores are used to choose the nal predicted answers. Our approach achieved the F1-score of 47.23% on test data which obtained the rst place in the contest of NLPCC 2017 Shared Task 5(KBQA sub-task). Furthermore, there are also a series of experiments which can help other developers understand the contribution of every part of our system. |
15:55-16:10 |
Zhipeng Xie show abstract hide abstractABSTRACT: The document-based question answering is to select the an- swer from a set of candidate sentence for a given question. Most Existing works focus on the sentence-pair modeling, but ignore the peculiars of question-answer pairs. This paper proposes to model the interaction be- tween question words and POS tags, as a special kind of information that is peculiar to question-answer pairs. Such information is integrated into a neural model for answer selection. Experimental results on DBQA Task have shown that our model has achieved better results, compared with several state-of-the-art systems. In addition, it also achieves the best result on NLPCC 2017 Shared Task on DBQA. |
16:10-16:25 |
Yunxiao Zhou, Man Lan and Yuanbin Wu show abstract hide abstractABSTRACT: This paper describes the system we submitted to Task 1, i.e., Chinese Word Semantic Relation Classification, in NLPCC 2017. Given a pair of context-free Chinese words, this task is to predict the semantic relationships of them among four categories: Synonym, Antonym, Hy- ponym and Meronym. We design and investigate several surface features and embedding features containing word level and character level em- beddings together with supervised machine learning methods to address this task. Officially released results show that our system ranks above average. |
16:25-16:40 |
Changliang Li, Teng Ma, Jian Cheng and Bo Xu show abstract hide abstractABSTRACT: Classification of word semantic relation is a challenging task in natural language processing (NLP) field. In many practical applications, we need to distinguish words with different semantic relations. Much work relies on semantic resources such as Tongyici Cilin and HowNet, which are limited by the quality and size. Recently, methods based on word embedding have re- ceived increasing attention for their flexibility and effectiveness in many NLP tasks. Furthermore, word vector offset implies words semantic relation to some extent. This paper proposes a novel framework for identifying the Chinese word semantic relation. We combine semantic dictionary, word vector and linguistic knowledge into a classification system. We conduct experiments on the Chinese Word Semantic Relation Classification shared task of NLPCC 2017. We rank No.1 with the result of F1 value 0.859. The results demonstrate that our method is very scientific and effective. |
Student Workshop 2017-11-11 15:40-17:30, 2F Meeting Room 7(2F 7号会议室), DIFCC Chair: TBD Return to Top | |
15:40-16:00 |
基于对抗训练的多标准学习中文分词 Xinchi Chen show abstract hide abstractABSTRACT: 摘要: 不同的语言视角往往导致许多不同细分标准的中文分词语料。大多数现有的方法侧重于改进使用单个标准的语料下的分词性能。如果能利用不同标准的语料来提升分词的效果是很有意义的。在这篇文章中,我们使用对抗训练的思想,通过多目标集成学习的方法来学习多个异构标准的分词语料集。我们将每一个分词标准作为一项任务。在多任务学习框架下,提出三种不同的共享私有模型。共享层用于提取标准不变特征,并利用一个私有层来提取不同标准特性。进一步地,我们利用对抗训练策略,确保共享层可以提取的不同标准的共享共有特征,让共享层适用于所有标准。在8种不同标准的语料库上的实验表明,相比较于相比单标准学习方法,模型在每个语料集上的性能都获得了显著改进。我们还分别在不同标准的五个简体中文语料集和三个繁体中文语料集上进行了测试。实验证明了我们的模型可以从简体中文中学习到有助于提高繁体中文的信息。 个人简介:陈新驰,复旦大学计算机(上海智能媒体研究所)本科直博生(博士五年级),专注于人工智能领域,研究方向主要包括深度学习,自然语言处理等。已在人工智能顶级学术会议发表多篇论文,如 ACL,EMNLP 等。 |
16:00-16:20 |
Neural Machine Translation: Developed Architectures for Encoder, Attention and Decoder Biao Zhang show abstract hide abstractABSTRACT: Abstract: Recent years have witnessed the rapid development of neural machine translation (NMT) which only relies on an encoder-attention-decoder framework and has continued to break the boundaries of state-of-the-art results on various translation tasks. In this talk, I will first give an overview of the dominant NMT framework, and then present several developed architectures from our recent work for the encoder, attention and decoder respectively. Accordingly, I will also share some experiences about developing these architectures. SHORT BIO: Biao Zhang is a graduate student in the School of Software at Xiamen University. He is supervised by Dr. Jinsong Su. His major research interests are natural language processing and deep learning, especially neural machine translation. In recent years, he published several papers on top conferences/journals, including AAAI, IJCAI, EMNLP, COLING and TASLP, INS. |
16:20-16:40 |
阅读理解和问答 Bingning Wang show abstract hide abstractABSTRACT: 摘要:阅读理解是近几年新兴的人工智能任务,旨在让机器理解文档的基础上完成一系列任务,如对话,问答,等等。要完成一个阅读理任务,可以将其解拆分成几个部分,其中第一个部分为答案选择,即如何利用问题从文档中找到能回答这个问题的答案;第二部分为答案生成,即根据当前支撑句以及问题生成相应的答案。在这两部分中,我们可以利用一系列已有问答资源辅助来帮助获得更好的效果,同时,我们一些其他的诸如信息检索语义哈希方法也可以提高检索的效率。我们将这套模型使用在多个阅读理解任务当中去,都取得了不错的效果。 个人简介:王炳宁,中国科学院自动化研究所博士生,主要研究方向为自然语言处理,机器阅读理解。在人工智能,自然语言处理顶级会议IJCAI,ACL上以第一作者身份发表论文三篇。 |
16:40-16:55 |
Mingzhou Yang, Daling Wang, Shi Feng and Yifei Zhang show abstract hide abstractABSTRACT: Recently, huge amount of text with user consumption intentions have been published on the social media platform, such as Twitter and Weibo, and classifying the intentions of users has great values for both scientific research and commercial applications. User consumption analysis in social media concerns about the text content representation and intention classification, whose solutions mainly focus on the traditional machine learning and the emerging deep learning techniques. In this paper, we conduct a comprehensive empirical study on the user intension classification problem with learning based techniques using different text representation methods. We compare different machine learning, deep learning methods and various conbinations of them in tweet text presentation and users’ consumption intention classification. The experimental results show that LSTM models with pre-trained word vector representation can achieve the best classification performance. |
16:55-17:10 |
Zhendong Jiang show abstract hide abstractABSTRACT: It has been difficult to extract concept relation with deep learning and other large-scale knowledge map construction methods in domains that lack of tagged corpus. Alternatively, methods based on transfer learning have been proposed to transfer the knowledge from the source of open domain to the diplomatic domain. In view of the time-sequence features of diplomatic text, we constructed a concept relation extraction model with LSTM-based transfer learning method. By transferring trained network weights and retraining fine-tuned, the accuracy of diplomatic concept relation extraction has been significantly improved. |
17:10-17:30 |
Discussion ABSTRACT: strAbs |
Machine Learning 2017-11-12 13:30-15:10, 2F Meeting Room 5(2F 5号会议室), DIFCC Chair: TBD Return to Top | |
13:30-13:50 |
Huijia Wu, Jiajun Zhang and Chengqing Zong show abstract hide abstractABSTRACT: Deep stacked RNNs are usually hard to train. Recent studies have shown that shortcut connections across di erent RNN layers bring substantially faster convergence. However, shortcuts increase the computational complexity of the recurrent computations. To reduce the complexity, we propose the shortcut block, which is a re nement of the shortcut LSTM blocks. Our approach is to replace the self-connected parts (cl) with shortcuts (hl−2) in the internal states. We present extenttsive empirical experiments showing that this design performs better than the original shortcuts. We evaluate our method on CCG supertagging task, obtaining a 8% relatively improvement over current state-of-the-art results. |
13:50-14:10 |
Keegan Kang show abstract hide abstractABSTRACT: The technique of random projection is one of dimension reduction, where high dimensional vectors in RD are projected down to a smaller sub- space in Rk. Certain forms of distances or distance kernels such as Euclidean distances, inner products [10], and lp distances [12] between high dimensional vectors are approximately preserved in this smaller dimensional subspace. Word vectors which are represented in a bag of words model can thus be projected down to a smaller subspace via random projections, and their relative similar- ity computed via distance metrics. We propose using marginal information and Bayesian probability to improve the estimates of the inner product between pairs of vectors, and demonstrate our results on actual datasets. |
14:10-14:30 |
Jiachen Du, Lin Gui, Ruifeng Xu and Yulan He show abstract hide abstractABSTRACT: Neural network models with attention mechanism have shown their efficiencies on various tasks. However, there is little research work on attention mechanism for text classification and existing attention model for text classifi- cation lacks of cognitive intuition and mathematical explanation. In this paper, we propose a new architecture of neural network based on the attention model for text classification. In particular, we show that the convolutional neural net- work (CNN) is a reasonable model for extracting attentions from text sequences in mathematics. We then propose a novel attention model base on CNN and intro- duce a new network architecture which combines recurrent neural network with our CNN-based attention model. Experimental results on five datasets show that our proposed models can accurately capture the salient parts of sentences to im- prove the performance of text classification. |
14:30-14:50 |
Hongwei Liu and Yun Xu show abstract hide abstractABSTRACT: Disease name normalization aims at mapping various disease names to standardized disease vocabulary entries. Disease names have such a wide variation that dictionary lookup method couldn’t get a high accuracy on this task. Dnorm is the first machine learning approach for this task. It is not robust enough due to strong dependence on training dataset. In this article, we propose a deep learning way for disease name representation and normalization. Representations of composing words can be learned from large unlabelled literature corpus. Rich semantic and syntactic properties of disease names are encoded in the representa- tions during the process. With the new way of representations for disease names, a higher accuracy is achieved in the normalization task. |
14:50-15:10 |
Lei Sha, Feng Qian and Zhifang Sui show abstract hide abstractABSTRACT: Repeated Reading (re-read), which means to read a sentence twice to get a better understanding, has been applied to machine read- ing tasks. But there have not been rigorous evaluations showing its exact contribution to natural language processing. In this paper, we design four tasks, each representing a different class of NLP tasks: (1) part-of-speech tagging, (2) sentiment analysis, (3) semantic relation classification, (4) event extraction. We take a bidirectional LSTM-RNN architecture as standard model for these tasks. Based on the standard model, we add repeated reading mechanism to make the model better “understand” the current sentence by reading itself twice. We compare three different re- peated reading architectures: (1) Multi-level attention (2) Deep BiLSTM (3) Multi-pass BiLSTM, enforcing apples-to-apples comparison as much as possible. Our goal is to understand better in what situation repeated reading mechanism can help NLP task, and which of the three repeat- ed reading architectures is more appropriate to repeated reading. We find that repeated reading mechanism do improve performance on some tasks (sentiment analysis, semantic relation classification, event extrac- tion) but not on others (POS tagging). We discuss how these differences may be caused in each of the tasks. Then we give some suggestions for researchers to follow when choosing whether to use repeated model and which repeated model to use when faced with a new task. Our results thus shed light on the usage of repeated reading in NLP tasks. |
Information Extraction/KG 2017-11-12 13:30-15:10, 2F Meeting Room 6(2F 6号会议室), DIFCC Chair: TBD Return to Top | |
13:30-13:50 |
Xuelian Li, Qian Liu, Man Zhu, Feifei Xu, Yunxiu Yu, Shang Zhang, Zhaoxi Ni and Zhiqiang Gao show abstract hide abstractABSTRACT: Multiple-choice questions of comparing one entity with another in a university’s entrance examination like Gaokao in China are very common but require high knowledge skill. As a preliminary at- tempt to address this problem, we build a geography Gaokao-oriented knowledge acquisition system for comparative sentences based on logic programming to help solve real geography examinations. Our work consists of two consecutive tasks: identify comparative sentences from geographical texts and extract comparative elements from the identified comparative sentences. Specifically, for the former task, logic program- ming is employed to filter out non-comparative sentences, and for the latter task, the information of dependency grammar and heuristic posi- tion is adopted to represent the relations among comparative elements. The experimental results show that our system achieves outstanding per- formance for practical use. |
13:50-14:10 |
Weiming Lu, Yangfan Zhou, Haijiao Lu, Pengkun Ma, Zhenyu Zhang and Baogang Wei show abstract hide abstractABSTRACT: EntityLinking(EL)isthetaskofmappingmentionsinnatural-language text to their corresponding entities in a knowledge base (KB). Type modeling for mention and entity could be beneficial for entity linking. In this paper, we propose a type-guided semantic embedding approach to boost collective entity linking. We use Bidirectional Long Short-Term Memory (BiLSTM) and dynam- ic convolutional neural network (DCNN) to model the mention and the entity respectively. Then, we build a graph with the semantic relatedness of mentions and entities for the collective entity linking. Finally, we evaluate our approach by comparing the state-of-the-art entity linking approaches over a wide range of very different data sets, such as TAC-KBP from 2009 to 2013, AIDA, DBPediaSpot- light, N3-Reuters-128, and N3-RSS-500. Besides, we also evaluate our approach with a Chinese Corpora. The experiments reveal that the modeling for entity type can be very beneficial to the entity linking. |
14:10-14:30 |
Runtao Liu, Liangcai Gao, Dong An, Zhuoren Jiang and Zhi TANG show abstract hide abstractABSTRACT: Metadata information extraction from academic papers is of great value to many applications such as scholar search, digital library, and so on. This task has attracted much attention from researchers in the past decades, and many templates-based or statistical machine learning (e.g. SVM, CRF, etc.)-based extraction methods have been proposed, while this task is still a challenge because of the variety and complex- ity of page layout. To address this challenge, we try introducing the deep learning networks to this task in this paper, since deep learning has shown great power in many areas like computer vision (CV) and natural language processing (NLP). Firstly, we employ the deep learning networks to model the image information and the text information of pa- per headers respectively, which allow our approach to perform metadata extraction with little information loss. Then we formulate the problem, metadata extraction from a paper header, as two typical tasks of dif- ferent areas: object detection in the area of CV, and sequence labeling in the area of NLP. Finally, the two deep networks generated from the above two tasks are combined together to give extraction results. The primary experiments show that our approach achieves state-of-the-art performance on several open datasets. At the same time, this approach can process both image data and text data, and does not need to design any classification feature. |
14:30-14:50 |
Lv Shuning and Dong Zhian show abstract hide abstractABSTRACT: This paper presents a new approach for Chinese term extraction using URL-key. Taking the URL as a medium, with the help of known domain of URL-key, we can judge the domain which candidate terms belong to. First, with the help of domain URL classified artificially in the Internet, a method based on the variance is proposed to identify the domain URL-key and the dictionary of domain URL-key is built according to the frequency of URL-key appearing in various fields. Then, we use the pseudo related feedback to construct the URL-key vector of candidate domain terms. Finally, we apply SVM to extract terms. We conduct experiments on four different domains for Chinese term extraction. Experimental results indicate that the approach proposed in this paper is quiet effective. In addition, the proposed approach can effectively solve the recognition problem of low frequency terms, which provides a new way for the identification of low frequency terms. |
Chatbot/QA 2017-11-12 13:30-15:10, 2F Meeting Room 7(2F 7号会议室), DIFCC Chair: TBD Return to Top | |
13:30-13:50 |
Liyun WEN, Xiaojie WANG, Zhenjiang DONG and Hong CHEN show abstract hide abstractABSTRACT: Intent classification and slot filling are two critical subtasks of natural language understanding (NLU) in task-oriented dialogue systems. Previous work has made use of either hierarchical or contextual information when jointly mod- eling intent classification and slot filling, proving that either of them is helpful for joint models. This paper proposes a cluster of joint models to encode both types of information at the same time. Experimental results on different datasets show that the proposed models outperform joint models without either hierarchical or contextual information. Besides, finding the balance between two loss functions of two subtasks is important to achieve best overall performances. |
13:50-14:10 |
Lian Meng and Minlie Huang show abstract hide abstractABSTRACT: Dialogue intent analysis plays an important role for dialogue systems. In this paper,we present a deep hierarchical LSTM model to classify the intent of a dialogue utterance. The model is able to recognize and classify user’s dialogue intent in an efficient way. Moreover, we introduce a memory module to the hi- erarchical LSTM model, so that our model can utilize more context information to perform classification. We evaluate the two proposed models on a real-world conversational dataset from a Chinese famous e-commerce service. The experi- mental results show that our proposed model outperforms the baselines. |
14:10-14:30 |
Yue Ma, Xiaojie WANG, Zhenjiang Dong and Hong Chen show abstract hide abstractABSTRACT: This paper proposes a deep neural network model for jointly modeling Natural Language Understanding and Dialogue Management in goal-driven dialogue systems. There are three parts in this model. A Long Short-Term Memory (LSTM) at the bottom of the network encodes utterances in each dialogue turn into a turn embedding. Dialogue embed- dings are learned by a LSTM at the middle of the network, and updated by the feeding of all turn embeddings. The top part is a forward Deep Neural Network which converts dialogue embeddings into the Q-values of different dialogue actions. The cascaded LSTMs based reinforcement learning network is jointly optimized by making use of the rewards re- ceived at each dialogue turn as the only supervision information. There is no explicit NLU and dialogue states in the network. Experimental re- sults show that our model outperforms both traditional Markov Decision Process (MDP) model and single LSTM with Deep Q-Network on meet- ing room booking tasks. Visualization of dialogue embeddings illustrates that the model can learn the representation of dialogue states. |
14:30-14:50 |
Zhiqiang Liu, Mengzhang Li, Tianyu Bai, Rui Yan and Yan Zhang show abstract hide abstractABSTRACT: Nowadays the community-based question answering (cQA) sites be- come popular Web service, which have accumulated millions of questions and their associated answers over time. Thus, the answer selection component plays an important role in a cQA system, which ranks the relevant answers to the given question. With the development of this area, problems of noise prevalence and data sparsity become more tough. In our paper, we consider the task of answer selection from two aspects including deep semantic matching and user commu- nity metadata representation. We propose a novel dual attentive neural network framework (DANN) to embed question topics and user network structures for an- swer selection. The representation of questions and answers are first learned by convolutional neural networks (CNNs). Then the DANN learns interactions of questions and answers, which is guided via user network structures and semantic matching of question topics with double attention. We evaluate the performance of our method on the well-known question answering site Stack exchange. The experiments show that our framework outperforms other state-of-the-art solutions to the problem. |
14:50-15:10 |
Xia Li, HanFeng Liu and Shengyi Jiang show abstract hide abstractABSTRACT: Question classification is an important research content in automatic question-answering system. Chinese question sentences are different from long texts and those short texts like comments on product. They generally contain in- terrogative words such as who, which, where or how to specify the information required, and include complete grammatical components in the sentence. Based on these characteristics, we propose a more effective feature extraction method for Chinese question classification in this paper. We first extract the head verb of the sentence and its dependency words combined with interrogative words of the sentence as our base features. And then we use latent semantic analysis to help remove semantic noises from the base features. In the end, we expand those fea- tures to be semantic representation features by our weighted word-embedding method. Several experimental results show that our semantic joint feature extrac- tion method outperforms classical syntactic based or content vector based method and superior to convolutional neural network based sentence classification method. |
Alibaba Workshop 1: The Road to the Intelligence of New Customer Service 2017-11-12 13:30-15:10, 1F Joyous Gathering Palace A(1F 聚和宫A区), DIFCC Chair: Jian SUN (Alibaba Group) Return to Top | |
13:30-14:10 |
Very Large-scale Intelligent Customer Service in Alibaba Jianrong Liu, Product Manager, Alibaba show abstract hide abstractABSTRACT: 1. Challenges of customer service in very large-scale e-commerce ecosystem 2. How to build the intelligent customer service using AI 3. The future of customer service + AI in Alibaba. SHORT BIO: JianRong Liu is the product manager of Intelligent Service Division, Alibaba. |
14:10-14:50 |
Upgrade Customer Service by Machine Intelligence Dr. Yikun Guo, Senior staff engineer, Alibaba show abstract hide abstractABSTRACT: In this talk, I will firstly review the A.I. application used in the customer service area, and then I will introduce our experience in using such technology to build large-scale applications to disrupt the traditional customer services. More specifically, I will focus on the use of A.I. to build the Alime chatbot, the Wali – our new generation decision making assistant, as well as our next-gen knowledge base, where deep leaning is applied to automatically generate answers. SHORT BIO: Dr. Yikun Guo, graduated from Fudan University, majored in NLP and machine learning. After graduation, he has been working in UK for companies like LexisNexis and KPMG etc. on the building the NLP applications using A.I technology. His main work includes the machine learning models used by Samsung’s mobile chatbot application (S-voice), which has been deployed to answer millions of questions from the Samsung’s Galaxy mobile users everyday. At present, Dr. Guo is senior staff engineer in Intelligent Service Division, Alibaba. |
Social Network 2017-11-12 15:40-17:00, 2F Meeting Room 5(2F 5号会议室), DIFCC Chair: TBD Return to Top | |
15:40-16:00 |
Jianjun Wu, Ying Sha, Rui Li, Qi Liang, Bo Jiang, Jianlong Tan and Bin Wang show abstract hide abstractABSTRACT: Identifying influential users in social networks is of signifi- cant interest, as it can help improve the propagation of ideas or inno- vations. Various factors can affect the relationships and the formulation of influence between users. Although many studies have researched this domain, the effect of the correlation between messages and behaviors in measuring users’ influence in social networks has not been adequately fo- cused on. As a result, influential users can not be accurately evaluated. Thus, we propose a topic-behavior influence tree algorithm that iden- tifies influential users using six types of relationships in the following factors: message content, hashtag titles, retweets, replies, and mentions. By maximizing the number of affected users and minimizing the propa- gation path, we can improve the accuracy of identifying influential users. The experimental results compared with state-of-the-art algorithms on various datasets and visualization on TUAW dataset validate the effec- tiveness of the proposed algorithm. |
16:00-16:20 |
Jin Qian, Gong Yeyun, Qi Zhang and Xuanjing Huang show abstract hide abstractABSTRACT: The hierarchical Dirichlet process model has been successfully used for extracting the topical or semantic content of documents and other kinds of sparse count data. Along with the growth of social media, there have been si- multaneous increases in the amounts of textual information and social structural information. To incorporate the information contained in these structures, in this paper, we propose a novel non-parametric model, social hierarchical Dirichlet process (sHDP), to solve the problem. We assume that the topic distributions of documents are similar to each other if their authors have relations in social net- works. The proposed method is extended from the hierarchical Dirichlet process model. We evaluate the utility of our method by applying it to three data sets: pa- pers from NIPS proceedings, a subset of articles from Cora, and microblogs with social network. Experimental results demonstrate that the proposed method can achieve better performance than state-of-the-art methods in all three data sets. |
16:20-16:40 |
Pan Xiao, Yongquan Fan and Yajun Du show abstract hide abstractABSTRACT: As the popularity of micro-blogging sites, followee recom- mendation plays an important role in information sharing over microblog- ging platforms. But as the popularity of microblogging sites increases, the difficulty of deciding who to follow also increases. The interests and e- motions of users are often varied in their real lives. On the contrary, some other features of micro-blog are always unchangeable and they cannot describe the users characteristics very well. To solve this problem, we pro- pose a personality-aware followee recommendation model(PSER) based on text semantics and sentiment analysis, a novel personality followee recommendation scheme over microblogging systems based on user at- tributes and the big-five personality model. It quantitatively analyses the effects of user personality in followee selection by combining personality traits with text semantics of micro-blogging and sentiment analysis of users. We conduct comprehensive experiments on a large-scale dataset collected from Sina Weibo, the most popular mircoblogging system in China. The results show that our scheme greatly outperforms existing schemes in terms of precision and an accurate appreciation of this mod- el tied to a quantitative analysis of personality is crucial for potential followees selection, and thus, enhance recommendation. |
16:40-17:00 |
Donglei Liu, Yipeng Su, Xudong Li and Zhendong Niu show abstract hide abstractABSTRACT: Community structure is the basic structure of a social net- work. Nodes of a social network can naturally form communities. More speci cally, nodes are densely connected with each other within the same community while sparsely between di erent communities. Community detection is an important task in understanding the features of networks and graph analysis. At present there exist many community detection methods which aim to reveal the latent community structure of a social network, such as graph-based methods and heuristic-information-based methods. However, the approaches based on graph theory are complex and with high computing expensive. In this paper, we extend the den- sity concept and propose a density peaks based community detection method. This method rstly computes two metrics-the local density ρ and minimum climb distance δ -for each node in a network, then iden- tify the nodes with both higher ρ and δ in local elds as each community center. Finally, rest nodes are assigned with corresponding community labels. The complete process of this method is simple but e cient. We test our approach on four classic baseline datasets. Experimental results demonstrate that the proposed method based on density peaks is more accurate and with low computational complexity. |
Machine Translation II 2017-11-12 15:40-17:00, 2F Meeting Room 6(2F 6号会议室), DIFCC Chair: TBD Return to Top | |
15:40-16:00 |
Na Ye, Ping Xu, Chuang Wu and Guiping Zhang show abstract hide abstractABSTRACT: Recent research on machine translation has achieved substantial progress. However, the machine translation results are still not error-free, and need to be post-edited by a human translator (user) to produce correct translations. Interactive machine translation enhanced the human-computer collaboration through having human validate the longest correct prefix in the suggested translation. In this paper, we refine the interactivity protocol to provide more natural collaboration. Users are allowed to validate bilingual segments, which give more direct guidance to the decoder and more hints to the users. Besides, vali- dating bilingual segments is easier than identifying correct segments from the incorrect translations. Experimental results with real users show that the new protocol improved the translation efficiency and translation quality on three Chinese-English translation tasks. |
16:00-16:20 |
Yonghe Wang, Feilong Bao, Hongwei Zhang and Guanglai Gao show abstract hide abstractABSTRACT: Deep Neural Network (DNN) model has been achieved a significant result over the Mongolian speech recognition task, however, compared to Chi- nese, English or the others, there are still opportunities for further enhance- ments. This paper presents the first application of Feed-forward Sequential Memory Network (FSMN) for Mongolian speech recognition tasks to model long-term dependency in time series without using recurrent feedback. Fur- thermore, by modeling the speaker in the feature space, we extract the i-vector features and combine them with the Fbank features as the input to validate their effectiveness in Mongolian ASR tasks. Finally, discriminative training was firstly conducted over the FSMN by using maximum mutual information (MMI) and state-level minimum Bayes risk (sMBR), respectively. The experi- mental results show that: FSMN possesses better performance than DNN in the Mongolian ASR, and by using i-vector features combined with Fbank features as FSMN input and discriminative training, the word error rate(WER) is rela- tively reduced by 17.9% compared with the DNN baseline. |
16:20-16:40 |
chen Zhao, Yanchao Liu, Jianyi Guo, Wei Chen, Xin Yan, Zhengtao Yu and Xiuqin Chen show abstract hide abstractABSTRACT: POS tagging is a fundamental work in Natural Language Pro- cessing, which determines the subsequent processing quality, and the ambigui- ty of multi-category words directly affects the accuracy of Vietnamese POS tagging. At present, the POS tagging of English and Chinese has achieved bet- ter results, but the accuracy of Vietnamese POS tagging is still to be im- proved. For address this problem, this paper proposes a novel method of Viet- namese POS tagging based on multi-category words disambiguation model and Part of Speech dictionary,the multi-category words dictionary and the non-multi-category words dictionary are generated from the Vietnamese dic- tionary, which are used to build POS tagging corpus. 396,946 multi-category words have been extracted from the corpus, by using statistical method, the maximum entropy disambiguation model of Vietnamese part of speech is con- structed, based on it, the multi-category words and the non-multi-category words are tagged. Experimental results show that the method proposed in the paper is higher than the existing model, which is proved that the method is feasible and effective. |
16:40-17:00 |
Nan Zhou, Yue Zhao, YaoQiang Li, XiaoNa Xu, LaMu CaiWang and LiCheng Wu show abstract hide abstractABSTRACT: At present, deep neural network has been widely used in speech recognition, although it has high robustness and semantic distinction, but its posterior features cannot be used for GMM-HMM acoustic modeling framework. However, the neural network with a narrow bottleneck can solve this problem, and its bottleneck features not only have long term context-dependence and compact representation of speech signal, but also can replace the traditional MFCC features for GMM-HMM acoustic modeling. In this paper, we study on applying bottleneck features and its concatenated features with MFCC into Lhasa-Tibetan continuous speech recognition. The experimental results show that the concatenated features of bottleneck features and MFCC achieved better performance than the posterior features of deep neural network and mono-bottleneck features. |
NLP Applications 2017-11-12 15:40-17:00, 2F Meeting Room 7(2F 7号会议室), DIFCC Chair: TBD Return to Top | |
15:40-16:00 |
Jian Tang, Yu Hong, Mengyi Liu, Jiashuo Zhang and Jianmin Yao show abstract hide abstractABSTRACT: Image Description Translation (IDT) is a task to automatically trans- late the image captions (i.e., image descriptions) into the target language. Current statistical machine translation (SMT) cannot perform as well as usual in this task because there is lack of topic information provided for translation model genera- tion. In this paper, we focus on acquiring the possible contexts of the captions so as to generate topic models with rich and reliable information. The image match- ing technique is utilized in acquiring the relevant Wikipedia texts to the captions, including the captions of similar Wikipedia images, the full articles that involve the images and the paragraphs that semantically correspond to the images. On the basis, we go further to approach topic modelling using the obtained contexts. Our experimental results show that the obtained topic information enhances the SMT of image caption, yielding a performance gain of no less than 1% BLUE score. |
16:00-16:20 |
Yufeng Diao and Hongfei LIN show abstract hide abstractABSTRACT: Homographic puns have a long history in human writing, being a com- mon source of humor in jokes and other comedic works. It remains a difficult challenge to construct computational models to discover the latent semantic structures behind homographic puns so as to recognize puns. In this work, we design several latent semantic structures of homographic puns based on rele- vant theory and design sets of effective features of each structure, and then we apply an effective computational approach to identify homographic puns. Re- sults on the SemEval2017 Task7 and Pun of the Day datasets indicate that our proposed latent semantic structures and features have sufficient effectiveness to distinguish between homographic pun and non-homographic pun texts. We believe that our novel findings will facilitate and stimulate the booming field of computational pun research in the future. |
16:20-16:40 |
Mengqiao Han, Zhendong Niu and Ou Wu show abstract hide abstractABSTRACT: In this paper, we focus on the problem of text style transfer which is considered as a subtask of paraphrasing. Most previous para- phrasing studies have focused on the replacements of words and phrases, which depend exclusively on the availability of parallel or pseudo-parallel corpora. However, existing methods can not transfer the style of text completely or be independent from pair-wise corpora. This paper presents a novel sequence-to-sequence (Seq2Seq) based deep neural network model, using two switches with tensor product to control the style transfer in the encoding and decoding processes. Since massive parallel corpora are usu- ally unavailable, the switches enable the model to conduct unsupervised learning, which is an initial investigation into the task of text style trans- fer to the best of our knowledge. The results are analyzed quantitatively and qualitatively, showing that the model can deal with paraphrasing at different text style transfer levels. |
16:40-17:00 |
Xing Wei, Wei Wang, Jingping Chen, Yanlu Xie and Jinsong Zhang show abstract hide abstractABSTRACT: For the purpose of relieving the time cost and inconformity in annotation, this paper propose to use an articulatory features based mispronunciation detection system to give an Top-N feedback and use this feedback to assist manual annotation. As a result,the consistency rate of phoneme labels in our system increase from 80.7% to 92.48%. In addition,the time cost for annotating each sentence reduce from 10 minutes to 3 minutes. The results indicate that our automatic annotation system be practical, and there is also a room for further improvement. |
Alibaba Workshop 2: The Road to the Intelligence of New Customer Service 2017-11-12 15:40-17:00, 1F Joyous Gathering Palace A(1F 聚和宫A区), DIFCC Chair: Jian SUN (Alibaba Group) Return to Top | |
15:00-15:40 |
Knowledge Engineering: Empowering Machine Intelligence in Big Data Prof. Juanzi Li, Tsinghua University show abstract hide abstractABSTRACT: Machine computable knowledge provides the ability for computer systems to organize, manage and understand the Internet content, and is the key to achieve Internet intelligent services in today’s big data era. Building large-scale knowledge bases and mining knowledge from big data are important and challenging problems. In this talk, I will give the state of the art overview of knowledge engineering in big data at first, and then introduce our related research. SHORT BIO: Juanzi Li is a professor from at Tsinghua University. Her research interests contain semantic Web and knowledge graph building. Especially, she focuses on the research of semantic technology by combining the Semantic Web, Natural Language Processing and Data mining. She is the chair of Knowledge and Language Computing Committee at the Chinese Information Processing Society of China. She is the principal investigator of many important projects supported Natural Science Foundation of China, the framework of EU cooperation projects (FP7), and etc. She has published over 90 papers in the top international conferences and journals such as WWW, ACL, SIGIR, IJCAI, TKDE and TKDD. |
15:40-16:20 |
Question Answering Dr. Yansong Feng, Peking University show abstract hide abstractABSTRACT: TBD SHORT BIO: |
16:20-17:00 |
Teaching Machines to Converse Dr. Jiwei Li, Stanford University show abstract hide abstractABSTRACT: Recent advances in neural network models present both new opportunities and challenges for developing conversational agents. Current chatbot systems still face a variety of issues: they tend to output dull and generic responses such as "I don't know what you are talking about"; they lack a consistent or a coherent persona; they are usually optimized through single-turn conversations and are incapable of handling the long-term success of a conversation; and they are not able to take the advantage of the interactions with humans. In this talk, I will discuss how we can handle the issues mentioned above, and how to design a chatbot that is able to output more interesting, interactive and human-like responses. Specifically, I will talk about how to avoid the pitfall of outputting dull responses using mutual information; how to incorporate speaker embedding into the neural generation model to endow a bot with a coherent persona; how to handle the long-term success of a conversation using reinforcement learning and adversarial learning; and how to give a bot the ability to ask questions and make it smart about when to ask questions. SHORT BIO: Jiwei Li just got his Ph.D in Computer Science from Stanford University, advised by Prof. Dan Jurafsky. His research interests lie in Natural Language Processing, with a focus on deep learning applications. He was a recipient of Facebook fellowship of 2015, Baidu fellowship of 2016. |