◆ Industrial Talk, by Jun Xie
Topic: Enhanced Model, Unified Quality Evaluation, and Light-Weight Domain Adaptation for Nerual Machine Translation
Date and Time: Sep.24 10:00-10:15
Meeting Room: Ballroom 1&2, 1st floor
Short Bio: Jun Xie is a principle algorithm engineer at Alibaba Damo Academy. He recieved his Ph.D. in 2012 from ICT The Institute of Computing Technology of the Chinese Academy of Sciences. His research focuses on machine translation, text generation and natural language processing. And he has published 20+ research papers at top-tier conferences including ACL, EMNLP, AAAI. He is now working on a high quality practical neural machine translation system which supports multiple langages and is flexible to transfer to multiple domains.
Abstract: High quality, trustability and transferability is always fundemental and crucial for a practical machine translation system to meet different demands on different domains. In this talk, I would like to introduce our efforts on these three aspects, CSANMT (continous semantic augmented neural machine translation), UniTE (Unified Translation Evaluation) , Adapter-NMT and revised KNN-MT, respectively. CSANMT equips Transformer with a continuous semantic space to simulate the real distribution of semantic equivalence between source and target languages, which is capable to make more unseen instances under generalization with very limited training data. UniTE is a universial framework with better human correlation, which can evaluate the translation quality of hypothesis with source (quality estimation setting), reference (metrics setting), and both source and reference. With Adapter-NMT and revised KNN-MT, we can transfer a base NMT model to new domains with few new trainable parameters or even without training procedure at all.
◆ Industrial workshop speech, by Ziyan Chen
Topic: A cross-lingual knowledge-centric pre-trained framework and application
Date and Time: Sep.24 16:00-16:20
Meeting Room: Guilin room (桂林厅)
Short Bio: Ziyan Chen, currently the director of NLP field of GTCOM 2030 Artificial Intelligence Research Institute, graduated from the Institute of Electronics, Chinese Academy of Sciences with a Ph.D. His main research interests include information extraction, natural language generation, and cross-lingual knowledge graph construction. He has published more than 10 papers in international academic journals and conferences, and participated in a number of major national projects, such as pre-research projects and National Natural Science Foundation.
Abstract: With the globalization of open source data, multilingual nlp technology and cross-lingual alignment have become the core issues to be solved. In recent years, the rapid development of multilingual pre-trained models has brought profound changes to the global linguistic intelligence technology. GTCOM has built a multilingual pre-trained model, based on its large-scale cross-lingual industry knowledge graph and a global multilingual big data system. Meanwhile, based on its big models, GTCOM has released a natural language processing algorithm library covering 61 kinds and more than 60 languages, enabling vertical industry applications such as military, finance, and technology.
◆ Industrial workshop speech, by Wenxiang Jiao
Topic: Leveraging Multilingual Pretrained Models for Machine Translation
Date and Time: Sep.24 16:20-16:40
Meeting Room: Guilin room (桂林厅)
Short Bio: Wenxiang Jiao is now a senior researcher at Tencent AI Lab. He received his Ph.D degree from the Chinese University of Hong Kong in 2021, under the supervision by Prof. Irwin King and Prof. Michael R. Lyu. Before that, he received his Bachelor degree and Mphil degree at Nanjing University in 2015 and 2017, respectively. Wenxiang is interested in research directions like conversational emotion recognition, machine translation, and multilingual pretraining, and has published papers in top conferences and journals such as ACL, EMNLP, NAACL, AAAI, TASLP, etc.
Abstract: Multilingual pretraining is an effective approach to boost the performance of NLP tasks across languages by learning representations from large-scale unlabeled multilingual corpora. It is well suited for machine translation (MT) tasks, which usually involve two or more languages. While previous multilingual pretraining for MT generally focuses only on Transformer encoder, recent studies like mBART pretrain a complete autoregressive sequence-to-sequence (Seq2Seq) model, which remedy the architecture gap between pretraining and finetuning.
In this talk, we present a substantial step in better understanding such multilingual Seq2Seq pretrained models through three questions: (1) How much does the jointly pretrained decoder matter? (2) How do the discrepancies between pretraining and finetuning affect the downstream performance? (3) How does multilingual Seq2Seq pretraining perform for multilingual MT? We find that multilingual Seq2Seq pretraining is a double-edged sword: On one hand, it helps MT models to produce more diverse translations and reduce adequacy-related translation errors. On the other hand, the discrepancies between multilingual Seq2Seq pretraining and MT finetuning limit the translation quality (i.e., domain discrepancy) and induce the over-estimation issue (i.e., objective discrepancy). As for multilingual MT, multilingual Seq2Seq pretrained models consistently improve the performance of supervised translation but harms that of zero-shot translation by introducing more off-target issues. Based on these findings, we propose simple yet effective approaches to better leverage the multilingual Seq2Seq pretrained models and achieve significant improvements on various translation tasks.
◆ Industrial workshop speech, by Fei Mi
Topic: Overview of Recent Progress on Foundation Models at Huawei Noah’s Ark Lab
Date and Time: Sep.24 16:40-17:00
Meeting Room: Guilin room (桂林厅)
Short Bio: Fei Mi is a research scientist at Huawei Noah’s Ark Lab specialized in foundation NLP models and dialog systems. He obtained his Ph.D. degree in Computer Science from The Swiss Federal Institute of Technology Lausanne (EPFL) in 2021, supervised by Prof. Boi Faltings. Prior to that, he obtained his MPhill degree from Hong Kong University of Science and Technology (HKUST) under the supervision of Prof. Dit-Yan Yeung and Bachelor degree from a joint program between Sun-Yat Sun University (SYSU) and HKUST. His research interests mainly lies on Dialog Systems, Foundation Models, Few-shot Learning, Domain Adaptation, Recommendation, and he has published more than 10 papers at top AI conferences such as ACL, EMNLP, NAACK, AAAI, IJCAI.
Abstract: Foundation Models (a.k.a., Pre-trained Language Models, Large Models) is a new paradigm of enabling NLP technologies to prosper different NLP applications. Foundation Models are equipped with increasingly large model parameters and are trained on broad data at scale through self-supervision to cope with a wide range of downstream tasks. A variety of foundation models have been recognized as major transformation in how powerful AI systems can be built in different scenarios, such as language understanding (BERT), language generation (GPT), dialogue applications (LaMDA), multi-modality (DALL-E, Flamingo, Florence), decision making (GATO), and etc. In this talk, the speaker is going to briefly introduce recent leading researches w.r.t. the topic of foundation models at Huawei Noah’s Ark Lab, including (1) large-scale (200 billion) language model pre-training fully based on Huawei technology stack (PanGu-Alpha); (2) model compression and acceleration (TinyBert, QuantGPT, …); (3) open-domain dialogue model pre-training (PanGu-Bot); (4) Code intelligence pre-training (PanGu-Coder); (5) multi-modality pre-training (SPIRAL, FILIP). Besides foundation models, the speaker will also briefly overview some other research interests and roadmaps at Huawei Noah’s Ark Lab – Speech & NLP Lab.
◆ Industrial workshop speech, by Shikun Feng
Topic: ERNIE-ViLG 2.0: a Vision-Language Generation Model in Baidu Wenxin
Date and Time: Sep.24 17:10-17:30
Meeting Room: Guilin room (桂林厅)
Short Bio: Shikun Feng, currently the principal architect of natural language processing department at Baidu Inc., graduated from the Institute of Automation, Chinese Academy of Sciences. He is responsible for the semantic representation, graph learning, intelligent document understanding, and other directions. His research and development results are widely used in search engines, information flow, smart speakers, maps, and other products in Baidu, and significantly improve the user experience of hundreds of millions of netizens. He has won more than ten AI competition world championships, including KDD CUP, GLUE, SuperGLUE, SemEval, DocVQA, and so on. He has published several high-level papers in top international conferences on artificial intelligence, such as CVPR, AAAI, IJCAI, KDD, ACM MM, and CIKM, and one of the papers was rated as one of the most influential academic papers in AAAI 2020 by Paper Digest. He has more than 40 domestic and foreign technology patents and won the China Excellent Patent Award. He won the highest award of the World Artificial Intelligence Conference SAIL Award, the Outstanding Science and Technology Achievement Award of the Chinese Association for Artificial Intelligence, and two times Baidu Highest Award.
Abstract: Recently the field of text-to-image synthesis has attracted more and more attention, and its goal is to generate art or realistic images given an input prompt (text). The techniques behind this field have been evolving very quickly lately, from GAN to Seq2Seq models, and then to the Diffusion Model.
In this talk, I will first give a brief introduction to the text-to-image synthesis techniques, and then share the two generations of the image-text synthesis models in Baidu Wenxin, which are ERNIE-ViLG 1.0, the Unified Generative Pre-training for Bidirectional Vision-Language Generation, and ERNIE-ViLG 2.0, a new state-of-the-art text-to-image model that generates images from Chinese text. In addition, I will share serveral useful skills (Prompt Books) for playing with ERNIE-ViLG 2.0 and show some interesting generated cases. The audiences are welcome to click this link (https://wenxin.baidu.com/moduleApi/ernieVilg) to try out the ERNIE-ViLG 2.0 API service.