The integration of Baidu Search and LLMs today is providing users with even greater value. AI has redefined search, transforming Baidu Search from a simple list of web results based on text input to an intelligent search engine that can "listen and see." Its understanding and adaptability to user queries have continuously improved, making its content and services more accurate and diverse, and it has become increasingly user-friendly. In this session, three engineers from Baidu Search will each discuss their explorations into the integration of LLMs, the practical applications of generative question-answering, and the open-source retrieval engine PUCK. We welcome everyone to join our conversation.
Speaker: Junxian He
Title: Evaluating Synthetic Data for LLM Alignment
Abstract : In this talk, I will cover our recent works on manipulating synthetic data during LLM alignment. Specifically, I will discuss (1) evaluating user alignment data through difficulty, diversity, and quality to enhance the alignment performance; and (2) controlling the difficulty distribution of mathematical synthetic data to synthesize the SOTA chain-of-thought data for mathematical reasoning. I will also present our insights of data duing LLMs' self-alignment/improvement.
Speaker: Junxian He is an assistant professor in the Department of computer science and engineering at the Hong Kong University of Science and Technology. He received his PhD degree in natural language processing from Carnegie Mellon University, Language Technologies Institute. His recent research focuses on reasoning, synthetic data, and evaluation of large language models. He has served as area chair for ICLR, ACL and EMNLP.
Speaker: Bowen Yu
Title: Automated Alignment via Inductive Bias
Abstract : Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approaches. This presentation will discuss aligning LLMs through inductive bias, which automatically steers the model towards desired behaviors by introducing suitable assumptions and constraints, without the use of additional training signals beyond the model itself.
Speaker: Bowen Yu is an Algorithm Expert at Qwen, part of the Alibaba Group. His current focus is on Qwen's post-training research and the development of the Qwen-Instruct models, leading to the creation of a series of models including Qwen1.5-Chat and Qwen2-Instruct. In 2022, he obtained his Ph.D. from the Institute of Information Engineering at the Chinese Academy of Sciences. To date, he has published over 50 papers in top-tier conferences and journals, including ICML, WWW, SIGIR, ACL, EMNLP, AAAI, and TOIS, and has received more than 3000 citations.
Speaker: Zhenfei Yin
Title: Ensuring Trustworthiness Throughout the AI Life Cycle
Abstract : Current alignment techniques, such as Reinforcement Learning from Human Feedback (RLHF), often focus solely on post-training stages. However, risks can be introduced throughout the entire AI lifecycle, including use case design, data preparation, model training and fine-tuning, deployment, and product delivery. A systematic approach is necessary to mitigate risks at every stage. Adhering to the concept of AI lifecycle Trustworthiness and considering the current development status of foundation models, the AI Safety Research Center at Shanghai AI Lab has conducted a series of open-source projects on alignment and evaluation at each stage of the AI lifecycle. Our research encompasses not only Large Language Models (LLMs) but also extends to Multi-modal Large Language Models (MLLMs) and AI Agents. Furthermore, we will present our outlook on future research in broader AGI alignment. We hope that our research will inspire and contribute to the ongoing dialogue on AI Trustworthiness in society.
Speaker: Zhenfei Yin is a researcher at the Shanghai AI Lab, guided by Dr. Jing Shao. He is also a Ph.D. candidate at the University of Sydney, under the supervision of Professor Wanli Ouyang. Before his doctoral studies, he worked full-time at SenseTime's AGI group, supervised by Dr. Junjie Yan and Professor Xiaogang Wang. His primary research interests include multi-modal foundation models, multi-agent systems, trustworthy foundation models, and embodied agents. He has published dozens of papers in conferences and journals, including ICLR, NeurIPS, CVPR, ECCV, ACL, and IJCV. He also serves as a reviewer for ICLR, NeurIPS, ICML, ECCV, ICME, and TPAMI, among others, and has organized workshops and competitions at international conferences like ICML, ECCV, and CVPR.
Speaker: Rui Zheng
Title: Process supervision based on inverse curriculum reinforcement learning
Abstract : Based on OpenAI's classic method—reverse curriculum reinforcement learning, we will explore how to approximate process supervision under limited resources (relying only on demonstration and outcome supervision). And through two works, "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning, ICML2024" and "StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback, ACL2024", we respectively verify the effectiveness of reverse curriculum reinforcement learning in reasoning and coding tasks. Finally, we will summarize the shortcomings of the reverse curriculum reinforcement learning method and give directions and thoughts for further improvement.
Speaker: Rui Zheng is a large model algorithm engineer at ByteDance. He holds a Ph.D. in computer science from Fudan University. His doctoral supervisor is Professor Qi Zhang. His research interests include large model alignment and complex scenario applications. He is the person in charge of the open-source project MOSS-RLHF and a core contributor to the text robustness evaluation tool TextFlint. He has published more than ten academic papers in conferences such as ICLR, ACL, EMNLP, and COLING.
Speaker: Ning Ding
Title: From Specialist, to Generalist, to Specialized Generalist.
Abstract : Large language models, as one of the significant advancements in artificial intelligence in recent years, are gradually being applied across various industries. Generally, the usability of large language models stems from their generalization capabilities, while value creation relies on their specialized expertise. However, current large language models still face challenges in delivering high value in critical scenarios, and there is still a gap before they can be fully transformed into productivity tools. This presentation will explore the evolution from early pre-trained language models tailored for specific tasks to more generalized large language models, and eventually to future foundational models that combine generalization with specialization.
Speaker: Ding Ning is a postdoctoral researcher in the Department of Electronic Engineering at Tsinghua University. His research interests include machine learning, mechanisms of large language models, alignment methods, and applications. In 2024, he received his Ph.D. in Computer Science and Technology from Tsinghua University. He has been selected as Young Elite Scientists Sponsorship Program by CAST, and has received several honors, including the ACL Best System Demonstration Paper Award, the World Artificial Intelligence Conference Young Excellent Paper Award, Baidu Scholarship, etc.
Speaker: Tao Ji
Title: Hallucination Suppression in Large Vision-Language Model Alignment
Abstract : Large Vision-Language Models (LVLMs) have achieved impressive performance, yet research has pointed out a serious issue with object hallucinations within these models. However, there is no clear conclusion as to which part of the model these hallucinations originate from. In this talk, we present an in-depth investigation into the object hallucination problem specifically within the CLIP model, which serves as the backbone for many state-of-the-art vision-language systems. We unveil that even in isolation, the CLIP model is prone to object hallucinations, suggesting that the hallucination problem is not solely due to the interaction between vision and language modalities. To address this, we propose a counterfactual data augmentation method by creating negative samples with a variety of hallucination issues.
Speaker: Tao Ji is a postdoctoral researcher in the Department of Computer Science at Fudan University. In 2023, he received his PhD degree in Computer Science and Technology from East China Normal University. His research interests include Large Multimodal Model, Efficient Inference of LLM. He has published more than 20 papers in international conferences and journals such as ACL, CoLM, EMNLP, and NAACL, and has received honors like the Shanghai Super Postdoctoral Fellowship, Fudan University Super Postdoctoral Fellowship, and the Future Scientists and Scholars Training Program from East China Normal University.