◆ Arman Cohan, Assistant Professor, Yale University; Research Scientist ,Allen Institute for AI (AI2)
Title: Piecing the Puzzle: Language Models for Multi-Document Contexts
Time: Saturday, 10/14, 9:00-10:00
Abstract: Large language models have shown significant capabilities in a variety of NLP tasks. Despite several advances, their abilities for processing multi-document tasks remain less explored. In this talk, I will first discuss our earlier work focusing on the extension of long-context language models for tasks that necessitate cross-document understanding, such as multi-document summarization. Subsequently, I will delve into our work on enhancing these models to manage both short and long-form generation across multiple documents. This work proposes a novel pre-training approach to improve the language models' ability to understand and integrate cross-document information. I will then present our recent work on extreme multi-document scenarios and the role of retrieval, providing further insights into these tasks. This talk describes joint work with Avi Cacialuru, Wen Xiao, John Giorgi, Yilun Zhao, and several other collaborators.
Bio: Arman Cohan is an Assistant Professor of Computer Science at Yale University and a faculty Research Scientist at the Allen Institute for AI (AI2). His research spans various problems at the intersection of Natural Language Processing and Machine Learning, including Language Modeling, Representation Learning, Generation, and their applications to specialized domains include science. His research has been recognized with multiple awards, including a best paper award at EMNLP, an outstanding paper award at EACL, and an honorable mention at COLING. Prior to joining Yale, he was a Research Scientist at the Allen Institute for AI (AI2) and an Affiliate Assistant Professor at University of Washington.
Homepage: https://armancohan.com/.
◆ Xia "Ben" Hu, Associate Professor, Rice University
Title: ChatGPT in Action: An Experimental Investigation of Its Effectiveness in NLP Tasks
Time: Saturday, 10/14, 10:45-11:45
Abstract: The recent progress in large language models has resulted in highly effective models like OpenAI's ChatGPT that have demonstrated exceptional performance in various tasks, including question answering, essay writing, and code generation. This presentation will cover the evolution of LLMs from BERT to ChatGPT and showcase their use cases. Although LLMs are useful for many NLP tasks, one significant concern is the inadvertent disclosure of sensitive information, especially in the healthcare industry, where patient privacy is crucial. To address this concern, we developed a novel framework that generates high-quality synthetic data using ChatGPT and fine-tunes a local offline model for downstream tasks. The use of synthetic data improved the performance of downstream tasks, reduced the time and resources required for data collection and labeling, and addressed privacy concerns. Finally, we will discuss the regulation of LLMs, which has raised concerns about cheating in education. We will introduce our recent survey on LLM-generated text detection and discuss the opportunities and challenges it presents.
Bio: Dr. Xia "Ben" Hu is an Associate Professor at Rice University in the Department of Computer Science. Dr. Hu has published over 200 papers in several major academic venues, including NeurIPS, ICLR, KDD, WWW, IJCAI, AAAI, etc. An open-source package developed by his group, namely AutoKeras, has become the most used automated deep learning system on Github (with over 8,000 stars and 1,000 forks). Also, his work on deep collaborative filtering, anomaly detection and knowledge graphs have been included in the TensorFlow package, Apple production system and Bing production system, respectively. His papers have received ten Best Paper (Candidate) awards from venues such as ICML, WWW, WSDM, ICDM, AMIA and INFORMS. He is the recipient of NSF CAREER Award and ACM SIGKDD Rising Star Award. His work has been cited more than 20,000 times with an h-index of 60. He is the conference General Co-Chair for WSDM 2020 and ICHI 2023. He is also the founder of AI POW LLC.
Homepage: https://cs.rice.edu/~xh37/index.html.
◆ Denny Zhou, Principal Scientist/Research Director, Google DeepMind
Title: Teach Language Models to Reason
Time: Sunday, 10/15, 9:00-10:00
Abstract: Over the past decades, the machine learning community has developed tons of data-driven techniques aimed at enhancing learning efficiency, like semi-supervised learning, meta learning, active learning, transfer learning, and more. However, none of these techniques have proven to be highly effective for real-world natural language processing tasks. This shortcoming uncovers a fundamental flaw in machine learning - the absence of reasoning. Humans often learn from just a few examples because of their capacity to reason, as opposed to relying on data statistics. In this talk, I will talk about the large language models (LLM) reasoning work that we pioneered, and show that the techniques we developed can greatly narrow the gap between human intelligence and machine learning: crushed SoTA in the literature while demanding only a few annotated examples and no training. Our work was presented by Google CEO Sundar Pichai at Google I/O 2021 as a showcase of Google AI.
Bio: Denny Zhou is a principal scientist / research director in Google DeepMind, where he founded and leads the Reasoning Team. His research is centered around building and teaching large language models (LLMs) to achieve human-level reasoning. His notable work includes chain-of-thought prompting, self-consistency decoding, least-to-most prompting, instruction tuning (FLAN2), LLMs self-debugging and various investigations of emergent properties of LLMs. He won Google Research Tech Impact Award in 2022.
Homepage: https://dennyzhou.github.io/.
◆ Diyi Yang, Assistant Professor, Stanford University
Title: Human-Centered NLP for Positive Impact
Time: Sunday, 10/15, 10:30-11:30
Abstract: Large language models have revolutionized the way humans interact with AI systems, transforming a wide range of fields and disciplines. However, there is a growing amount of evidence and concern about the negative aspects of NLP systems such as biases and the lack of input from users. How can we build NLP systems that are more user-centric and more aware of human factors? In this talk, we will present two case studies on how human-centered design can be leveraged to build responsible NLP applications. The first one looks at linguistic prejudice with a participatory design approach to develop dialect-inclusive language tools and adaptation techniques for low-resourced language and dialect. The second part introduces CARE, an interactive AI agent that supports therapists through LLM-empowered feedback and deliberative practices, as an initial step toward democratizing skill training with AI. We conclude by discussing the challenges and hidden risks for building human-centered NLP systems for positive impact.
Bio: Diyi Yang is an assistant professor in the Computer Science Department at Stanford University. Her research interests are computational social science and human-centered natural language processing. Her work has received multiple best paper nominations or awards at top NLP and HCI conferences (e.g., ACL, EMNLP, SIGCHI, and CSCW). She is a recipient of IEEE “AI 10 to Watch” (2020), the Intel Rising Star Faculty Award (2021), the Samsung AI Researcher of the Year (2021), the Microsoft Research Faculty Fellowship (2021), and the NSF CAREER Award (2022).
Homepage: https://cs.stanford.edu/~diyiy/.