2021년 8월 9일 ~ 11일 생중계
Monday, August 9
Interpretability - What Now?
Abstract:
TBA
Why Distillation Helps: A Statistical Perspective
Why Distillation Helps: A Statistical Perspective
Abstract:
Knowledge distillation is a technique for improving the performance of a simple "student" model by replacing its one-hot training labels with a distribution over labels obtained from a complex "teacher" model. While this simple approach has proven widely effective, a basic question remains unresolved: why does distillation help? This talk presents a statistical perspective on distillation which addresses this question, and provides a novel connection to extreme multiclass retrieval techniques. Our core observation is that the teacher seeks to estimate the underlying (Bayes) class-probability function. Building on this, we establish a fundamental bias-variance tradeoff in the student's objective: this quantifies how approximate knowledge of these class-probabilities can significantly aid learning. Finally, we show how distillation complements existing negative mining techniques for extreme multiclass retrieval, and propose a unified objective which combines these ideas.
Toward a Tractable Solution for Human-in-the-loop Reinforcement Learning: Algorithm and Benchmark
Toward a Tractable Solution for Human-in-the-loop Reinforcement Learning: Algorithm and Benchmark
Abstract:
Deep Reinforcement Learning (RL) has been successful in a range of challenging domains, such as board games, video games, and robotic control tasks. Scaling RL to many applications, however, is yet precluded by a number of challenges. One such challenge lies in designing a suitable reward function that is sufficiently informative yet easy enough to provide. Human-in-the-loop RL methods allow practitioners to instead interactively teach agents through tailored feedback; however, such approaches have been challenging to scale since human feedback is very expensive. In this talk, I’ll present PEBBLE: a feedback-efficient RL algorithm by which learning is largely autonomous and supplemented by a practical number of preferences provided by a supervisor. Based on off-policy RL and unsupervised pre-training, our method is able to utilize real-time human feedback to effectively prevent reward exploitation and learn new behaviors that are difficult to specify with standard reward functions. Additionally, I’ll introduce B-Pref: a benchmark specially designed for preference-based RL to further our understanding of the strengths of existing algorithms.
Universal Learning Machines for Human-Level AI
Universal Learning Machines for Human-Level AI
Abstract:
Despite great progress, current machine
learning methods are restricted to the function mapping framework. To
achieve a truly human-level intelligence, we propose to build universal
learning machines (ULMs), i.e. machines that recursively improve their performance
by learning continually 24/7 without human intervention. We identify three key
elements of the ULM and describe an architecture for employing these elements progressively
as we move from Level 2 AI through Levels 3, 4, and to Level 5 (Human-Level AI)
and beyond.
Training Dynamics and Complexity of Infinitely Wide WGAN and Minimax Optimization
Training Dynamics and Complexity of Infinitely Wide WGAN and Minimax Optimization
Abstract:
Adversarial training based on solving a minimax optimization is empirically trickier and theoretically less understood than training dynamics based on loss minimization. This talk presents recent results on the minimax training dynamics under two simplifying setups: (i) the limit of the generator network in a WGAN being infinitely wide (ii) the loss being convex-concave. Under assumption (i), we show that the loss landscape has no spurious stationary points. Under assumption (ii), we present an acceleration mechanism distinct from Nesterov's acceleration and establish an optimal O(1/k^2) rate of convergence.
Recent Methods on Improving ML Generalization via Data Augmentation and Mixup
Recent Methods on Improving ML Generalization via Data Augmentation and Mixup
Abstract:
In this talk, I will first introduce various recent techniques in machine learning on data augmentation and data mixup for improving the generalization ability of the learned ML models. Then, I will discuss our latest research on the state of the art mixup methods which appeared at ICML2020 and ICLR2021 (oral).
Graph Learning for Drug Discovery and Personalized Medicine
Graph Learning for Drug Discovery and Personalized Medicine
Abstract:
"Many research problems in drug discovery and personalized medicine can be naturally formulated as graphs. In this formulation, each drug or patient is represented as a graph and classifying drugs or patients requires graph-level learning unlike node-level or link-level learning in many graph learning problems. In this talk, I will discuss some of recent works, a graph learning problem for toxicity prediction of small molecule drugs and another graph learning problem for cancer subtype and metastasis prediction using gene expression data, a.k.a, transcriptome. The first problem of toxicity prediction for drugs is a graph mining problem in search of sets of subgraph structures or toxicophores that can represent toxicity of drugs as graphs. Since we do not know toxicophores, a naive approach is to enumerate all possible subgraphs that are frequent in toxic drugs. However, this approach is computationally infeasible. We use Markovian random walks on chemical graphs to generate subgraphs and these subgraphs are screened for over-representation in toxic chemicals using information theory. However, not many of these subgraphs are over-represented, thus we use graph pruning techniques to refine and search for toxicophore candidates until enough number of over-represented subgraphs are discovered. Finally, subgraphs are combinatorially combined as toxicophores using frequent pattern mining techniques. The second problem of using gene expression data for cancer subtype and metastasis prediction is formulated as graphs where a graph represents a patient. In the patient graph, nodes are genes and edges are known interactions between genes in the template of protein interaction network. In this formulation, classifying patients is to classify a set of graphs with labels. For the cancer subtype classification, we combined spectral graph learning and relation learning together. For the metastasis prediction for early oral cancer, we used a novel graph reduction techniques using both biological knowledge and gene expression information to create a low dimensional embedding space where patients with and without metastatic potentials are distinguished.
Joint work with Drs. Sangsoo Lim, Sungmin Rhee, and Minsoo Kim"
How to Train Your Virtual Dragon via Deep Learning
How to Train Your Virtual Dragon via Deep Learning
Abstract:
인체는 650여 개의 근육과 300여개의 뼈로 이루어져 있다. 보행과 같은 단순한 움직임을 위해서도 수 백 개의 근육에 매 순간순간 적절한 정도의 근수축 신호를 보내주어야 균형을 유지하며 앞으로 걸음을 옮길 수 있다. 뇌성마비와 같은 병증이 있는 환자들은 근육이 약화/경직되거나 혹은 뼈가 변형되는 등 근골격계 변형이 결과적으로 병적인 걸음걸이로 이어진다. 이제희 교수의 지난 5 년간 연구는 신체 조건으로부터 보행 동작을 유추하는 보행 예측 시뮬레이션(predictive gait simulation), 반대로 보행 동작으로부터 신체 조건을 유추하는 지능형 보행 분석(intelligent gait analysis), 그리고 특정인에 근골 격 모델을 구축하는 환자맞춤형 근골격계 모델링(patient-specific musculoskeletal modeling)의 세 가지 문제를 해결하는데 집중해왔다. 또한, 분당서울대학교병원 정형외과와 협력하여 연구 결과를 실질적으로 의학 분야에 적용하고자 하는 시도를 이어가고 있다. 다른 한편에서는 컴퓨터 게임에 필요한 캐릭터 애니메이션과 하늘을 나는 새와 물 속을 헤엄치는 문어 등 다양한 동물의 움직임을 재현하는 연구를 병행하고 있다. 이 강연에서는 지난 20년간 캐릭터 애니메이션 기술이 어떻게 발전해왔는지, 딥러닝 기술이 캐릭터 애니메이션 분야를 어떻게 변화시키고 있는지, 최신 기술들이 게임과 같은 대화형 응용분야에 어떻게 적용될 수 있을지에 대해서 논의한다.
Tuesday, August 10
Machine Learning for Deep Image Manipulation
Machine Learning for Deep Image Manipulation
Abstract:
Deep generative models, such as Generative Adversarial Networks (GANs) in particular, can sample realistic images from Gaussian noise. However, are they good for image editing? Image editing requires the output to retain some resemblance to the user-provided input image. In this talk, I will discuss a different formulation in which the generator network is trained to transform one image to another. I will explore some interesting ways to constrain the generator to respect the input images, and show that they are indeed useful for image editing and other practical tasks.
Understanding hardware-mapping-model co-design space for efficient deep learning inference
Understanding hardware-mapping-model co-design space for efficient deep learning inference
Abstract:
Deep learning-based applications are widely adopted from data centers to edge devices to deliver high-quality results to users. Many such applications have deep neural networks (DNNs) as their backbones, and service providers train the model and provide services based on the trained model. Using the trained model, users mainly run inferences on their devices to run applications. However, such devices often have limited computing power and batteries (e.g., smartphones, AR glasses, and so on), so custom chips for accelerating DNN inferences, named DNN accelerator, have emerged.
DNN accelerator is one of the key approaches for efficient and powerful AI chips to run inferences, which enables quick responses, long battery time, and low thermal dissipation. However, although DNN accelerators provide high potential inference efficiency (computing power and energy efficiency), the effective inference efficiency also depends on the workload (DNN model) and mapping strategies (dataflow + tiling) of DNN operators onto a DNN accelerator.
Therefore, this lecture aims to understand the complex trade-off space across the DNN model, mapping, and DNN accelerator design. For that, I will first discuss simple examples for intuition and show the impact of mapping and hardware choices using a DNN accelerator performance/cost model, MAESTRO. Based on the observation from data showing that no single mapping strategy is ideal for all DNNs/hardware, I will introduce one of the latest works at Facebook Reality Labs, Herald, which explores heterogeneous dataflow accelerator architectures.
Writing with Artificial Intelligence
Writing with Artificial Intelligence
Abstract:
In this talk, we cover topics centered around the broad theme of “Writing with Artificial Intelligence (AI),” with the goal of providing both a high-level summary of this emerging research area and in-depth discussions of several recent works. We first focus on the general problem of using AI to enhance productivity in writing, including ways to develop autocomplete systems and evaluate existing thesaurus systems. We then transition to the arguably more challenging question of “how can AI enhance human creativity?” Here, we discuss the current landscape, along with our ongoing work on understanding human-LM (language model) interaction. Finally, we conclude by mentioning some risks/opportunities that might arise from increased adoption of AI-assisted writing systems.
인공지능과 프라이버시
Abstract:
코로나19로 촉발된 '언택트'(untact)와 '초연결'이 결합된 "온택트'(ontact) 시대를 맞이하여 개인정보, 데이터 프라이버시 보호는 IT, BT, NT, 의료/보건, 금융, 문화계등 거의 모든 산업 영역에서 선택의 문제가 아니라 생존의 문제이다. 이 강연에서는 개인 정보 보호 AI 시스템 및 알고리즘 설계와 관련된 이론적 및 실제적 과제에 초점을 맞추고, 프라이버시 보호를 위한 인공지능의 핵심기술인 차등정보보호(differential privacy), 연합학습(federated learning), 동형암호(homomorphic encryption), 적대적 기계학습(adversarial machine learning)에 대한 최신 기법들을 간략하게 소개하고자 한다.
Private AI and HE
Abstract:
최근 빅데이터의 시대를 맞아 데이터의 활용과 보호를 모두 지원하는 프라이버시 보존 데이터분석에 대한 요구와 중요성이 같이 증가하고 있다. 분석시에 개인정보를 보호하기 위해 데이터를 변형하는 방식으로는 통계적 비식별화, 차분프라이버시(Differential Privacy), 연합학습(Federated Learning), 동형암호(Homomorphic Encryption)등이 있으며, 특히 동형암호는 모든 계산을 암호화된 상태에서 수행하여 개인정보 유출을 완벽하게 통제하는 기술이다.
Self-Supervised Learning for Videos
Self-Supervised Learning for Videos
Abstract:
In this talk, I will introduce three of our recent works about the topics of self-supervised learning. First, we introduce some techniques of reducing the parameters of multimodal Transformers in the context of audio-visual video representation learning. Second, we present an efficient self-supervised approach to directly learn video representations from compressed videos. Finally, we discuss neural activation coding (NAC) as a novel approach for learning deep representations from unlabeled data by maximizing the mutual information between activation patterns of the encoder and the data over a noisy communication channel. All of these works are recently published in ICLR 2021 and ICML 2021.
Edge-Cloud Collaborative Systems for Live Video Analytics
Edge-Cloud Collaborative Systems for Live Video Analytics
Abstract:
혼합현실, 자율주행, 영상 감시등 실시간 영상 분석에 대한 요구가 지속적으로 증가하고 있다. 본 발표에서는 본 연구실에서 개발한 시스템을 포함해서 실시간 영상 분석을 위한 최신 기술을 살펴본다.
Algorithm/Hardware Co-Design for Extreme-Scale Deep Neural Networks
Algorithm/Hardware Co-Design for Extreme-Scale Deep Neural Networks
Abstract:
최근 자연어처리(NLP) 응용을 중심으로 초대규모 모델의 개발과 적용이 활발하게 이루어지고 있다. 이 세미나에서는 이러한 초대규모 모델을 효율적으로 학습, 추론하기 위해 본 연구실에서 개발한 알고리즘-하드웨어 최적화 기법을 소개한다. 구체적으로 Transformer 기반 모델의 핵심 요소기술인 attention 연산을 효율적으로 처리하기 위한 하드웨어 가속기술 A3 [HPCA'20], ELSA [ISCA'21]와, 초대규모 모델 학습 플랫폼의 비용효율을 크게 개선한 NAND flash 기반 학습 솔루션 Behemoth [FAST'21]를 다룬다.
Wednesday, August 11
Addressing Information Seeking Queries: From Finding to Presenting Answers
Addressing Information Seeking Queries: From Finding to Presenting Answers
Abstract:
We aim for a QA system that can satisfy user’s information need. How can we build such a system? In this talk, I will discuss the full stack of addressing information seeking queries, from collecting datasets, building models, to communicating answers to the questioners. First, we discuss the implicit assumptions in existing datasets, such as assumed geographical and temporal contexts. Then, we analyze the status of current QA models, quantifying the remaining headrooms. Despite rapid progress in QA, models are not robust and cannot always return correct answers. Thus, in the last part, I will cover how to better communicate model predictions, such as teaching models when to abstain from answering.
Styles and Interactions in Human-centered NLP
Styles and Interactions in Human-centered NLP
Abstract:
Despite the recent advances of massive language models like GPT3, texts predicted by such systems are far from any human-written text. In fact, they most often produce either nonfactual text, incoherent text, or pragmatically inappropriate text. Also, the lack of interaction with real users makes the system less controllable and nonpractical. Our lab is focused on developing linguistically informed NLP models and building interactive NLP systems. In this talk, I introduce two recent works that make NLP systems more human-centered: stylistic variation of language (ACL 2021) and interactions with NLP systems (CHI 2021).
Toward Real-World Natural Language Understanding
Toward Real-World Natural Language Understanding
Abstract:
Despite rapid progress in natural language understanding (NLU), current systems are developed based on a number of unrealistic assumptions---background knowledge being given, well-defined input from users, and each model being trained on a single data distribution, making it infeasible to apply such systems in real life. In this talk, we relax these assumptions by (1) introducing advanced models that retrieve and process world knowledge from external sources, (2) embracing potential ambiguities and false presuppositions in the input from users, and (3) proposing novel methods that prompt a single model for a range of tasks. I will present how our work has made breakthroughs in representative NLU tasks such as question answering, and highlight avenues for future work.
Systems for Large-scale AI
Abstract:
Since GPT-3 was announced in 2020, large-scale AI has been revolutionizing various domains such as natural language processing and multi-modal models. The development of large-scale AI is thanks to a software stack for accelerating its execution. In this talk, I will talk about the main system components in the stack and several important research questions and solutions that drive large-scale AI including the solution built by FriendliAI.
Geometry in the Deep Learning Era
Geometry in the Deep Learning Era
Abstract:
While deep learning has been a powerful tool for computer vision in many downstream tasks, its application to 3D geometry is not straightforward. Processing 3D data requires significantly more memory and computation, and it is harder to acquire 3D measurements or large-scale fine-grained annotation than the 2D (image) counterpart. In this talk, I will discuss the efforts to utilize three-dimensional physics and metric information in various degrees in addition to the rich semantic information obtained from image evidence. First, I will show an example of using 3D measurements only for shape completion. Second, I will discuss how we can combine 3D measurements to assist panoramic image localization. Lastly, I will explain how we can infer physical interactions without explicitly modeling 3D knowledge. The ultimate goal is to find the optimal framework to represent the dynamic environment around us and enable spatial perception for various applications of embodied intelligence.
Stock Prediction with AI
Abstract:
"주식 가격 예측은 데이터에 노이즈가 많고 수많은 요인에 의존하기 때문에 인공지능에서 가장 어려운 문제중 하나다. 하지만 주식 가격은 완전히 무작위로 움직이는 것이 아니며, 주식 가격의 움직임을 정확하게 예측하는 것은 일반인들의 자산 관리를 혁신할수 있는 중요한 기술이다. 본 발표에서는 인공지능을 이용한 최신 주식 가격 예측 기술을 설명한다. "
Node Classification with Belief Propagation
Node Classification with Belief Propagation
Abstract:
Given an undirected graph, how can we classify its nodes accurately and efficiently? Belief propagation (BP) is an inference algorithm widely used for this purpose with various applications including fraud detection, malware detection, and recommendation. I introduce our works that utilize BP for node classification in difficult scenarios that cannot handled by typical classifiers such as graph convolutional networks. We focus on cases such as when no node attributes are given, test nodes have no neighbors, or the given graph is too large.
Modeling Cells through Deep Learning of Biological Pathways and Epigenetic Regulations
Modeling Cells through Deep Learning of Biological Pathways and Epigenetic Regulations
Abstract:
Despite the success of deep learning for many image or text-processing applications, it has achieved limited success in many scientific fields such as biology, medicine, or healthcare. The major bottleneck of current black-box deep learning models in these domains is the large data requirement. Biomedical data has a small number of data and many features, resulting in a high dimensional low sample size (HDLSS) problem. Recently, Knowledge Guided Machine Learning (KGML), which integrates machine learning models and the domain knowledge accumulated over decades, has been highlighted to bridge the gap between data requirements and model performances. In this talk, I introduce two of our KGML works in the biology domain: (1) biological pathway guided graph neural network model for predicting cancer subtypes, (2) epigenetic gene regulation guided deep learning model for predicting gene expression.
Knowledge-Enhanced Text Generation on Colloquial Language
Knowledge-Enhanced Text Generation on Colloquial Language
Abstract:
TBA
Mitigating Shortcut Learning of NLP Models
Mitigating Shortcut Learning of NLP Models
Abstract:
TBA
온라인 강연 주소
* 2021 AI 여름학교는 Zoom과 Youtube에서 동시 생중계 됩니다.
AI여름학교 기간 매일 컴퓨터연구소에서 후원하는 경품행사가 있습니다.
훌륭한 참관기를 작성한 분들께 매일매일 에어팟과 기계식 키보드 상품을 드립니다.
주관: 서울대 컴퓨터공학부, AI연구원, 컴퓨터연구소
Copyright 2020 Seoul National University All Rights Reserved.