index

Research

Today, data is the most important resource affecting all walks of our lives - business, healthcare, finance, manufacturing, education, entertainment, etc. Once data is generated, we need to worry about how to store them. To protect data against possible local failures and hacker attacks and to maximally utilize available storage spaces, data is stored over a distributed network. Important issues in distributed storage include download speed, repair speed given local failures, storage space efficiency, privacy guarantee and ability to withstand hacker attacks, etc. Blockchain is an exciting and emerging example of distributed and decentralized way of storing and managing massive data. We utilize tools and insights from mathematics, communication theory, information theory, statistics and error correction theory to help construct the best distributed storage.

Data processing is another crucial issue. Machine learning is a type of data processing where necessary information gets extracted from massive data using modern learning methods. Machine learning is increasingly built into critical autonomous decision making processes in new applications like self-driving car and smart factory. Machine learning based on DNN has found great recent success in many applications. However, DNN requires exorbitant amounts of training data as well as massively scaled neural nets. Often both the training data and computing resources are widely scattered in the network. It thus becomes essential to be able to train and operate a neural net consisting of multiple parts that exist at different locations over the network in performing a given job. Distributed and decentralized ways of learning or computing are critical in this sense and represent a new and exciting area of research.

Another imminent issue in machine learning is complexity and energy consumption. In certain applications, learning has to take place using a machine with limited computational capability under finite energy or power resource without resorting to powerful general-purpose GPUs. In this sense, hardware-friendly learning targeting low computational load and energy/power consumption is of great practical interest.

Computing power, communication bandwidth and storage space are resources that can be traded with one another in operating a powerful distributed machine learning algorithm. What are the mathematical laws behind the optimal tradeoff? The practices and theories of machine learning, modern communication and distributed storage all come together in answering this question.

A few of the ongoing works are described below at more specific levels:


Quick Menu

Meta Learning
Multi-Modal Learning
Incremental Few-Shot Learning
Few-Shot Segmentation and Edge-Detection
Generative Models for Robust Learning
Distributed Learning
Federated Learning
Low-Complexity Learning
Distributed Computing
Blockchain
Distributed Storage System



  • Meta Learning

    Meta learning aims to learn a general strategy to learn new tasks. We focus on theoretical analysis on meta learning and developing meta learning algorithm for high speed/adaptive learning. We propose a novel meta learning algorithm using neural network augmented with task-adaptive projection for few-shot learning [1], [2].



  • Multi-Modal Learning

    Multi-modal learning system combines signals from various types of sensors to derive optimal solutions for extremely complex applications (e.g. autonomous vehicle). We focus on developing hardware-friendly multi-modal learning algorithms based on the clustered structure of neural network, distributed processors and information exchange modules.




  • Incremental Few-Shot Learning

    Learning novel concepts while preserving prior knowledge is a long-standing challenge in machine learning. The challenge gets greater when a novel task is given with only a few labeled examples, a problem known as incremental few-shot learning. We proposed XtarNet [3], which learns to extract task-adaptive representation (TAR) for facilitating incremental few-shot learning. The method utilizes a backbone network pretrained on a set of base categories while also employing additional modules that are meta-trained across episodes. Given a new task, the novel feature extracted from the meta-trained modules is mixed with the base feature obtained from the pretrained model. The process of combining two different features provides TAR, possessing effective information for classifying both novel and base categories. Experiments on standard image datasets indicate that XtarNet achieves state-of-the-art incremental few-shot learning performance. The concept of TAR can also be used in conjunction with existing incremental few-shot learning methods.




  • Few-Shot Segmentation and Edge-Detection

    Training a deep neural network requires a large amount of labeled data, which are scarce or expensive in many cases. Few-shot learning algorithms aim to tackle this problem, and advances in few-shot learning now allows machines to handle previously unseen classification tasks with only a few labeled samples in certain cases. Recently, more complicated few-shot learning problems such as few-shot object detection and few-shot semantic segmentation have seen much interest. Labels for object detection or semantic segmentation are even harder to obtain, naturally occasioning the formulation of few-shot detection and few-shot segmentation problems. Our work on both few-shot segmentation and few-shot edge-detection suggest that task-adaptive feature transformation can be utilized to change the task-driven features into task-agnostic features that are highly amenable to the desired jobs at hand.




  • Generative Models for Robust Learning

    Ability to generalize well to unseen data is of paramount importance to the success of modern deep predictive models. At the same time, the models must also be robust enough against perturbations at inference time. Achieving these two objectives presents a great challenge. Our recent work [4] on data augmentation making use of generative adversarial networks suggests that it is possible to achieve these two seemingly competing objectives. In our approach augmented data points are generated in between the learned class manifolds. Experiments on both synthetic data and read data show that in both generalization and adversarial robustness our scheme far exceeds known methods.




  • Distributed Learning

    Distributed learning enables to train a large-scale learning model with massive dataset. We investigate ways of speeding up and securing distributed learning in various scenarios. We propose coding schemes specifically geared to the tiered and broadcast nature of wireless edge networks.



    We also propose election coding, which is a coding framework for protecting a communication-efficient distributed learning algorithm (called SignSGD with majority vote) against Byzantine attacks. This framework explores new information-theoretic limits of finding the majority opinion when some workers could be malicious, and paves the road to implement robust and efficient distributed learning algorithms.






  • Federated Learning

    With the explosive growth in the numbers of smart phones, wearable devices and IoT sensors, a large portion of data generated nowadays is collected outside the cloud, especially at the distributed end-devices at the edge. Federated learning is a recent paradigm for this setup, which enables training of a machine learning model in a distributed network while significantly resolving privacy concerns of the individual devices. However, training requires repeated downloading and uploading of the models between the parameter server (PS) and devices, presenting significant challenges in terms of the communication bottleneck at the PS and the non-IID data characteristic across devices. We attack these problems while also finding ways to meet stronger privacy requirements (even in federated learning user data can be reproduced by accessing the transmitted model updates from the user). We take coding-theoretical approach while also taking advantage of the inherent nature of the wireless edge network. Federated learning may also suffer from slow devices known as stragglers, as well as adversaries enforcing model update poisoning, data poisoning and backdoor attacks. We provide effective solutions to these practical problems in the form of semi-synchronous aggregation with entropy-filtering and loss-averaging using a small amount of public data.




  • Low-Complexity Learning

    Deep neural networks typically require a very large number of learnable parameters. As on-device learning, which heavily relies on local computing resources as well as local data, is becoming increasingly important for low-latency optimization of distributed systems, low-complexity and hardware-friendly learning methods are in great demand. We take an algorithmic approach for hardware-constrained learning: instead of taking a hardware-centric view in attempting a direct hardware reduction of the full-blown model architecture with a desired level of performance, we strive to find an algorithmic innovation in reducing the overall model complexity. Examples of our approach include gradient filtering to stabilize backpropagation under heavily quantized gradients as well as knowledge distillation combined the differential error representation learning.


  • Distributed Computing

    Coding for distributed computing supports low-latency computation by relieving the burden of straggling workers. We propose various coding schemes which reflect the architecture of real-world distributed computing systems. We show that our scheme outperforms existing schemes in many practical scenarios [5]-[8].



  • Blockchain

    Blockchain is a peer to peer distributed ledger technology, which enables decentralization without any trusted central authorities. We focus on applying coding and information theory to blockchain to enhance performance for practical usage [9].




  • Distributed Storage System

    Distributed storage system is a network of storage nodes to store data reliably over a long period of time. We focus on network coding for distributed storage, considering bandwidth efficiency, latency and security issues [10]-[12].





  • Copyright © Moon Lab., 2017
    School of Electrical Engineering, Korea Advanced Institute of Science and Technology
    291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea

    KAIST