Link Search Menu Expand Document

Research Overview

I develop trustworthy and scalable ML algorithms and systems while making the most efficient use of data.

Q1. How can we design trustworthy and scalable ML systems?

Q2. How can we design efficient ML algorithms?

Q3. How much data is needed for reliable and efficient ML, and what should I do if I don’t have enough data?

Q4. How can we solve real-world problems using ML?

Some of my recent work can be roughly clustered as follows.

Q1. How can we design trustworthy and scalable ML systems?

  • Improving Fairness via Federated Learning
    Y. Zeng, H. Chen, and K. Lee
    MLSys-CrossFL 2022
    AAAI 2022 Workshop on Federated Learning
  • Dynamic Decentralized Federated Learning
    S. Dai, K. Lee, and S. Banerjee
    MLSys-CrossFL 2022
  • Breaking Fair Binary Classification with Optimal Flipping Attacks
    C. Jo, J. Sohn, and K. Lee
    Arxiv 2022
  • Debiasing Pre-Trained Language Models via Efficient Fine-tuning
    M. Gira, R. Zhang, and K. Lee
    ACL Workshop on Language Technology for Equality, Diversity, Inclusion 2022
  • Federated Unsupervised Clustering with Generative Models
    J. Chung, K. Lee, and K. Ramchandran
    AAAI 2022 Workshop on Federated Learning
  • Sample Selection for Fair and Robust Training
    Y. Roh, K. Lee, S. Whang, and C. Suh
    NeurIPS 2021
  • Gradient Inversion with Generative Image Prior
    J. Kim, J. Jeon, K. Lee, S. Oh, and J. Ok
    NeurIPS 2021
  • Coded-InvNet for Resilient Prediction Serving Systems
    T. Dinh, and K. Lee
    ICML 2021 long oral
  • Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
    S. Agarwal, H. Wang, K. Lee, S. Venkataraman, and D. Papailiopoulos
    MLSys 2021
  • FairBatch: Batch Selection for Model Fairness
    Y. Roh, K. Lee, S. Whang, and C. Suh
    ICLR 2021
  • Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
    H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, and D. Papailiopoulos
    NeurIPS 2020
  • FR-Train: A mutual information-based approach to fair and robust training
    Y. Roh, K. Lee, S. Whang, and C. Suh
    ICML 2020
  • Improving Model Robustness via Automatically Incorporating Self-supervision Tasks
    D. Kim, K. Lee, and C. Suh
    NeurIPS 2019 Workshop on Meta-Learning
  • UberShuffle: Communication-efficient Data Shuffling for SGD via Coding Theory
    J. Chung, K. Lee, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran
    SysML 2018, NIPS Workshop on Machine Learning Systems 2017
  • Speeding Up Distributed Machine Learning Using Codes
    K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran
    IEEE Transactions on Information Theory January 2018
    The Joint Communications Society/Information Theory Society Paper Award, 2020
  • High-Dimensional Coded Matrix Multiplication
    K. Lee, C. Suh, and K. Ramchandran
    IEEE ISIT 2017
  • The MDS Queue: Analysing the Latency Performance of Erasure Codes
    K. Lee, N. Shah, L. Huang, and K. Ramchandran
    IEEE Transactions on Information Theory May 2017
  • On Scheduling Redundant Requests With Cancellation Overheads
    K. Lee, R. Pedarsani, and K. Ramchandran
    IEEE/ACM Transactions on Networking April 2017
  • When Do Redundant Requests Reduce Latency?
    N. Shah, K. Lee, and K. Ramchandran
    IEEE Transactions on Communications February 2016
  • A VoD System for Massively Scaled, Heterogeneous Environments: Design and Implementation
    K. Lee, L. Yan, A. Parekh, and K. Ramchandran
    IEEE MASCOTS 2013

Q2. How can we design efficient ML algorithms?

  • LIFT: Language-Interfaced FineTuning for Non-Language Machine Learning Tasks
    T. Dinh*, Y. Zeng*, R. Zhang, Z. Lin, M. Gira, S. Rajput, J. Sohn, D. Papailiopoulos and K. Lee
    Arxiv 2022
  • Super Seeds: extreme model compression by trading off storage with computation
    N. Lee*, S. Rajput*, J. Sohn, H. Wang, A. Nagle, E. Xing, K. Lee, D. Papailiopoulos
    ICML Workshop on Updatable Machine Learning (UpML 2022)
  • Improved Input Reprogramming for GAN Conditioning
    T. Dinh, D. Seo, Z. Du, L. Shang, and K. Lee
    ICML Workshop on Updatable Machine Learning (UpML 2022)
  • Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment
    T. Dinh, J. Sohn, S. Rajput, T. Ossowski, Y. Ming, J. Hu, D. Papailiopoulos and K. Lee
    Arxiv 2022
  • Rare Gems: Finding Lottery Tickets at Initialization
    K. Sreenivasan, J. Sohn, L. Yang, M. Grinde, A. Nagle, H. Wang, K. Lee, D. Papailiopoulos
    Arxiv 2022
  • Permutation-Based SGD: Is Random Optimal?
    S. Rajput, K. Lee, and D. Papailiopoulos
    ICLR 2022
  • GenLabel: Mixup Relabeling using Generative Models
    J. Sohn, L. Shang, H. Chen, J. Moon, D. Papailiopoulos, and K. Lee
    ICML 2022

Q3. How much data is needed for reliable and efficient ML, and what should I do if I don’t have enough data?

  • Improved Input Reprogramming for GAN Conditioning
    T. Dinh, D. Seo, Z. Du, L. Shang, and K. Lee
    Arxiv 2022
  • Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information
    C. Jo, and K. Lee
    ICML 2021
  • Reprogramming GANs via Input Noise Design
    K. Lee, C. Suh, and K. Ramchandran
    ECML PKDD 2020
  • SAFFRON: Sparse-Graph Code Framework for Group Testing
    K. Lee, R. Pedarsani, and K. Ramchandran
    IEEE Transactions on Signal Processing 2019
  • Community Recovery in Hypergraphs
    K. Ahn, K. Lee, and C. Suh
    IEEE Transactions on Information Theory 2019
  • Binary Rating Estimation with Graph Side Information
    K. Ahn, K. Lee, H. Cha, and C. Suh
    NeruIPS 2018
  • Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
    K. Ahn, K. Lee, and C. Suh
    IEEE Journal of Selected Topics in Signal Processing October 2018
  • Simulated+Unsupervised Learning With Adaptive Data Generation and Bidirectional Mappings
    K. Lee, H. Kim, and C. Suh
    ICLR 2018
  • Information-theoretic Limits of Subspace Clustering
    K. Ahn, K. Lee, and C. Suh
    IEEE ISIT 2017
  • PhaseCode: Fast and Efficient Compressive Phase Retrieval based on Sparse-Graph-Codes
    R. Pedarsani, D. Yin, K. Lee, and K. Ramchandran
    IEEE Transactions on Information Theory June 2017

Q4. How can we solve real-world problems using ML?

  • Crash to Not Crash: Learn to Identify Dangerous Vehicles using a Simulator
    H. Kim, K. Lee, G. Hwang, and C. Suh
    AAAI 2019, ICML Workshop on Machine Learning for Autonomous Vehicles 2017
  • Large-scale and Interpretable Collaborative Filtering for Educational Data
    K. Lee, J. Chung, and C. Suh
    KDD Workshop on Advancing Education with Data 2017
  • Machine Learning Approaches for Learning Analytics: Collaborative Filtering or Regression With Experts?
    K. Lee, J. Chung, Y. Cha, and C. Suh
    NIPS Workshop on Machine Learning for Education 2016