About me
Howdy! I am a final-year PhD candidate in the Computer Science Department of UT Austin, advised by Prof. Inderjit S. Dhillon and Prof. Sujay Sanghavi. Broadly speaking, I am interested in developing provably better optimization algorithms and generalization-improving techniques for machine learning, especially under data-centric constraints. Previously, I received the combined B.Tech. and M.Tech. degree in Electrical Engineering from IIT Bombay. Here I worked under the guidance of Prof. Subhasis Chaudhuri and I was awarded the Undergraduate Research Award.
I will join Google Research as a Research Scientist in 2025! During my PhD, I have interned at Google DeepMind, Google Research, and Amazon.
You can check out my outdated CV here. My email is rdas(at)utexas(dot)edu.
Papers
“Retraining with Predicted Hard Labels Provably Increases Model Accuracy” - Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi and Peilin Zhong.
Preprint. Download here.
“Towards Quantifying the Preconditioning Effect of Adam” - Rudrajit Das, Naman Agarwal, Sujay Sanghavi and Inderjit S. Dhillon.
Preprint. Download here.
“Understanding the Training Speedup from Sampling with Approximate Losses” - Rudrajit Das, Xi Chen, Bertram Ieong, Parikshit Bansal and Sujay Sanghavi.
ICML 2024. Download paper here.
“Understanding Self-Distillation in the Presence of Label Noise” - Rudrajit Das and Sujay Sanghavi.
ICML 2023. Download paper here.
“On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data” - Jianyu Wang, Rudrajit Das, Gauri Joshi, Satyen Kale, Zheng Xu and Tong Zhang.
TMLR. Download paper here.
“Beyond Uniform Lipschitz Condition in Differentially Private Optimization” - Rudrajit Das, Satyen Kale, Zheng Xu, Tong Zhang and Sujay Sanghavi.
ICML 2023. Download paper here.
“Differentially Private Federated Learning with Normalized Updates” - Rudrajit Das, Abolfazl Hashemi, Sujay Sanghavi and Inderjit S. Dhillon.
Download preprint here. Short version presented in OPT2022 workshop of NeurIPS 2022; download here.
“Faster Non-Convex Federated Learning via Global and Local Momentum” - Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon and Ufuk Topcu.
“On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization” - Abolfazl Hashemi, Anish Acharya^, Rudrajit Das^, Haris Vikalo, Sujay Sanghavi and Inderjit Dhillon (^ denotes equal contribution).
IEEE Transactions on Parallel and Distributed Systems. Download paper here and preprint here.
“On the Convergence of a Biased Version of Stochastic Gradient Descent” - Rudrajit Das, Jiong Zhang and Inderjit Dhillon.
NeurIPS 2019 Beyond First Order Methods in ML workshop. Download paper here.
“On the Separability of Classes with the Cross-Entropy Loss Function” - Rudrajit Das and Subhasis Chaudhuri.
Preprint. Download here.
“Nonlinear Blind Compressed Sensing under Signal-Dependent Noise” - Rudrajit Das and Ajit Rajwade.
IEEE International Conference on Image Processing (ICIP) 2019. Download paper here.
“Sparse Kernel PCA for Outlier Detection” - Rudrajit Das, Aditya Golatkar and Suyash Awate.
IEEE International Conference on Machine Learning and Applications (ICMLA) 2018 Oral. Download paper here.
iFood Challenge, FGVC Workshop, CVPR 2018 - Parth Kothari^, Arka Sadhu^, Aditya Golatkar^ and Rudrajit Das^ (^ denotes equal contribution).
Finished $2^{nd}$ in the public leaderboard and $3^{rd}$ in the private leaderboard (Team name: Invincibles). Leaderboard Link. Invited to present our method at CVPR 2018 (slides can be found here).
Internships
- Student Researcher at Google Research, New York City, NY (June ‘24 - August ‘24)
Host: Kyriakos Axiotis- Worked on improving the quality of pruned large language models.
- Worked on improving the quality of pruned large language models.
- Student Researcher at Google Research (Remote) (November ‘23 - March ‘24)
Host: Alessandro Epasto- Theoretically showed that retraining with predicted hard labels improves model accuracy in the presence of label noise. Empirically we showed that retraining appropriately can significantly improve training with label differential privacy.
- Theoretically showed that retraining with predicted hard labels improves model accuracy in the presence of label noise. Empirically we showed that retraining appropriately can significantly improve training with label differential privacy.
- Student Researcher at Google DeepMind, Princeton, NJ (June ‘23 - October ‘23)
Host: Naman Agarwal- Derived new theoretical results to quantify the preconditioning effect of the Adam optimizer, and empirically benchmarked several optimization algorithms based on Adam.
- Derived new theoretical results to quantify the preconditioning effect of the Adam optimizer, and empirically benchmarked several optimization algorithms based on Adam.
- Research Intern at Google (Remote) (June ‘21 - August ‘21)
Hosts: Zheng Xu, Satyen Kale, and Tong Zhang- Clipped gradient methods are commonly used in practice for differentially private (DP) training, e.g., DP-SGD. However, a sound theoretical understanding of these methods has been elusive. We provide principled guidance on choosing the clipping threshold in DP-SGD and also derive novel convergence results for DP-SGD in heavy-tailed settings.
- Clipped gradient methods are commonly used in practice for differentially private (DP) training, e.g., DP-SGD. However, a sound theoretical understanding of these methods has been elusive. We provide principled guidance on choosing the clipping threshold in DP-SGD and also derive novel convergence results for DP-SGD in heavy-tailed settings.
- Applied Scientist Intern at Amazon Search, Berkeley, CA (May ‘20 - August ‘20)
Mentor: Dan Hill, Manager: Sujay Sanghavi- Worked on customer-specific query correction by leveraging the “session data” (i.e. previous searches of the customer) using SOTA Transformer models. Our model generated better candidates than the production system.
- Worked on customer-specific query correction by leveraging the “session data” (i.e. previous searches of the customer) using SOTA Transformer models. Our model generated better candidates than the production system.
- Institute for Biomechanics, ETH Zürich, Zürich, Switzerland (May ‘17 - July ‘17)
Guide : Dr. Patrik Christen and Prof. Dr. Ralph Müller, D-HEST- Proposed a stable linear model (with closed-form solution) and a fuzzy boolean network for bone remodeling. Also developed an automated 2D-3D image registration framework for histology images from scratch.
- Proposed a stable linear model (with closed-form solution) and a fuzzy boolean network for bone remodeling. Also developed an automated 2D-3D image registration framework for histology images from scratch.