1
- Policy Learning with Abstention
- It's Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation
- Solving Inequality Proofs with Large Language Models
- Structure-agnostic Optimality of Doubly Robust Learning for Treatment Effect Estimation
- Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
- Learning Causal Representations from General Environments: Identifiability and Intrinsic Ambiguity
- Minimax Optimal Kernel Operator Learning via Multilevel Training
- Understanding Incremental Learning of Gradient Descent -- A Fine-grained Analysis of Matrix Sensing
- Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power
- Understanding Riemannian Acceleration via a Proximal Extragradient Framework
- Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis
- Improved analysis of clipping algorithms for non-convex optimization