“There are some things which cannot be learned quickly, and time, which is all we have, must be paid heavily for their acquiring. They are the very simplest things, and because it takes a man’s life to know them the little new that each man gets from life is very costly and the only heritage he has to leave.” - Ernest Hemingway (More…)
News #
I will be updating both good news, bad news and all kinds of news.
Posts #
Recent Advances in Neural Network Optimization for LLM Training
The optimization landscape for LLM training looks very different from two years ago. AdamW still dominates production runs, but a wave of research is eroding that dominance from multiple angles simultaneously: matrix-aware optimizers, horizon-free schedulers, a sharply revised understanding of µP, and communication-efficient distributed methods. This post synthesizes 18 recent papers across five interconnected fronts. The unifying thread is an active re-examination of long-held assumptions, from whether gradient geometry matters, to what µP is actually doing, to whether weight decay is a regularizer at all.
The Invariant Subspace Problem
Few questions in functional analysis have attracted sustained attention across as many decades as this one. It sits at the confluence of operator theory, spectral theory, and complex analysis, and every partial result has opened new territory rather than narrowing the problem to a routine case. Problem (Invariant Subspace Problem) Does every bounded linear operator $T$ on an infinite-dimensional separable complex Hilbert space $\mathcal{H}$ have a non-trivial closed invariant subspace?
Something Like Picard for 1-Forms
Picard’s great theorem is a statement about how wildly a holomorphic function can behave near an essential singularity. The conjecture below asks whether injectivity of local primitives of a 1-form is enough to rule out such wild behaviour at the origin, forcing the 1-form to extend meromorphically across the puncture. Conjecture (Elsner, 2010) Let $D$ be the open unit disk and let $U_1,\dots,U_n$ be open sets with $\bigcup_{j=1}^n U_j = D\setminus{0}$. Suppose there are injective holomorphic functions $f_j : U_j \to \mathbb{C}$ such that $$\mathrm{d}f_j = \mathrm{d}f_k \quad \text{on every connected component of } U_j \cap U_k.$$ Then the $\mathrm{d}f_j$ glue together to a meromorphic 1-form on $D$.
Criterion for Boundedness of Power Series
Introduction & Problem Statement # Power series constitute one of the most ubiquitous objects in analysis. A power series $\sum_{n=0}^{\infty}a_n x^n$ with infinite radius of convergence defines a real-entire function $f:\mathbb{R}\to\mathbb{R}$. Whereas the question of convergence is completely settled by Cauchy–Hadamard theory, the question of boundedness of the sum function is far subtler and, as of this writing, remains open. Question 1 (Rüdinger, 2009) Let $(a_n) _{n\ge 0}$ be a sequence of real numbers such that the power series $\sum _{n=0}^{\infty}a_n x^n$ converges for every $x\in\mathbb{R}$, thereby defining a smooth function $f:\mathbb{R}\to\mathbb{R}$. Give a necessary and sufficient criterion on $(a_n)$ for $f$ to be bounded on $\mathbb{R}$.
Brezis' first open problem - An elliptic equation involving the critical exponent in 3D
Yamabe problem # Yamabe problem: Suppose $(\mathcal{M}, g_0)$ is a compact closed Riemannian manifold with dimension $N \geq 3$, does there exist a conformal metric $g = u^{\frac{4}{N-2}}g_0$ which has constant scalar curvature $R_g \equiv C$? Find $u > 0$ on $\mathcal{M}$ such that $$ -\frac{4(N-1)}{N-2}\Delta_{g_0}u + R_{g_0}u = Cu^{\frac{N+2}{N-2}}\qquad\text{on }\mathcal{M}. $$ Some results: Trudinger [1968]: if $g$ has non-positive scalar curvature. Aubin [1976]: $N \geq 6$ and $(\mathcal{M}, g)$ not locally conformally flat. Schoen [1984]: any dimension, the remaining cases, assuming the Positive Mass Theorem by Schoen-Yau [1979]. A special case # Consider the special case where $\mathcal{M}$ is a bounded domain $\Omega$ in $\mathbb{R}^{N}$: $$ \begin{cases} -\Delta u = u^{\frac{N+2}{N-2}}\qquad\text{in }\Omega, \\ u > 0\qquad\text{in }\Omega, \\ u = 0\qquad\text{on }\partial\Omega. \end{cases} $$