<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts on Nam Le</title><link>https://blog.namln.org/en/posts/</link><description>Recent content in Posts on Nam Le</description><generator>Hugo</generator><language>en-US</language><lastBuildDate>Fri, 29 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.namln.org/en/posts/index.xml" rel="self" type="application/rss+xml"/><item><title>Navier–Stokes Existence and Smoothness</title><link>https://blog.namln.org/en/posts/navier-stokes-existence-smoothness/</link><pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/navier-stokes-existence-smoothness/</guid><description>&lt;p&gt;The motion of a viscous incompressible fluid is described by the Navier–Stokes
equations, first written down by Claude-Louis Navier in 1822 and given their modern
form by George Gabriel Stokes. Whether smooth solutions to these equations can
always be continued for all time (or whether they can spontaneously develop a
singularity at some finite time) is one of the deepest open problems in mathematics,
and one of the seven &lt;a href="https://www.claymath.org/millennium-problems/"&gt;Clay Millennium Prize Problems&lt;/a&gt;,
carrying a 1,000,000$ prize for a solution.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Problem (Clay Millennium Prize, Fefferman 2000)&lt;/span&gt;
&lt;p&gt;Let $u_0 : \mathbb{R}^3 \to \mathbb{R}^3$ be a smooth divergence-free vector field.
Does there exist a smooth solution $u(x,t)$, $p(x,t)$ to the 3D incompressible
Navier–Stokes equations
$$\partial_t u + (u \cdot \nabla)u - \nu\Delta u + \nabla p = 0, \qquad \nabla \cdot u = 0,
\qquad u(\cdot,0) = u_0$$
defined for all $t &amp;gt; 0$ and satisfying $\int_{\mathbb{R}^3}|u(x,t)|^2,dx &amp;lt; C$
for all $t \geq 0$? A solution or a counterexample (a smooth $u_0$ for which no
such smooth solution exists) both qualify for the prize.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-equations-and-their-scaling"&gt;
 The Equations and Their Scaling&lt;span class="heading__anchor"&gt; &lt;a href="#the-equations-and-their-scaling"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Compared to the Euler equations (which describe inviscid flow), the Navier–Stokes
equations add the viscous term $\nu\Delta u$, where $\nu &amp;gt; 0$ is the kinematic
viscosity. This term dissipates energy and regularises the flow locally. The central
tension is that the nonlinear term $(u\cdot\nabla)u$ can concentrate energy at
small spatial scales faster than viscosity can diffuse it away.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scaling symmetry.&lt;/strong&gt; The Navier–Stokes equations are invariant under the rescaling
$$u(x,t) \mapsto \lambda u(\lambda x,, \lambda^2 t), \qquad
p(x,t) \mapsto \lambda^2 p(\lambda x,, \lambda^2 t).$$
A norm is &lt;em&gt;critical&lt;/em&gt; (or &lt;em&gt;scale-invariant&lt;/em&gt;) if it is preserved by this rescaling.
The critical norm in $L^p(\mathbb{R}^3)$ is $L^3$, since
$|\lambda u(\lambda\cdot)| _{L^3} = |u| _{L^3}$.
The energy norm $|u| _{L^2}$ is &lt;em&gt;subcritical&lt;/em&gt;: it scales as $\lambda^{1/2}|u| _{L^2}$,
which shrinks under the rescaling $\lambda \to \infty$ (i.e., zoom into small
scales). This mismatch is the core of the difficulty: global energy control does
not prevent concentration at arbitrarily small scales.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2D global regularity.&lt;/strong&gt; In two dimensions the scaling is different: the enstrophy
$|\nabla u|_{L^2}^2$ is scale-invariant and is controlled by the energy. Global
regularity in 2D follows from this enstrophy estimate, a fact known since the 1960s.
In 3D no analogous critical quantity is controlled globally, and the problem is open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-hierarchy-of-known-results"&gt;
 The Hierarchy of Known Results&lt;span class="heading__anchor"&gt; &lt;a href="#the-hierarchy-of-known-results"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="lerayhopf-weak-solutions-1934"&gt;
 Leray–Hopf Weak Solutions (1934)&lt;span class="heading__anchor"&gt; &lt;a href="#lerayhopf-weak-solutions-1934"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Leray 1934, Hopf 1951)&lt;/span&gt;
&lt;p&gt;For any $u_0 \in L^2(\mathbb{R}^3)$ divergence-free, there exists a global
&lt;em&gt;weak solution&lt;/em&gt; $u \in L^\infty(0,\infty;, L^2) \cap L^2(0,\infty;, H^1)$
satisfying the energy inequality
$$|u(t)| _{L^2}^2 + 2\nu\int _0^t |\nabla u| _{L^2}^2, ds \leq |u_0| _{L^2}^2.$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Leray&amp;rsquo;s construction, via a compactness argument on regularised equations, produces
a solution that is globally defined but potentially not smooth, and the term &amp;ldquo;weak&amp;rdquo;
refers to the fact that the equations are satisfied only in an integral (distributional)
sense, not pointwise. The energy inequality is the only bound available globally.
Whether Leray–Hopf solutions are unique, or whether they are the same as smooth
solutions when the initial data is smooth, is unknown.&lt;/p&gt;
&lt;h3 class="heading" id="partial-regularity-the-ckn-theorem"&gt;
 Partial Regularity: The CKN Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#partial-regularity-the-ckn-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The best known result limiting the size of potential singularities is the following.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Caffarelli–Kohn–Nirenberg, 1982)&lt;/span&gt;
&lt;p&gt;For any &lt;em&gt;suitable weak solution&lt;/em&gt; to the 3D Navier–Stokes equations, the set of
space-time singular points has &lt;em&gt;parabolic Hausdorff dimension at most 1&lt;/em&gt;. In
particular, at any given time the spatial singular set has Hausdorff dimension
at most $\dfrac{1}{2}$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A &amp;ldquo;suitable weak solution&amp;rdquo; is a weak solution satisfying a local energy inequality.
The CKN theorem proves that singularities, if they exist, cannot fill a curve or
surface: they can occupy at most a set of dimension one in space-time. This is the
most quantitative partial regularity result available and was simplified by Lin
(1998). Scheffer (1977) had earlier shown singular times have Hausdorff dimension
at most $\dfrac{1}{2}$.&lt;/p&gt;
&lt;h3 class="heading" id="conditional-regularity-ladyzhenskayaprodiserrin"&gt;
 Conditional Regularity: Ladyzhenskaya–Prodi–Serrin&lt;span class="heading__anchor"&gt; &lt;a href="#conditional-regularity-ladyzhenskayaprodiserrin"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Ladyzhenskaya 1967, Prodi 1959, Serrin 1962)&lt;/span&gt;
&lt;p&gt;If a weak solution additionally satisfies $u \in L^r(0,T;, L^s(\mathbb{R}^3))$
with $\dfrac{2}{r} + \dfrac{3}{s} = 1$ and $3 &amp;lt; s \leq \infty$, then $u$ is
smooth on $(0,T]$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The condition $\dfrac{2}{r} + \dfrac{3}{s} = 1$ is precisely the scale-invariant
line in the $(r,s)$ plane: membership in any of these spaces implies regularity.
The family ranges from $(r,s)=(\infty, 3)$ (critical $L^3$ control in space,
uniform in time) to $(r,s)=(2,\infty)$ (square-integrable $L^\infty$ control in
time). These are &lt;em&gt;conditional&lt;/em&gt; results: they do not prove that a weak solution
lies in such a space, only that if it does, it must be smooth.&lt;/p&gt;
&lt;h3 class="heading" id="the-critical-endpoint-escauriazasereginšverák"&gt;
 The Critical Endpoint: Escauriaza–Seregin–Šverák&lt;span class="heading__anchor"&gt; &lt;a href="#the-critical-endpoint-escauriazaseregin%c5%a1ver%c3%a1k"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Escauriaza–Seregin–Šverák, 2003)&lt;/span&gt;
&lt;p&gt;If $u$ is a Leray–Hopf weak solution with $\sup _{t \in [0,T^*)} |u(\cdot,t)| _{L^3(\mathbb{R}^3)} &amp;lt; \infty$,
then $u$ can be extended as a smooth solution past $T^*$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The endpoint case $s=3$ of the LPS family is the critical one: $L^3(\mathbb{R}^3)$
is exactly the scale-invariant norm for Navier–Stokes. The ESS proof is substantially
harder than the subcritical cases; it uses a compactness argument to reduce to a
smooth, backwards self-similar solution and then invokes a backwards uniqueness
theorem for parabolic equations to rule it out.&lt;/p&gt;
&lt;h3 class="heading" id="taos-quantitative-criterion"&gt;
 Tao&amp;rsquo;s Quantitative Criterion&lt;span class="heading__anchor"&gt; &lt;a href="#taos-quantitative-criterion"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Tao, 2019)&lt;/span&gt;
&lt;p&gt;If a smooth finite-energy solution first becomes singular at time $T^*$, then
$$\limsup_{t \uparrow T^*}
\dfrac{|u(\cdot,t)| _{L^3(\mathbb{R}^3)}}{\bigl(\log\log\log\tfrac{1}{T^*-t}\bigr)^c}
= \infty$$
for some absolute constant $c&amp;gt;0$. In particular, the critical $L^3$ norm must blow
up at least as fast as a triple-logarithm in $(T^*-t)^{-1}$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Tao&amp;rsquo;s result is the first &lt;em&gt;supercritical&lt;/em&gt; regularity criterion for Navier–Stokes:
it gives quantitative information about the blowup rate that goes (by a triple
logarithm) beyond what scaling alone can detect. The proof quantifies the
compactness arguments in the ESS proof, replacing each use of a compactness method
by an explicit Carleman inequality, and propagates lower bounds for the vorticity
across dyadic annuli. The triple-exponential dependence in Tao&amp;rsquo;s bound has since
been localised and sharpened by Barker–Prange (2021) and others.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-supercriticality-problem"&gt;
 The Supercriticality Problem&lt;span class="heading__anchor"&gt; &lt;a href="#the-supercriticality-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The fundamental analytical obstruction is that Navier–Stokes is &lt;em&gt;supercritical&lt;/em&gt;
with respect to the only globally controlled norm ($L^2$): the energy.&lt;/p&gt;
&lt;p&gt;Define the &lt;em&gt;critical regularity index&lt;/em&gt; as the Sobolev exponent $s$ such that
$\dot{H}^s(\mathbb{R}^3)$ is scale-invariant. For Navier–Stokes, $s = 1/2$. The
energy controls $\dot{H}^0 = L^2$ (subcritical), and regularity theory requires
control at $\dot{H}^1$ (critical viscous norm) or $L^3$ (critical Lebesgue norm).
There is a &lt;em&gt;regularity gap&lt;/em&gt; between what is globally available ($L^2$) and what
is needed ($L^3$ or $\dot{H}^1$). Every known approach to closing this gap runs
into the same obstruction: the nonlinearity can create structure at arbitrarily
small scales that the subcritical $L^2$ bound cannot see.&lt;/p&gt;
&lt;p&gt;Tao (2016) made this gap precise by constructing an &lt;em&gt;averaged&lt;/em&gt; Navier–Stokes system, where the bilinear nonlinearity $(u\cdot\nabla)u$ is replaced by a carefully designed convex average of related nonlinearities, for which finite-time blowup
can be rigorously proved. This construction does not produce a counterexample to the true Navier–Stokes equations, but it demonstrates that the specific algebraic structure of the nonlinearity is load-bearing: any proof of global regularity must use something specific about $(u\cdot\nabla)u$ that is not shared by its averages.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-improving-the-quantitative-blowup-rate"&gt;
 1. Improving the Quantitative Blowup Rate&lt;span class="heading__anchor"&gt; &lt;a href="#1-improving-the-quantitative-blowup-rate"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Tao&amp;rsquo;s triple-logarithmic rate is the sharpest known lower bound on blowup of the
critical $L^3$ norm. Scaling considerations suggest that the true rate, if blowup
occurs, should be much faster; conjecturally $|u|_{L^3} \sim (T^*-t)^{-\delta}$
for some $\delta &amp;gt; 0$, analogous to Type I blowup in nonlinear heat equations. The
gap between the triple-logarithmic lower bound and the conjectured power-law rate
represents the frontier of quantitative regularity theory. Closing even part of this
gap, for instance establishing a single-logarithmic or power-of-log lower bound,
would require new ideas beyond Carleman estimates.&lt;/p&gt;
&lt;h3 class="heading" id="2-type-i-vs-type-ii-blowup"&gt;
 2. Type I vs. Type II Blowup&lt;span class="heading__anchor"&gt; &lt;a href="#2-type-i-vs-type-ii-blowup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A blowup is called &lt;em&gt;Type I&lt;/em&gt; if the scale-invariant norm $|u(\cdot,t)|_{L^3}$
grows no faster than $O((T^&lt;em&gt;-t)^{-1/2})$ near $T^&lt;/em&gt;$. It is &lt;em&gt;Type II&lt;/em&gt; otherwise.
For the Navier–Stokes equations, ruling out Type I blowup would be a significant
advance: all self-similar singularities (where $u(x,t) = (T^*-t)^{-1/2}U(x/(T^*-t)^{1/2})$)
are of Type I, and several results (including work of Ružička and Seregin) already
rule them out under mild additional assumptions. Whether all Type I blowup can be
excluded, leaving only the less structured Type II, is open.&lt;/p&gt;
&lt;h3 class="heading" id="3-uniqueness-of-weak-solutions"&gt;
 3. Uniqueness of Weak Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#3-uniqueness-of-weak-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Leray–Hopf weak solutions exist globally, but they may not be unique. This is a
separate, equally deep question: even if all smooth solutions extend globally, one
must also ask whether weak solutions coincide with smooth ones when started from
smooth data. Recent work of Buckmaster and Vicol (2019) showed that weak solutions
below the Ladyzhenskaya–Prodi–Serrin threshold are indeed non-unique, using
convex integration techniques developed for the Euler equations (De Lellis–Székelyhidi).
Whether Leray–Hopf solutions with the energy inequality are unique is still open
and is perhaps the central problem in the weak solution theory.&lt;/p&gt;
&lt;h3 class="heading" id="4-self-similar-and-discretely-self-similar-solutions"&gt;
 4. Self-Similar and Discretely Self-Similar Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#4-self-similar-and-discretely-self-similar-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Self-similar solutions of the form $u(x,t) = (T^*-t)^{-1/2} U(x/(T^*-t)^{1/2})$
satisfy a nonlinear elliptic system for the profile $U$. Several non-existence
theorems show that backward self-similar solutions with certain integrability must
be trivial (Nečas–Ružička–Šverák, 1996). The case of &lt;em&gt;discretely&lt;/em&gt; self-similar
solutions, where $u(x,t) = \lambda u(\lambda x, \lambda^2 t)$ for a fixed
$\lambda \neq 1$, is less understood and was recently revisited. Whether the
set of self-similar profiles that could appear as blowup limits is empty is not known.&lt;/p&gt;
&lt;h3 class="heading" id="5-computer-assisted-proofs-via-rigorous-numerics"&gt;
 5. Computer-Assisted Proofs via Rigorous Numerics&lt;span class="heading__anchor"&gt; &lt;a href="#5-computer-assisted-proofs-via-rigorous-numerics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Chen–Hou approach to Euler singularities (2025) used a computer-assisted proof
framework: construct a numerical approximate profile, then verify its stability
rigorously using interval arithmetic. For Navier–Stokes the presence of viscosity
complicates such an approach (the profile is dissipated rather than transported),
but the same framework (dynamical rescaling plus nonlinear stability verification) might in principle detect or rule out singularities in specific axi-symmetric geometries. Applying and adapting the Hou group&amp;rsquo;s methods to the viscous problem
is an active direction.&lt;/p&gt;
&lt;h3 class="heading" id="6-the-zero-viscosity-limit-and-eulernavierstokes-connection"&gt;
 6. The Zero-Viscosity Limit and Euler–Navier–Stokes Connection&lt;span class="heading__anchor"&gt; &lt;a href="#6-the-zero-viscosity-limit-and-eulernavierstokes-connection"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;As $\nu \to 0$, Navier–Stokes formally converges to Euler. The precise relationship
is subtle: in the presence of boundaries (Prandtl layers) or after a potential Euler
singularity, the zero-viscosity limit can fail to hold in strong norms. If Euler
develops a finite-time singularity at time $T^*_E$ from smooth data (as Chen–Hou
suggest for bounded domains), then for small $\nu$ the Navier–Stokes solution must
either also develop a near-singularity or be regularised by viscosity before $T^*_E$.
Whether viscosity is always sufficient to regularise an Euler singularity, or whether
a Navier–Stokes singularity can arise from a nearby Euler one, is entirely open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Fefferman, C. L. (2000). Existence and smoothness of the Navier–Stokes equation. Clay Mathematics Institute Millennium Prize Problems. &lt;a href="https://www.claymath.org/wp-content/uploads/2022/06/navierstokes.pdf"&gt;https://www.claymath.org/wp-content/uploads/2022/06/navierstokes.pdf&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Leray, J. (1934). Sur le mouvement d&amp;rsquo;un liquide visqueux emplissant l&amp;rsquo;espace. &lt;em&gt;Acta Mathematica&lt;/em&gt;, &lt;strong&gt;63&lt;/strong&gt;, 193–248.&lt;/li&gt;
&lt;li&gt;Hopf, E. (1951). Über die Anfangswertaufgabe für die hydrodynamischen Grundgleichungen. &lt;em&gt;Mathematische Nachrichten&lt;/em&gt;, &lt;strong&gt;4&lt;/strong&gt;(1–6), 213–231.&lt;/li&gt;
&lt;li&gt;Caffarelli, L., Kohn, R., &amp;amp; Nirenberg, L. (1982). Partial regularity of suitable weak solutions of the Navier–Stokes equations. &lt;em&gt;Communications on Pure and Applied Mathematics&lt;/em&gt;, &lt;strong&gt;35&lt;/strong&gt;(6), 771–831.&lt;/li&gt;
&lt;li&gt;Ladyzhenskaya, O. A. (1967). On uniqueness and smoothness of generalized solutions to the Navier–Stokes equations. &lt;em&gt;Zapiski Nauchnykh Seminarov LOMI&lt;/em&gt;, &lt;strong&gt;5&lt;/strong&gt;, 169–185.&lt;/li&gt;
&lt;li&gt;Escauriaza, L., Seregin, G. A., &amp;amp; Šverák, V. (2003). $L_{3,\infty}$-solutions of the Navier–Stokes equations and backward uniqueness. &lt;em&gt;Russian Mathematical Surveys&lt;/em&gt;, &lt;strong&gt;58&lt;/strong&gt;(2), 211–250.&lt;/li&gt;
&lt;li&gt;Tao, T. (2019). Quantitative bounds for critically bounded solutions to the Navier–Stokes equations. arXiv:1908.04958. Published in &lt;em&gt;Nine Mathematical Challenges&lt;/em&gt;, AMS, 2021, pp. 149–193.&lt;/li&gt;
&lt;li&gt;Tao, T. (2016). Finite time blowup for an averaged three-dimensional Navier–Stokes equation. &lt;em&gt;Journal of the American Mathematical Society&lt;/em&gt;, &lt;strong&gt;29&lt;/strong&gt;(3), 601–674.&lt;/li&gt;
&lt;li&gt;Buckmaster, T. &amp;amp; Vicol, V. (2019). Nonuniqueness of weak solutions to the Navier–Stokes equation. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;189&lt;/strong&gt;(1), 101–144.&lt;/li&gt;
&lt;li&gt;Barker, T. &amp;amp; Prange, C. (2021). Localized quantitative estimates and potential blow-up rates for the Navier–Stokes equations. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;385&lt;/strong&gt;, 717–792.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Navier–Stokes Regularity: The Uniqueness of Weak Solutions</title><link>https://blog.namln.org/en/posts/navier-stokes-weak-uniqueness/</link><pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/navier-stokes-weak-uniqueness/</guid><description>&lt;p&gt;The &lt;a href="../navier-stokes-existence-smoothness"&gt;companion post on Navier–Stokes existence and smoothness&lt;/a&gt;
asked whether smooth solutions can break down in finite time. This post asks the
opposite question: when a solution is only weakly defined, satisfying the equations
in an integral sense rather than pointwise, is it uniquely determined by its initial
data? The answer, developed over the last two decades through a dramatic series of
results, is a resounding &lt;em&gt;no&lt;/em&gt; in many regimes. The frontier is now whether the
physically natural class of Leray–Hopf weak solutions retains uniqueness.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Question (Weak Uniqueness)&lt;/span&gt;
&lt;p&gt;Are Leray–Hopf weak solutions of the 3D incompressible Navier–Stokes equations
$$\partial_t u + (u\cdot\nabla)u - \nu\Delta u + \nabla p = 0, \qquad \nabla\cdot u = 0$$
uniquely determined by their initial data $u_0 \in L^2(\mathbb{R}^3)$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The question is one of the most urgent open problems in the PDE theory of fluid
dynamics. It is logically independent of the blowup question: Leray–Hopf solutions
exist globally for all time regardless of whether smooth solutions break down. What
is not known is whether two Leray–Hopf solutions started from the same data must
coincide.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="nashs-h-principle-the-conceptual-ancestor"&gt;
 Nash&amp;rsquo;s h-Principle: The Conceptual Ancestor&lt;span class="heading__anchor"&gt; &lt;a href="#nashs-h-principle-the-conceptual-ancestor"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The story begins not in fluid mechanics but in differential geometry. In 1954,
John Nash proved that any Riemannian manifold admits a $C^1$ isometric embedding
into Euclidean space, a result that contradicted the expectation, based on the rigid
behaviour of $C^2$ embeddings (Cauchy), that the metric should impose strong
constraints. The key insight is that $C^1$ embeddings are &lt;em&gt;flexible&lt;/em&gt;: one can
deform them by adding high-frequency oscillations that are invisible at the large
scale but locally produce any prescribed metric tensor.&lt;/p&gt;
&lt;p&gt;Gromov formulated this phenomenon as the &lt;em&gt;h-principle&lt;/em&gt;: for certain underdetermined
differential relations, the topological (homotopy-theoretic) obstructions are the
only ones, and any formal solution can be deformed into an actual solution. The
h-principle is a flexibility result: it says geometry is surprisingly unconstrained
below a critical regularity threshold.&lt;/p&gt;
&lt;p&gt;De Lellis and Székelyhidi recognised in the mid-2000s that the incompressible Euler
equations are formally analogous to Nash&amp;rsquo;s embedding problem. The Euler system is
underdetermined (more unknowns than equations), and one can attempt to construct
wild solutions by adding high-frequency oscillations. The crucial observation is that
the nonlinearity $u\otimes u$ in the Reynolds stress tensor plays the role of the
metric tensor in Nash&amp;rsquo;s problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="wild-euler-solutions"&gt;
 Wild Euler Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#wild-euler-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The first step was to show that the Euler equations possess infinitely many weak
solutions for given initial data.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (De Lellis–Székelyhidi, 2009–2013)&lt;/span&gt;
&lt;p&gt;For any divergence-free $u _0 \in L^2(\mathbb{T}^3)$ and any prescribed energy
profile $e(t) \in C^\infty([0,T])$ with $e(t) &amp;gt; |u _0| _{L^2}^2$ for all $t &amp;gt; 0$,
there exist infinitely many weak solutions $u \in C_t^0 L_x^2$ of the 3D Euler
equations with $u(\cdot,0) = u _0$ and $|u(\cdot,t)| _{L^2}^2 = e(t)$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In particular, the Euler equations admit weak solutions that spontaneously gain or
lose kinetic energy for no reason: &lt;em&gt;wild solutions&lt;/em&gt;. The construction proceeds by
convex integration: one builds the solution iteratively, at each stage adding a
high-frequency perturbation (a &lt;em&gt;Beltrami wave&lt;/em&gt;) that corrects the error in the
momentum equation while staying nearly invisible in the velocity field.&lt;/p&gt;
&lt;p&gt;Earlier, Scheffer (1993) and Shnirelman (1997) had shown the existence of weak Euler
solutions with compact support in space-time: the fluid is at rest, then spontaneously
moves, then returns to rest; but their constructions were indirect. De Lellis and
Székelyhidi&amp;rsquo;s convex integration scheme gave the first systematic and quantitative
approach.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="onsagers-conjecture"&gt;
 Onsager&amp;rsquo;s Conjecture&lt;span class="heading__anchor"&gt; &lt;a href="#onsagers-conjecture"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The De Lellis–Székelyhidi results raise an immediate question: at what regularity
does the fluid behaviour transition from flexible (wild, non-unique) to rigid
(energy-conserving, unique)? This is precisely what Lars Onsager conjectured in 1949.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Onsager's Conjecture (1949)&lt;/span&gt;
&lt;p&gt;For the 3D incompressible Euler equations, the threshold regularity for energy
conservation is the Hölder exponent $1/3$:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If $u \in C^{0,\alpha}$ with $\alpha &amp;gt; 1/3$, then every weak solution conserves
kinetic energy.&lt;/li&gt;
&lt;li&gt;For every $\alpha &amp;lt; 1/3$, there exist weak solutions in $C^{0,\alpha}$ that
dissipate energy.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;The &lt;strong&gt;positive direction&lt;/strong&gt; (conservation above $1/3$) was proved by
Constantin–E–Titi (1994). The &lt;strong&gt;negative direction&lt;/strong&gt; (dissipation possible below
$1/3$) required much more work and was fully resolved only recently.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Isett, 2018)&lt;/span&gt;
&lt;p&gt;For every $\alpha &amp;lt; 1/3$ there exist weak solutions $u \in C^{0,\alpha}(\mathbb{T}^3\times[0,T])$
of the 3D Euler equations that fail to conserve kinetic energy.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Isett&amp;rsquo;s proof, published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; in 2018, was the culmination
of a decade of refinements of the De Lellis–Székelyhidi scheme. The key difficulty at
regularity exactly $1/3$ is that the high-frequency perturbations must be sized to
cancel the Reynolds stress error while staying in $C^{1/3-}$; this requires a
delicate interplay of oscillation and concentration (&lt;em&gt;intermittency&lt;/em&gt;). De Lellis,
Székelyhidi, Buckmaster, and Vicol also obtained solutions attaining any prescribed
energy profile in $C^{1/3-}$. Onsager&amp;rsquo;s conjecture is now a theorem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="viscous-non-uniqueness-buckmastervicol"&gt;
 Viscous Non-Uniqueness: Buckmaster–Vicol&lt;span class="heading__anchor"&gt; &lt;a href="#viscous-non-uniqueness-buckmastervicol"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Adapting the convex integration scheme from Euler to Navier–Stokes requires overcoming
the viscous term $\nu\Delta u$, which smooths out high-frequency oscillations. The
intermittent Beltrami waves used by Isett concentrate energy at sparse spatial sets,
reducing their interaction with the Laplacian. Buckmaster and Vicol exploited this
idea to bring convex integration into the viscous setting.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Buckmaster–Vicol, 2019)&lt;/span&gt;
&lt;p&gt;There exist infinitely many weak solutions $u \in C_t^0 L_x^2(\mathbb{T}^3)$ of the
3D Navier–Stokes equations, belonging to the same regularity class as Leray–Hopf
solutions, that do not satisfy the global energy inequality. In particular, weak
solutions of 3D Navier–Stokes are not unique in the class $C_t^0 L_x^2$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The Buckmaster–Vicol solutions, published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; &lt;strong&gt;189&lt;/strong&gt;
(2019), 101–144, are weak in both the PDE sense and the energy sense: they satisfy
the equations distributionally and have finite kinetic energy, but they can gain
energy spontaneously, violating the natural dissipation law $\partial _t|u| _{L^2}^2
\leq -2\nu|\nabla u| _{L^2}^2$.&lt;/p&gt;
&lt;p&gt;This non-uniqueness is striking but also limited: the Buckmaster–Vicol solutions
are not Leray–Hopf solutions, because Leray–Hopf solutions are required to satisfy
the &lt;em&gt;energy inequality&lt;/em&gt; $|u(t)| _{L^2}^2 \leq |u _0| _{L^2}^2$. Whether this
single additional constraint, that energy does not increase, suffices to restore
uniqueness is the open question.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="crossing-the-energy-barrier-albrittonbruécolombo"&gt;
 Crossing the Energy Barrier: Albritton–Brué–Colombo&lt;span class="heading__anchor"&gt; &lt;a href="#crossing-the-energy-barrier-albrittonbru%c3%a9colombo"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The energy inequality distinguishing Leray–Hopf solutions from Buckmaster–Vicol wild
solutions seemed for a long time to be a genuine barrier to non-uniqueness. The
following result crossed this barrier, but required introducing an external force.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Albritton–Brué–Colombo, 2022)&lt;/span&gt;
&lt;p&gt;There exists a body force $f \in L^1(0,T;, L^2(\mathbb{R}^3))$ and two distinct
Leray–Hopf weak solutions of the &lt;strong&gt;forced&lt;/strong&gt; 3D Navier–Stokes equations
$\partial_t u + (u\cdot\nabla)u - \nu\Delta u + \nabla p = f$ with the same initial
data $u_0 \equiv 0$ and the same force $f$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; &lt;strong&gt;196&lt;/strong&gt; (2022), 415–455, the proof uses a
completely different mechanism from convex integration. The key ingredient is an
&lt;em&gt;unstable&lt;/em&gt; background solution: using Vishik&amp;rsquo;s construction of spectrally unstable
steady states of the 2D Euler equations, Albritton–Brué–Colombo lift a 2D unstable
vortex ring to an axisymmetric 3D solution and embed it into the Navier–Stokes flow
via a self-similar change of variables. The force $f$ is chosen precisely to make
this background exactly solve the forced equations; the instability then allows two
different solutions to branch from the same initial data.&lt;/p&gt;
&lt;p&gt;The force is singular; it belongs to $L^1_t L^2_x$ but is not smooth, and is
concentrated near the initial time $t=0$. Whether the same non-uniqueness can be
achieved with a smooth or zero force is the remaining open problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-unforced-case-current-frontier"&gt;
 The Unforced Case: Current Frontier&lt;span class="heading__anchor"&gt; &lt;a href="#the-unforced-case-current-frontier"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Non-uniqueness of Leray–Hopf solutions for the &lt;em&gt;unforced&lt;/em&gt; Navier–Stokes equations
remains open. The route to the unforced case requires finding a self-similar
background profile that solves the unforced equations exactly and has an unstable
eigenvalue, a far more demanding task than the forced case, where the profile can
be any divergence-free function.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Open Problem (Jia–Šverák Programme)&lt;/span&gt;
&lt;p&gt;Do there exist two distinct Leray–Hopf solutions of the 3D Navier–Stokes equations
with the same initial data and no external force?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Jia and Šverák (2013–2014) showed that non-uniqueness would follow from a spectral
assumption: if there exists a forward self-similar Navier–Stokes solution whose
linearised operator has an eigenvalue with positive real part, then Leray–Hopf
solutions are non-unique. Guillod and Šverák (2017) provided compelling numerical
evidence that such an unstable self-similar profile exists.&lt;/p&gt;
&lt;p&gt;In September 2025, Giri and Kwon posted a preprint (arXiv:2509.25116) claiming a
computer-assisted proof of the existence of an unstable self-similar profile for
the unforced equations, which, via the Jia–Šverák mechanism, would establish
non-uniqueness of Leray–Hopf solutions. The proof uses rigorous interval arithmetic
to verify the existence of an unstable eigenvalue. As of this writing the preprint
is under review by the community.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-regularity-threshold"&gt;
 The Regularity Threshold&lt;span class="heading__anchor"&gt; &lt;a href="#the-regularity-threshold"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The accumulated results suggest the following picture of the
&lt;strong&gt;flexibility-rigidity dichotomy&lt;/strong&gt; for the Euler and Navier–Stokes equations.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Regularity class&lt;/th&gt;
					&lt;th&gt;Euler&lt;/th&gt;
					&lt;th&gt;Navier–Stokes&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;$C^{0,\alpha}$, $\alpha &amp;lt; 1/3$&lt;/td&gt;
					&lt;td&gt;non-unique, dissipative (Isett 2018)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$C^{0,\alpha}$, $\alpha &amp;gt; 1/3$&lt;/td&gt;
					&lt;td&gt;energy-conserving (Constantin–E–Titi 1994)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$L^2$ (global energy inequality)&lt;/td&gt;
					&lt;td&gt;non-unique&lt;/td&gt;
					&lt;td&gt;&lt;strong&gt;open (unforced); non-unique forced (ABC 2022)&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$L^\infty_t L^3_x$ (LPS regularity)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;unique and smooth (ESS 2003)&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The Leray–Hopf class sits precisely at the boundary where uniqueness is expected
to break down but has not yet been proved to do so in the unforced case.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-resolving-the-jiašverák-spectral-condition"&gt;
 1. Resolving the Jia–Šverák Spectral Condition&lt;span class="heading__anchor"&gt; &lt;a href="#1-resolving-the-jia%c5%a1ver%c3%a1k-spectral-condition"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most direct path to unforced Leray–Hopf non-uniqueness is to rigorously confirm
or refute the spectral condition of Jia–Šverák: find (or prove the nonexistence of)
a forward self-similar Navier–Stokes profile with an unstable linearised eigenvalue.
The 2025 Giri–Kwon computer-assisted preprint claims this is now done. If confirmed,
the consequence is striking: Leray&amp;rsquo;s 1934 existence theorem cannot be supplemented
by uniqueness, and the Navier–Stokes Cauchy problem is &lt;em&gt;ill-posed&lt;/em&gt; in the Leray–Hopf
class.&lt;/p&gt;
&lt;h3 class="heading" id="2-selection-principles-and-physical-solutions"&gt;
 2. Selection Principles and Physical Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#2-selection-principles-and-physical-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;If Leray–Hopf solutions are indeed non-unique, a fundamental question becomes which
solution is the physically correct one, the one observed in experiments and computed
in simulations. Several selection criteria have been proposed:
the &lt;em&gt;vanishing viscosity&lt;/em&gt; limit of the Navier–Stokes solution as $\nu\to 0$ from
above, &lt;em&gt;entropy conditions&lt;/em&gt; analogous to those for hyperbolic conservation laws,
and &lt;em&gt;renormalisation group&lt;/em&gt; or &lt;em&gt;statistical ensemble&lt;/em&gt; approaches motivated by
turbulence theory. None of these has been rigorously validated as a selection
criterion that distinguishes a unique Leray–Hopf solution from the others.&lt;/p&gt;
&lt;h3 class="heading" id="3-sharp-regularity-thresholds-for-navierstokes"&gt;
 3. Sharp Regularity Thresholds for Navier–Stokes&lt;span class="heading__anchor"&gt; &lt;a href="#3-sharp-regularity-thresholds-for-navierstokes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For Euler, Onsager&amp;rsquo;s conjecture identifies $C^{1/3}$ as the sharp regularity
threshold for energy conservation. What is the analogous threshold for Navier–Stokes?
The Buckmaster–Vicol solutions are in $C_t^0 L_x^2$ (very rough), while the
Ladyzhenskaya–Prodi–Serrin class gives uniqueness. The precise exponent at which
uniqueness breaks down, if it does, is not known. Determining the sharp Sobolev
or Hölder regularity threshold for Navier–Stokes uniqueness, analogous to Onsager&amp;rsquo;s
$1/3$, is a central open problem.&lt;/p&gt;
&lt;h3 class="heading" id="4-uniqueness-for-axisymmetric-initial-data"&gt;
 4. Uniqueness for Axisymmetric Initial Data&lt;span class="heading__anchor"&gt; &lt;a href="#4-uniqueness-for-axisymmetric-initial-data"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A natural restricted problem is whether Leray–Hopf solutions with axisymmetric,
swirl-free initial data are unique. Such data imposes a strong geometric constraint
that eliminates most of the degrees of freedom available to convex integration.
Partial results are known (e.g., global regularity for axisymmetric data without
swirl is not proved but no counterexamples exist), but uniqueness in this class
has not been established. If the Giri–Kwon instability is confirmed, understanding
whether the instability mechanism survives axisymmetric perturbations is an
immediate question.&lt;/p&gt;
&lt;h3 class="heading" id="5-stochastic-regularisation"&gt;
 5. Stochastic Regularisation&lt;span class="heading__anchor"&gt; &lt;a href="#5-stochastic-regularisation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;There is a well-studied phenomenon, &lt;em&gt;regularisation by noise&lt;/em&gt;, in which adding
a stochastic forcing term to an ill-posed deterministic PDE restores well-posedness.
For the Navier–Stokes equations, Hofmanová–Zhu–Zhu (2023) showed non-uniqueness
persists even under multiplicative noise for certain body forces, by adapting the
Albritton–Brué–Colombo construction. Whether a generic stochastic perturbation
can restore uniqueness of Leray–Hopf solutions, and what the appropriate notion of
&amp;ldquo;generic&amp;rdquo; should be, is a rich open direction combining convex integration with stochastic
analysis.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Nash, J. (1954). $C^1$ isometric imbeddings. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;60&lt;/strong&gt;(3), 383–396.&lt;/li&gt;
&lt;li&gt;De Lellis, C. &amp;amp; Székelyhidi, L. (2009). The Euler equations as a differential inclusion. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;170&lt;/strong&gt;(3), 1417–1436.&lt;/li&gt;
&lt;li&gt;De Lellis, C. &amp;amp; Székelyhidi, L. (2013). Dissipative continuous Euler flows. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;, &lt;strong&gt;193&lt;/strong&gt;(2), 377–407.&lt;/li&gt;
&lt;li&gt;Constantin, P., E, W., &amp;amp; Titi, E. S. (1994). Onsager&amp;rsquo;s conjecture on the energy conservation for solutions of Euler&amp;rsquo;s equation. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;165&lt;/strong&gt;(1), 207–209.&lt;/li&gt;
&lt;li&gt;Isett, P. (2018). A proof of Onsager&amp;rsquo;s conjecture. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;188&lt;/strong&gt;(3), 871–963.&lt;/li&gt;
&lt;li&gt;Buckmaster, T. &amp;amp; Vicol, V. (2019). Nonuniqueness of weak solutions to the Navier–Stokes equation. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;189&lt;/strong&gt;(1), 101–144.&lt;/li&gt;
&lt;li&gt;Buckmaster, T. &amp;amp; Vicol, V. (2019). Convex integration and phenomenologies in turbulence. &lt;em&gt;EMS Surveys in Mathematical Sciences&lt;/em&gt;, &lt;strong&gt;6&lt;/strong&gt;(1–2), 1–88.&lt;/li&gt;
&lt;li&gt;Albritton, D., Brué, E., &amp;amp; Colombo, M. (2022). Non-uniqueness of Leray solutions of the forced Navier–Stokes equations. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;196&lt;/strong&gt;(1), 415–455.&lt;/li&gt;
&lt;li&gt;Jia, H. &amp;amp; Šverák, V. (2014). Local-in-space estimates near initial time for weak solutions of the Navier–Stokes equations and forward self-similar solutions. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;, &lt;strong&gt;196&lt;/strong&gt;(1), 233–265.&lt;/li&gt;
&lt;li&gt;Giri, V. &amp;amp; Kwon, H. (2025). Nonuniqueness of Leray–Hopf solutions to the unforced incompressible 3D Navier–Stokes equation. arXiv:2509.25116.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>The Regularity Problem for the 3D Euler Equations</title><link>https://blog.namln.org/en/posts/euler-regularity-problem/</link><pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/euler-regularity-problem/</guid><description>&lt;p&gt;Leonhard Euler wrote down the equations governing the motion of an ideal
incompressible fluid in 1757. Whether smooth solutions to these equations can
develop a singularity in finite time, a point at which derivatives of the
velocity blow up, has been an open problem ever since, and remains one of the
central questions in mathematical fluid dynamics.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Problem (Euler Regularity)&lt;/span&gt;
&lt;p&gt;Let $u_0 : \mathbb{R}^3 \to \mathbb{R}^3$ be a smooth, divergence-free initial
velocity field with sufficient decay at infinity. Does the unique local smooth
solution $u(x,t)$ to the 3D incompressible Euler equations
$$\partial_t u + (u \cdot \nabla)u + \nabla p = 0, \qquad \nabla \cdot u = 0, \qquad u(\cdot,0)=u_0$$
remain smooth for all time $t &amp;gt; 0$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;L4&lt;/em&gt; on &lt;a href="https://www.unsolvedmath.com/problems/PDE-001"&gt;UnsolvedMath&lt;/a&gt;,
reflecting its depth, and is closely related to the Clay Millennium Prize Problem
on the Navier–Stokes equations. The two questions are linked through the
zero-viscosity limit, but neither implies the other.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-equations-and-what-regularity-means"&gt;
 The Equations and What Regularity Means&lt;span class="heading__anchor"&gt; &lt;a href="#the-equations-and-what-regularity-means"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The Euler equations express conservation of momentum (first equation) and
incompressibility (second equation) for an inviscid fluid. The unknowns are the
velocity field $u(x,t) \in \mathbb{R}^3$ and pressure $p(x,t) \in \mathbb{R}$;
the pressure is determined implicitly by incompressibility via an elliptic equation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vorticity.&lt;/strong&gt; The central quantity for singularity analysis is the vorticity
$\omega = \nabla \times u$, which satisfies the vorticity equation
$$\partial_t \omega + (u \cdot \nabla)\omega = (\omega \cdot \nabla)u.$$
The right-hand side, the &lt;em&gt;vortex stretching&lt;/em&gt; term, is the essential source of
difficulty. It creates a quadratic feedback: large $\omega$ produces large
$(\omega \cdot \nabla)u$, which can further amplify $\omega$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local well-posedness.&lt;/strong&gt; For $u_0 \in H^s(\mathbb{R}^3)$ with $s &amp;gt; 5/2$, there
exists a unique smooth solution on a time interval $[0, T^*)$ for some $T^* &amp;gt; 0$
depending on $|u _0| _{H^s}$ (Kato, 1972). The question is whether $T^*$ can be
taken equal to $+\infty$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why 2D is easy, 3D is not.&lt;/strong&gt; In two dimensions the vortex stretching term
$(\omega \cdot \nabla)u$ vanishes identically by antisymmetry. The scalar vorticity
$\omega = \partial_1 u_2 - \partial_2 u_1$ is then simply transported along fluid
particle paths without amplification, and $|\omega|_{L^\infty}$ is conserved.
Global regularity in 2D follows immediately. In 3D no such conservation holds,
and the problem is genuinely open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-bealekatomajda-criterion"&gt;
 The Beale–Kato–Majda Criterion&lt;span class="heading__anchor"&gt; &lt;a href="#the-bealekatomajda-criterion"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The first major structural result reduces the regularity problem to a single quantity.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Beale–Kato–Majda, 1984)&lt;/span&gt;
&lt;p&gt;A smooth solution $u$ of the 3D Euler equations loses regularity at time $T^*$ if
and only if
$$\int _0^{T^*} |\omega(\cdot,t)| _{L^\infty(\mathbb{R}^3)}, dt = +\infty.$$
In particular, if the vorticity remains bounded in $L^\infty$ on $[0,T]$ for every
finite $T$, the solution remains smooth globally.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The BKM criterion redirects the problem: one must show that the vorticity magnitude
$|\omega|_{L^\infty}$ cannot accumulate to infinity in finite time. Since $\omega$
satisfies a transport-stretching equation, this requires understanding the geometric
structure of the vorticity field under its own evolution.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="geometric-conditions-and-depletion-of-stretching"&gt;
 Geometric Conditions and Depletion of Stretching&lt;span class="heading__anchor"&gt; &lt;a href="#geometric-conditions-and-depletion-of-stretching"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The vortex stretching term $(\omega \cdot \nabla)u$ can be decomposed as
$$(\omega \cdot \nabla)u = |\omega|^2 (\hat\omega \cdot \nabla)\hat u,$$
where $\hat\omega = \omega/|\omega|$ is the unit vorticity direction. The key
observation is that stretching is governed not only by the magnitude of $\omega$
but also by the &lt;em&gt;geometry&lt;/em&gt; of the vorticity field.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Constantin–Fefferman–Majda, 1996)&lt;/span&gt;
&lt;p&gt;If the unit vorticity direction $\hat\omega = \omega/|\omega|$ is uniformly Lipschitz
in a neighbourhood of the set ${|\omega| &amp;gt; \lambda}$ for all $t \in [0, T]$ and
some $\lambda &amp;gt; 0$, then the solution remains smooth on $[0,T]$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This result says that blowup, if it occurs, must be accompanied by violent geometric
irregularity of vortex lines, not just large vorticity magnitude, but also loss of
Lipschitz regularity of the vorticity direction. It has motivated a line of research
on the geometric structure of vortex tubes near potential singularities.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="blowup-for-less-regular-data"&gt;
 Blowup for Less Regular Data&lt;span class="heading__anchor"&gt; &lt;a href="#blowup-for-less-regular-data"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Recent years have seen dramatic progress on singularity formation for initial data
that is smooth except at isolated points.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Elgindi, 2021)&lt;/span&gt;
&lt;p&gt;There exist axisymmetric, swirl-free initial velocity fields $u_0 \in C^{1,\alpha}(\mathbb{R}^3)$
for sufficiently small $\alpha &amp;gt; 0$ such that the corresponding solution to the 3D
Euler equations develops a finite-time singularity.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Elgindi&amp;rsquo;s proof, published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; &lt;strong&gt;194&lt;/strong&gt; (2021), 647–727,
constructs a self-similar blowup profile and establishes its nonlinear stability using
a dynamical rescaling formulation. The initial data is not smooth: it belongs to
$C^{1,\alpha}$ but not to $C^2$. The singularity forms at the axis of symmetry $r=0$.&lt;/p&gt;
&lt;p&gt;This was a breakthrough, but it left open the smooth case. Elgindi himself noted the
next target: constructing blowup from initial data that is non-smooth only at a
single point, or eventually from fully smooth data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extending Elgindi&amp;rsquo;s construction.&lt;/strong&gt; Chen and Hou (2022) proved the same type of
$C^{1,\alpha}$ blowup for the 3D axisymmetric Euler equations &lt;em&gt;with boundary&lt;/em&gt; (inside
a periodic cylinder), realising the Hou–Luo blowup scenario numerically proposed in
2014. Subsequent work by Córdoba, Martínez-Zoroa, and Zheng (2025, &lt;em&gt;Annals of PDE&lt;/em&gt;)
showed that the singularity can be formed from initial data in
$C^\infty(\mathbb{R}^3 \setminus {0}) \cap C^{1,\alpha}$, with non-smoothness at a
single point, a further step toward the smooth case.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-2025-breakthrough-smooth-blowup-with-boundary"&gt;
 The 2025 Breakthrough: Smooth Blowup with Boundary&lt;span class="heading__anchor"&gt; &lt;a href="#the-2025-breakthrough-smooth-blowup-with-boundary"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The most significant recent development is the following result, which provides a
rigorous proof of finite-time singularity from &lt;em&gt;smooth&lt;/em&gt; initial data.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Chen–Hou, PNAS 2025)&lt;/span&gt;
&lt;p&gt;There exists a family of smooth, finite-energy initial data for the 3D axisymmetric
Euler equations in a smooth bounded domain (periodic cylinder) such that the
corresponding solutions develop a finite-time singularity. The blowup is
nearly self-similar and occurs at the intersection of the boundary $r=1$
and the symmetry plane $z=0$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The paper, contributed by Thomas Hou and published in &lt;em&gt;PNAS&lt;/em&gt; in June 2025
(reviewed by Caflisch, Gómez-Serrano, Sverak, and Tao), provides a
&lt;em&gt;computer-assisted proof&lt;/em&gt;. The strategy is to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;construct a numerical approximate self-similar blowup profile via the dynamical
rescaling formulation,&lt;/li&gt;
&lt;li&gt;prove rigorously that the true solution remains close to this profile using
energy estimates with carefully verified error bounds (computed with interval
arithmetic), and&lt;/li&gt;
&lt;li&gt;conclude nonlinear stability of the blowup via a bootstrap argument.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This resolves the problem affirmatively in the setting of smooth data and a
smooth bounded domain. The boundary plays a crucial role: it creates an
antisymmetric flow pattern driving azimuthal vorticity toward a critical ring,
generating intense vortex stretching at a hyperbolic saddle point on the wall.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The remaining open case.&lt;/strong&gt; The problem in $\mathbb{R}^3$ (or on the periodic
torus $\mathbb{T}^3$) &lt;em&gt;without boundary&lt;/em&gt; remains open. It is not known whether
smooth initial data in free space can produce a singularity, or whether the
absence of a boundary provides a genuine stabilising mechanism.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-removing-the-boundary"&gt;
 1. Removing the Boundary&lt;span class="heading__anchor"&gt; &lt;a href="#1-removing-the-boundary"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most pressing open question is whether the Chen–Hou construction can be
extended to $\mathbb{R}^3$ or $\mathbb{T}^3$. The boundary in the 2025 result
acts as a geometric catalyst: it enforces a no-flow condition that concentrates
vorticity at a specific ring on the wall. Without a boundary, the antisymmetric
flow structure that drives the singularity must be sustained entirely by the
initial data and the nonlinear dynamics. Whether a comparable mechanism can
persist in free space, without the reflective constraint of the wall, is the
central open question.&lt;/p&gt;
&lt;h3 class="heading" id="2-self-similar-blowup-in-full-3d"&gt;
 2. Self-Similar Blowup in Full 3D&lt;span class="heading__anchor"&gt; &lt;a href="#2-self-similar-blowup-in-full-3d"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;All current singularity results are for &lt;em&gt;axisymmetric&lt;/em&gt; flows, which reduce the
problem from 3 spatial dimensions to 2 (the $rz$-plane). In full 3D, the angular
variable $\theta$ is active, and perturbations in the azimuthal direction can either
stabilise or destabilise the singularity. Elgindi, Ghoul, and Masmoudi (2021) proved
stability of the $C^{1,\alpha}$ blowup under axisymmetric perturbations. Whether
the singularity survives &lt;em&gt;fully 3D&lt;/em&gt; (non-axisymmetric) perturbations, a question
Elgindi posed as open, is crucial: a blowup that is destroyed by any non-symmetric
perturbation has limited physical relevance.&lt;/p&gt;
&lt;h3 class="heading" id="3-quantitative-vortex-stretching-and-the-role-of-geometry"&gt;
 3. Quantitative Vortex Stretching and the Role of Geometry&lt;span class="heading__anchor"&gt; &lt;a href="#3-quantitative-vortex-stretching-and-the-role-of-geometry"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The BKM criterion and the Constantin–Fefferman–Majda theorem both express the
same idea from opposite directions: blowup is controlled by the magnitude &lt;em&gt;and&lt;/em&gt;
geometry of the vorticity. Current research asks whether a quantitative version can
be made sharp. Specifically: if the vorticity direction $\hat\omega$ becomes
Hölder-continuous but not Lipschitz, does blowup necessarily follow? Or is there
a finer scale invariant quantity, perhaps involving the Hessian of the velocity
or the curvature of vortex lines, that governs the problem?&lt;/p&gt;
&lt;h3 class="heading" id="4-weak-solutions-and-non-uniqueness"&gt;
 4. Weak Solutions and Non-Uniqueness&lt;span class="heading__anchor"&gt; &lt;a href="#4-weak-solutions-and-non-uniqueness"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Separate from the question of whether smooth solutions blow up is the question of
what happens &lt;em&gt;after&lt;/em&gt; a potential singularity. De Lellis and Székelyhidi (2009–2013)
proved that the Euler equations have infinitely many weak $L^\infty$ solutions
for generic initial data, via convex integration. Isett (2018) proved that weak
solutions can dissipate energy, confirming Onsager&amp;rsquo;s 1949 conjecture. These results
show that the solution concept must be carefully chosen. After a smooth blowup, the
system likely enters a regime of non-unique weak solutions, and identifying the
physically relevant selection criterion, entropy conditions, vanishing viscosity,
$h$-principle, is a major open problem.&lt;/p&gt;
&lt;h3 class="heading" id="5-vanishing-viscosity-and-the-navierstokes-connection"&gt;
 5. Vanishing Viscosity and the Navier–Stokes Connection&lt;span class="heading__anchor"&gt; &lt;a href="#5-vanishing-viscosity-and-the-navierstokes-connection"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Navier–Stokes equations add a viscous term $\nu \Delta u$ to the right-hand
side. For any $\nu &amp;gt; 0$, global regularity of Navier–Stokes in 3D is itself open
(the Clay Millennium Problem). For the zero-viscosity limit $\nu \to 0$, the
central question is whether Navier–Stokes solutions converge to Euler solutions
uniformly in time, a question tied to boundary layer behaviour (the Prandtl
conjecture) and to the regularity of the Euler solution. If Euler develops a
singularity at time $T^*$, the behaviour of Navier–Stokes solutions near $T^*$
as $\nu \to 0$ is completely unknown.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Euler, L. (1757). Principes généraux du mouvement des fluides. &lt;em&gt;Mémoires de l&amp;rsquo;Académie des Sciences de Berlin&lt;/em&gt;, &lt;strong&gt;11&lt;/strong&gt;, 274–315.&lt;/li&gt;
&lt;li&gt;Beale, J. T., Kato, T., &amp;amp; Majda, A. (1984). Remarks on the breakdown of smooth solutions for the 3-D Euler equations. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;94&lt;/strong&gt;(1), 61–66.&lt;/li&gt;
&lt;li&gt;Constantin, P., Fefferman, C., &amp;amp; Majda, A. J. (1996). Geometric constraints on potentially singular solutions for the 3-D Euler equations. &lt;em&gt;Communications in Partial Differential Equations&lt;/em&gt;, &lt;strong&gt;21&lt;/strong&gt;(3–4), 559–571.&lt;/li&gt;
&lt;li&gt;Elgindi, T. M. (2021). Finite-time singularity formation for $C^{1,\alpha}$ solutions to the incompressible Euler equations on $\mathbb{R}^3$. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;194&lt;/strong&gt;(3), 647–727.&lt;/li&gt;
&lt;li&gt;Elgindi, T. M., Ghoul, T.-E., &amp;amp; Masmoudi, N. (2021). On the stability of self-similar blow-up for $C^{1,\alpha}$ solutions to the incompressible Euler equations. &lt;em&gt;Cambridge Journal of Mathematics&lt;/em&gt;, &lt;strong&gt;9&lt;/strong&gt;(4), 1035–1075.&lt;/li&gt;
&lt;li&gt;Chen, J. &amp;amp; Hou, T. Y. (2023). Finite time blowup of 2D Boussinesq and 3D Euler equations with $C^{1,\alpha}$ velocity and boundary. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;383&lt;/strong&gt;, 4827–4890.&lt;/li&gt;
&lt;li&gt;Chen, J. &amp;amp; Hou, T. Y. (2025). Singularity formation in 3D Euler equations with smooth initial data and boundary. &lt;em&gt;Proceedings of the National Academy of Sciences&lt;/em&gt;, &lt;strong&gt;122&lt;/strong&gt;(27). &lt;a href="https://doi.org/10.1073/pnas.2500940122"&gt;https://doi.org/10.1073/pnas.2500940122&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Córdoba, D., Martínez-Zoroa, L., &amp;amp; Zheng, F. (2025). Finite time singularities to the 3D incompressible Euler equations for solutions in $C^\infty(\mathbb{R}^3\setminus{0})\cap C^{1,\alpha}\cap L^2$. &lt;em&gt;Annals of PDE&lt;/em&gt;. &lt;a href="https://doi.org/10.1007/s40818-025-00214-2"&gt;https://doi.org/10.1007/s40818-025-00214-2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Isett, P. (2018). A proof of Onsager&amp;rsquo;s conjecture. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;188&lt;/strong&gt;(3), 871–963.&lt;/li&gt;
&lt;li&gt;Majda, A. J. &amp;amp; Bertozzi, A. L. (2002). &lt;em&gt;Vorticity and Incompressible Flow&lt;/em&gt;. Cambridge University Press.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>$C^r$ Stability Conjecture</title><link>https://blog.namln.org/en/posts/cr-stability-conjecture/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/cr-stability-conjecture/</guid><description>&lt;p&gt;Structural stability is a global topological property: a dynamical system is
structurally stable if all nearby systems have the same orbit structure, up to
continuous reparametrisation. Hyperbolicity is a local differential property:
the tangent bundle over the recurrent set splits into uniformly contracting and
expanding directions. That these two conditions should be equivalent is one of the
deepest principles in smooth dynamics.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Conjecture ($C^r$ Stability Conjecture, Palis–Smale, ~1970)&lt;/span&gt;
&lt;p&gt;Let $M$ be a closed smooth manifold and $r \geq 1$. If $f \in \mathrm{Diff}^r(M)$
is $C^r$-structurally stable, then $f$ is hyperbolic, i.e., it satisfies
&lt;strong&gt;Axiom A&lt;/strong&gt; and the &lt;strong&gt;Strong Transversality Condition&lt;/strong&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;L3&lt;/em&gt; on &lt;a href="https://www.unsolvedmath.com/problems/OPG-725"&gt;UnsolvedMath&lt;/a&gt;
and sits at the heart of the global theory of smooth dynamical systems. The case
$r = 1$ is resolved. The case $r \geq 2$ is open, and even basic consequences of
structural stability that are elementary for $r = 1$ remain unknown for $r = 2$.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="key-definitions"&gt;
 Key Definitions&lt;span class="heading__anchor"&gt; &lt;a href="#key-definitions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Structural stability.&lt;/strong&gt; A diffeomorphism $f \in \mathrm{Diff}^r(M)$ is
&lt;em&gt;$C^r$-structurally stable&lt;/em&gt; if there exists a $C^r$-neighborhood $\mathcal{U}$ of $f$
such that every $g \in \mathcal{U}$ is topologically conjugate to $f$: there is a
homeomorphism $h : M \to M$ with $h \circ f = g \circ h$. The system is therefore
robust under $C^r$-small perturbations in the strongest possible sense: topology,
not just orbit counts, is preserved.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Axiom A.&lt;/strong&gt; The diffeomorphism $f$ satisfies &lt;em&gt;Axiom A&lt;/em&gt; if:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the non-wandering set $\Omega(f)$ is hyperbolic: there is a $Df$-invariant splitting
$T_x M = E^s_x \oplus E^u_x$ over $\Omega(f)$ with uniform exponential contraction
on $E^s$ and expansion on $E^u$;&lt;/li&gt;
&lt;li&gt;the periodic points of $f$ are dense in $\Omega(f)$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Strong Transversality Condition (STC).&lt;/strong&gt; For every $x, y \in \Omega(f)$, the
stable manifold $W^s(x)$ and the unstable manifold $W^u(y)$ intersect transversally.
Tangential intersections, namely &lt;em&gt;homoclinic or heteroclinic tangencies&lt;/em&gt;, are forbidden.&lt;/p&gt;
&lt;p&gt;Together, Axiom A and the STC constitute what is usually meant by saying $f$ is
&lt;em&gt;hyperbolic&lt;/em&gt; in the sense of the stability conjecture.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-two-directions"&gt;
 The Two Directions&lt;span class="heading__anchor"&gt; &lt;a href="#the-two-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The conjecture, as an equivalence, has an easy direction and a hard direction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structural stability follows from hyperbolicity&lt;/strong&gt; (the easy direction). Robbin (1971)
proved this for $C^2$ diffeomorphisms; Robinson (1976) extended it to $C^1$. Both
proofs use the implicit function theorem on an appropriate space of conjugacies,
and work for all $r \geq 1$ since Axiom A + STC is the hypothesis.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Robbin 1971, Robinson 1976)&lt;/span&gt;
&lt;p&gt;For every $r \geq 1$, if $f \in \mathrm{Diff}^r(M)$ satisfies Axiom A and the
Strong Transversality Condition, then $f$ is $C^r$-structurally stable.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Hyperbolicity follows from structural stability&lt;/strong&gt; (the hard direction) is the
conjecture itself. It requires understanding what structural stability forces on
the dynamics, ruling out every non-hyperbolic mechanism compatible with stability.
This is where the difficulty lies, and where the gap between $r = 1$ and $r \geq 2$
opens.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-c1-case-mañés-theorem"&gt;
 The $C^1$ Case: Mañé&amp;rsquo;s Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#the-c1-case-ma%c3%b1%c3%a9s-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The $C^1$ stability conjecture was fully proved by Mañé in 1987.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Mañé, 1987)&lt;/span&gt;
&lt;p&gt;Every $C^1$-structurally stable diffeomorphism of a closed manifold satisfies
Axiom A and the Strong Transversality Condition.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The proof, published in &lt;em&gt;Publ. Math. IHÉS&lt;/em&gt; &lt;strong&gt;66&lt;/strong&gt; (1987), 161–210, is a tour de
force of $C^1$ perturbation theory. It rests on several tools that are available
only in the $C^1$ topology:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Pugh&amp;rsquo;s $C^1$ closing lemma (1967):&lt;/strong&gt; Given a non-wandering point $x$ of $f$,
one can make an arbitrarily small $C^1$ perturbation of $f$ to create a periodic
orbit passing near $x$. This is the essential mechanism for showing that periodic
points are dense in $\Omega(f)$.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mañé&amp;rsquo;s ergodic closing lemma (1982):&lt;/strong&gt; A more refined version that controls the
Lyapunov exponents of the created periodic orbit, allowing the construction of
hyperbolic periodic points that shadow the orbit of an ergodic measure.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Franks&amp;rsquo; lemma (1971):&lt;/strong&gt; Linear maps along periodic orbits can be prescribed
independently (up to $C^1$ conjugacy), allowing one to test whether a given
splitting is genuinely hyperbolic or can be destroyed by a small $C^1$ perturbation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The strategy is to assume structural stability and use these tools to show, step by
step, that the non-wandering set must be hyperbolic and that tangencies cannot persist.
Mañé had proved the surface case ($\dim M = 2$, $r = 1$) earlier, with the full
higher-dimensional result completed in the 1987 paper. Aoki (1992) and Hayashi (1992)
subsequently settled the closely related Mañé conjecture on the $C^1$ interior of the
set of diffeomorphisms with all hyperbolic periodic points.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-wall-at-r-geq-2"&gt;
 The Wall at $r \geq 2$&lt;span class="heading__anchor"&gt; &lt;a href="#the-wall-at-r-geq-2"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The $C^r$ case for $r \geq 2$ is not merely an incremental extension. The tools that
power Mañé&amp;rsquo;s proof are fundamentally $C^1$ phenomena.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The $C^r$ closing lemma is open for $r \geq 2$.&lt;/strong&gt; Pugh&amp;rsquo;s closing lemma fails for
$r \geq 2$ in general: Gutierrez showed that the local perturbation argument used
for $C^1$ does not work in the $C^2$ topology. A $C^r$ closing lemma is available
only for specific classes of diffeomorphisms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Conservative (volume-preserving) diffeomorphisms on surfaces: Asaoka–Irie
($C^\infty$, 2015), Cristofaro-Gardiner–Prasad–Zhang (2023).&lt;/li&gt;
&lt;li&gt;Partially hyperbolic diffeomorphisms with one-dimensional center bundle (all
$r \geq 2$ including $r = \infty$): Gan–Shi (2022) and the follow-up
$C^r$-chain closing lemma of Shi–Wang (&lt;em&gt;Ergodic Theory Dynam. Syst.&lt;/em&gt; &lt;strong&gt;44&lt;/strong&gt;, 2024).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the absence of a general $C^r$ closing lemma, the first step of Mañé&amp;rsquo;s proof,
showing that periodic points are dense in $\Omega(f)$ under $C^r$ structural
stability, is not known for $r \geq 2$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mañé himself underscored this gap.&lt;/strong&gt; In the 1987 paper, immediately after the
proof of Theorem A, he writes that for $r &amp;gt; 1$ &amp;ldquo;not even [being] known whether a
$C^2$ structurally stable diffeomorphism has at least one periodic point, it seems,
to say the least, difficult to prove that they are dense.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Franks&amp;rsquo; lemma also fails for $r \geq 2$.&lt;/strong&gt; Controlling linear maps along periodic
orbits requires $C^1$ perturbations; in higher regularity the ambient perturbation
must be smooth and the constraints on higher derivatives can prevent the desired
linear behaviour from being achieved.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-the-cr-closing-lemma-for-general-diffeomorphisms"&gt;
 1. The $C^r$ Closing Lemma for General Diffeomorphisms&lt;span class="heading__anchor"&gt; &lt;a href="#1-the-cr-closing-lemma-for-general-diffeomorphisms"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most direct path to the $C^r$ stability conjecture passes through a general
$C^r$ closing lemma. For $r \geq 2$ this asks: given any non-wandering point of a
$C^r$ diffeomorphism, can one make an arbitrarily small $C^r$ perturbation to close
the orbit? Answering this in the affirmative for all closed manifolds and all
$r \geq 2$ would be a landmark result, and would immediately advance the stability
conjecture. The recent progress in conservative surface dynamics (Cristofaro-Gardiner
et al., 2023) and partially hyperbolic settings shows the question is not hopeless,
but the general dissipative case remains untouched.&lt;/p&gt;
&lt;h3 class="heading" id="2-the-surface-case-dim-m--2-r-geq-2"&gt;
 2. The Surface Case $\dim M = 2$, $r \geq 2$&lt;span class="heading__anchor"&gt; &lt;a href="#2-the-surface-case-dim-m--2-r-geq-2"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;On surfaces the dynamics is simpler: the non-wandering set has lower-dimensional
structure, and the absence of a center bundle means &amp;ldquo;partially hyperbolic&amp;rdquo; reduces
to &amp;ldquo;hyperbolic.&amp;rdquo; Mañé settled the surface case for $r = 1$. The $C^r$ stability
conjecture for surfaces and $r \geq 2$ is already an important open target and may
be the most accessible subcase. Recent $C^\infty$ closing lemmas for conservative
surface diffeomorphisms (Asaoka–Irie) suggest that the conservative surface case
may be reachable.&lt;/p&gt;
&lt;h3 class="heading" id="3-partially-hyperbolic-diffeomorphisms"&gt;
 3. Partially Hyperbolic Diffeomorphisms&lt;span class="heading__anchor"&gt; &lt;a href="#3-partially-hyperbolic-diffeomorphisms"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A diffeomorphism is &lt;em&gt;partially hyperbolic&lt;/em&gt; if the tangent bundle splits as
$TM = E^{ss} \oplus E^c \oplus E^{uu}$ with uniform contraction on $E^{ss}$,
uniform expansion on $E^{uu}$, and an intermediate &amp;ldquo;center&amp;rdquo; bundle $E^c$.
For these systems, Gan–Shi (2022) and Shi–Wang (2024) have established $C^r$
closing and chain-closing lemmas when $\dim E^c = 1$. The question is whether
$C^r$-structural stability of a partially hyperbolic diffeomorphism forces the
center bundle to also become hyperbolic, that is, whether partial hyperbolicity
implies full hyperbolicity under stability.&lt;/p&gt;
&lt;h3 class="heading" id="4-the-palis-global-conjecture"&gt;
 4. The Palis Global Conjecture&lt;span class="heading__anchor"&gt; &lt;a href="#4-the-palis-global-conjecture"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Palis proposed that the complement of the hyperbolic diffeomorphisms is exactly the
closure of systems exhibiting &lt;em&gt;homoclinic tangencies&lt;/em&gt; or &lt;em&gt;heteroclinic cycles&lt;/em&gt;. This
is a positive description of non-hyperbolic dynamics, and is a strengthening of the
$C^r$ stability conjecture (it would also characterise what structural stability
forbids). In $C^1$ topology this programme is largely complete through Bonatti–
Crovisier&amp;rsquo;s connecting lemma (2004) and related results. For $r \geq 2$ it is wide
open, and progress on the Palis conjecture in $C^r$ would likely resolve the
stability conjecture as a corollary.&lt;/p&gt;
&lt;h3 class="heading" id="5-flows-and-the-vector-field-analogue"&gt;
 5. Flows and the Vector Field Analogue&lt;span class="heading__anchor"&gt; &lt;a href="#5-flows-and-the-vector-field-analogue"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The stability conjecture has a natural analogue for $C^r$ vector fields: a
$C^r$-structurally stable flow should satisfy Axiom A and the strong transversality
condition. For $r = 1$ this is also proved. For $r \geq 2$ it is open. The vector
field setting introduces additional complications from singular points (zeros of the
vector field), as Labarca–Pacifico showed that on manifolds with boundary stable
flows can fail Axiom A, so the correct formulation may need adaptation. Progress
on the diffeomorphism case would likely shed light on the flow case as well.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Palis, J. &amp;amp; Smale, S. (1970). Structural stability theorems. &lt;em&gt;Proc. Sympos. Pure Math.&lt;/em&gt;, &lt;strong&gt;14&lt;/strong&gt;, 223–231.&lt;/li&gt;
&lt;li&gt;Robbin, J. W. (1971). A structural stability theorem. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;94&lt;/strong&gt;(2), 447–493.&lt;/li&gt;
&lt;li&gt;Robinson, C. (1976). Structural stability of $C^1$ diffeomorphisms. &lt;em&gt;Journal of Differential Equations&lt;/em&gt;, &lt;strong&gt;22&lt;/strong&gt;(1), 28–73.&lt;/li&gt;
&lt;li&gt;Mañé, R. (1987). A proof of the $C^1$ stability conjecture. &lt;em&gt;Publications Mathématiques de l&amp;rsquo;IHÉS&lt;/em&gt;, &lt;strong&gt;66&lt;/strong&gt;, 161–210.&lt;/li&gt;
&lt;li&gt;Aoki, N. (1992). The set of Axiom A diffeomorphisms with no cycles. &lt;em&gt;Bol. Soc. Brasil. Mat.&lt;/em&gt;, &lt;strong&gt;23&lt;/strong&gt;(1–2), 21–65.&lt;/li&gt;
&lt;li&gt;Hayashi, S. (1992). Diffeomorphisms in $\mathcal{F}^1(M)$ satisfy Axiom A. &lt;em&gt;Ergodic Theory Dynam. Systems&lt;/em&gt;, &lt;strong&gt;12&lt;/strong&gt;(2), 233–253.&lt;/li&gt;
&lt;li&gt;Gan, S. &amp;amp; Shi, Y. (2022). $C^r$-closing lemma for partially hyperbolic diffeomorphisms with 1D-center bundle. &lt;em&gt;Journal of Differential Equations&lt;/em&gt;, &lt;strong&gt;334&lt;/strong&gt;, 337–363.&lt;/li&gt;
&lt;li&gt;Shi, Y. &amp;amp; Wang, X. (2024). $C^r$-chain closing lemma for certain partially hyperbolic diffeomorphisms. &lt;em&gt;Ergodic Theory Dynam. Systems&lt;/em&gt;, &lt;strong&gt;44&lt;/strong&gt;(7), 1923–1944.&lt;/li&gt;
&lt;li&gt;Bonatti, C. &amp;amp; Crovisier, S. (2004). Récurrence et généricité. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;, &lt;strong&gt;158&lt;/strong&gt;(1), 33–104.&lt;/li&gt;
&lt;li&gt;Berger, P. (2017). Lectures on structural stability in dynamics. arXiv:1703.00092.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Inequality for Square-Summable Complex Series</title><link>https://blog.namln.org/en/posts/inequality-square-summable-complex-series/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/inequality-square-summable-complex-series/</guid><description>&lt;p&gt;Some inequalities look formidable until the right decomposition makes them
transparent. The conjecture below, posed by Zoltan Retkes on the
&lt;a href="http://www.openproblemgarden.org/op/inequality_for_square_summable_complex_series"&gt;Open Problem Garden&lt;/a&gt;
in 2012 with a £10 prize attached, is one such case: once the dyadic structure of
the positive integers is made explicit, the proof reduces to two classical facts.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Conjecture (Retkes, 2012), now proved&lt;/span&gt;
&lt;p&gt;For all $\alpha = (\alpha_1, \alpha_2, \ldots) \in \ell^2(\mathbb{C})$,
$$\sum_{n \geq 1} |\alpha_n|^2 \geq \frac{6}{\pi^2} \sum_{k \geq 0}
\left|, \sum_{l \geq 0} \frac{\alpha_{2^k(2l+1)}}{l+1} ,\right|^2.$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The conjecture was confirmed by an anonymous comment on the problem page in November
2013. A self-contained proof and an extension to $\ell^p$ were subsequently published
by Ibragimov and Salimova in &lt;em&gt;Elemente der Mathematik&lt;/em&gt; &lt;strong&gt;70&lt;/strong&gt; (2015), 79–81.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-dyadic-decomposition"&gt;
 The Dyadic Decomposition&lt;span class="heading__anchor"&gt; &lt;a href="#the-dyadic-decomposition"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The index $2^k(2l+1)$ running over $k \geq 0$ and $l \geq 0$ is not arbitrary:
it encodes a canonical partition of the positive integers. Every $n \in \mathbb{N}^+$
factors uniquely as
$$n = 2^k \cdot r, \qquad k \geq 0,\quad r \text{ odd positive},$$
where $k = v_2(n)$ is the 2-adic valuation of $n$ and $r = n/2^k$ is its odd part.
Writing $r = 2l+1$ gives the bijection $\mathbb{N}_0 \times \mathbb{N}_0 \to \mathbb{N}^+$,
$(k, l) \mapsto 2^k(2l+1)$. In particular the sets
$$A_k = {2^k(2l+1) : l \geq 0} = {2^k, 3 \cdot 2^k, 5 \cdot 2^k, \ldots}$$
form a &lt;strong&gt;partition&lt;/strong&gt; of $\mathbb{N}^+$. Explicitly: $A_0 = {1, 3, 5, 7, \ldots}$
(odd numbers), $A_1 = {2, 6, 10, 14, \ldots}$ (twice an odd number), and so on.
This partition is the key structural fact behind the proof.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="proof"&gt;
 Proof&lt;span class="heading__anchor"&gt; &lt;a href="#proof"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The argument has two ingredients: the &lt;strong&gt;Basel sum&lt;/strong&gt; $\sum_{l \geq 0}(l+1)^{-2} = \pi^2/6$,
and the &lt;strong&gt;Cauchy–Schwarz inequality&lt;/strong&gt; in $\ell^2(\mathbb{C})$.&lt;/p&gt;
&lt;p&gt;Define two sequences in $\ell^2(\mathbb{C})$:
$$x = \left(1,, \tfrac{1}{2},, \tfrac{1}{3},, \ldots\right), \qquad
y_k = \left(\alpha_{2^k},, \alpha_{3 \cdot 2^k},, \alpha_{5 \cdot 2^k},, \ldots\right)
\quad (k \geq 0).$$&lt;/p&gt;
&lt;p&gt;The inner sum in the conjecture is exactly the $\ell^2$ inner product $\langle x, y_k \rangle$:
$$\sum_{l \geq 0} \frac{\alpha_{2^k(2l+1)}}{l+1} = \langle x, y_k \rangle.$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1: Apply Cauchy–Schwarz.&lt;/strong&gt; For each $k$,&lt;/p&gt;
&lt;p&gt;$$|\langle x, y_k \rangle|^2 \leq |x|_2^2 \cdot |y_k|_2^2.$$&lt;/p&gt;
&lt;p&gt;Summing over $k \geq 0$,&lt;/p&gt;
&lt;p&gt;$$\sum _{k \geq 0} |\langle x, y _k \rangle|^2 \leq |x| _2^2 \sum _{k \geq 0} |y _k| _2^2.$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2: Evaluate using the Basel problem and the partition.&lt;/strong&gt; The Basel problem gives
$$|x| _2^2 = \sum _{l \geq 0} \frac{1}{(l+1)^2} = \frac{\pi^2}{6}.$$&lt;/p&gt;
&lt;p&gt;Since the sets $A_k$ partition $\mathbb{N}^+$,
$$\sum _{k \geq 0} |y_k|_2^2 = \sum _{k \geq 0} \sum _{l \geq 0} |\alpha _{2^k(2l+1)}|^2
= \sum _{n \geq 1} |\alpha_n|^2.$$&lt;/p&gt;
&lt;p&gt;Combining both steps,
$$\sum_{k \geq 0} \left|\sum_{l \geq 0} \frac{\alpha_{2^k(2l+1)}}{l+1}\right|^2
\leq \frac{\pi^2}{6} \sum_{n \geq 1} |\alpha_n|^2,$$
which is the inequality with the $\frac{6}{\pi^2}$ factor moved to the other side.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="sharpness-of-the-constant"&gt;
 Sharpness of the Constant&lt;span class="heading__anchor"&gt; &lt;a href="#sharpness-of-the-constant"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The constant $6/\pi^2$ is the best possible. To see this, consider the truncated
sequence $\alpha^{(N)}$ defined by $\alpha^{(N)}_{2l+1} = 1/(l+1)$ for
$l = 0, 1, \ldots, N-1$ and $\alpha^{(N)}_n = 0$ otherwise. Then:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The left-hand side equals $\displaystyle\sum_{l=0}^{N-1} \frac{1}{(l+1)^2} \to \frac{\pi^2}{6}$.&lt;/li&gt;
&lt;li&gt;The only non-zero contribution to the right-hand side comes from $k = 0$
(since all non-zero indices are odd, i.e. in $A_0$), giving
$\displaystyle\frac{6}{\pi^2}\left(\sum_{l=0}^{N-1} \frac{1}{(l+1)^2}\right)^2 \to \frac{6}{\pi^2} \cdot \frac{\pi^4}{36} = \frac{\pi^2}{6}$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The ratio of the right-hand side to the left-hand side therefore tends to $1$ as
$N \to \infty$, so no larger constant than $6/\pi^2$ can hold universally. Equality
is never achieved for $\alpha \in \ell^2(\mathbb{C})\setminus{0}$ with finite norm
since the limiting sequence does not belong to $\ell^2(\mathbb{C})$.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="extension-to-ellp"&gt;
 Extension to $\ell^p$&lt;span class="heading__anchor"&gt; &lt;a href="#extension-to-ellp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The Cauchy–Schwarz inequality used above is a special case of Hölder&amp;rsquo;s inequality,
and the proof generalises immediately.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Ibragimov–Salimova, 2015)&lt;/span&gt;
&lt;p&gt;Let $p, q \in (1,\infty)$ with $\tfrac{1}{p} + \tfrac{1}{q} = 1$. For all
$\alpha = (\alpha_1, \alpha_2, \ldots) \in \ell^p(\mathbb{C})$ and
$x = (x_0, x_1, \ldots) \in \ell^q(\mathbb{C})$,
$$\sum_{n \geq 1} |\alpha_n|^p \geq \left(\sum_{l \geq 0} |x_l|^q\right)^{-p/q}
\sum_{k \geq 0} \left|\sum_{l \geq 0} x_l, \alpha_{2^k(2l+1)}\right|^p.$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Retkes&amp;rsquo;s original inequality is the case $p = q = 2$ and $x_l = 1/(l+1)$, where
$(\sum_{l\geq 0}|x_l|^2)^{-1} = 6/\pi^2$ by the Basel problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="remarks-on-structure"&gt;
 Remarks on Structure&lt;span class="heading__anchor"&gt; &lt;a href="#remarks-on-structure"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;The role of the dyadic partition.&lt;/strong&gt; The sets $A_k$ are the &lt;em&gt;dyadic layers&lt;/em&gt; of
$\mathbb{N}^+$: each integer sits in exactly one layer determined by its 2-adic
valuation. This structure also appears in the theory of Hardy spaces, where the
dyadic martingale decomposition underpins the $H^1$–BMO duality, and in wavelets,
where the dyadic scaling of the real line organises the multiresolution analysis.
The inequality can be read as a norm comparison between the $\ell^2$ norm and a
weighted sum over dyadic layers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Relation to the Basel problem.&lt;/strong&gt; The constant $6/\pi^2$, the reciprocal of
$\zeta(2)$, appears here because the weight sequence $1/(l+1)$ used in the inner
sum is precisely the harmonic sequence, whose $\ell^2$ norm squared is $\zeta(2)$.
Any other weight sequence $x \in \ell^2(\mathbb{C})$ would produce the analogous
inequality with $|x|_2^{-2}$ in place of $6/\pi^2$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The inequality as a rearrangement estimate.&lt;/strong&gt; The right-hand side reorganises the
entries of $\alpha$ by their dyadic layer and applies a weighted average within each
layer. The inequality says the total $\ell^2$ energy cannot be less than $6/\pi^2$
times the energy of this rearranged, averaged version of the sequence, a
quantitative statement about how averaging destroys energy.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="further-questions"&gt;
 Further Questions&lt;span class="heading__anchor"&gt; &lt;a href="#further-questions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;While the original conjecture is settled, several natural variants remain.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt;Question 1&lt;/span&gt;
&lt;p&gt;What is the sharp constant in the inequality if the dyadic partition is replaced by
the partition induced by a prime $p \neq 2$, i.e. by the sets
$A_k^{(p)} = {p^k m : \gcd(m, p) = 1}$? The same argument applies with
$x_l = w_l$ for any weight sequence $w \in \ell^2(\mathbb{C})$, but the resulting
constant depends on $|w|_2$ and the choice of weight, not on $\pi$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt;Question 2&lt;/span&gt;
&lt;p&gt;The inner sum $\sum_{l \geq 0} \alpha_{2^k(2l+1)}/(l+1)$ averages the entries in
layer $A_k$ with the harmonic weights. What happens if the harmonic weight $1/(l+1)$
is replaced by a weight $w(l)$ depending on the position $l$ within the layer in a
more general way, for instance $w(l) = l^{-s}$ for $s &amp;gt; 1/2$? The sharp constant
would then involve $\zeta(2s)$ instead of $\zeta(2) = \pi^2/6$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt;Question 3&lt;/span&gt;
&lt;p&gt;For $p = 1$ the Ibragimov–Salimova theorem requires $q = \infty$, and the Hölder
inequality takes a different form. Does an analogue of Retkes&amp;rsquo;s inequality hold for
$\alpha \in \ell^1(\mathbb{C})$, and if so, what is the sharp constant?&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Ibragimov, Z. O. &amp;amp; Salimova, D. F. (2015). On an inequality in $\ell_p(\mathbb{C})$ involving Basel problem. &lt;em&gt;Elemente der Mathematik&lt;/em&gt;, &lt;strong&gt;70&lt;/strong&gt;(2), 79–81. &lt;a href="https://ems.press/content/serial-article-files/45532"&gt;https://ems.press/content/serial-article-files/45532&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Retkes, Z. (2012). Inequality for square summable complex series. &lt;em&gt;Open Problem Garden&lt;/em&gt;. &lt;a href="http://www.openproblemgarden.org/op/inequality_for_square_summable_complex_series"&gt;http://www.openproblemgarden.org/op/inequality_for_square_summable_complex_series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Benko, D. &amp;amp; Molokach, J. (2013). The Basel problem as a rearrangement of series. &lt;em&gt;College Mathematics Journal&lt;/em&gt;, &lt;strong&gt;44&lt;/strong&gt;(3), 171–176.&lt;/li&gt;
&lt;li&gt;Ritelli, D. (2013). Another proof of $\zeta(2) = \pi^2/6$ using double integrals. &lt;em&gt;American Mathematical Monthly&lt;/em&gt;, &lt;strong&gt;120&lt;/strong&gt;(7), 642–645.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Recent Advances in Neural Network Optimization for LLM Training</title><link>https://blog.namln.org/en/posts/llm-optimization-2025-survey/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/llm-optimization-2025-survey/</guid><description>&lt;p&gt;The optimization landscape for LLM training looks very different from two years
ago. AdamW still dominates production runs, but a wave of research is eroding
that dominance from multiple angles simultaneously: matrix-aware optimizers,
horizon-free schedulers, a sharply revised understanding of µP, and
communication-efficient distributed methods. This post synthesizes 18 recent
papers across five interconnected fronts.&lt;/p&gt;
&lt;p&gt;The unifying thread is an active re-examination of long-held assumptions, from
whether gradient geometry matters, to what µP is actually doing, to whether
weight decay is a regularizer at all.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="1-muon-and-non-euclidean-optimizers"&gt;
 1. Muon and Non-Euclidean Optimizers&lt;span class="heading__anchor"&gt; &lt;a href="#1-muon-and-non-euclidean-optimizers"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Muon&lt;/strong&gt; (&lt;em&gt;&lt;strong&gt;Mo&lt;/strong&gt;mentum &lt;strong&gt;U&lt;/strong&gt;rthog&lt;/em&gt;&lt;em&gt;on&lt;/em&gt;*alized by Newton-Schulz*) applies a
gradient orthogonalization step via a Newton-Schulz iteration before each weight
update. Rather than treating each parameter as an independent scalar (as Adam
does), Muon recognizes that weight matrices have geometric structure and
optimizes them accordingly, performing steepest descent under the &lt;strong&gt;spectral
norm&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The core Newton-Schulz iteration, which runs stably in &lt;code&gt;bfloat16&lt;/code&gt; on tensor
cores, is:&lt;/p&gt;
&lt;p&gt;$$
X \leftarrow aX + b(XX^\top)X + c(XX^\top)^2 X
$$&lt;/p&gt;
&lt;p&gt;with coefficients $a = 3.4445$, $b = -4.7750$, $c = 2.0315$. In PyTorch:&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;newtonschulz5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-7&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.4445&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;4.7750&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;2.0315&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bfloat16&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;/=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;A ready-to-use implementation lives at
&lt;a href="https://github.com/KellerJordan/Muon"&gt;KellerJordan/Muon&lt;/a&gt;. Install via:&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install git+https://github.com/KellerJordan/Muon&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;Muon is intended for hidden-layer matrix weights only. Embeddings, the output
head, and scalar/vector parameters should still use AdamW:&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;muon&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MuonWithAuxAdam&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;hidden_matrix_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndim&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;embed&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;embed_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;embed&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scalar_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndim&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;head_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lm_head&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MuonWithAuxAdam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;muon_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;hidden_matrix_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;adamw_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embed_params&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;scalar_params&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;head_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;adamw_lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3e-4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;adamw_wd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# LR has built-in muP scaling, so no retuning is needed as you scale up&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;h3 class="heading" id="scaling-muon-the-moonlight-result"&gt;
 Scaling Muon: the Moonlight result&lt;span class="heading__anchor"&gt; &lt;a href="#scaling-muon-the-moonlight-result"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;MoonshotAI&amp;rsquo;s &lt;strong&gt;Moonlight&lt;/strong&gt; (3B/16B-parameter MoE, trained on 5.7T tokens)
provides the strongest evidence yet that Muon scales to real LLM training
(&lt;a href="https://arxiv.org/abs/2502.16982"&gt;arXiv:2502.16982&lt;/a&gt;,
&lt;a href="https://github.com/MoonshotAI/Moonlight"&gt;GitHub&lt;/a&gt;). Two fixes are needed to
make Muon work beyond small scale:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Weight decay:&lt;/strong&gt; without it, weight and output RMS norms grow until they
overflow &lt;code&gt;bfloat16&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Per-parameter update scale adjustment:&lt;/strong&gt; matching the RMS update norm of
AdamW by a factor of $\sqrt{(1-\beta_1)/(1+\beta_1)}$.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With these in place, scaling-law experiments indicate roughly &lt;strong&gt;2× computational
efficiency&lt;/strong&gt; compared to AdamW at compute-optimal settings.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Train a Qwen-like dense model with Muon (from Moonlight repo)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;python3 examples/toy_train.py &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --model qwen --optimizer muon &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --dataset openwebtext-100k &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --hidden_size &lt;span class="m"&gt;896&lt;/span&gt; --lr 1e-3&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;A further efficiency variant is
&lt;a href="https://github.com/nil0x9/flash-muon"&gt;Flash-Muon&lt;/a&gt;, which reimplements the
Newton-Schulz inner loop using a custom Triton kernel that exploits the symmetry
of the $XX^\top$ computation, halving the effective FLOP count.&lt;/p&gt;
&lt;h3 class="heading" id="theoretical-foundations"&gt;
 Theoretical foundations&lt;span class="heading__anchor"&gt; &lt;a href="#theoretical-foundations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Kovalev (2025)&lt;/strong&gt; shows in &lt;em&gt;Understanding Gradient Orthogonalization via
Non-Euclidean Trust-Region Optimization&lt;/em&gt; that the orthogonalized gradient update
can be interpreted as a first-order trust-region method where the trust-region is
defined in terms of the matrix spectral norm. This framework unifies Muon with
normalized SGD and signSGD with momentum.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pethick et al. (2025)&lt;/strong&gt; propose &lt;strong&gt;Scion&lt;/strong&gt;, a family of LMO-based algorithms
that subsumes Muon, AdamW, and normalized SGD under a single framework
(&lt;a href="https://arxiv.org/abs/2502.07529"&gt;arXiv:2502.07529&lt;/a&gt;). By choosing an explicit
norm for deep architectures, Scion also achieves hyperparameter transferability
across model widths.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Polar Express&lt;/strong&gt; (Amsel et al., 2025) replaces Newton-Schulz with a minimax
polar decomposition, solving a minimax problem at each iteration to minimize
worst-case error. It converges faster than Newton-Schulz in both early and
asymptotic stages, while remaining numerically stable in &lt;code&gt;bfloat16&lt;/code&gt;.&lt;/p&gt;
&lt;h3 class="heading" id="challenging-the-geometric-narrative"&gt;
 Challenging the geometric narrative&lt;span class="heading__anchor"&gt; &lt;a href="#challenging-the-geometric-narrative"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite the theoretical appeal, &lt;strong&gt;Shumaylov et al. (2026)&lt;/strong&gt; mount a systematic
challenge in &lt;em&gt;Muon is Not That Special: Random or Inverted Spectra Work Just as
Well&lt;/em&gt;. They introduce:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Freon:&lt;/strong&gt; a family of optimizers based on Schatten (quasi-)norms,
interpolating between SGD and Muon. The best-performing Schatten parameter for
GPT-2 lies in the &lt;em&gt;quasi-norm&lt;/em&gt; regime, which no LMO-based optimizer can
represent.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kaon:&lt;/strong&gt; replaces Muon&amp;rsquo;s singular values with random noise, yet still
matches Muon&amp;rsquo;s validation loss on GPT-2.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Their key insight: performance is primarily controlled by two local quantities,
&lt;em&gt;alignment&lt;/em&gt; (how well the update direction aligns with the gradient) and &lt;em&gt;descent
potential&lt;/em&gt; (step-size optimality). Muon succeeds by guaranteeing step-size
optimality, not by tracking an ideal geometry.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th style="text-align: left"&gt;Optimizer&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Core mechanism&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Key claim&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Muon&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Newton-Schulz orthogonalization&lt;/td&gt;
					&lt;td style="text-align: left"&gt;~2× efficiency over AdamW at compute-optimal&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Scion&lt;/td&gt;
					&lt;td style="text-align: left"&gt;LMO over norm-ball&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Unifies Muon/Adam; HP transferable across widths&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Polar Express&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Minimax polar decomposition&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Faster convergence; bfloat16-safe&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Freon / Kaon&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Schatten quasi-norms / random SVs&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Geometry is irrelevant; alignment drives performance&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="2-learning-rate-scheduling"&gt;
 2. Learning Rate Scheduling&lt;span class="heading__anchor"&gt; &lt;a href="#2-learning-rate-scheduling"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="linear-decay-is-provably-optimal"&gt;
 Linear decay is provably optimal&lt;span class="heading__anchor"&gt; &lt;a href="#linear-decay-is-provably-optimal"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio et al. (2023/2024)&lt;/strong&gt; close a long-standing gap between theory and
practice in &lt;em&gt;Optimal Linear Decay Learning Rate Schedules and Further
Refinements&lt;/em&gt; (&lt;a href="https://arxiv.org/abs/2310.07831"&gt;arXiv:2310.07831&lt;/a&gt;). Under
worst-case analysis, &lt;strong&gt;linear decay&lt;/strong&gt;, setting $\eta_t \propto (1 - t/T)$, is
the theoretically optimal schedule for a broad class of optimizers including SGD.
Across 10 diverse benchmarks, it consistently outperforms cosine annealing.&lt;/p&gt;
&lt;p&gt;$$
\eta_t = \eta_{\max} \cdot \left(1 - \frac{t}{T}\right)
$$&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# PyTorch built-in, the optimal default&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lr_scheduler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LinearLR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_iters&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;total_steps&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;h3 class="heading" id="the-wsd-cooldown-phase"&gt;
 The WSD cooldown phase&lt;span class="heading__anchor"&gt; &lt;a href="#the-wsd-cooldown-phase"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Warmup-Stable-Decay (WSD) scheduler separates training into distinct phases
ending in a sharp LR drop. &lt;strong&gt;Dremov et al. (2025)&lt;/strong&gt; analyse the cooldown phase
specifically in &lt;em&gt;Training Dynamics of the Cooldown Stage in WSD&lt;/em&gt;, finding:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cooldown shapes that balance exploration and exploitation consistently
outperform purely exploratory or exploitative alternatives.&lt;/li&gt;
&lt;li&gt;There is substantial sensitivity to AdamW&amp;rsquo;s $\beta_2$ parameter during
cooldown, and &lt;strong&gt;higher $\beta_2$ values yield consistent improvements&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Loss-landscape visualisations support the &amp;ldquo;river valley&amp;rdquo; perspective: the
cooldown follows a narrow valley in parameter space.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="convex-theory-meets-llm-practice"&gt;
 Convex theory meets LLM practice&lt;span class="heading__anchor"&gt; &lt;a href="#convex-theory-meets-llm-practice"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Schaipp et al. (2025)&lt;/strong&gt; show in &lt;em&gt;The Surprising Agreement Between Convex
Optimization Theory and Learning-Rate Scheduling for Large Model Training&lt;/em&gt; that
schedules for large model training obey performance bounds from non-smooth convex
optimisation. For the constant schedule with linear cooldown, the bound is:&lt;/p&gt;
&lt;p&gt;$$
\bar{f}&lt;em&gt;T - f^* \leq \frac{|x_0 - x^*|^2}{2\eta T} + \frac{\eta}{2} \sum&lt;/em&gt;{t=0}^{T-1} \sigma_t^2
$$&lt;/p&gt;
&lt;p&gt;where the cooldown benefit appears explicitly through the absence of logarithmic
terms. This enables &lt;strong&gt;principled LR transfer&lt;/strong&gt;: exploiting the theory yields
noticeable validation loss improvements for 124M and 210M Llama-type models when
extending schedules for continued training.&lt;/p&gt;
&lt;h3 class="heading" id="anytime-schedules-and-weight-averaging"&gt;
 Anytime schedules and weight averaging&lt;span class="heading__anchor"&gt; &lt;a href="#anytime-schedules-and-weight-averaging"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Meterez et al. (2026)&lt;/strong&gt; prove in &lt;em&gt;Anytime Pretraining: Horizon-Free
Learning-Rate Schedules with Weight Averaging&lt;/em&gt;
(&lt;a href="https://arxiv.org/abs/2602.03702"&gt;arXiv:2602.03702&lt;/a&gt;) that horizon-free (anytime)
schedules exist for overparameterised linear regression, with &lt;strong&gt;weight averaging&lt;/strong&gt;
central to achieving minimax-optimal convergence. At 150M–300M params trained at
1–32× Chinchilla scale, a constant LR with weight averaging matches well-tuned
cosine decay across the full training duration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Weight averaging is a largely underutilised practical lever. It should be a
default, not an afterthought.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 class="heading" id="schedulefree-at-llm-scale"&gt;
 ScheduleFree+ at LLM scale&lt;span class="heading__anchor"&gt; &lt;a href="#schedulefree-at-llm-scale"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio (2026)&lt;/strong&gt; extends schedule-free learning to full LLM pretraining in
&lt;em&gt;ScheduleFree+: Scaling Learning-Rate-Free and Schedule-Free Learning to Large
Language Models&lt;/em&gt; (&lt;a href="https://arxiv.org/abs/2605.19095"&gt;arXiv:2605.19095&lt;/a&gt;).
Practical fixes for large batch and model sizes enable ScheduleFree+ to achieve
a &lt;strong&gt;31% improvement&lt;/strong&gt; over WSD schedules at 1000 tokens per parameter, while
also providing a theoretical foundation for checkpoint merging during pretraining.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install schedulefree&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;
&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;schedulefree&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AdamWScheduleFree&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AdamWScheduleFree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;warmup_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Must switch to eval mode before evaluation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;val_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;GitHub: &lt;a href="https://github.com/facebookresearch/schedule_free"&gt;facebookresearch/schedule_free&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="3-hyperparameter-transfer-and-scaling-laws-µp"&gt;
 3. Hyperparameter Transfer and Scaling Laws (µP)&lt;span class="heading__anchor"&gt; &lt;a href="#3-hyperparameter-transfer-and-scaling-laws-%c2%b5p"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="weight-decay-as-the-true-driver-of-lr-transfer"&gt;
 Weight decay as the true driver of LR transfer&lt;span class="heading__anchor"&gt; &lt;a href="#weight-decay-as-the-true-driver-of-lr-transfer"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Maximal Update Parameterisation (µP) is widely used to transfer optimal
learning rates from proxy models to large ones without re-tuning. &lt;strong&gt;Kosson et al.
(2025/2026)&lt;/strong&gt;, accepted to ICLR 2026, provide a large-scale empirical refutation
of the standard µP narrative in &lt;em&gt;Weight Decay May Matter More than µP for
Learning Rate Transfer in Practice&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Their finding: µP&amp;rsquo;s geometric alignment assumptions, which require alignment
between a layer&amp;rsquo;s inputs, weights, and gradient updates, hold only &lt;strong&gt;briefly at
the start of training&lt;/strong&gt;. For the remainder, it is &lt;strong&gt;weight decay&lt;/strong&gt; that
stabilises update dynamics across widths and facilitates LR transfer. This
implies µP&amp;rsquo;s scaling primarily acts as an implicit warmup, and can be largely
replaced by modified warmup schedules.&lt;/p&gt;
&lt;h3 class="heading" id="embedding-layer-lr-as-the-key-factor"&gt;
 Embedding layer LR as the key factor&lt;span class="heading__anchor"&gt; &lt;a href="#embedding-layer-lr-as-the-key-factor"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Kalra &amp;amp; Barkeshli (2026)&lt;/strong&gt; provide complementary evidence in &lt;em&gt;Quantifying
Hyperparameter Transfer and the Importance of Embedding Layer Learning Rate&lt;/em&gt;,
tracing µP&amp;rsquo;s advantage over standard parameterisation (SP) to a single factor:
the &lt;strong&gt;embedding layer learning rate&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In SP, the embedding LR acts as a training bottleneck. Simply increasing it by a
factor of model width, matching µP, eliminates most of the gap. Three
quantitative metrics are used: quality of scaling law fit, robustness to
extrapolation errors, and asymptotic loss penalty.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;span class="lnt"&gt;9
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Simple fix that captures most of µP&amp;#39;s benefit in SP&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;embed_lr_multiplier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_width&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;base_width&lt;/span&gt; &lt;span class="c1"&gt;# = d_model / d_model_proxy&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;param_groups&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;lr&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;base_lr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;embed_lr_multiplier&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;non_embed_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;lr&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;base_lr&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AdamW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param_groups&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;weight_decay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;Open question:&lt;/strong&gt; Kosson et al. argue µP acts as an implicit warmup; Kalra &amp;amp;
Barkeshli argue it is about the embedding LR. Both contradict µP&amp;rsquo;s original
geometric motivation. No consensus has emerged, and the practical implications
differ significantly.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="4-normalization-weight-decay-and-variance-reduction"&gt;
 4. Normalization, Weight Decay, and Variance Reduction&lt;span class="heading__anchor"&gt; &lt;a href="#4-normalization-weight-decay-and-variance-reduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-end-of-training-gradient-spike"&gt;
 The end-of-training gradient spike&lt;span class="heading__anchor"&gt; &lt;a href="#the-end-of-training-gradient-spike"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio (2025)&lt;/strong&gt; identifies a subtle pathology in &lt;em&gt;Why Gradients Rapidly
Increase Near the End of Training&lt;/em&gt;: gradient norms spike sharply near the end of
long LLM runs. The diagnosis is a three-way interaction between &lt;strong&gt;weight decay&lt;/strong&gt;,
&lt;strong&gt;normalisation layers&lt;/strong&gt;, and the &lt;strong&gt;LR schedule&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;When a layer is followed by normalisation, its scale becomes irrelevant to the
forward pass, but weight decay continues shrinking the parameters. This creates
an implicit competition between the optimizer&amp;rsquo;s effective update size and
normalisation rescaling, causing gradient norms to grow unchecked as the LR
decays.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; disable weight decay for AdamW-updated layers in architectures where
those layers are directly followed by normalisation (e.g. every transformer
block):&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;no_wd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;norm&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;embed&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndim&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;no_wd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AdamW&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;wd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;weight_decay&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;no_wd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;weight_decay&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3e-4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;This simultaneously eliminates the spike and reduces loss throughout training.
The analysis explains why weight decay should be disabled for AdamW-updated
layers in architectures like modded-nanoGPT.&lt;/p&gt;
&lt;h3 class="heading" id="weight-normalisation-as-an-alternative"&gt;
 Weight normalisation as an alternative&lt;span class="heading__anchor"&gt; &lt;a href="#weight-normalisation-as-an-alternative"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Nemotron-Flash&lt;/strong&gt; (Fu et al., 2025, NeurIPS 2025) investigates weight
normalisation as a practical mechanism in small language models, finding that it
enables more effective weight updates and improves final convergence. Weight
normalisation sidesteps the weight-decay/normalisation interaction described
above, though at the cost of slightly worse final loss compared to a well-tuned
baseline.&lt;/p&gt;
&lt;h3 class="heading" id="mars-variance-reduction-meets-preconditioned-gradients"&gt;
 MARS: variance reduction meets preconditioned gradients&lt;span class="heading__anchor"&gt; &lt;a href="#mars-variance-reduction-meets-preconditioned-gradients"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite decades of theoretical work, variance reduction has largely failed to
yield practical gains in deep learning. &lt;strong&gt;Yuan et al. (2024/2025)&lt;/strong&gt; attempt to
change this in &lt;em&gt;MARS: Unleashing the Power of Variance Reduction for Training
Large Models&lt;/em&gt;, proposing a unified framework that reconciles AdamW, Lion, and
Shampoo with variance reduction via a &lt;strong&gt;scaled stochastic recursive momentum&lt;/strong&gt;
technique.&lt;/p&gt;
&lt;p&gt;GPT-2 training results look strong. However, the comprehensive benchmark by
&lt;strong&gt;Semenov et al. (2025)&lt;/strong&gt;, &lt;em&gt;Benchmarking Optimizers for Large Language Model
Pretraining&lt;/em&gt;, a 73-page study covering 44 figures and 48 tables across
standardised scenarios, reveals that &lt;strong&gt;MARS does not work well with small batch
sizes&lt;/strong&gt;, limiting its practical applicability in memory-constrained settings.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This underscores the danger of evaluating optimizers on a single benchmark
setup: MARS looks excellent at the batch sizes used in the original paper and
brittle elsewhere.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="5-distributed-training-diloco-and-its-descendants"&gt;
 5. Distributed Training: DiLoCo and Its Descendants&lt;span class="heading__anchor"&gt; &lt;a href="#5-distributed-training-diloco-and-its-descendants"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;DiLoCo (Distributed Low-Communication training) uses AdamW as an &lt;em&gt;inner&lt;/em&gt;
optimizer for $H$ local steps on each worker (typically $H = 500$), then
synchronises by applying Nesterov momentum to the &lt;strong&gt;pseudo-gradient&lt;/strong&gt;, the sum
of all parameter changes across those inner steps. This reduces communication
frequency by up to 500×.&lt;/p&gt;
&lt;h3 class="heading" id="opendiloco-the-open-source-foundation"&gt;
 OpenDiLoCo: the open-source foundation&lt;span class="heading__anchor"&gt; &lt;a href="#opendiloco-the-open-source-foundation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;PrimeIntellect&amp;rsquo;s
&lt;a href="https://github.com/PrimeIntellect-ai/OpenDiloco"&gt;OpenDiLoCo&lt;/a&gt; provides a
reproducible drop-in implementation, demonstrated training across two continents
and three countries with 90–95% compute utilisation. It later served as the
foundation for INTELLECT-1, a 10B-parameter model trained globally.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;open_diloco.hivemind_diloco&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DiLoCoOptimizer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;inner_optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AdamW&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;4e-4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;outer_optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;momentum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nesterov&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DiLoCoOptimizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;dht&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dht&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;num_inner_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# sync every 500 steps, 500× fewer communications&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;inner_optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;inner_optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;outer_optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;outer_optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;h3 class="heading" id="why-diloco-works-on-a-single-node-snoo"&gt;
 Why DiLoCo works on a single node: SNOO&lt;span class="heading__anchor"&gt; &lt;a href="#why-diloco-works-on-a-single-node-snoo"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Kallusky et al. (2025)&lt;/strong&gt; show in &lt;em&gt;SNOO: Step-K Nesterov Outer Optimizer&lt;/em&gt; that
DiLoCo&amp;rsquo;s effectiveness, even on a single node, stems from applying &lt;strong&gt;Nesterov
momentum to the pseudo-gradient&lt;/strong&gt;. Their method isolates this as a standalone
Lookahead variant. Results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;1.5–2.5× FLOPs efficiency&lt;/strong&gt; gains up to $10^{23}$ training FLOPs.&lt;/li&gt;
&lt;li&gt;Improvements &lt;em&gt;increase&lt;/em&gt; with model size.&lt;/li&gt;
&lt;li&gt;Compatible with both AdamW and Muon as inner optimizers.&lt;/li&gt;
&lt;li&gt;Minimal memory overhead.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The single-worker DiLoCo achieves speedups of up to &lt;strong&gt;6.32%&lt;/strong&gt; in steps-to-loss
over AdamW on a 160M Llama model.&lt;/p&gt;
&lt;h3 class="heading" id="smoothing-diloco-generalized-primal-averaging-gpa"&gt;
 Smoothing DiLoCo: Generalized Primal Averaging (GPA)&lt;span class="heading__anchor"&gt; &lt;a href="#smoothing-diloco-generalized-primal-averaging-gpa"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio et al. (2025/2026)&lt;/strong&gt; propose &lt;strong&gt;GPA&lt;/strong&gt; in &lt;em&gt;Smoothing DiLoCo with Primal
Averaging for Faster Training of LLMs&lt;/em&gt;
(&lt;a href="https://arxiv.org/abs/2512.17131"&gt;arXiv:2512.17131&lt;/a&gt;), which decouples
DiLoCo&amp;rsquo;s interpolation constants to enable smooth iterate averaging at every
step, replacing uniform averaging with exponential moving averaging.&lt;/p&gt;
&lt;p&gt;GPA unifies single-worker DiLoCo and ScheduleFree within a single non-distributed
framework. Speedups over AdamW in steps-to-target-loss:&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th style="text-align: left"&gt;Model&lt;/th&gt;
					&lt;th style="text-align: right"&gt;Speedup&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Llama-160M&lt;/td&gt;
					&lt;td style="text-align: right"&gt;8.71%&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Llama-1B&lt;/td&gt;
					&lt;td style="text-align: right"&gt;10.13%&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Llama-8B&lt;/td&gt;
					&lt;td style="text-align: right"&gt;9.58%&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 class="heading" id="streaming-diloco-towards-free-distributed-training"&gt;
 Streaming DiLoCo: towards free distributed training&lt;span class="heading__anchor"&gt; &lt;a href="#streaming-diloco-towards-free-distributed-training"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Douillard et al. (2025)&lt;/strong&gt; address the remaining bottleneck in &lt;em&gt;Streaming
DiLoCo with Overlapping Communication: Towards a Distributed Free Lunch&lt;/em&gt;
(&lt;a href="https://arxiv.org/abs/2501.18512"&gt;arXiv:2501.18512&lt;/a&gt;): even with infrequent
synchronisation, each sync exchanges all parameters simultaneously. Three fixes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Streaming sync:&lt;/strong&gt; synchronise only subsets of parameters at a time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Overlapping communication:&lt;/strong&gt; continue training during synchronisation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quantisation:&lt;/strong&gt; reduce cross-worker data to fewer bits.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Together, required bandwidth drops by &lt;strong&gt;two orders of magnitude&lt;/strong&gt; while
maintaining comparable quality at billion-parameter scale.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th style="text-align: left"&gt;Method&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Setting&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Key contribution&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Gain&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;SNOO&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Single-node&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Nesterov momentum on pseudo-gradient&lt;/td&gt;
					&lt;td style="text-align: left"&gt;1.5–2.5× FLOP efficiency&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;GPA&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Single-node&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Smooth iterate averaging; unifies DiLoCo + SF&lt;/td&gt;
					&lt;td style="text-align: left"&gt;~9% steps-to-loss&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Streaming DiLoCo&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Distributed&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Streaming sync + quantisation&lt;/td&gt;
					&lt;td style="text-align: left"&gt;~100× bandwidth reduction&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="6-cross-cutting-themes-and-open-questions"&gt;
 6. Cross-Cutting Themes and Open Questions&lt;span class="heading__anchor"&gt; &lt;a href="#6-cross-cutting-themes-and-open-questions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Several recurrent tensions emerge from reading these papers together.&lt;/p&gt;
&lt;h3 class="heading" id="geometry-vs-step-size-calibration-in-muon"&gt;
 Geometry vs. step-size calibration in Muon&lt;span class="heading__anchor"&gt; &lt;a href="#geometry-vs-step-size-calibration-in-muon"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Kovalev, Pethick et al., and Amsel et al. offer geometric explanations for
Muon&amp;rsquo;s success. Shumaylov et al. argue that geometry is practically irrelevant
and step-size optimality is the true driver. Which narrative guides future
research matters: geometry points toward more sophisticated matrix norms; the
step-size interpretation suggests much simpler paths to similar gains.&lt;/p&gt;
&lt;h3 class="heading" id="what-µp-is-actually-doing"&gt;
 What µP is actually doing&lt;span class="heading__anchor"&gt; &lt;a href="#what-%c2%b5p-is-actually-doing"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Kosson et al. argue µP is primarily an implicit warmup mechanism. Kalra &amp;amp;
Barkeshli argue it is essentially about the embedding layer LR. Both stand in
contrast to µP&amp;rsquo;s original geometric motivation. The practical stakes are high:
the warmup interpretation suggests µP can be discarded with a schedule change;
the embedding LR interpretation suggests a single-line fix.&lt;/p&gt;
&lt;h3 class="heading" id="weight-decay-as-a-multi-role-hyperparameter"&gt;
 Weight decay as a multi-role hyperparameter&lt;span class="heading__anchor"&gt; &lt;a href="#weight-decay-as-a-multi-role-hyperparameter"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Weight decay appears as a protagonist in three independent stories in this
survey:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Defazio:&lt;/strong&gt; source of end-of-training gradient spikes via interaction with
normalisation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kosson et al.:&lt;/strong&gt; the true driver of LR transfer, not µP geometry.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kalra &amp;amp; Barkeshli:&lt;/strong&gt; improves scaling law fits but &lt;em&gt;hurts&lt;/em&gt; extrapolation
robustness.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is no longer tenable to treat weight decay as a simple regulariser with a
sensible default. It must be understood per-layer and in interaction with your
normalisation strategy.&lt;/p&gt;
&lt;h3 class="heading" id="diloco-as-the-practical-distributed-optimizer"&gt;
 DiLoCo as the practical distributed optimizer&lt;span class="heading__anchor"&gt; &lt;a href="#diloco-as-the-practical-distributed-optimizer"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite a large body of research on distributed optimizers, DiLoCo and its
derivatives appear to be the only methods that consistently add value beyond
simply scaling the batch size. The finding that its benefits carry over to
single-node settings (via SNOO and GPA) makes it a particularly important line
of work for practitioners at all scales.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="practical-recommendations-for-2026"&gt;
 Practical Recommendations for 2026&lt;span class="heading__anchor"&gt; &lt;a href="#practical-recommendations-for-2026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Based on the convergence of evidence across these papers, for a new large
training run consider:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Optimizer:&lt;/strong&gt; Muon for hidden-layer matrix weights + AdamW for
embeddings/head. The Moonlight scaling fixes (weight decay + update scale
adjustment) are necessary above ~1B parameters.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Schedule:&lt;/strong&gt; ScheduleFree+ or linear decay instead of cosine. If you need a
fixed-horizon schedule, WSD with higher $\beta_2$ during cooldown.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weight decay:&lt;/strong&gt; Disable it for layers directly followed by normalisation to
avoid end-of-training gradient spikes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Outer optimizer:&lt;/strong&gt; Wrap your training loop with single-worker DiLoCo (SNOO
or GPA) for a ~9% efficiency gain with no architectural changes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;µP alternatives:&lt;/strong&gt; Before adopting full µP overhead, try increasing the
embedding layer LR by a factor of $d_{\text{model}} / d_{\text{proxy}}$.
This may reproduce most of the benefit.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;None of these require fundamental architectural changes.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;#&lt;/th&gt;
					&lt;th&gt;Paper&lt;/th&gt;
					&lt;th&gt;Venue&lt;/th&gt;
					&lt;th&gt;Links&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;1&lt;/td&gt;
					&lt;td&gt;Jordan et al. (2024): &lt;em&gt;Muon: An optimizer for hidden layers&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://kellerjordan.github.io/posts/muon/"&gt;blog&lt;/a&gt; · &lt;a href="https://github.com/KellerJordan/Muon"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2&lt;/td&gt;
					&lt;td&gt;Liu et al. (2025): &lt;em&gt;Muon is Scalable for LLM Training&lt;/em&gt; (Moonlight)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2502.16982"&gt;arXiv:2502.16982&lt;/a&gt; · &lt;a href="https://github.com/MoonshotAI/Moonlight"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;3&lt;/td&gt;
					&lt;td&gt;Kovalev (2025): &lt;em&gt;Understanding Gradient Orthogonalization&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;4&lt;/td&gt;
					&lt;td&gt;Pethick et al. (2025): &lt;em&gt;Training Deep Learning Models with Norm-Constrained LMOs&lt;/em&gt; (Scion)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2502.07529"&gt;arXiv:2502.07529&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;5&lt;/td&gt;
					&lt;td&gt;Amsel et al. (2025): &lt;em&gt;The Polar Express&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;6&lt;/td&gt;
					&lt;td&gt;Shumaylov et al. (2026): &lt;em&gt;Muon is Not That Special&lt;/em&gt; (Freon/Kaon)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;7&lt;/td&gt;
					&lt;td&gt;Defazio et al. (2023): &lt;em&gt;Optimal Linear Decay Learning Rate Schedules&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2310.07831"&gt;arXiv:2310.07831&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;8&lt;/td&gt;
					&lt;td&gt;Dremov et al. (2025): &lt;em&gt;Training Dynamics of the Cooldown Stage in WSD&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;9&lt;/td&gt;
					&lt;td&gt;Schaipp et al. (2025): &lt;em&gt;Surprising Agreement Between Convex Theory and LR Scheduling&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;10&lt;/td&gt;
					&lt;td&gt;Meterez et al. (2026): &lt;em&gt;Anytime Pretraining&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2602.03702"&gt;arXiv:2602.03702&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;11&lt;/td&gt;
					&lt;td&gt;Defazio (2026): &lt;em&gt;ScheduleFree+&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2605.19095"&gt;arXiv:2605.19095&lt;/a&gt; · &lt;a href="https://github.com/facebookresearch/schedule_free"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;12&lt;/td&gt;
					&lt;td&gt;Kosson et al. (2026): &lt;em&gt;Weight Decay May Matter More than µP&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;ICLR 2026&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;13&lt;/td&gt;
					&lt;td&gt;Kalra &amp;amp; Barkeshli (2026): &lt;em&gt;Quantifying HP Transfer and Embedding LR&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;14&lt;/td&gt;
					&lt;td&gt;Defazio (2025): &lt;em&gt;Why Gradients Rapidly Increase Near End of Training&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;15&lt;/td&gt;
					&lt;td&gt;Fu et al. (2025): &lt;em&gt;Nemotron-Flash&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;NeurIPS 2025&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;16&lt;/td&gt;
					&lt;td&gt;Yuan et al. (2025): &lt;em&gt;MARS&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;17&lt;/td&gt;
					&lt;td&gt;Semenov et al. (2025): &lt;em&gt;Benchmarking Optimizers for LLM Pretraining&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;18&lt;/td&gt;
					&lt;td&gt;Kallusky et al. (2025): &lt;em&gt;SNOO&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;19&lt;/td&gt;
					&lt;td&gt;Defazio et al. (2026): &lt;em&gt;Smoothing DiLoCo with Primal Averaging (GPA)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2512.17131"&gt;arXiv:2512.17131&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;20&lt;/td&gt;
					&lt;td&gt;Douillard et al. (2025): &lt;em&gt;Streaming DiLoCo&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2501.18512"&gt;arXiv:2501.18512&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;21&lt;/td&gt;
					&lt;td&gt;Douillard et al. (2023/2024): &lt;em&gt;DiLoCo&lt;/em&gt; (original)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2311.08105"&gt;arXiv:2311.08105&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;22&lt;/td&gt;
					&lt;td&gt;PrimeIntellect AI (2024): &lt;em&gt;OpenDiLoCo&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://github.com/PrimeIntellect-ai/OpenDiloco"&gt;GitHub&lt;/a&gt; · &lt;a href="https://www.primeintellect.ai/blog/opendiloco"&gt;blog&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;</description></item><item><title>The Invariant Subspace Problem</title><link>https://blog.namln.org/en/posts/invariant-subspace-problem/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/invariant-subspace-problem/</guid><description>&lt;p&gt;Few questions in functional analysis have attracted sustained attention across
as many decades as this one. It sits at the confluence of operator theory,
spectral theory, and complex analysis, and every partial result has opened new
territory rather than narrowing the problem to a routine case.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Problem (Invariant Subspace Problem)&lt;/span&gt;
&lt;p&gt;Does every bounded linear operator $T$ on an infinite-dimensional separable
complex Hilbert space $\mathcal{H}$ have a &lt;strong&gt;non-trivial closed invariant subspace&lt;/strong&gt;?&lt;/p&gt;
&lt;p&gt;That is, does there always exist a closed subspace $\mathcal{M} \subsetneq \mathcal{H}$
with $\mathcal{M} \neq {0}$ such that $T\mathcal{M} \subseteq \mathcal{M}$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;medium importance&lt;/em&gt; on the
&lt;a href="http://www.openproblemgarden.org/op/invariant_subspace_problem"&gt;Open Problem Garden&lt;/a&gt;.
It is old enough to have accumulated a rich history of partial results, yet still
open in the Hilbert space setting after more than seventy years.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="trivial-observations-and-why-they-run-out"&gt;
 Trivial Observations and Why They Run Out&lt;span class="heading__anchor"&gt; &lt;a href="#trivial-observations-and-why-they-run-out"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Two subspaces are always invariant: ${0}$ and $\mathcal{H}$ itself. These are the
&lt;em&gt;trivial&lt;/em&gt; invariant subspaces; the problem asks whether anything else must exist.&lt;/p&gt;
&lt;p&gt;On finite-dimensional spaces the answer is immediate: every operator on $\mathbb{C}^n$
has an eigenvector (by the fundamental theorem of algebra applied to the characteristic
polynomial), and the span of any eigenvector is a one-dimensional invariant subspace.
This argument fails completely in infinite dimensions, where the spectrum can be
continuous and eigenvectors need not exist.&lt;/p&gt;
&lt;p&gt;On non-separable Hilbert spaces the problem is also trivial but for a different reason:
for any non-zero vector $x \in \mathcal{H}$, the closed linear span
$\overline{\operatorname{span}{T^n x : n \geq 0}}$ is a closed invariant subspace,
and if $\mathcal{H}$ is non-separable it cannot equal all of $\mathcal{H}$.
So the problem is genuinely about &lt;strong&gt;separable&lt;/strong&gt; spaces.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="landscape-of-known-results"&gt;
 Landscape of Known Results&lt;span class="heading__anchor"&gt; &lt;a href="#landscape-of-known-results"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="positive-results-classes-with-invariant-subspaces"&gt;
 Positive Results: Classes with Invariant Subspaces&lt;span class="heading__anchor"&gt; &lt;a href="#positive-results-classes-with-invariant-subspaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Aronszajn–Smith, 1954)&lt;/span&gt;
&lt;p&gt;Every compact operator on a Banach space of dimension greater than one has a
non-trivial closed invariant subspace.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The compact case was already known to von Neumann in the 1930s for Hilbert spaces,
but was never published; Aronszajn and Smith gave the first published proof, extended
to Banach spaces. The key idea is that a compact operator can be approximated by
finite-rank operators, each of which has invariant subspaces, and a limiting argument
produces an invariant subspace for the compact operator.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Lomonosov, 1973)&lt;/span&gt;
&lt;p&gt;If a bounded operator $T$ on a Banach space commutes with a non-zero compact operator,
then $T$ has a non-trivial hyperinvariant subspace (a subspace invariant under every
operator that commutes with $T$).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Lomonosov&amp;rsquo;s proof is strikingly short, less than a page, and uses the
Schauder fixed-point theorem in an unexpected way. It subsumes both the compact
case (an operator commutes with itself) and the polynomially compact case
(an operator commutes with $p(T)$, which is compact if $p(T)$ is).
For several years it seemed that Lomonosov&amp;rsquo;s theorem might resolve the problem
entirely, until Hadwin, Nordgren, Radjavi, and Rosenthal (1980) exhibited an
operator that does not commute with any non-zero compact operator yet still has
invariant subspaces.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Brown, 1987)&lt;/span&gt;
&lt;p&gt;Every subnormal operator on a Hilbert space has a non-trivial invariant subspace.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;An operator $T$ is &lt;em&gt;subnormal&lt;/em&gt; if it is the restriction of a normal operator on a
larger Hilbert space. Normal operators are handled by the spectral theorem, which
produces a rich lattice of invariant subspaces; subnormal operators inherit
invariant subspaces by restriction. Brown&amp;rsquo;s proof uses techniques from rational
approximation theory (the solution of the Halmos problem on subnormal operators).&lt;/p&gt;
&lt;p&gt;Beyond these landmark theorems, invariant subspaces are also known for:
hyponormal operators with some additional conditions, operators whose spectrum has
interior points, operators satisfying growth conditions on the resolvent, and
polynomially bounded operators with spectrum containing the unit circle under
further constraints (Liu, 2017; Réjasse, 2023).&lt;/p&gt;
&lt;h3 class="heading" id="beurlings-theorem-a-complete-classification"&gt;
 Beurling&amp;rsquo;s Theorem: A Complete Classification&lt;span class="heading__anchor"&gt; &lt;a href="#beurlings-theorem-a-complete-classification"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Beurling, 1949)&lt;/span&gt;
&lt;p&gt;The closed invariant subspaces of the unilateral shift $S : H^2(\mathbb{D}) \to H^2(\mathbb{D})$,
$(Sf)(z) = zf(z)$, are exactly the subspaces of the form $\varphi H^2(\mathbb{D})$
where $\varphi$ is an inner function (i.e. $|\varphi(e^{i\theta})| = 1$ a.e.).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Beurling&amp;rsquo;s theorem is a landmark because it gives not merely existence but a full
classification of all invariant subspaces for a single operator. The shift on $H^2$
is in many senses the canonical operator for the Hilbert space invariant subspace
problem: finding a counterexample to the full problem is equivalent to finding an
operator with no invariant subspaces, and the shift shows how rich such structure
can be even for a single operator.&lt;/p&gt;
&lt;h3 class="heading" id="negative-results-counterexamples-on-banach-spaces"&gt;
 Negative Results: Counterexamples on Banach Spaces&lt;span class="heading__anchor"&gt; &lt;a href="#negative-results-counterexamples-on-banach-spaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Enflo, 1975/1987; Read, 1984)&lt;/span&gt;
&lt;p&gt;There exist separable Banach spaces and bounded linear operators on them with no
non-trivial closed invariant subspace. In particular, Read constructed such an
operator on $\ell^1$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Enflo&amp;rsquo;s counterexample was the first, constructed in 1975 though not published until
1987 due to its length and complexity. Read&amp;rsquo;s construction (1984) arrived independently
and somewhat earlier in print; a further, more explicit example by Read (1985) lives on
the classical space $\ell^1$. These results make clear that the answer to the invariant
subspace problem is &lt;strong&gt;negative for general Banach spaces&lt;/strong&gt;. The Hilbert space case
remains the central open question precisely because no counterexample on any reflexive
Banach space, much less a Hilbert space, has been found.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-hilbertbanach-gap"&gt;
 The Hilbert–Banach Gap&lt;span class="heading__anchor"&gt; &lt;a href="#the-hilbertbanach-gap"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The separation between Hilbert space and general Banach space behaviour is a
recurring theme. Several features of Hilbert spaces that Banach spaces lack suggest
why counterexamples might not exist in the Hilbert setting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The inner product gives every operator an adjoint $T^*$, and the lattice of invariant
subspaces of $T$ and of $T^*$ are related by orthogonal complementation.&lt;/li&gt;
&lt;li&gt;The spectral theorem for normal operators provides a complete invariant subspace
theory for that class, anchoring intuition.&lt;/li&gt;
&lt;li&gt;Reflexivity and the existence of unconditional bases in specific Hilbert spaces
constrain operator behaviour more than in $\ell^1$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these features has yet been converted into a proof for the general case.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-proof-attempts"&gt;
 Recent Proof Attempts&lt;span class="heading__anchor"&gt; &lt;a href="#recent-proof-attempts"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The problem has attracted renewed attention in recent years.&lt;/p&gt;
&lt;p&gt;In May 2023, Per Enflo, the same mathematician who produced the first Banach space
counterexample, posted a preprint to arXiv (2305.15442) claiming a &lt;strong&gt;positive
resolution&lt;/strong&gt; for all separable Hilbert spaces. The original preprint was 13 pages;
a substantially expanded version (52 KB) appeared in April 2024. Enflo himself has
been cautious about the result, noting that expert review is ongoing. As of this
writing the preprint has not received a definitive verdict from the community.&lt;/p&gt;
&lt;p&gt;In July 2023 an independent preprint by Neville (arXiv:2307.08176) also claimed
a positive solution for separable Hilbert spaces.&lt;/p&gt;
&lt;p&gt;In September 2024 a peer-reviewed article in &lt;em&gt;Axioms&lt;/em&gt; by Khalil, Yousef, Alshanti,
and Abu Hammad announced a proof, but basic errors were identified shortly after
publication (Ghatasheh, arXiv:2411.19409, November 2024).&lt;/p&gt;
&lt;p&gt;The problem therefore remains officially open. The cluster of recent attempts reflects
both its difficulty and its continued centrality in functional analysis.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-cyclic-vectors-and-the-spectral-radius-formula"&gt;
 1. Cyclic Vectors and the Spectral Radius Formula&lt;span class="heading__anchor"&gt; &lt;a href="#1-cyclic-vectors-and-the-spectral-radius-formula"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A vector $x \in \mathcal{H}$ is &lt;em&gt;cyclic&lt;/em&gt; for $T$ if $\mathcal{H} = \overline{\operatorname{span}{T^n x : n \geq 0}}$. An operator with a non-trivial invariant subspace cannot have every non-zero vector be cyclic. The contrapositive is: if every non-zero vector is cyclic, then $T$ is a counterexample.&lt;/p&gt;
&lt;p&gt;Read&amp;rsquo;s Banach-space constructions proceed by building &lt;em&gt;hypercyclic&lt;/em&gt; operators whose
orbits are dense. On Hilbert spaces, Hilbert space geometry severely constrains the
density of orbits. Making this constraint quantitative, via growth estimates on
$|T^n x|$ or on the resolvent $|(T-\lambda)^{-1}|$, might close the gap between
known positive results and the general case.&lt;/p&gt;
&lt;h3 class="heading" id="2-dual-algebra-techniques"&gt;
 2. Dual Algebra Techniques&lt;span class="heading__anchor"&gt; &lt;a href="#2-dual-algebra-techniques"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A powerful modern approach studies the &lt;em&gt;dual algebra&lt;/em&gt; $\mathcal{A} _T$, the weak-$*$
closure of the polynomials in $T$ as a subalgebra of $\mathcal{B}(\mathcal{H})$.
If $\mathcal{A} _T = \mathcal{B}(\mathcal{H})$ (the operator is &lt;em&gt;reflexive&lt;/em&gt; in this
sense), one can sometimes extract invariant subspaces from the structure of the
algebra. Results along these lines have been obtained for $C _{00}$ contractions
(Bercovici, Foiaş, Pearcy) and for polynomially bounded operators under spectral
conditions (Liu, 2017). The key open question is whether every Hilbert space contraction
is reflexive in this sense, or whether the dual algebra approach can be made to work
for all contractions via Sz.-Nagy–Foiaş theory.&lt;/p&gt;
&lt;h3 class="heading" id="3-contractions-and-the-sz-nagyfoiaş-calculus"&gt;
 3. Contractions and the Sz.-Nagy–Foiaş Calculus&lt;span class="heading__anchor"&gt; &lt;a href="#3-contractions-and-the-sz-nagyfoia%c5%9f-calculus"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Every contraction ($|T| \leq 1$) on a Hilbert space admits a minimal unitary dilation
(Sz.-Nagy&amp;rsquo;s dilation theorem), and Foiaş developed a functional calculus for
contractions based on $H^\infty(\mathbb{D})$. The rich structure of this calculus has
produced invariant subspace theorems for $C_{11}$ contractions and for contractions
whose spectrum is rich enough. The question is whether the calculus can be pushed to
all contractions; the general invariant subspace problem for contractions is equivalent
to the full problem (by rescaling), so this is not a simplification but a different
vantage point that has been productive.&lt;/p&gt;
&lt;h3 class="heading" id="4-almost-invariant-half-spaces"&gt;
 4. Almost Invariant Half-Spaces&lt;span class="heading__anchor"&gt; &lt;a href="#4-almost-invariant-half-spaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A weaker notion, studied by Androulakis, Popov, Tcaciuc, and Troitsky, asks for
&lt;em&gt;almost invariant half-spaces&lt;/em&gt;: closed subspaces $\mathcal{M}$ of infinite dimension
and infinite codimension such that $T\mathcal{M} \subseteq \mathcal{M} + \mathcal{F}$
for some finite-dimensional subspace $\mathcal{F}$. These exist for every operator
on any infinite-dimensional Banach space. Whether every operator on a Hilbert space
has a genuinely invariant (not just almost invariant) infinite-dimensional subspace
of infinite codimension remains open and is a concrete intermediate target.&lt;/p&gt;
&lt;h3 class="heading" id="5-hyperinvariant-subspaces"&gt;
 5. Hyperinvariant Subspaces&lt;span class="heading__anchor"&gt; &lt;a href="#5-hyperinvariant-subspaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A subspace is &lt;em&gt;hyperinvariant&lt;/em&gt; for $T$ if it is invariant under every operator that
commutes with $T$. Every hyperinvariant subspace is invariant, so existence of a
hyperinvariant subspace implies a positive answer to the invariant subspace problem.
Lomonosov&amp;rsquo;s 1973 theorem gives hyperinvariant subspaces when $T$ commutes with a
compact operator. The &lt;em&gt;hyperinvariant subspace problem&lt;/em&gt;, does every operator on a
Hilbert space (other than scalar multiples of the identity) have a hyperinvariant
subspace?, is also open and may be harder than the invariant subspace problem itself.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Aronszajn, N. &amp;amp; Smith, K. T. (1954). Invariant subspaces of completely continuous operators. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;60&lt;/strong&gt;(2), 345–350.&lt;/li&gt;
&lt;li&gt;Beurling, A. (1949). On two problems concerning linear transformations in Hilbert space. &lt;em&gt;Acta Mathematica&lt;/em&gt;, &lt;strong&gt;81&lt;/strong&gt;, 239–255.&lt;/li&gt;
&lt;li&gt;Brown, S. (1987). Hyponormal operators with thick spectra have invariant subspaces. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;125&lt;/strong&gt;(1), 93–103.&lt;/li&gt;
&lt;li&gt;Enflo, P. H. (1987). On the invariant subspace problem for Banach spaces. &lt;em&gt;Acta Mathematica&lt;/em&gt;, &lt;strong&gt;158&lt;/strong&gt;, 213–313.&lt;/li&gt;
&lt;li&gt;Enflo, P. H. (2023). On the invariant subspace problem in Hilbert spaces. arXiv:2305.15442.&lt;/li&gt;
&lt;li&gt;Lomonosov, V. I. (1973). Invariant subspaces of operators commuting with compact operators. &lt;em&gt;Functional Analysis and Its Applications&lt;/em&gt;, &lt;strong&gt;7&lt;/strong&gt;(3), 213–214.&lt;/li&gt;
&lt;li&gt;Read, C. J. (1984). A solution to the invariant subspace problem. &lt;em&gt;Bulletin of the London Mathematical Society&lt;/em&gt;, &lt;strong&gt;16&lt;/strong&gt;(4), 337–401.&lt;/li&gt;
&lt;li&gt;Read, C. J. (1985). A solution to the invariant subspace problem on the space $\ell^1$. &lt;em&gt;Bulletin of the London Mathematical Society&lt;/em&gt;, &lt;strong&gt;17&lt;/strong&gt;(4), 305–317.&lt;/li&gt;
&lt;li&gt;Radjavi, H. &amp;amp; Rosenthal, P. (2003). &lt;em&gt;Invariant Subspaces&lt;/em&gt; (2nd ed.). Dover.&lt;/li&gt;
&lt;li&gt;Bercovici, H., Foiaş, C., &amp;amp; Pearcy, C. (1985). &lt;em&gt;Dual Algebras with Applications to Invariant Subspaces and Dilation Theory&lt;/em&gt;. AMS.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Something Like Picard for 1-Forms</title><link>https://blog.namln.org/en/posts/something-like-picard-for-1-forms/</link><pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/something-like-picard-for-1-forms/</guid><description>&lt;p&gt;Picard&amp;rsquo;s great theorem is a statement about how wildly a holomorphic function can
behave near an essential singularity. The conjecture below asks whether injectivity
of local primitives of a 1-form is enough to rule out such wild behaviour at the
origin, forcing the 1-form to extend meromorphically across the puncture.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Conjecture (Elsner, 2010)&lt;/span&gt;
&lt;p&gt;Let $D$ be the open unit disk and let $U_1,\dots,U_n$ be open sets with
$\bigcup_{j=1}^n U_j = D\setminus{0}$. Suppose there are injective holomorphic
functions $f_j : U_j \to \mathbb{C}$ such that
$$\mathrm{d}f_j = \mathrm{d}f_k \quad \text{on every connected component of } U_j \cap U_k.$$
Then the $\mathrm{d}f_j$ glue together to a &lt;strong&gt;meromorphic&lt;/strong&gt; 1-form on $D$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;medium importance&lt;/em&gt; on the
&lt;a href="http://www.openproblemgarden.org/op/something_like_picard_for_1_forms"&gt;Open Problem Garden&lt;/a&gt;
and is not recommended for undergraduates, reflecting the depth of the tools involved.
It arises from Elsner&amp;rsquo;s study of hyperelliptic action integrals in the context of the
exact WKB method for Schrödinger equations with polynomial potential
(Elsner, &lt;em&gt;Ann. Inst. Fourier&lt;/em&gt; &lt;strong&gt;49&lt;/strong&gt;(1), 1999).&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="setup-and-interpretation"&gt;
 Setup and Interpretation&lt;span class="heading__anchor"&gt; &lt;a href="#setup-and-interpretation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The compatibility condition $\mathrm{d}f_j = \mathrm{d}f_k$ on each connected
component of $U_j \cap U_k$ is equivalent to saying $f_j - f_k$ is locally constant
there. The local differentials therefore glue together unambiguously to a global
holomorphic 1-form
$$\omega \in \Omega^1(D\setminus{0})$$
whose restriction to each $U_j$ equals $\mathrm{d}f_j$. The conjecture asserts that
$\omega$ does not have an essential singularity at the origin: it extends to a
meromorphic 1-form on all of $D$, meaning near $0$ it looks like
$$\omega = \left(\frac{c_{-m}}{z^m} + \cdots + \frac{c_{-1}}{z} + c_0 + c_1 z + \cdots\right)dz$$
for some $m \ge 0$.&lt;/p&gt;
&lt;p&gt;The injectivity of each $f_j$ is the crucial hypothesis. Without it the statement is
false: any holomorphic 1-form $\omega$ on $D\setminus{0}$ with an essential
singularity at $0$ is locally $\mathrm{d}f_j$ for some holomorphic $f_j$, and these
$f_j$ can be chosen on contractible pieces of the cover; injectivity is what
prohibits essential singularities from arising.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="what-is-already-known"&gt;
 What Is Already Known&lt;span class="heading__anchor"&gt; &lt;a href="#what-is-already-known"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Partial Result&lt;/span&gt;
&lt;p&gt;Under the hypotheses of the conjecture:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The 1-form $\omega$ is holomorphic on $D\setminus{0}$.&lt;/li&gt;
&lt;li&gt;If the residue of $\omega$ at the origin vanishes, Picard&amp;rsquo;s big theorem can be
applied to conclude that $\omega$ extends meromorphically across $0$.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;p&gt;Point (1) is straightforward: each $\mathrm{d}f_j$ is holomorphic on $U_j$ and the
local forms agree on overlaps, so $\omega$ is holomorphic wherever it is defined,
i.e. on $D\setminus{0}$.&lt;/p&gt;
&lt;p&gt;Point (2) is the key partial result recorded by Elsner. If $\operatorname{Res}_0\omega = 0$,
then $\omega$ has trivial monodromy around the origin and admits a single-valued
holomorphic primitive $F$ on the punctured disk: $\omega = \mathrm{d}F$. The
injectivity of each local branch $f_j$ then forces $F$ itself to be injective on
some punctured neighbourhood of $0$ (since $f_j = F + c$ locally). An injective
holomorphic function on a punctured disk cannot have an essential singularity there,
and this is where Picard enters: at an essential singularity, by Picard&amp;rsquo;s big theorem,
every value is taken infinitely often in any punctured neighbourhood, contradicting
injectivity. Hence $F$ has at most a pole at $0$, and $\omega = \mathrm{d}F$ is meromorphic.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;open case&lt;/strong&gt; is when $\operatorname{Res}_0\omega \ne 0$, so that $\omega$ has
non-trivial monodromy and no single-valued global primitive exists. The local
primitives $f_j$ then experience monodromy as one loops around the origin, and the
injectivity constraint must be leveraged in this more delicate multi-valued setting.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="connection-to-picards-theorem"&gt;
 Connection to Picard&amp;rsquo;s Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#connection-to-picards-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The title of the conjecture reflects a precise structural analogy.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Picard's Great Theorem)&lt;/span&gt;
&lt;p&gt;If $f$ has an essential singularity at $z_0$, then in every punctured neighbourhood
of $z_0$ the function $f$ takes every value in $\mathbb{C}$, with at most one exception,
infinitely many times.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In particular, a function with an essential singularity is far from injective near
that point. The conjecture elevates this observation to the level of 1-forms: an
injective holomorphic primitive should preclude essential singularities in the
1-form itself, even when the primitive is only locally and multi-valuedly defined.&lt;/p&gt;
&lt;p&gt;Standard Picard covers the zero-residue case by reducing to a single-valued primitive.
The conjecture asks for an analogue that works when the monodromy is non-trivial, a
genuinely new statement about multi-valued functions and their differential geometry.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="origin-hyperelliptic-action-integrals"&gt;
 Origin: Hyperelliptic Action Integrals&lt;span class="heading__anchor"&gt; &lt;a href="#origin-hyperelliptic-action-integrals"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The problem arises from the &lt;em&gt;exact WKB method&lt;/em&gt; applied to the stationary
Schrödinger equation $-\psi&amp;rsquo;&amp;rsquo; + V(x)\psi = E\psi$ with polynomial potential $V$.
The formal WKB ansatz $\psi \sim e^{S/\hbar}$ produces a multivalued &lt;em&gt;action integral&lt;/em&gt;
$$\mathcal{I}(E) = \int_\gamma \sqrt{V(x) - E}\mathrm{d}x$$
defined on a hyperelliptic Riemann surface whose branch structure depends on the
energy parameter $E$. Elsner&amp;rsquo;s 1999 paper constructs the Riemann surface of
$\mathcal{I}$ explicitly and shows its branch points accumulate densely in the
value plane, a phenomenon that obstructs Borel–Laplace resummation of the
WKB symbols.&lt;/p&gt;
&lt;p&gt;In this setting the local inverses of $\mathcal{I}$ play the role of the $f_j$: they
are locally injective holomorphic functions whose differentials agree on overlaps.
The conjecture asks whether the obstruction to global meromorphic extension can
arise only from a pole, a controlled singularity, rather than an essential one.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-the-non-zero-residue-case"&gt;
 1. The Non-Zero Residue Case&lt;span class="heading__anchor"&gt; &lt;a href="#1-the-non-zero-residue-case"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The open heart of the problem is the case $\operatorname{Res}_0\omega \ne 0$. Here
$\omega$ is not exact near $0$, the monodromy of the primitive is a non-trivial
translation $f_j \mapsto f_j + 2\pi i, \operatorname{Res}_0\omega$, and no single
injective function encompasses the full behaviour near the singularity.&lt;/p&gt;
&lt;p&gt;A natural approach is to pass to a cyclic cover $\tilde D \to D$ that trivialises the
monodromy, construct a single-valued primitive on $\tilde D\setminus{0}$, and
then appeal to the zero-residue argument there. The key difficulty is that the
injectivity of each $f_j$ on $U_j$ does not immediately imply injectivity of the
lifted primitive on $\tilde D$, since different sheets can collide. Making this
argument precise, or finding a counterexample, is the main open problem.&lt;/p&gt;
&lt;h3 class="heading" id="2-quantitative-control-via-nevanlinna-theory"&gt;
 2. Quantitative Control via Nevanlinna Theory&lt;span class="heading__anchor"&gt; &lt;a href="#2-quantitative-control-via-nevanlinna-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;An alternative strategy replaces Picard&amp;rsquo;s theorem by its quantitative form. If $F$ is
a meromorphic function on the punctured disk with an essential singularity, the
Nevanlinna characteristic $T(r,F)$ grows faster than any power of $\log(1/r)$ as
$r\to 0$. For an injective function the counting functions $N(r,a,F)$, recording
how often $F = a$ in the punctured disk, satisfy strong constraints.&lt;/p&gt;
&lt;p&gt;Nevanlinna-theoretic methods might give a direct bound on $T(r,f_j)$ in terms of the
geometry of the cover ${U_j}$ and the injectivity of $f_j$, ruling out essential
singularities of $\omega$ without passing through the monodromy argument. This would
require adapting the standard Nevanlinna machinery to functions that are only locally
defined on an open cover.&lt;/p&gt;
&lt;h3 class="heading" id="3-replacing-injectivity-by-finite-valence"&gt;
 3. Replacing Injectivity by Finite Valence&lt;span class="heading__anchor"&gt; &lt;a href="#3-replacing-injectivity-by-finite-valence"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;One can ask whether the conjecture remains true if &amp;ldquo;injective&amp;rdquo; is weakened to
&amp;ldquo;at most $d$-to-one&amp;rdquo; for some fixed integer $d$. Finite-valence holomorphic functions
cannot have essential singularities either, by a Picard-type argument (a function of
valence at most $d$ takes each value at most $d$ times, so in any neighbourhood of an
essential singularity it must omit a set of positive capacity, contradicting Picard).&lt;/p&gt;
&lt;p&gt;If the conjecture extends to finite valence, the proof strategy will likely yield a
valence-independent argument that illuminates the zero-residue case more transparently.
If it fails for finite valence, the counterexample geometry would clarify what role
injectivity plays beyond the mere avoidance of essential singularities.&lt;/p&gt;
&lt;h3 class="heading" id="4-several-complex-variables"&gt;
 4. Several Complex Variables&lt;span class="heading__anchor"&gt; &lt;a href="#4-several-complex-variables"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In $\mathbb{C}^n$ for $n \ge 2$ the theory of isolated singularities of holomorphic
functions changes dramatically: by Hartogs&amp;rsquo; extension theorem, isolated singularities
of holomorphic functions are always removable. One would expect the analogous
conjecture for holomorphic 1-forms in $\mathbb{C}^n$ to be more tractable, or even to
follow from known extension results.&lt;/p&gt;
&lt;p&gt;Formulating the precise analogue, replacing the punctured disk by a domain
$\Omega\setminus{0}$ in $\mathbb{C}^n$, and specifying what &amp;ldquo;meromorphic 1-form&amp;rdquo;
means on a higher-dimensional domain, and checking whether Hartogs-type arguments
already resolve it would clarify which features of the problem are genuinely
one-dimensional.&lt;/p&gt;
&lt;h3 class="heading" id="5-geometric-formulation-on-riemann-surfaces"&gt;
 5. Geometric Formulation on Riemann Surfaces&lt;span class="heading__anchor"&gt; &lt;a href="#5-geometric-formulation-on-riemann-surfaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The disk $D$ and the puncture at $0$ are not special: the same question can be posed
on any Riemann surface $X$ with a marked point $p$. Given an open cover of
$X\setminus{p}$ and injective holomorphic functions $f_j$ on each piece with
compatible differentials, does $\omega = \mathrm{d}f_j$ extend meromorphically
across $p$?&lt;/p&gt;
&lt;p&gt;The answer may depend on the genus and the function theory of $X$. For the disk
(simply connected, genus 0) the monodromy is a simple translation; for a torus or
higher-genus surface the monodromy group is richer and the argument structure should
change. Comparing these cases may isolate the essential input from the topology versus
the analysis.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Elsner, B. (1999). Hyperelliptic action integral. &lt;em&gt;Annales de l&amp;rsquo;Institut Fourier&lt;/em&gt;, &lt;strong&gt;49&lt;/strong&gt;(1), 303–331. &lt;a href="https://www.numdam.org/item/AIF_1999__49_1_303_0/"&gt;https://www.numdam.org/item/AIF_1999__49_1_303_0/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Ahlfors, L. V. (1979). &lt;em&gt;Complex Analysis&lt;/em&gt; (3rd ed.). McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Conway, J. B. (1978). &lt;em&gt;Functions of One Complex Variable&lt;/em&gt; (2nd ed.). Springer.&lt;/li&gt;
&lt;li&gt;Nevanlinna, R. (1970). &lt;em&gt;Analytic Functions&lt;/em&gt;. Springer.&lt;/li&gt;
&lt;li&gt;Forster, O. (1981). &lt;em&gt;Lectures on Riemann Surfaces&lt;/em&gt;. Springer.&lt;/li&gt;
&lt;li&gt;Delabaere, E., Dillinger, H., &amp;amp; Pham, F. (1993). Résurgence de Voros et périodes des courbes hyperelliptiques. &lt;em&gt;Annales de l&amp;rsquo;Institut Fourier&lt;/em&gt;, &lt;strong&gt;43&lt;/strong&gt;(1), 163–199.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Criterion for Boundedness of Power Series</title><link>https://blog.namln.org/en/posts/power_series_boundedness/</link><pubDate>Tue, 26 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/power_series_boundedness/</guid><description>&lt;h2 class="heading" id="introduction--problem-statement"&gt;
 Introduction &amp;amp; Problem Statement&lt;span class="heading__anchor"&gt; &lt;a href="#introduction--problem-statement"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Power series constitute one of the most ubiquitous objects in analysis.
A power series $\sum_{n=0}^{\infty}a_n x^n$ with infinite radius of
convergence defines a real-entire function $f:\mathbb{R}\to\mathbb{R}$.
Whereas the question of &lt;em&gt;convergence&lt;/em&gt; is completely settled by
Cauchy–Hadamard theory, the question of &lt;em&gt;boundedness&lt;/em&gt; of the sum function
is far subtler and, as of this writing, remains open.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 1 (Rüdinger, 2009)&lt;/span&gt;
&lt;p&gt;Let $(a_n) _{n\ge 0}$ be a sequence of real numbers such that the power
series $\sum _{n=0}^{\infty}a_n x^n$ converges for every $x\in\mathbb{R}$,
thereby defining a smooth function $f:\mathbb{R}\to\mathbb{R}$.
Give a &lt;strong&gt;necessary and sufficient&lt;/strong&gt; criterion on $(a_n)$ for $f$ to be
bounded on $\mathbb{R}$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;low importance&lt;/em&gt; on the
&lt;a href="http://www.openproblemgarden.org/op/criterion_for_boundedness_of_power_series"&gt;Open Problem Garden&lt;/a&gt;
and is recommended as accessible to undergraduates; nevertheless, a complete
answer appears to be unknown.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Motivating examples.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Function&lt;/th&gt;
					&lt;th&gt;Power series&lt;/th&gt;
					&lt;th&gt;Bounded?&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;$\cos x$&lt;/td&gt;
					&lt;td&gt;$\displaystyle\sum_{k=0}^{\infty}\frac{(-1)^k}{(2k)!}x^{2k}$&lt;/td&gt;
					&lt;td&gt;$|\cos x|\le 1$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$\sin x$&lt;/td&gt;
					&lt;td&gt;$\displaystyle\sum_{k=0}^{\infty}\frac{(-1)^k}{(2k+1)!}x^{2k+1}$&lt;/td&gt;
					&lt;td&gt;$|\sin x|\le 1$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$e^x$&lt;/td&gt;
					&lt;td&gt;$\displaystyle\sum_{n=0}^{\infty}\frac{x^n}{n!}$&lt;/td&gt;
					&lt;td&gt;$e^x\to+\infty$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$p(x)=a_0+\cdots+a_Nx^N,\ N\ge 1$&lt;/td&gt;
					&lt;td&gt;(polynomial)&lt;/td&gt;
					&lt;td&gt;unbounded&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background--prerequisites"&gt;
 Background &amp;amp; Prerequisites&lt;span class="heading__anchor"&gt; &lt;a href="#background--prerequisites"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;This section collects the core mathematical tools needed to engage
seriously with Question 1.&lt;/p&gt;
&lt;h3 class="heading" id="power-series-and-entire-functions"&gt;
 Power Series and Entire Functions&lt;span class="heading__anchor"&gt; &lt;a href="#power-series-and-entire-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 1 (Power Series &amp; Radius of Convergence)&lt;/span&gt;
&lt;p&gt;A &lt;em&gt;power series&lt;/em&gt; centred at the origin is a formal series
$\sum_{n=0}^{\infty}a_n x^n$ with $a_n\in\mathbb{R}$. Its &lt;em&gt;radius of
convergence&lt;/em&gt; is
$$
R = \frac{1}{\limsup_{n\to\infty}|a_n|^{1/n}} \in [0,+\infty].
$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Throughout this note we always assume $R=+\infty$, i.e.,
$\limsup_{n\to\infty}|a_n|^{1/n}=0$.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 2 (Entire Function)&lt;/span&gt;
&lt;p&gt;A function $f:\mathbb{C}\to\mathbb{C}$ is called &lt;em&gt;entire&lt;/em&gt; if it is
holomorphic on all of $\mathbb{C}$. Every power series with $R=+\infty$
defines a real-entire function, and by the identity theorem its complex
extension is entire.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 1 (Cauchy–Hadamard)&lt;/span&gt;
&lt;p&gt;The radius of convergence of $\sum a_n z^n$ equals
$$
R = \Bigl(\limsup_{n\to\infty}|a_n|^{1/n}\Bigr)^{-1}.
$$&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 1&lt;/span&gt;
&lt;p&gt;The condition $R=+\infty$ is equivalent to $a_n = O(r^n/n!)$ for every
$r&amp;gt;0$, i.e., the coefficients decay faster than any geometric sequence.
This is the Paley–Wiener type condition for entire functions of order $1$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="order-and-type-of-entire-functions"&gt;
 Order and Type of Entire Functions&lt;span class="heading__anchor"&gt; &lt;a href="#order-and-type-of-entire-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 3 (Order and Type)&lt;/span&gt;
&lt;p&gt;The &lt;em&gt;order&lt;/em&gt; of an entire function $f$ is
$$
\rho = \limsup_{r\to\infty}\frac{\log\log M(r)}{\log r},
\qquad M(r)=\max_{|z|=r}|f(z)|.
$$
The &lt;em&gt;type&lt;/em&gt; $\sigma$ (for $0&amp;lt;\rho&amp;lt;\infty$) is
$$
\sigma = \limsup_{r\to\infty}\frac{\log M(r)}{r^{\rho}}.
$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A bounded &lt;em&gt;complex&lt;/em&gt; entire function has order $\rho=0$ (by Liouville&amp;rsquo;s
theorem it must be constant), while a bounded &lt;em&gt;real-valued&lt;/em&gt; entire function
can be non-constant. Boundedness is therefore a genuinely real-variable
phenomenon.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="liouvilles-theorem-and-its-limitations"&gt;
 Liouville&amp;rsquo;s Theorem and Its Limitations&lt;span class="heading__anchor"&gt; &lt;a href="#liouvilles-theorem-and-its-limitations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 2 (Liouville)&lt;/span&gt;
&lt;p&gt;Every bounded entire function $f:\mathbb{C}\to\mathbb{C}$ is constant.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 2 (Why Liouville does not solve the problem)&lt;/span&gt;
&lt;p&gt;Question 1 concerns &lt;em&gt;real-valued&lt;/em&gt; functions $f:\mathbb{R}\to\mathbb{R}$.
A function may be bounded on $\mathbb{R}$ while its complex extension is
unbounded. For instance, $\cos z$ satisfies $|\cos z|\to\infty$ along
the imaginary axis (since $\cos(iy)=\cosh y\to+\infty$). Liouville&amp;rsquo;s
theorem therefore does not apply, and the problem is genuinely non-trivial.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="algebraic-structure-of-the-relevant-function-space"&gt;
 Algebraic Structure of the Relevant Function Space&lt;span class="heading__anchor"&gt; &lt;a href="#algebraic-structure-of-the-relevant-function-space"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 4 (Space of Bounded Power Series)&lt;/span&gt;
&lt;p&gt;Let $\mathcal{B}$ denote the set of all functions $f:\mathbb{R}\to\mathbb{R}$
that can be represented as a convergent power series $\sum_{n\ge 0}a_n x^n$
(with $R=+\infty$) and that are bounded on $\mathbb{R}$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #e67e22; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#e67e22; font-weight:bold;"&gt; Proposition 1, Algebraic Properties of $\mathcal{B}$ (Rüdinger, 2009)&lt;/span&gt;
&lt;ol&gt;
&lt;li&gt;$\mathcal{B}$ is a &lt;strong&gt;linear subspace&lt;/strong&gt; of $C^\infty(\mathbb{R})$: if
$f,g\in\mathcal{B}$ and $\lambda\in\mathbb{R}$ then $f+\lambda g\in\mathcal{B}$.&lt;/li&gt;
&lt;li&gt;$\mathcal{B}$ is &lt;strong&gt;closed under pointwise multiplication&lt;/strong&gt;: if
$f,g\in\mathcal{B}$ then $fg\in\mathcal{B}$.&lt;/li&gt;
&lt;li&gt;$\mathcal{B}$ contains &lt;strong&gt;all functions of the form&lt;/strong&gt; $c\cos(h(x))$,
where $c\in\mathbb{R}$ and $h:\mathbb{R}\to\mathbb{R}$ is any entire function.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 3&lt;/span&gt;
&lt;p&gt;Part (3) follows from $\cos(h(x)) = \operatorname{Re}(e^{ih(x)})$ together
with $|\cos(h(x))|\le 1$. The class is strictly larger than
${c\cos(bx):c,b\in\mathbb{R}}$; for example, $\cos(x^3-x)\in\mathcal{B}$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="known-partial-results"&gt;
 Known Partial Results&lt;span class="heading__anchor"&gt; &lt;a href="#known-partial-results"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="necessary-conditions"&gt;
 Necessary Conditions&lt;span class="heading__anchor"&gt; &lt;a href="#necessary-conditions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #e67e22; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#e67e22; font-weight:bold;"&gt; Proposition 2, Necessary Condition for Boundedness (Rüdinger, 2009)&lt;/span&gt;
&lt;p&gt;Suppose $f(x)=\sum_{n=0}^{\infty}a_n x^n$ is bounded on $\mathbb{R}$.
Then either:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;$a_0$ is the &lt;em&gt;only&lt;/em&gt; non-zero coefficient (i.e., $f$ is the constant
function $f\equiv a_0$), or&lt;/li&gt;
&lt;li&gt;there are &lt;strong&gt;infinitely many&lt;/strong&gt; indices $n$ with $a_n\neq 0$, and the
signs of the non-zero $a_n$ &lt;strong&gt;change infinitely often&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 4&lt;/span&gt;
&lt;p&gt;The sign-change condition is necessary: if the non-zero coefficients are
eventually of one sign, the dominant-term comparison shows
$f(x)\to\pm\infty$ as $x\to+\infty$ or $x\to-\infty$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt; Corollary 1&lt;/span&gt;
&lt;p&gt;Every non-constant polynomial is unbounded on $\mathbb{R}$.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;&lt;em&gt;Proof.&lt;/em&gt;&lt;/summary&gt;
A polynomial has only finitely many non-zero coefficients. By Proposition 2 (1),
the only bounded polynomial is the constant function. Any non-constant
polynomial satisfies $|p(x)|\to\infty$ as $|x|\to\infty$.
&lt;/details&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="the-sign-change-condition-is-not-sufficient"&gt;
 The Sign-Change Condition Is Not Sufficient&lt;span class="heading__anchor"&gt; &lt;a href="#the-sign-change-condition-is-not-sufficient"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The condition of Proposition 2 is &lt;em&gt;not&lt;/em&gt; sufficient, as the following
examples show.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #16a085; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#16a085; font-weight:bold;"&gt; Example 1&lt;/span&gt;
&lt;p&gt;Consider the geometric series
$$
f(x) = \sum_{n=0}^{\infty}(-1)^n x^{2n} = \frac{1}{1+x^2},
\qquad |x|&amp;lt;1.
$$
The coefficients alternate in sign, yet $R=1\neq+\infty$. One must first
require $R=+\infty$ before the sign-change condition becomes meaningful.&lt;/p&gt;
&lt;p&gt;For a subtler case with $R=+\infty$: take $a_n=(-1)^n/n!$, so
$$
f(x) = \sum_{n=0}^{\infty}\frac{(-1)^n}{n!}x^n = e^{-x}.
$$
The signs alternate, yet $e^{-x}\to+\infty$ as $x\to-\infty$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 5&lt;/span&gt;
&lt;p&gt;The $e^{-x}$ example reveals the key gap: sign alternation of the
&lt;em&gt;coefficients&lt;/em&gt; does not prevent the &lt;em&gt;function&lt;/em&gt; from growing in one
direction, because the series for $e^{-x}$ reconstructs exponential
growth in the negative half-line. A complete criterion must capture
cancellation in &lt;strong&gt;both&lt;/strong&gt; directions.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="connections-to-entire-function-theory"&gt;
 Connections to Entire Function Theory&lt;span class="heading__anchor"&gt; &lt;a href="#connections-to-entire-function-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 3 (Borel–Carathéodory)&lt;/span&gt;
&lt;p&gt;Let $f$ be holomorphic in $|z|\le R$. Then for $0&amp;lt;r&amp;lt;R$,
$$
M(r) \le \frac{2r}{R-r}\sup_{|z|=R}\operatorname{Re}f(z) + \frac{R+r}{R-r},|f(0)|.
$$&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 6&lt;/span&gt;
&lt;p&gt;Borel–Carathéodory shows that the &lt;em&gt;real part&lt;/em&gt; of a complex-valued entire
function controls its modulus. For a real-valued function on $\mathbb{R}$
the analogous control is more delicate, since we only observe the function
on a line, not on a disk.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 4 (Hadamard Factorisation)&lt;/span&gt;
&lt;p&gt;Every entire function of finite order $\rho$ can be written as
$$
f(z) = z^m e^{g(z)}\prod_{n=1}^{\infty} E_p!\left(\frac{z}{z_n}\right),
$$
where $m\ge 0$, $p=\lfloor\rho\rfloor$, $g$ is a polynomial of degree
$\le\rho$, and the $E_p$ are Weierstrass elementary factors.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 7&lt;/span&gt;
&lt;p&gt;A bounded real entire function of infinite order (if one exists) would
not be directly covered by the Hadamard factorisation. Understanding the
zero set and the exponential factor in $e^{g(z)}$ may be key to
classifying all $f\in\mathcal{B}$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-open-sub-question-on-the-generators-of-mathcalb"&gt;
 The Open Sub-Question on the Generators of $\mathcal{B}$&lt;span class="heading__anchor"&gt; &lt;a href="#the-open-sub-question-on-the-generators-of-mathcalb"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 2 (Rüdinger, 2009)&lt;/span&gt;
&lt;p&gt;Does $\mathcal{B}$ consist &lt;em&gt;precisely&lt;/em&gt; of functions of the form $c\cos(h(x))$
and their linear combinations and products, where $h:\mathbb{R}\to\mathbb{R}$
is entire and $c\in\mathbb{R}$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A &lt;strong&gt;positive&lt;/strong&gt; answer would give an implicit characterisation via algebraic
generators. A &lt;strong&gt;negative&lt;/strong&gt; answer would require producing a bounded entire
function on $\mathbb{R}$ that does &lt;em&gt;not&lt;/em&gt; lie in the
$\mathbb{R}$-algebra generated by ${\cos\circ, h : h\text{ entire}}$.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 8&lt;/span&gt;
&lt;p&gt;By Proposition 1 (3), every $c\cos(h(x))$ belongs to $\mathcal{B}$, and
$\mathcal{B}$ is an algebra, so all products and sums remain in
$\mathcal{B}$. What is unknown is whether &lt;em&gt;every&lt;/em&gt; element of $\mathcal{B}$
arises this way. Note that $\sin x = \cos(x-\pi/2) \in \mathcal{B}$, so
sine is already covered.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions-and-conjectures"&gt;
 Research Directions and Conjectures&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions-and-conjectures"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="direction-1-coefficient-growth-rate"&gt;
 Direction 1: Coefficient Growth Rate&lt;span class="heading__anchor"&gt; &lt;a href="#direction-1-coefficient-growth-rate"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A promising approach is to examine the &lt;em&gt;rate&lt;/em&gt; of decay of $|a_n|$, not just
the sign pattern.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 3&lt;/span&gt;
&lt;p&gt;Is there a decay condition on $|a_n|$, combined with the sign-change
condition, that gives a &lt;strong&gt;sufficient&lt;/strong&gt; criterion for $f\in\mathcal{B}$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;em&gt;Approach.&lt;/em&gt; The Cauchy estimates give $|a_n| = |f^{(n)}(0)|/n!\le M(r)/r^n$
for all $r&amp;gt;0$. If $f\in\mathcal{B}$ with $|f|\le B$, the bound
$|a_n|\le B/r^n$ holds for every $r&amp;gt;0$, but this recovers only the
$R=+\infty$ condition. Is there a sharper constraint?&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-2-fourier-analytic-approach"&gt;
 Direction 2: Fourier-Analytic Approach&lt;span class="heading__anchor"&gt; &lt;a href="#direction-2-fourier-analytic-approach"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Every $f\in L^\infty(\mathbb{R})\cap L^2(\mathbb{R})$ possesses a
square-integrable Fourier transform. If $f$ is also entire, Paley–Wiener
forces the transform to be compactly supported. However, a generic
$f\in\mathcal{B}$ may not lie in $L^2$ (e.g., $\cos x\notin L^2(\mathbb{R})$).&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 4&lt;/span&gt;
&lt;p&gt;Can the Fourier theory for tempered distributions give a necessary and
sufficient condition for $f\in\mathcal{B}$ in terms of the spectral
support of $f$?&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-3-differential-equation-characterisation"&gt;
 Direction 3: Differential Equation Characterisation&lt;span class="heading__anchor"&gt; &lt;a href="#direction-3-differential-equation-characterisation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Bounded entire functions often arise as solutions to ODEs. For instance
$y&amp;rsquo;&amp;rsquo;+y=0$ has bounded solutions $A\cos x + B\sin x$. More generally,
$y&amp;rsquo;&amp;rsquo;+\omega(x)y=0$ with $\omega$ entire and bounded can produce bounded
solutions.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 5&lt;/span&gt;
&lt;p&gt;Characterise those linear differential operators $L$ with entire coefficients
whose full solution space lies within $\mathcal{B}$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-4-evenodd-decomposition-and-reduction"&gt;
 Direction 4: Even/Odd Decomposition and Reduction&lt;span class="heading__anchor"&gt; &lt;a href="#direction-4-evenodd-decomposition-and-reduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Every $f\in\mathcal{B}$ splits as $f=f_e+f_o$ where
$$
f_e(x)=\tfrac{1}{2}(f(x)+f(-x))=\sum_{k\ge 0}a_{2k}x^{2k}
\quad\text{and}\quad
f_o(x)=\tfrac{1}{2}(f(x)-f(-x))=\sum_{k\ge 0}a_{2k+1}x^{2k+1}.
$$
Since $f_e(x)=g(x^2)$ for the entire function $g(t)=\sum_{k\ge 0}a_{2k}t^k$,
boundedness of $f_e$ reduces to: &lt;em&gt;is $g$ bounded on $[0,+\infty)$?&lt;/em&gt; This
reduction may make the even and odd parts easier to study separately.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-5-polynomial-approximation-and-numerics"&gt;
 Direction 5: Polynomial Approximation and Numerics&lt;span class="heading__anchor"&gt; &lt;a href="#direction-5-polynomial-approximation-and-numerics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 6&lt;/span&gt;
&lt;p&gt;If the partial sums $S_N(x)=\sum_{n=0}^{N}a_n x^n$ are uniformly bounded
on growing intervals $[-R_N,R_N]$ (with $R_N\to\infty$), does it follow
that $f\in\mathcal{B}$? Conversely, if $f\in\mathcal{B}$, how fast must
$R_N$ grow relative to $N$ for the bound to hold?&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="summary-of-open-problems"&gt;
 Summary of Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#summary-of-open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;#&lt;/th&gt;
					&lt;th&gt;Statement&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q1&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Give a necessary and sufficient condition on $(a_n)$ for $f=\sum a_n x^n$ to be bounded on $\mathbb{R}$.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q2&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Is $\mathcal{B}$ generated (as an algebra) precisely by ${c\cos(h(x)):h\text{ entire}}$?&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q3&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Does a sharper decay condition on $&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q4&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Can spectral-support (Paley–Wiener / distribution) theory characterise $\mathcal{B}$?&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q5&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Which linear ODEs with entire coefficients have solution space $\subseteq\mathcal{B}$?&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q6&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;What is the precise relationship between truncation bounds on $[-R_N,R_N]$ and $f\in\mathcal{B}$?&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Ahlfors, L. V. (1979). &lt;em&gt;Complex Analysis&lt;/em&gt;, 3rd ed. McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Boas, R. P. (1954). &lt;em&gt;Entire Functions&lt;/em&gt;. Academic Press.&lt;/li&gt;
&lt;li&gt;Conway, J. B. (1978). &lt;em&gt;Functions of One Complex Variable&lt;/em&gt;, 2nd ed. Springer.&lt;/li&gt;
&lt;li&gt;Levin, B. Ya. (1996). &lt;em&gt;Lectures on Entire Functions&lt;/em&gt;. AMS Translations of Mathematical Monographs, vol. 150.&lt;/li&gt;
&lt;li&gt;Rudin, W. (1976). &lt;em&gt;Principles of Mathematical Analysis&lt;/em&gt;, 3rd ed. McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Rudin, W. (1987). &lt;em&gt;Real and Complex Analysis&lt;/em&gt;, 3rd ed. McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Rüdinger, A. (2009). Criterion for boundedness of power series. &lt;em&gt;Open Problem Garden&lt;/em&gt;. &lt;a href="http://www.openproblemgarden.org/op/criterion_for_boundedness_of_power_series"&gt;http://www.openproblemgarden.org/op/criterion_for_boundedness_of_power_series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Stein, E. M. and Shakarchi, R. (2003). &lt;em&gt;Fourier Analysis: An Introduction&lt;/em&gt;. Princeton University Press.&lt;/li&gt;
&lt;li&gt;Stein, E. M. and Shakarchi, R. (2010). &lt;em&gt;Complex Analysis&lt;/em&gt;. Princeton University Press.&lt;/li&gt;
&lt;li&gt;Titchmarsh, E. C. (1939). &lt;em&gt;The Theory of Functions&lt;/em&gt;, 2nd ed. Oxford University Press.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Brezis' first open problem - An elliptic equation involving the critical exponent in 3D</title><link>https://blog.namln.org/en/posts/brezis-first-open-problem/</link><pubDate>Sat, 18 Apr 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/brezis-first-open-problem/</guid><description>&lt;h2 class="heading" id="yamabe-problem"&gt;
 Yamabe problem&lt;span class="heading__anchor"&gt; &lt;a href="#yamabe-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Yamabe problem: Suppose $(\mathcal{M}, g_0)$ is a compact closed Riemannian manifold with dimension $N \geq 3$, does there exist a conformal metric $g = u^{\frac{4}{N-2}}g_0$ which has constant scalar curvature $R_g \equiv C$?&lt;/p&gt;
&lt;p&gt;Find $u &amp;gt; 0$ on $\mathcal{M}$ such that
$$
-\frac{4(N-1)}{N-2}\Delta_{g_0}u + R_{g_0}u = Cu^{\frac{N+2}{N-2}}\qquad\text{on }\mathcal{M}.
$$&lt;/p&gt;
&lt;p&gt;Some results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Trudinger [1968]: if $g$ has non-positive scalar curvature.&lt;/li&gt;
&lt;li&gt;Aubin [1976]: $N \geq 6$ and $(\mathcal{M}, g)$ not locally conformally flat.&lt;/li&gt;
&lt;li&gt;Schoen [1984]: any dimension, the remaining cases, assuming the Positive Mass Theorem by Schoen-Yau [1979].&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="a-special-case"&gt;
 A special case&lt;span class="heading__anchor"&gt; &lt;a href="#a-special-case"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Consider the special case where $\mathcal{M}$ is a bounded domain $\Omega$ in $\mathbb{R}^{N}$:
$$
\begin{cases}
-\Delta u = u^{\frac{N+2}{N-2}}\qquad\text{in }\Omega, \\
u &amp;gt; 0\qquad\text{in }\Omega, \\
u = 0\qquad\text{on }\partial\Omega.
\end{cases}
$$&lt;/p&gt;
&lt;p&gt;Pohozaev [1965]: if $\Omega$ is star-shaped, then there is no nontrivial solution.&lt;/p&gt;
&lt;h2 class="heading" id="brezis-nirenberg-problem"&gt;
 Brezis-Nirenberg problem&lt;span class="heading__anchor"&gt; &lt;a href="#brezis-nirenberg-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Consider a lower-order perturbation:
$$
\begin{cases}
-\Delta u = u^{\frac{N+2}{N-2}} + \lambda u\qquad\text{in }\Omega, \\
u &amp;gt; 0\qquad\text{in }\Omega, \\
u = 0\qquad\text{on }\partial\Omega.
\end{cases}
$$&lt;/p&gt;
&lt;p&gt;Some results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pohozaev&amp;rsquo;s result also yields nonexistence when $\lambda \leq 0$ and $\Omega$ is star-shaped.&lt;/li&gt;
&lt;li&gt;If a positive solution exists, then necessarily $\lambda &amp;lt; \lambda_1$, where $\lambda_1$ is the first eigenvalue of $-\Delta$ on $\Omega$ with zero Dirichlet boundary condition.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hence, for positive solutions on star-shaped domains,
$$
0 &amp;lt; \lambda &amp;lt; \lambda_1.
$$&lt;/p&gt;
&lt;h2 class="heading" id="brezis-open-problem-11"&gt;
 Brezis&amp;rsquo; Open Problem 1.1&lt;span class="heading__anchor"&gt; &lt;a href="#brezis-open-problem-11"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Let $N=3$, and let $\Omega = B_1 \subset \mathbb{R}^3$ be the unit ball. Consider
$$
\begin{cases}
-\Delta u = u^5 + \lambda u \qquad \text{in } B_1, \\
u = 0 \qquad \text{on } \partial B_1.
\end{cases}
$$
We ask whether this problem admits a nontrivial positive solution $u \not\equiv 0$.&lt;/p&gt;
&lt;p&gt;Here the exponent $5 = \frac{N+2}{N-2}$ is the critical Sobolev exponent when $N=3$, and this is exactly the source of the main compactness difficulty.&lt;/p&gt;
&lt;p&gt;Let $\lambda_1$ be the first Dirichlet eigenvalue of $-\Delta$ on $B_1$. The classical Brezis-Nirenberg theory shows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If $\lambda \leq 0$, then the only solution is $u \equiv 0$.&lt;/li&gt;
&lt;li&gt;If $\frac{1}{4}\lambda_1 &amp;lt; \lambda &amp;lt; \lambda_1$, then there exists a positive radial solution.&lt;/li&gt;
&lt;li&gt;If $0 &amp;lt; \lambda \leq \frac{1}{4}\lambda_1$, then any radial solution must be trivial; hence there is no positive radial solution.&lt;/li&gt;
&lt;li&gt;If $\lambda &amp;gt; \lambda_1$, there exist sign-changing solutions, but no positive solution.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore the unresolved case is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 1.1.&lt;/strong&gt; Assume
$$
0 &amp;lt; \lambda \leq \frac{1}{4}\lambda_1.
$$
Does there exist a nontrivial solution?&lt;br&gt;
Equivalently, since no positive radial solution can exist in this range, can there exist a non-radial positive solution?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This problem has remained open for decades, even if one restricts further to a smaller interval such as
$$
0 &amp;lt; \lambda &amp;lt; \varepsilon
$$
for some sufficiently small $\varepsilon &amp;gt; 0$.&lt;/p&gt;
&lt;h2 class="heading" id="remarks"&gt;
 Remarks&lt;span class="heading__anchor"&gt; &lt;a href="#remarks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;A few points are worth emphasizing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;By the Gidas-Ni-Nirenberg symmetry principle, positive solutions on a ball are often expected to be radial; however, in this regime Brezis observed that any radial solution must vanish, so any eventual positive solution would have to be genuinely non-radial.&lt;/li&gt;
&lt;li&gt;This makes dimension $3$ sharply different from higher-dimensional cases, where the Brezis-Nirenberg existence theory is better understood.&lt;/li&gt;
&lt;li&gt;The bifurcation picture suggests branches of sign-changing non-radial solutions emerging from higher eigenvalues, but it is not known whether such branches can reach the interval $\left(0,\frac14\lambda_1\right]$.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;H. Brezis and L. Nirenberg, &lt;em&gt;Positive solutions of nonlinear elliptic equations involving critical Sobolev exponents&lt;/em&gt;, Comm. Pure Appl. Math. 36 (1983), 437&amp;ndash;477.&lt;/li&gt;
&lt;li&gt;H. Brezis, &lt;em&gt;Some of My Favorite Open Problems&lt;/em&gt;, Open Problem 1.1.&lt;/li&gt;
&lt;li&gt;M. Comte, &lt;em&gt;Solutions of elliptic equations with critical Sobolev exponent in dimension three&lt;/em&gt;, Nonlinear Anal. 17 (1991), 445&amp;ndash;455.&lt;/li&gt;
&lt;li&gt;O. Druet, &lt;em&gt;Elliptic equations with critical Sobolev exponents in dimension 3&lt;/em&gt;, Ann. Inst. H. Poincaré Anal. Non Linéaire 19 (2002), 125&amp;ndash;142.&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Recent Advances in KAN-Based Numerical PDE Solvers</title><link>https://blog.namln.org/en/posts/kan-pde-solvers/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/kan-pde-solvers/</guid><description>&lt;p&gt;Kolmogorov-Arnold Networks (KANs), introduced in 2024, have rapidly become one of the most active frontiers in scientific machine learning for solving partial differential equations (PDEs) (Liu et al., 2024). Unlike Multi-Layer Perceptrons (MLPs), which apply fixed activation functions at nodes, KANs place &lt;strong&gt;learnable univariate activation functions on edges&lt;/strong&gt;, grounded in the Kolmogorov-Arnold representation theorem: every continuous multivariate function can be expressed as a composition of univariate functions and summations. This structural difference gives KANs two key properties relevant to PDE numerics — &lt;strong&gt;higher interpretability&lt;/strong&gt; and &lt;strong&gt;parameter efficiency&lt;/strong&gt; — making them an appealing successor to MLP-based Physics-Informed Neural Networks (PINNs).&lt;/p&gt;
&lt;p&gt;From 2024 through early 2026, researchers have published dozens of frameworks combining KANs with classical numerical concepts (spectral methods, operator learning, energy-stable time-stepping, neural operators) and targeting problems ranging from single PDEs to high-dimensional systems with hundreds of variables.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The KAN-for-PDEs landscape organises into several interrelated research threads:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Physics-Informed KAN Frameworks (PIKANs / KINN)&lt;/strong&gt; — direct replacements of MLP layers in PINNs with KAN layers, using strong, energy, and inverse PDE formulations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spectral-Basis and Wavelet-Enriched KANs&lt;/strong&gt; — embedding orthogonal polynomial or wavelet bases to combat spectral bias.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;KAN-Based Neural Operators&lt;/strong&gt; — KAN sub-networks inside DeepONet, FNO, and pseudo-differential operator frameworks for learning PDE solution maps.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Time-Dependent and Evolutionary KANs&lt;/strong&gt; — energy-stable schemes, KAN-ODEs, and moving-boundary solvers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Discontinuities, Shock Waves, and Turbulence&lt;/strong&gt; — specialised architectures for sharp transitions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High-Dimensional PDEs&lt;/strong&gt; — separable and tensor-product KAN surrogates scaling to hundreds of dimensions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data-Driven Discovery and Inverse Problems&lt;/strong&gt; — interpretability-driven model identification.&lt;/li&gt;
&lt;/ol&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Architecture&lt;/th&gt;
					&lt;th&gt;Key Strength&lt;/th&gt;
					&lt;th&gt;Representative Work&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;KINN&lt;/td&gt;
					&lt;td&gt;Forward/inverse problems, strong/energy/inverse forms&lt;/td&gt;
					&lt;td&gt;Wang et al., 2024&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;ChebPIKAN&lt;/td&gt;
					&lt;td&gt;Fluid mechanics PDEs, orthogonal basis&lt;/td&gt;
					&lt;td&gt;Cui et al., 2024&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;KANO&lt;/td&gt;
					&lt;td&gt;Symbolic operator recovery, variable-coefficient PDEs&lt;/td&gt;
					&lt;td&gt;arXiv:2509.16825&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;EvoKAN&lt;/td&gt;
					&lt;td&gt;Long-horizon time evolution, energy stability&lt;/td&gt;
					&lt;td&gt;arXiv:2503.01618&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Anant-KAN&lt;/td&gt;
					&lt;td&gt;High-dimensional PDEs (up to 300D)&lt;/td&gt;
					&lt;td&gt;arXiv:2505.03595&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;DPINN&lt;/td&gt;
					&lt;td&gt;Shock waves and discontinuities&lt;/td&gt;
					&lt;td&gt;arXiv:2507.08338&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-kolmogorov-arnold-representation-theorem"&gt;
 The Kolmogorov-Arnold Representation Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#the-kolmogorov-arnold-representation-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The theoretical foundation of KANs is the Kolmogorov-Arnold theorem: any continuous function $f: [0,1]^n \to \mathbb{R}$ can be written as&lt;/p&gt;
&lt;p&gt;$$f(x_1, \ldots, x_n) = \sum_{q=0}^{2n} \Phi_q!\left(\sum_{p=1}^{n} \phi_{q,p}(x_p)\right),$$&lt;/p&gt;
&lt;p&gt;where $\phi_{q,p}: [0,1] \to \mathbb{R}$ and $\Phi_q: \mathbb{R} \to \mathbb{R}$ are univariate continuous functions. In contrast to MLPs — where activations are fixed and weights are learned — KANs &lt;strong&gt;parameterise the activation functions themselves&lt;/strong&gt; (typically as B-splines or orthogonal polynomials) on each edge of the network graph.&lt;/p&gt;
&lt;h3 class="heading" id="physics-informed-neural-networks-pinns--the-starting-point"&gt;
 Physics-Informed Neural Networks (PINNs) — The Starting Point&lt;span class="heading__anchor"&gt; &lt;a href="#physics-informed-neural-networks-pinns--the-starting-point"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;PINNs (Raissi, Perdikaris, &amp;amp; Karniadakis, 2019) embed physical laws directly into the neural network loss function. For a PDE $\mathcal{N}[u] = f$ on domain $\Omega$ with boundary condition $\mathcal{B}[u] = g$ on $\partial\Omega$, the PINN loss is&lt;/p&gt;
&lt;p&gt;$$\mathcal{L} = \underbrace{\frac{1}{N _r}\sum _{i=1}^{N _r}|\mathcal{N}[u _\theta](x _i)|^2} _{\text{PDE residual}} + \underbrace{\frac{1}{N _b}\sum _{j=1}^{N _b}|\mathcal{B}[u _\theta](x _j) - g(x _j)|^2} _{\text{boundary condition}}.$$&lt;/p&gt;
&lt;p&gt;The substitution of MLP layers with KAN layers in this framework is the basic idea behind all PIKAN architectures.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-physics-informed-kan-frameworks"&gt;
 1. Physics-Informed KAN Frameworks&lt;span class="heading__anchor"&gt; &lt;a href="#1-physics-informed-kan-frameworks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h4 class="heading" id="kinn--the-foundational-framework"&gt;
 KINN — The Foundational Framework&lt;span class="heading__anchor"&gt; &lt;a href="#kinn--the-foundational-framework"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;The &lt;strong&gt;Kolmogorov-Arnold-Informed Neural Network (KINN)&lt;/strong&gt; is the primary physics-informed framework replacing MLP layers in PINNs with KAN layers (Wang et al., 2024). KINN supports three PDE formulations: the &lt;strong&gt;strong form&lt;/strong&gt; (collocating the PDE residual directly), the &lt;strong&gt;energy form&lt;/strong&gt; (minimising a variational energy functional), and the &lt;strong&gt;inverse form&lt;/strong&gt; (recovering unknown parameters from observations).&lt;/p&gt;
&lt;p&gt;Systematic benchmarks demonstrate that KINN significantly outperforms MLP-based PINNs in accuracy and convergence speed for multi-scale problems, stress concentration, singularities, nonlinear hyperelasticity, and heterogeneous materials. The one domain where MLP remains competitive is complex geometry problems. Published in &lt;em&gt;Computer Methods in Applied Mechanics and Engineering&lt;/em&gt; (2024), KINN has become the canonical reference for subsequent KAN-PDE research.&lt;/p&gt;
&lt;h4 class="heading" id="chebyshev-and-polynomial-basis-pikans"&gt;
 Chebyshev and Polynomial Basis PIKANs&lt;span class="heading__anchor"&gt; &lt;a href="#chebyshev-and-polynomial-basis-pikans"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;A major architectural refinement has been substituting B-spline basis functions with &lt;strong&gt;orthogonal polynomial bases&lt;/strong&gt;. The &lt;strong&gt;ChebPIKAN&lt;/strong&gt; model leverages orthogonality of Chebyshev polynomials and integrates physics-informed loss functions for fluid-mechanics PDEs including the Allen-Cahn, Burgers, Helmholtz, Kovasznay flow, cylinder wake flow, and cavity flow equations (Cui et al., 2024). ChebPIKAN significantly outperforms vanilla KAN by embedding essential physical information and alleviating overfitting.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;AC-PKAN&lt;/strong&gt; (Attention-Enhanced Chebyshev PKAN) further addresses the &lt;em&gt;rank collapse&lt;/em&gt; problem in Chebyshev-based KANs by integrating wavelet-activated MLPs with an internal attention mechanism, provably preserving a full-rank Jacobian and approximating PDEs of arbitrary order (arXiv:2505.08687). An external &lt;strong&gt;Residual Gradient Attention (RGA)&lt;/strong&gt; mechanism dynamically re-weights individual loss terms based on gradient norms, stabilising training of stiff PDE systems.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Legendre-KAN&lt;/strong&gt; method applies Legendre polynomial orthogonality to solve the fully nonlinear Monge-Ampère equation with Dirichlet boundary conditions, demonstrating effectiveness on both smooth and singular solutions across various dimensions and in the optimal transport problem.&lt;/p&gt;
&lt;h4 class="heading" id="hybrid-kanmlp-and-augmented-lagrangian-approaches"&gt;
 Hybrid KAN–MLP and Augmented Lagrangian Approaches&lt;span class="heading__anchor"&gt; &lt;a href="#hybrid-kanmlp-and-augmented-lagrangian-approaches"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;The &lt;strong&gt;AL-PKAN&lt;/strong&gt; introduces a hybrid encoder-decoder architecture where the decoder maps hidden variable features from high-dimensional latent space into trainable univariate activation functions via KAN (Zhang et al., 2025). An augmented Lagrangian function treats penalty factors and Lagrangian multipliers as learnable parameters to dynamically balance constraint terms. This approach typically improves prediction accuracy by &lt;strong&gt;one to two orders of magnitude&lt;/strong&gt; compared to traditional neural networks.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;HPKM-PINN&lt;/strong&gt; combines MLP and KAN branches with a trainable convex mixing parameter to blend features optimally across subdomains, especially effective for multi-scale problems.&lt;/p&gt;
&lt;h3 class="heading" id="2-spectral-basis-and-wavelet-enriched-kans"&gt;
 2. Spectral-Basis and Wavelet-Enriched KANs&lt;span class="heading__anchor"&gt; &lt;a href="#2-spectral-basis-and-wavelet-enriched-kans"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Wav-KAN&lt;/strong&gt; incorporates wavelet functions into the KAN structure, capturing both high-frequency and low-frequency components via continuous dyadic wavelet transforms for multiresolution analysis. This directly addresses the &lt;em&gt;spectral bias&lt;/em&gt; problem inherent in standard neural networks, which struggle to resolve high-frequency features in PDE solutions.&lt;/p&gt;
&lt;p&gt;PIKANs have been extended to &lt;strong&gt;multi-resolution spectral hybridisations (HWF-PIKAN)&lt;/strong&gt;, combining wavelet and Fourier features to explicitly counteract spectral bias and accelerate convergence for advection-dominated and kinetic equations.&lt;/p&gt;
&lt;p&gt;A unified benchmark published in February 2026 provides a &lt;strong&gt;systematic, controlled comparison between MLP-based PINNs and KAN-based PIKANs&lt;/strong&gt; across a representative collection of ODEs and PDEs (arXiv:2602.15068). The results show that PIKANs consistently achieve more accurate solutions, converge in fewer iterations, and yield superior gradient estimates.&lt;/p&gt;
&lt;h3 class="heading" id="3-kan-based-neural-operators"&gt;
 3. KAN-Based Neural Operators&lt;span class="heading__anchor"&gt; &lt;a href="#3-kan-based-neural-operators"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Neural operators learn mappings between infinite-dimensional function spaces, enabling generalisation across families of PDEs. KANs are increasingly embedded in operator architectures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DeepOKAN&lt;/strong&gt; replaces MLP sub-networks in the Deep Operator Network (DeepONet) framework with KAN sub-networks using Gaussian Radial Basis Functions (Abueidda et al., 2024). The branch and trunk networks of DeepONet are re-implemented as RBF-KAN layers. Evaluated on 1D sinusoidal waves, 2D orthotropic elasticity, and transient Poisson problems, DeepOKAN consistently achieves lower training losses and more accurate predictions compared to standard DeepONet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PO-CKAN&lt;/strong&gt; (Physics-informed Deep Operator KAN with Chunk Rational Structure) integrates PDE residual loss into a DeepONet-style branch–trunk architecture using Chunkwise Rational KAN sub-networks (arXiv:2510.08795). On Burgers&amp;rsquo; equation with viscosity $\nu = 0.01$, PO-CKAN reduces mean relative $L^2$ error by approximately &lt;strong&gt;48%&lt;/strong&gt; compared to PI-DeepONet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KANO&lt;/strong&gt; (Kolmogorov-Arnold Neural Operator) is the most theoretically ambitious framework, jointly parameterising operators in both &lt;strong&gt;spectral and spatial bases&lt;/strong&gt; within a pseudo-differential operator framework (arXiv:2509.16825). KANO overcomes the pure-spectral bottleneck of Fourier Neural Operators (FNO): while FNO remains practical only for spectrally sparse operators, KANO remains expressive over generic variable-coefficient PDEs. Crucially, KANO achieves &lt;strong&gt;symbolic recovery of the learned operator&lt;/strong&gt;, enabling closed-form extraction of governing equations. On the quantum Hamiltonian learning benchmark, KANO attains state infidelity $\approx 6 \times 10^{-6}$ compared to FNO&amp;rsquo;s $\approx 1.5 \times 10^{-2}$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KAN-ONets&lt;/strong&gt; embeds adaptive, learnable B-spline activations from KAN into FNO (yielding FNO-KAN for uniform grids) and into the attention-based GNOT (yielding GNOT-KAN for arbitrary grids). Across seven challenging PDE benchmarks, KAN-ONets achieves &lt;strong&gt;MSE reductions of 10.2–30.2%&lt;/strong&gt; compared to existing models.&lt;/p&gt;
&lt;h3 class="heading" id="4-time-dependent-and-evolutionary-kans"&gt;
 4. Time-Dependent and Evolutionary KANs&lt;span class="heading__anchor"&gt; &lt;a href="#4-time-dependent-and-evolutionary-kans"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;EvoKAN&lt;/strong&gt; (Evolutionary Kolmogorov-Arnold Network, March 2025) introduces a novel paradigm: rather than retraining repeatedly, EvoKAN &lt;strong&gt;encodes only the PDE&amp;rsquo;s initial state&lt;/strong&gt; during an initial learning phase, then evolves the network parameters numerically, governed by the same PDE (arXiv:2503.01618). KAN weights are treated as time-dependent functions updated through time steps, enabling prediction over arbitrarily long time horizons.&lt;/p&gt;
&lt;p&gt;EvoKAN integrates the &lt;strong&gt;Scalar Auxiliary Variable (SAV) method&lt;/strong&gt; to guarantee unconditional energy stability: at each time step, SAV requires only solving decoupled linear systems with constant coefficients. EvoKAN has been validated on the 1D and 2D Allen-Cahn equations (phase-field phenomena with sharp interfaces) and the 2D Navier-Stokes equations (turbulent flows), closely matching analytical references.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KAN-ODEs&lt;/strong&gt; apply KANs as the backbone of neural ordinary differential equation (ODE) frameworks, enabling data-driven discovery of governing dynamics with greater interpretability compared to MLP-based neural ODEs (arXiv:2407.04192).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Shallow-KAN&lt;/strong&gt; addresses Stefan-type moving boundary problems (melting, solidification) by approximating the temperature distribution and moving interface while enforcing governing PDEs, phase equilibrium, and the Stefan condition through physics-informed residuals (arXiv:2601.09818). A key finding is that &lt;strong&gt;two hidden layers with tens of learnable parameters&lt;/strong&gt; suffice — far fewer than the nearly one million parameters required by standard MLP-based PINNs for the same problem.&lt;/p&gt;
&lt;h3 class="heading" id="5-discontinuities-shock-waves-and-turbulence"&gt;
 5. Discontinuities, Shock Waves, and Turbulence&lt;span class="heading__anchor"&gt; &lt;a href="#5-discontinuities-shock-waves-and-turbulence"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A known weakness of smooth neural networks is difficulty resolving &lt;strong&gt;sharp spatial transitions and discontinuities&lt;/strong&gt; such as shock waves. Two specialised frameworks address this:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DPINN&lt;/strong&gt; (Discontinuity-aware PINN) incorporates a discontinuity-aware KAN for modelling shock-wave properties, combined with an adaptive Fourier-feature embedding layer to mitigate spectral bias, mesh transformation for complex geometries, and learnable local artificial viscosity to stabilise the algorithm near discontinuities (arXiv:2507.08338). Numerical experiments on the inviscid Burgers&amp;rsquo; equation and transonic/supersonic airfoil flows demonstrate superior accuracy over existing methods.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;Physics-Infused KAN for Turbulence&lt;/strong&gt; (2026) targets turbulent flow prediction integrated with CFD, applying KAN within the Reynolds-Averaged Navier-Stokes (RANS) framework. It addresses the &lt;em&gt;information bottleneck&lt;/em&gt; phenomenon in multi-output KANs and proposes pruning-based network optimisation, achieving high prediction accuracy for Navier-Stokes equations.&lt;/p&gt;
&lt;h3 class="heading" id="6-high-dimensional-pdes-and-the-curse-of-dimensionality"&gt;
 6. High-Dimensional PDEs and the Curse of Dimensionality&lt;span class="heading__anchor"&gt; &lt;a href="#6-high-dimensional-pdes-and-the-curse-of-dimensionality"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;High-dimensional PDEs (tens to hundreds of dimensions) are where conventional numerical methods completely fail due to exponential cost scaling. KAN has shown early promise here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anant-Net&lt;/strong&gt; (2025) is a scalable neural surrogate employing a tensor product formulation with dimension-wise sweeps and selective automatic differentiation (arXiv:2505.03595). Benchmarked on the Poisson, Sine-Gordon, Allen-Cahn, and transient heat equations, Anant-Net &lt;strong&gt;solves PDEs in up to 300 dimensions on a single GPU within a few hours&lt;/strong&gt;. The framework includes &lt;strong&gt;Anant-KAN&lt;/strong&gt;, an interpretable KAN-based variant offering deeper insights into the learned solution structure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Separable PIKANs (SPIKANs)&lt;/strong&gt; decompose the PDE solution into products of one-dimensional KAN networks, drastically reducing computational complexity for high-dimensional problems while retaining accuracy and interpretability.&lt;/p&gt;
&lt;h3 class="heading" id="7-data-driven-discovery-and-inverse-problems"&gt;
 7. Data-Driven Discovery and Inverse Problems&lt;span class="heading__anchor"&gt; &lt;a href="#7-data-driven-discovery-and-inverse-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;KANs are especially powerful for &lt;strong&gt;scientific discovery tasks&lt;/strong&gt; where interpretability of the learned function is critical.&lt;/p&gt;
&lt;p&gt;Data-driven model discovery with KANs has been demonstrated on complex dynamical systems — including the Ikeda map and optical-cavity systems — where sparse optimisation methods fail due to non-sparse governing equations (arXiv:2409.15167). KAN captures complex behaviour while offering interpretability through its edge-wise univariate functions, providing insight into governing dynamics inaccessible in black-box MLPs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PI-KAN-PointNet&lt;/strong&gt; extends PIKAN to simultaneously solve inverse problems over multiple irregular geometries within a single training run, demonstrated on natural convection over 135 geometries with sparse data. &lt;strong&gt;KINN for Inverse Problems&lt;/strong&gt; enables identification of unknown material parameters in heterogeneous or hyperelastic materials from partial observations. &lt;strong&gt;KANHedge&lt;/strong&gt; applies KANs to high-dimensional BSDE solvers for option pricing, demonstrating improved hedging performance over MLP-based deep BSDE solvers (arXiv:2601.11097).&lt;/p&gt;
&lt;h3 class="heading" id="8-comparative-analysis-kan-vs-mlp-for-pdes"&gt;
 8. Comparative Analysis: KAN vs. MLP for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#8-comparative-analysis-kan-vs-mlp-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A comprehensive comparison between MLP and KAN representations for differential equations establishes nuanced findings (arXiv:2406.02917):&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Architecture&lt;/th&gt;
					&lt;th&gt;Shallow Networks&lt;/th&gt;
					&lt;th&gt;Deep Networks&lt;/th&gt;
					&lt;th&gt;Robustness&lt;/th&gt;
					&lt;th&gt;Interpretability&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;KAN (B-spline)&lt;/td&gt;
					&lt;td&gt;Superior accuracy&lt;/td&gt;
					&lt;td&gt;Comparable to MLP&lt;/td&gt;
					&lt;td&gt;Lower (may diverge with different seeds)&lt;/td&gt;
					&lt;td&gt;High — symbolic extraction possible&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;KAN (Chebyshev/Legendre)&lt;/td&gt;
					&lt;td&gt;High accuracy&lt;/td&gt;
					&lt;td&gt;Competitive&lt;/td&gt;
					&lt;td&gt;Moderate — rank collapse risk&lt;/td&gt;
					&lt;td&gt;High&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;MLP/PINN&lt;/td&gt;
					&lt;td&gt;Moderate accuracy&lt;/td&gt;
					&lt;td&gt;Robust&lt;/td&gt;
					&lt;td&gt;High&lt;/td&gt;
					&lt;td&gt;Low&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;PIKAN (optimised)&lt;/td&gt;
					&lt;td&gt;Superior&lt;/td&gt;
					&lt;td&gt;Superior or comparable&lt;/td&gt;
					&lt;td&gt;Moderate&lt;/td&gt;
					&lt;td&gt;High&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Key findings: KANs in &lt;strong&gt;shallow settings significantly outperform MLPs&lt;/strong&gt;, leveraging per-edge nonlinear expressiveness. In deep settings, KANs do not consistently outperform MLPs, but when properly optimised (e.g., with L-BFGS or Self-Scaled Broyden second-order optimisers), they achieve superior accuracy. &lt;strong&gt;JAX-based PIKAN implementations&lt;/strong&gt; have achieved up to 84× training speedup over original NumPy/PyTorch KANs.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Despite rapid progress, several challenges remain:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Computational cost.&lt;/strong&gt; Spline function evaluation involves multiple iterations, making KANs significantly slower per parameter than MLPs. Variants like PowerMLP propose more efficient formulations (arXiv:2412.13571), but a satisfactory solution to raw training speed at scale is still outstanding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scalability to complex geometries.&lt;/strong&gt; KINN and standard PIKANs underperform MLPs on irregular geometry problems. This remains a practical bottleneck for engineering applications involving complex domains.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Gradient instability in deep KANs.&lt;/strong&gt; Deep PIKANs face vanishing/exploding gradient challenges, motivating Glorot-like initialisation strategies and residual-gated architectures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Theoretical guarantees.&lt;/strong&gt; Generalisation bounds for KANs trained on PDE collocation have been studied — bounds scale with $\ell_1$ norms of spline coefficients — but practical understanding of how architecture choices affect convergence and generalisation remains incomplete (arXiv:2410.08026).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Operator learning completeness.&lt;/strong&gt; While KANO achieves symbolic operator recovery, the theoretical relationship between KAN architecture depth/width and approximation of PDE solution operators is still under active development.&lt;/p&gt;
&lt;p&gt;The trajectory is clear: KAN-based PDE solvers are moving from proof-of-concept demonstrations on canonical benchmarks toward &lt;strong&gt;production-ready frameworks&lt;/strong&gt; for engineering simulation, turbulence modelling, inverse problems, and high-dimensional scientific computing. The combination of interpretability, parameter efficiency, and growing theoretical foundations positions KANs as a genuinely transformative architecture for numerical PDEs.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Abueidda, D. W., Pantidis, P., &amp;amp; Mobasher, M. E. (2024). &lt;em&gt;DeepOKAN: Deep operator network based on Kolmogorov Arnold networks for mechanics problems&lt;/em&gt;. arXiv:2405.19143. &lt;a href="https://www.alphaxiv.org/overview/2405.19143v3"&gt;https://www.alphaxiv.org/overview/2405.19143v3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Cui, Z., et al. (2024). Physics-informed Kolmogorov–Arnold network with Chebyshev polynomials for fluid mechanics. &lt;em&gt;Physics of Fluids, 37&lt;/em&gt;(9), 095120. &lt;a href="https://pubs.aip.org/aip/pof/article-abstract/37/9/095120/3361431"&gt;https://pubs.aip.org/aip/pof/article-abstract/37/9/095120/3361431&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Knottenbelt, W., et al. (2026). &lt;em&gt;KANHedge: Efficient hedging of high-dimensional options using Kolmogorov-Arnold network-based BSDE solver&lt;/em&gt;. arXiv:2601.11097. &lt;a href="https://arxiv.org/abs/2601.11097"&gt;https://arxiv.org/abs/2601.11097&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Kovachki, N., et al. (2023). Neural operator: Learning maps between function spaces with applications to PDEs. &lt;em&gt;Journal of Machine Learning Research, 24&lt;/em&gt;(89), 1–97.&lt;/p&gt;
&lt;p&gt;Li, Z., et al. (2025). &lt;em&gt;Discontinuity-aware KAN-based physics-informed neural networks&lt;/em&gt;. arXiv:2507.08338. &lt;a href="https://arxiv.org/html/2507.08338v1"&gt;https://arxiv.org/html/2507.08338v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2024). &lt;em&gt;KAN: Kolmogorov–Arnold Networks&lt;/em&gt;. arXiv:2404.19756. &lt;a href="https://storage.prod.researchhub.com/uploads/papers/2024/05/04/2404.19756.pdf"&gt;https://storage.prod.researchhub.com/uploads/papers/2024/05/04/2404.19756.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2024). &lt;em&gt;A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks&lt;/em&gt;. arXiv:2406.02917. &lt;a href="https://arxiv.org/abs/2406.02917"&gt;https://arxiv.org/abs/2406.02917&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2026). &lt;em&gt;A unified benchmark of physics-informed neural networks and Kolmogorov-Arnold networks&lt;/em&gt;. arXiv:2602.15068. &lt;a href="https://arxiv.org/html/2602.15068v1"&gt;https://arxiv.org/html/2602.15068v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Peng, W., et al. (2025). &lt;em&gt;KANO: Kolmogorov-Arnold Neural Operator&lt;/em&gt;. arXiv:2509.16825. &lt;a href="https://arxiv.org/abs/2509.16825"&gt;https://arxiv.org/abs/2509.16825&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Shukla, K., et al. (2025). &lt;em&gt;Anant-Net: Breaking the curse of dimensionality with scalable and interpretable neural surrogates for high-dimensional PDEs&lt;/em&gt;. arXiv:2505.03595. &lt;a href="https://arxiv.org/html/2505.03595v3"&gt;https://arxiv.org/html/2505.03595v3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Tang, K., et al. (2025). &lt;em&gt;AC-PKAN: Attention-enhanced and Chebyshev polynomial-based Kolmogorov-Arnold networks&lt;/em&gt;. arXiv:2505.08687. &lt;a href="https://arxiv.org/html/2505.08687v2"&gt;https://arxiv.org/html/2505.08687v2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, Z., et al. (2025). &lt;em&gt;EvoKAN: Energy-dissipative evolutionary Kolmogorov-Arnold networks for complex PDE systems&lt;/em&gt;. arXiv:2503.01618. &lt;a href="https://arxiv.org/abs/2503.01618"&gt;https://arxiv.org/abs/2503.01618&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, Z., et al. (2024). Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems based on Kolmogorov–Arnold Networks. &lt;em&gt;Computer Methods in Applied Mechanics and Engineering&lt;/em&gt;. arXiv:2406.11045. &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S0045782524007722"&gt;https://www.sciencedirect.com/science/article/abs/pii/S0045782524007722&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xu, Y., et al. (2026). &lt;em&gt;Shallow-KAN based solution of moving boundary PDEs&lt;/em&gt;. arXiv:2601.09818. &lt;a href="https://arxiv.org/html/2601.09818v1"&gt;https://arxiv.org/html/2601.09818v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Yang, L., et al. (2025). &lt;em&gt;KAN-ODEs: Kolmogorov-Arnold network ordinary differential equations for learning dynamical systems and hidden physics&lt;/em&gt;. arXiv:2407.04192. &lt;a href="https://arxiv.org/html/2407.04192v1"&gt;https://arxiv.org/html/2407.04192v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zhang, Z., et al. (2025). Physics-informed neural networks with hybrid Kolmogorov-Arnold networks. &lt;em&gt;PMC&lt;/em&gt;. &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC11950322/"&gt;https://pmc.ncbi.nlm.nih.gov/articles/PMC11950322/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zuo, Q., et al. (2025). &lt;em&gt;Data-driven model discovery with Kolmogorov-Arnold networks&lt;/em&gt;. arXiv:2409.15167. &lt;a href="https://arxiv.org/abs/2409.15167"&gt;https://arxiv.org/abs/2409.15167&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Recent Advances in Numerical PDEs</title><link>https://blog.namln.org/en/posts/recent-numerical-pde/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/recent-numerical-pde/</guid><description>&lt;p&gt;Numerical methods for partial differential equations (PDEs) have entered a period of rapid transformation, driven by two converging forces: deep learning&amp;rsquo;s maturation as a tool for high-dimensional function approximation, and the resurgence of classical methods augmented by machine learning. The field broadly divides into &lt;em&gt;physics-informed machine learning&lt;/em&gt;, &lt;em&gt;neural operator learning&lt;/em&gt;, &lt;em&gt;foundation models for PDEs&lt;/em&gt;, and the continuing evolution of &lt;em&gt;classical high-order&lt;/em&gt;, &lt;em&gt;structure-preserving&lt;/em&gt;, and &lt;em&gt;data-driven discovery&lt;/em&gt; methods. Quantum computing and laser-based hardware solvers are also beginning to enter the landscape. This survey organises the most active research fronts, highlights landmark and recent key papers, and identifies open problems as of early 2026.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The table below summarises the major approaches covered in this survey, their representative key papers, and their current status.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Approach&lt;/th&gt;
					&lt;th&gt;Representative Key Papers&lt;/th&gt;
					&lt;th&gt;Status&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;PINNs (adaptive/staged training)&lt;/td&gt;
					&lt;td&gt;Raissi et al. (2019); IEEE 2025 staged training; PhysicsNeMo/Modulus&lt;/td&gt;
					&lt;td&gt;Production-ready&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;KANs for PDEs&lt;/td&gt;
					&lt;td&gt;Liu et al. (2024, ICLR 2025); KINN; PI-KAN; HRKANs&lt;/td&gt;
					&lt;td&gt;Active frontier&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Fourier Neural Operators&lt;/td&gt;
					&lt;td&gt;Li et al. (2020); O-FNO (2025); ReBA accelerator&lt;/td&gt;
					&lt;td&gt;Widely adopted&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;DeepONet variants&lt;/td&gt;
					&lt;td&gt;Lu et al. (2019); L-DeepONet; Hybrid KAN-DeepONet; Quantum DeepONet&lt;/td&gt;
					&lt;td&gt;Mature + expanding&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;PDE Foundation Models&lt;/td&gt;
					&lt;td&gt;Poseidon; OmniArch; PDEformer; Geo-NeW&lt;/td&gt;
					&lt;td&gt;Emerging (2024–2026)&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Deep BSDE &amp;amp; high-dimensional&lt;/td&gt;
					&lt;td&gt;Han, Jentzen, &amp;amp; E (PNAS 2018); Deep Shotgun; DRDM; Heun-BSDE&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Data-driven PDE discovery&lt;/td&gt;
					&lt;td&gt;SINDy (Brunton et al.); GN-SINDy; Evo-SINDy; Bayesian-SINDy&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Structure-preserving methods&lt;/td&gt;
					&lt;td&gt;Hairer et al. (2006); Stochastic multisymplectic; Geo-NeW&lt;/td&gt;
					&lt;td&gt;Maturing&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;High-order FEM/DG&lt;/td&gt;
					&lt;td&gt;hp-DGFEM Boltzmann; ML-accelerated FEM; FEX-PG&lt;/td&gt;
					&lt;td&gt;Mature + augmented&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Fractional PDEs&lt;/td&gt;
					&lt;td&gt;Review (2024); O-FNO for fractional Poisson; Fractional Laplacian meshfree&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Hamilton–Jacobi PDEs&lt;/td&gt;
					&lt;td&gt;Review arXiv:2502.20833; Actor-critic NN; Deep BSDE for HJB&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Multiscale / ROM&lt;/td&gt;
					&lt;td&gt;MLP-based multiscale; POD-DL-ROM; Multi-fidelity ROM&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Uncertainty quantification&lt;/td&gt;
					&lt;td&gt;QMC/RQMC; PDE-DKL&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Quantum computing&lt;/td&gt;
					&lt;td&gt;Schrödingerisation; H-DES (ColibriTD); Quantum DeepONet&lt;/td&gt;
					&lt;td&gt;Early-stage&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Photonic/analog solvers&lt;/td&gt;
					&lt;td&gt;LightSolver LPU&lt;/td&gt;
					&lt;td&gt;Very early-stage&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-classical-pde-problem"&gt;
 The Classical PDE Problem&lt;span class="heading__anchor"&gt; &lt;a href="#the-classical-pde-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A general PDE on a domain $\Omega \subseteq \mathbb{R}^d$ takes the form&lt;/p&gt;
&lt;p&gt;$$\mathcal{N} [u] (x) = f(x), \quad x \in \Omega, \qquad \mathcal{B} [u] (x) = g(x), \quad x \in \partial \Omega,$$&lt;/p&gt;
&lt;p&gt;where $\mathcal{N}$ is a (possibly nonlinear) differential operator, $\mathcal{B}$ encodes boundary or initial conditions, and $u: \Omega \to \mathbb{R}$ is the unknown. Classical mesh-based methods — finite element (FEM), finite difference (FDM), finite volume (FVM), and spectral methods — discretise $\Omega$ into $N$ degrees of freedom and solve a resulting algebraic system. Their complexity typically scales as $O(N^\alpha)$ for some $\alpha \geq 1$, and in $d$ dimensions $N \sim h^{-d}$ for mesh spacing $h$, leading to exponential cost as $d$ grows.&lt;/p&gt;
&lt;h3 class="heading" id="the-deep-learning-turn"&gt;
 The Deep Learning Turn&lt;span class="heading__anchor"&gt; &lt;a href="#the-deep-learning-turn"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The 2019 PINN paper by Raissi, Perdikaris, and Karniadakis, and the 2020 FNO paper by Li et al., triggered an explosion of mesh-free and operator-learning approaches. Rather than discretising $\Omega$, these methods parameterise $u$ (or the solution operator $\mathcal{N}^{-1}$) as a neural network and minimise a physics-informed or data-driven loss. The key advantages are mesh-free flexibility, natural handling of inverse problems, and — in the operator-learning setting — the ability to generalise across PDE instances.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-physics-informed-neural-networks-pinns-and-variants"&gt;
 1. Physics-Informed Neural Networks (PINNs) and Variants&lt;span class="heading__anchor"&gt; &lt;a href="#1-physics-informed-neural-networks-pinns-and-variants"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;PINNs, introduced by Raissi, Perdikaris, and Karniadakis (2019), embed physical laws directly into the neural network loss function as residual terms of the form $\mathcal{L}_{\text{phys}} = |f(\hat{u})|^2$, supplemented by data, boundary, and initial condition constraints. Their appeal lies in a mesh-free design that handles irregular geometries and inverse problems naturally. Yet PINN training is notoriously fragile — subject to spectral bias, loss imbalance, and stiffness — motivating a rich line of training improvements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Staged training strategies.&lt;/strong&gt; A 2025 IEEE paper proposes a two-stage process: a short-time pretraining phase followed by extension to the full time domain, combined with uncertainty-guided sampling. This significantly improves accuracy and efficiency for time-dependent PDEs compared to standard PINNs (IEEE, 2025).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Evolutionary optimisation of PINNs.&lt;/strong&gt; A 2025 arXiv paper introduces evolutionary optimisation to tune PINN architectures, improving robustness when data are scarce by complying with physical laws through training loss (arXiv:2501.06572).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Automatic structure discovery via knowledge distillation.&lt;/strong&gt; A 2025 &lt;em&gt;Nature Communications&lt;/em&gt; paper proposes a physics-informed distillation framework that decouples physical and parameter regularisation in teacher–student networks, then uses clustering and parameter reconstruction to embed physically meaningful structures. Experiments on Laplace, Burgers, Poisson, and fluid mechanics equations show improved accuracy, training efficiency, and transferability (arXiv:2502.06026).&lt;/p&gt;
&lt;p&gt;Production-ready frameworks include &lt;em&gt;PhysicsNeMo/Modulus&lt;/em&gt; (CUDA-optimised kernels with 4× speedups) and &lt;em&gt;DeepXDE&lt;/em&gt;, which support adaptive weighting schemes, curriculum learning, intelligent residual point sampling, and domain decomposition for stiff problems.&lt;/p&gt;
&lt;h3 class="heading" id="2-kolmogorovarnold-networks-kans-for-pdes"&gt;
 2. Kolmogorov–Arnold Networks (KANs) for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#2-kolmogorovarnold-networks-kans-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Proposed by Liu, Wang, Vaidya et al. (2024, accepted ICLR 2025), &lt;strong&gt;KANs&lt;/strong&gt; replace fixed activation functions at MLP nodes with learnable spline-parameterised functions on each edge. This change — inspired by the Kolmogorov-Arnold representation theorem — provides faster neural scaling laws, improved interpretability, and comparable or better accuracy with far fewer parameters, especially for scientific AI tasks. The major PINN-KAN hybrid architectures are as follows:&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Architecture&lt;/th&gt;
					&lt;th&gt;PDE focus&lt;/th&gt;
					&lt;th&gt;Key claim&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;KINN&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Solid mechanics, multi-scale, singularities&lt;/td&gt;
					&lt;td&gt;Significantly outperforms MLP-PINNs in accuracy and convergence speed&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;PI-KAN&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Navier–Stokes (forward)&lt;/td&gt;
					&lt;td&gt;High prediction accuracy; addresses information bottleneck&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;HRKANs&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Poisson, Burgers&lt;/td&gt;
					&lt;td&gt;Highest fitting accuracy, lowest training time vs. KAN and ReLU-KAN&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;PIKANs&lt;/strong&gt; (adaptive grid)&lt;/td&gt;
					&lt;td&gt;Forward PDE problems&lt;/td&gt;
					&lt;td&gt;Up to 84× faster training; adaptive state transition reduces $L^2$ error by 43%&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;EvoKAN&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Complex PDE systems&lt;/td&gt;
					&lt;td&gt;Energy-dissipative; encodes only the initial state, avoiding retraining&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;KAN-ODEs&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Schrödinger, Allen–Cahn, dynamical systems&lt;/td&gt;
					&lt;td&gt;Improved performance over Neural ODEs in discovering hidden physics&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;KANs are also being used inside &lt;strong&gt;DeepONet branch/trunk networks&lt;/strong&gt; for hybrid neural operator surrogates in porous media flows, including Darcy flow and 2D/3D multiphase problems (arXiv:2511.02962). For a deeper treatment of KAN architectures for PDEs, see the companion post in this series.&lt;/p&gt;
&lt;h3 class="heading" id="3-neural-operator-learning"&gt;
 3. Neural Operator Learning&lt;span class="heading__anchor"&gt; &lt;a href="#3-neural-operator-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Neural operators learn mappings between infinite-dimensional function spaces — enabling resolution-invariant, discretisation-agnostic PDE solvers. The two dominant architectures are the &lt;strong&gt;Fourier Neural Operator (FNO)&lt;/strong&gt; and &lt;strong&gt;Deep Operator Networks (DeepONet)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FNO&lt;/strong&gt; applies global convolution in Fourier space, giving resolution invariance and fast inference. The 2025 &lt;em&gt;Optimised FNO (O-FNO)&lt;/em&gt; integrates residual connections and enhanced spectral resolution for the 2D fractional Poisson equation, achieving over 98% test accuracy and outperforming both base FNO and DeepONet. A hardware/algorithm co-design chip, &lt;strong&gt;ReBA&lt;/strong&gt;, implements the Galerkin Transformer achieving 34.57× speedup over CPUs and up to 51.26× over prior accelerators (IEEE, 2025).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DeepONet&amp;rsquo;s&lt;/strong&gt; branch-trunk architecture excels under noise and complex geometries where FNO degrades. Recent extensions include multi-fidelity physics-guided DeepONet (2025), Fusion DeepONet for hypersonic flow predictions on arbitrary grids (arXiv:2501.01934), and &lt;strong&gt;Latent-space DeepONet (L-DeepONet)&lt;/strong&gt; (&lt;em&gt;Nature Communications&lt;/em&gt;, 2024), which outperforms all other neural operators with small latent dimensions ($d \leq 100$), enabling real-time high-dimensional predictions. Ensemble and Mixture-of-Experts DeepONets achieve 2–4× lower relative $\ell_2$ errors through basis enrichment and spatial locality (arXiv:2405.11907). &lt;strong&gt;Taylor Mode Neural Operators&lt;/strong&gt; provide an order-of-magnitude speed-up for DeepONet and 8× for FNO in computing high-order derivatives via Taylor-mode automatic differentiation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Graph Neural Operator Methods.&lt;/strong&gt; The &lt;strong&gt;GOLA framework&lt;/strong&gt; (2025) addresses the limitation of regular-grid assumptions by constructing graphs from irregularly sampled spatial points with a Fourier-based encoder for learnable complex-coefficient embeddings, outperforming baselines in data-scarce regimes across 2D Darcy, Advection, Eikonal, and Nonlinear Diffusion problems (arXiv:2505.18923).&lt;/p&gt;
&lt;h3 class="heading" id="4-foundation-models-for-pdes"&gt;
 4. Foundation Models for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#4-foundation-models-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Inspired by the success of LLMs, PDE foundation models represent a paradigm shift: large transformers pre-trained on diverse physical systems that can be fine-tuned for downstream tasks with minimal data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Poseidon&lt;/strong&gt; (ETH Zurich, 2024) is a multiscale operator transformer with time-conditioned layer norms, enabling continuous-in-time evaluation. Pre-trained on diverse physical systems, it exploits the semigroup property of time-dependent PDEs for significant data scaling (arXiv:2405.19101).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OmniArch&lt;/strong&gt; (ICML 2025) is the first multi-scale and multi-physics scientific computing foundation model, featuring a Fourier encoder-decoder and transformer backbone with a &lt;em&gt;PDE-Aligner&lt;/em&gt; for physics-informed fine-tuning. It achieves unified 1D-2D-3D pre-training on PDEBench and demonstrates zero-shot learning on new physics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PDEformer&lt;/strong&gt; (2025) represents PDEs as computational graphs integrating symbolic and numerical information; a graph transformer with implicit neural representation enables mesh-free predictions with zero-shot accuracy comparable to specialist models (arXiv:2402.12652).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Multimodal PDE Foundation Model&lt;/strong&gt; (UCLA, 2025) integrates both numerical inputs (equation parameters, initial conditions) and text descriptions. It achieves average relative error below 3.3% in-distribution and generates interpretable scientific text — bridging NLP and scientific computing (arXiv:2502.06026).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Physics-informed fine-tuning&lt;/strong&gt; (arXiv:2603.15431, 2026) establishes that hybrid fine-tuning (combining physics-informed and data-driven objectives) achieves superior extrapolation to downstream tasks and enables data-free learning of unseen PDE families.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Geo-NeW&lt;/strong&gt; (arXiv:2602.02788, Feb 2026) — General-Geometry Neural Whitney Forms — is a data-driven finite element method jointly learning differential operators and compatible finite element spaces on the geometry. It exactly preserves physical conservation laws via Finite Element Exterior Calculus, with state-of-the-art performance on out-of-distribution geometries.&lt;/p&gt;
&lt;h3 class="heading" id="5-deep-learning-for-high-dimensional-pdes"&gt;
 5. Deep Learning for High-Dimensional PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#5-deep-learning-for-high-dimensional-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Classical mesh-based methods suffer exponential complexity growth in dimension $d$. Three principal deep learning paradigms address this.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Deep BSDE method&lt;/strong&gt; (Han, Jentzen, &amp;amp; E, &lt;em&gt;PNAS&lt;/em&gt;, 2018) reformulates semilinear parabolic PDEs using backward stochastic differential equations (BSDEs) and learns the gradient of the solution with neural networks, enabling solution of PDEs in hundreds to thousands of dimensions. A 2025 review by the original authors traces subsequent advances. Key recent improvements include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Deep Shotgun Method&lt;/strong&gt; (&lt;em&gt;J. Sci. Comput.&lt;/em&gt;, 2025): avoids full trajectory simulation, using only data distribution, achieving results up to dimension 10,000 (Springer, 2025).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;XNet-enhanced Deep BSDE&lt;/strong&gt; (2025): a new network architecture with fewer parameters, significantly improving computational efficiency and accuracy (arXiv:2502.06238).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deep Random Difference Method (DRDM)&lt;/strong&gt; (2025): approximates the convection-diffusion operator using only first-order differences, avoiding Hessian computations, with proved first-order accuracy in time step $h$ (arXiv:2506.20308).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stratonovich-based BSDE with Heun integration&lt;/strong&gt; (2025): identifies that Euler-Maruyama discretisation bias is the root cause of BSDE underperformance relative to PINNs; Heun integration eliminates this bias and achieves competitive results across high-dimensional benchmarks (arXiv:2505.01078).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;strong&gt;Deep Ritz method&lt;/strong&gt; (E &amp;amp; Yu, 2018) minimises energy functionals using neural networks. Extensions to multiscale problems leverage scale convergence theory to derive $\Gamma$-limits of oscillatory energy functionals.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Full History Recursive Multilevel Picard (MLP)&lt;/strong&gt; methodology — combining Picard iterations with multilevel Monte Carlo — was the first method proven to overcome the curse of dimensionality for semilinear parabolic PDEs and remains one of very few methods with such proven guarantees.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PDE-DKL&lt;/strong&gt; (2025) combines deep learning for low-dimensional latent representations with Gaussian Processes for kernel regression under explicit PDE constraints, providing both high accuracy and principled uncertainty quantification in limited-data regimes (arXiv:2501.18258).&lt;/p&gt;
&lt;h3 class="heading" id="6-classical-high-order-methods-fem-dg-and-spectral"&gt;
 6. Classical High-Order Methods: FEM, DG, and Spectral&lt;span class="heading__anchor"&gt; &lt;a href="#6-classical-high-order-methods-fem-dg-and-spectral"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite the deep learning surge, classical methods continue to mature, particularly in rigorous error analysis and efficiency.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;hp-version DG finite element method for the Boltzmann transport problem&lt;/strong&gt; (&lt;em&gt;J. Sci. Comput.&lt;/em&gt;, 2024) achieves arbitrary-order convergence rates and handles polytopic elements, enabling efficient parallel implementation within existing multigroup discrete ordinates software. High-order DG methods for unsteady compressible flows — targeting acoustic waves, turbulence, and magnetohydrodynamics — benefit from block-diagonal mass matrices allowing efficient explicit time-stepping.&lt;/p&gt;
&lt;p&gt;A systematic 2024 approach uses neural networks to learn the element-wise solution map of PDEs, accelerating finite element-type methods in an &amp;ldquo;element neural network&amp;rdquo; paradigm that generalises across element geometries. Machine learning-based spectral methods combine orthogonal function expansions (Fourier, Legendre) with deep neural operator learning for highly accurate solutions with fewer grid points.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FEX-PG&lt;/strong&gt; (2024) solves high-dimensional partial integro-differential equations using parameter grouping to reduce coefficient count and Taylor series approximation for integral terms, achieving relative errors on the order of single-precision machine epsilon while providing &lt;em&gt;interpretable, explicit&lt;/em&gt; solution formulas absent from most DL methods (arXiv:2410.00835).&lt;/p&gt;
&lt;h3 class="heading" id="7-structure-preserving-numerical-methods"&gt;
 7. Structure-Preserving Numerical Methods&lt;span class="heading__anchor"&gt; &lt;a href="#7-structure-preserving-numerical-methods"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Structure-preserving methods retain intrinsic properties of the continuous system — symplecticity, energy conservation, divergence-free constraints — at the discrete level. They enhance numerical stability and long-term accuracy, ensuring computed solutions respect the underlying mathematical structure.&lt;/p&gt;
&lt;p&gt;Recent research encompasses geometric integrators and mimetic discretisations for conservative finite element, difference, and volume schemes; stochastic multisymplectic PDEs and their structure-preserving discretisations (&lt;em&gt;Studies in Applied Mathematics&lt;/em&gt;, 2025); and structure-preserving learning via the Geo-NeW model, which exactly preserves physical conservation laws through Finite Element Exterior Calculus. A 2024 University of Maryland workshop identified integration of structure-preserving methods with uncertainty quantification as a key open problem.&lt;/p&gt;
&lt;h3 class="heading" id="8-data-driven-pde-discovery"&gt;
 8. Data-Driven PDE Discovery&lt;span class="heading__anchor"&gt; &lt;a href="#8-data-driven-pde-discovery"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;SINDy&lt;/strong&gt; and its extensions use sparse regression over a dictionary of candidate functions. &lt;strong&gt;GN-SINDy&lt;/strong&gt; (2024–2026) addresses high dimensionality and large datasets by combining Q-DEIM greedy sampling, differentiable surrogate modelling, and sparse regression, showing robustness on Burgers, Allen-Cahn, and KdV equations. &lt;strong&gt;Evo-SINDy&lt;/strong&gt; (ACM, 2025) uses multi-population co-evolutionary algorithms for universal PDE identification. &lt;strong&gt;Bayesian-SINDy&lt;/strong&gt; quantifies parameter uncertainty robustly (arXiv:2402.15357).&lt;/p&gt;
&lt;p&gt;On the neural-symbolic front, &lt;strong&gt;Mechanistic PDE Networks&lt;/strong&gt; (arXiv:2502.18377, 2025) represent spatiotemporal data as space-time dependent linear PDEs within neural network hidden representations, then solve and decode for specific tasks. &lt;strong&gt;MORL4PDEs&lt;/strong&gt; (&lt;em&gt;Chaos Solitons Fractals&lt;/em&gt;, 2024) uses reinforcement learning and genetic algorithms for symbolic PDE regression without pre-specified candidate libraries. The &lt;strong&gt;Physics-Informed Information Criterion (PIC)&lt;/strong&gt; (&lt;em&gt;Research&lt;/em&gt;, 2022) selects the most appropriate PDE from candidates by incorporating symmetry constraints.&lt;/p&gt;
&lt;h3 class="heading" id="9-hamiltonjacobi-pdes"&gt;
 9. Hamilton–Jacobi PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#9-hamiltonjacobi-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Hamilton–Jacobi (HJ) PDEs govern optimal control, level-set methods, and front propagation. A comprehensive 2025 review (arXiv:2502.20833) covers grid-based methods, representation formula methods, Monte Carlo via Laplace&amp;rsquo;s method, and deep learning approaches. Key deep learning advances include actor-critic neural network frameworks for static HJ equations (convergence analysed in 2024), and variational methods that solve HJ PDEs up to 100 dimensions with relative errors of 1–5%. Deep BSDE methods naturally apply to Hamilton-Jacobi-Bellman (HJB) equations arising in stochastic optimal control.&lt;/p&gt;
&lt;h3 class="heading" id="10-fractional-and-non-local-pdes"&gt;
 10. Fractional and Non-Local PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#10-fractional-and-non-local-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Fractional-order derivatives model anomalous diffusion, viscoelastic behaviour, and memory effects that integer-order PDEs cannot capture. Recent advances include semi-analytical methods (Adomian Decomposition, Variational Iteration) applied to 3D time-fractional diffusion, telegraph, and wave equations; a 2024 comprehensive review of fractional stochastic PDEs covering the latest numerical methods and practical implementations; the Optimised FNO (O-FNO, 2025) achieving 98%+ test accuracy for fractional Poisson equations; and a 2025 meshfree finite difference scheme for the fractional Laplacian on arbitrary bounded domains.&lt;/p&gt;
&lt;h3 class="heading" id="11-multiscale-methods-and-model-order-reduction"&gt;
 11. Multiscale Methods and Model Order Reduction&lt;span class="heading__anchor"&gt; &lt;a href="#11-multiscale-methods-and-model-order-reduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The 2024 &lt;em&gt;Numerical Multiscale Methods&lt;/em&gt; dissertation establishes an equivalence between time averaging and space homogenisation, and extends Deep Ritz to multiscale problems via scale convergence theory. &lt;strong&gt;Multi-fidelity reduced order models&lt;/strong&gt; for PDE-constrained optimisation (arXiv:2503.21252, 2025) use a hierarchical trust region algorithm with active learning, constructing a full/reduced/ML model hierarchy on-the-fly. &lt;strong&gt;POD-DL-ROMs&lt;/strong&gt; (Politecnico di Milano, 2024) combine proper orthogonal decomposition with autoencoder architectures for nonlinear parametric PDEs, providing a mathematically rigorous framework enhancing accuracy of reduced models.&lt;/p&gt;
&lt;h3 class="heading" id="12-uncertainty-quantification-and-stochastic-pdes"&gt;
 12. Uncertainty Quantification and Stochastic PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#12-uncertainty-quantification-and-stochastic-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Quasi-Monte Carlo (QMC) methods&lt;/strong&gt; achieve faster convergence than Monte Carlo for smooth integrands. A 2024 paper analyses QMC with generalised Gaussian random variables and Gevrey regular inputs — relaxing the standard uniformly bounded assumption — analysing dimension truncation, FEM, and QMC errors jointly for randomly shifted rank-1 lattice rules (arXiv:2411.03793). &lt;strong&gt;Randomised QMC (RQMC)&lt;/strong&gt; with scrambled Sobol&amp;rsquo; sequences achieves smaller bias and RMSE than Monte Carlo for risk-averse optimisation (arXiv:2408.02842). A 2024 ICERM semester at Brown University (&amp;ldquo;Numerical PDEs: Analysis, Algorithms, and Data Challenges&amp;rdquo;) served as a major gathering point for researchers integrating uncertainty quantification with PDE methods.&lt;/p&gt;
&lt;h3 class="heading" id="13-quantum-and-photonic-computing-for-pdes"&gt;
 13. Quantum and Photonic Computing for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#13-quantum-and-photonic-computing-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Schrödingerisation&lt;/strong&gt; techniques convert general linear PDEs into Schrödinger-type equations via the &amp;ldquo;warped transformation,&amp;rdquo; enabling direct quantum Hamiltonian simulation. A 2024 &lt;em&gt;Quantum&lt;/em&gt; journal paper provides explicit quantum circuit implementations for the heat and advection equations with complexity analysis demonstrating quantum advantage in high dimensions. &lt;strong&gt;ColibriTD&amp;rsquo;s H-DES&lt;/strong&gt; (March 2025) was reported as the first real-hardware solution of a PDE via variational quantum algorithm, executing on IBM&amp;rsquo;s 156-qubit Heron R2 processor for the inviscid Burgers&amp;rsquo; equation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LightSolver&amp;rsquo;s Laser Processing Unit (LPU)&lt;/strong&gt; (announced September 2025) can now directly map and solve PDEs, with constant-time iteration steps independent of problem size, claiming up to 100× speed gains over GPU solvers and partnerships with Ansys for engineering integration.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;PINN training stability.&lt;/strong&gt; Despite many improvements, PINN training remains fragile for stiff and multi-scale problems. A general theory of loss landscape conditioning and principled hyperparameter selection is lacking.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Neural operator generalisation theory.&lt;/strong&gt; While FNO and DeepONet generalise empirically across PDE instances, rigorous approximation-theoretic guarantees relating operator-learning error to network width, depth, and training data remain incomplete.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Foundation model reliability and extrapolation.&lt;/strong&gt; PDE foundation models show impressive zero-shot accuracy within their pre-training distribution, but their failure modes on out-of-distribution physics — and the extent to which physics-informed fine-tuning can compensate — are not yet well understood.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;High-dimensional solvers beyond parabolic PDEs.&lt;/strong&gt; The Deep BSDE method and MLP method primarily address semilinear parabolic PDEs. Extending their curse-of-dimensionality guarantees to elliptic, hyperbolic, or fully nonlinear PDEs remains largely open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structure-preserving deep learning.&lt;/strong&gt; Integrating conservation laws and geometric structure (symplecticity, divergence-free constraints) into neural PDE solvers at scale — beyond the Geo-NeW approach for specific exterior calculus structures — is an active and unresolved challenge.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum hardware advantage.&lt;/strong&gt; Near-term quantum devices face noise and connectivity limitations that restrict their practical advantage over classical HPC for PDE solving. Demonstrating genuine quantum speedup for industrially relevant PDEs on real hardware remains an open goal.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Brunton, S. L., Proctor, J. L., &amp;amp; Kutz, J. N. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems. &lt;em&gt;PNAS, 113&lt;/em&gt;(15), 3932–3937.&lt;/p&gt;
&lt;p&gt;ColibriTD. (2025, March). &lt;em&gt;H-DES: First real-hardware PDE solver via variational quantum algorithm&lt;/em&gt;. The Quantum Insider. &lt;a href="https://thequantuminsider.com/2025/03/25/colibritd-announces-h-des-pde-solver-as-a-step-toward-accessible-quantum-simulation-in-engineering/"&gt;https://thequantuminsider.com/2025/03/25/colibritd-announces-h-des-pde-solver-as-a-step-toward-accessible-quantum-simulation-in-engineering/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;E, W., &amp;amp; Yu, B. (2018). The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. &lt;em&gt;Communications in Mathematics and Statistics, 6&lt;/em&gt;(1), 1–12.&lt;/p&gt;
&lt;p&gt;E, W., Han, J., &amp;amp; Jentzen, A. (2022). Algorithms for solving high dimensional PDEs: From nonlinear Monte Carlo to machine learning. &lt;em&gt;Nonlinearity, 35&lt;/em&gt;(1), 278.&lt;/p&gt;
&lt;p&gt;Han, J., Jentzen, A., &amp;amp; E, W. (2018). Solving high-dimensional partial differential equations using deep learning. &lt;em&gt;PNAS, 115&lt;/em&gt;(34), 8505–8510. &lt;a href="https://www.pnas.org/doi/10.1073/pnas.1718942115"&gt;https://www.pnas.org/doi/10.1073/pnas.1718942115&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Han, J. (2025). &lt;em&gt;A brief review of the Deep BSDE method for solving high-dimensional partial differential equations&lt;/em&gt;. arXiv:2505.17032. &lt;a href="https://arxiv.org/abs/2505.17032"&gt;https://arxiv.org/abs/2505.17032&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hu, J., Jin, S., Liu, N., &amp;amp; Zhang, L. (2024). Quantum circuits for partial differential equations via Schrödingerisation. &lt;em&gt;Quantum, 8&lt;/em&gt;, 1563. &lt;a href="https://quantum-journal.org/papers/q-2024-12-12-1563/"&gt;https://quantum-journal.org/papers/q-2024-12-12-1563/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;A staged training approach for physics-informed neural networks in solving partial differential equations&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11172661/"&gt;https://ieeexplore.ieee.org/document/11172661/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;Higher-order-ReLU-KANs (HRKANs) for solving physics-informed neural networks more accurately, robustly and faster&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11105234/"&gt;https://ieeexplore.ieee.org/document/11105234/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;ReBA: A hybrid sparse reconfigurable butterfly accelerator for solving PDEs via hardware and algorithm co-design&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11044078/"&gt;https://ieeexplore.ieee.org/document/11044078/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;An optimized Fourier neural operator for the 2D fractional Poisson equation&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11405135/"&gt;https://ieeexplore.ieee.org/document/11405135/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Li, Z., et al. (2020). &lt;em&gt;Fourier neural operator for parametric partial differential equations&lt;/em&gt;. arXiv:2010.08895.&lt;/p&gt;
&lt;p&gt;LightSolver. (2025, September). &lt;em&gt;LightSolver announces advance in physical modeling on the LPU&lt;/em&gt;. The Quantum Insider. &lt;a href="https://thequantuminsider.com/2025/09/16/lightsolver-announces-advance-in-physical-modeling-on-the-lpu-and-new-roadmap-for-optical-analog-pde-solving/"&gt;https://thequantuminsider.com/2025/09/16/lightsolver-announces-advance-in-physical-modeling-on-the-lpu-and-new-roadmap-for-optical-analog-pde-solving/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2024). &lt;em&gt;KAN: Kolmogorov-Arnold Networks&lt;/em&gt;. arXiv:2404.19756. ICLR 2025. &lt;a href="https://arxiv.org/abs/2404.19756"&gt;https://arxiv.org/abs/2404.19756&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Lu, L., Jin, P., Pang, G., Zhang, Z., &amp;amp; Karniadakis, G. E. (2021). Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. &lt;em&gt;Nature Machine Intelligence, 3&lt;/em&gt;, 218–229.&lt;/p&gt;
&lt;p&gt;Lu, L., et al. (2024). Learning nonlinear operators in latent spaces for real-time predictions of complex dynamics in physical systems. &lt;em&gt;Nature Communications&lt;/em&gt;. &lt;a href="https://www.nature.com/articles/s41467-024-49411-w"&gt;https://www.nature.com/articles/s41467-024-49411-w&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;McCabe, M., et al. (2025). &lt;em&gt;Poseidon: Efficient foundation models for PDEs&lt;/em&gt;. arXiv:2405.19101. &lt;a href="https://arxiv.org/html/2405.19101v2"&gt;https://arxiv.org/html/2405.19101v2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Peng, W., et al. (2025). &lt;em&gt;OmniArch: Building foundation model for scientific computing&lt;/em&gt;. ICML 2025. &lt;a href="https://icml.cc/virtual/2025/poster/45099"&gt;https://icml.cc/virtual/2025/poster/45099&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Raissi, M., Perdikaris, P., &amp;amp; Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. &lt;em&gt;Journal of Computational Physics, 378&lt;/em&gt;, 686–707.&lt;/p&gt;
&lt;p&gt;Shi, Z., et al. (2025). &lt;em&gt;Physics-informed fine-tuning of foundation models for partial differential equations&lt;/em&gt;. arXiv:2603.15431. &lt;a href="https://arxiv.org/html/2603.15431v1"&gt;https://arxiv.org/html/2603.15431v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, S., et al. (2025). &lt;em&gt;Geo-NeW: Structure-preserving learning improves geometry generalization in PDEs&lt;/em&gt;. arXiv:2602.02788. &lt;a href="https://arxiv.org/abs/2602.02788"&gt;https://arxiv.org/abs/2602.02788&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, Z., et al. (2024). Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems. &lt;em&gt;Computer Methods in Applied Mechanics and Engineering&lt;/em&gt;. &lt;a href="https://linkinghub.elsevier.com/retrieve/pii/S0045782524007722"&gt;https://linkinghub.elsevier.com/retrieve/pii/S0045782524007722&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xiao, P., et al. (2025). Quantum DeepONet: Neural operators accelerated by quantum computing. &lt;em&gt;Quantum, 9&lt;/em&gt;, 1761. &lt;a href="https://quantum-journal.org/papers/q-2025-06-04-1761/"&gt;https://quantum-journal.org/papers/q-2025-06-04-1761/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xie, Z., et al. (2025). &lt;em&gt;Anant-Net: Breaking the curse of dimensionality with scalable and interpretable neural surrogates&lt;/em&gt;. arXiv:2505.03595. &lt;a href="https://arxiv.org/html/2505.03595v3"&gt;https://arxiv.org/html/2505.03595v3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xie, Z., et al. (2025). &lt;em&gt;A deep shotgun method for solving high-dimensional parabolic partial differential equations&lt;/em&gt;. &lt;em&gt;Journal of Scientific Computing&lt;/em&gt;. &lt;a href="https://link.springer.com/10.1007/s10915-025-02983-1"&gt;https://link.springer.com/10.1007/s10915-025-02983-1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xu, K., &amp;amp; Darve, E. (2025). &lt;em&gt;Integration matters for learning PDEs with backwards SDEs&lt;/em&gt;. arXiv:2505.01078. &lt;a href="https://arxiv.org/abs/2505.01078"&gt;https://arxiv.org/abs/2505.01078&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zeng, Q., et al. (2025). Automatic network structure discovery of physics informed neural networks via knowledge distillation. &lt;em&gt;Nature Communications&lt;/em&gt;. &lt;a href="https://www.nature.com/articles/s41467-025-64624-3"&gt;https://www.nature.com/articles/s41467-025-64624-3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zhang, Y., et al. (2024). &lt;em&gt;PDEformer: Towards a foundation model for one-dimensional partial differential equations&lt;/em&gt;. arXiv:2402.12652. &lt;a href="http://arxiv.org/pdf/2402.12652.pdf"&gt;http://arxiv.org/pdf/2402.12652.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zhang, Y., et al. (2025). &lt;em&gt;A multimodal PDE foundation model for prediction and scientific text descriptions&lt;/em&gt;. arXiv:2502.06026. &lt;a href="https://arxiv.org/abs/2502.06026"&gt;https://arxiv.org/abs/2502.06026&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Recent Advances in Steady States of Navier-Stokes Equations</title><link>https://blog.namln.org/en/posts/ss-nse/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/ss-nse/</guid><description>&lt;p&gt;The study of steady-state and self-similar solutions of the incompressible Navier-Stokes equations (NSE) has undergone remarkable progress in the 2020s. This post surveys landmark results from 2024–2026 touching on existence, uniqueness, classification, and stability of such solutions. The stationary (steady) NSE in $\mathbb{R}^3$ reads:&lt;/p&gt;
&lt;p&gt;$$-\nu \Delta u + (u \cdot \nabla) u + \nabla p = 0, \quad \operatorname{div} u = 0.$$&lt;/p&gt;
&lt;p&gt;A central object of the self-similar theory is the class of &lt;strong&gt;$(-1)$-homogeneous&lt;/strong&gt; (scale-invariant) solutions: a function $u$ is $(-1)$-homogeneous if $u(\lambda x) = \lambda^{-1} u(x)$ for all $\lambda &amp;gt; 0$. These are precisely the profiles of forward self-similar solutions $u(x,t) = t^{-1/2} U(x/\sqrt{t})$ of the time-dependent NSE.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Five landmark results define the frontier of this area in 2024–2026:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Non-uniqueness of Leray–Hopf solutions&lt;/strong&gt; via a computer-assisted proof in the self-similar framework (Hou, Wang, &amp;amp; Yang, 2025).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Forward self-similar solutions in 2D&lt;/strong&gt; for arbitrarily large initial data (Albritton, Guillod, Korobkov, &amp;amp; Ren, 2026).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Existence of self-similar solutions in high dimensions&lt;/strong&gt; ($4 \leq n \leq 16$) without smallness conditions (Bang, Gui, Liu, Wang, &amp;amp; Xie, 2025).&lt;/li&gt;
&lt;li&gt;Sharp &lt;strong&gt;removable singularity results&lt;/strong&gt; for $(-1)$-homogeneous solutions with singular rays (Li, Li, &amp;amp; Yan, 2024).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Steady NSE in junction domains&lt;/strong&gt; with large, non-small fluxes (Gazzola, Korobkov, Ren, &amp;amp; Sperone, 2025).&lt;/li&gt;
&lt;/ol&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Paper&lt;/th&gt;
					&lt;th&gt;Authors&lt;/th&gt;
					&lt;th&gt;Contribution&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2410.11170&lt;/td&gt;
					&lt;td&gt;Li, Li, Yan&lt;/td&gt;
					&lt;td&gt;Optimal removable singularity for $(-1)$-homogeneous solutions&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2412.07283&lt;/td&gt;
					&lt;td&gt;Bang, Gui, Liu, Wang, Xie&lt;/td&gt;
					&lt;td&gt;Self-similar solutions in 2D sector: existence/non-uniqueness&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2505.14642&lt;/td&gt;
					&lt;td&gt;Gazzola, Korobkov, Ren, Sperone&lt;/td&gt;
					&lt;td&gt;Steady NSE in junction channels, non-small fluxes&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2509.25116&lt;/td&gt;
					&lt;td&gt;Hou, Wang, Yang&lt;/td&gt;
					&lt;td&gt;&lt;strong&gt;First rigorous non-uniqueness of Leray–Hopf&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2510.10488&lt;/td&gt;
					&lt;td&gt;Bang, Gui, Liu, Wang, Xie&lt;/td&gt;
					&lt;td&gt;$(-1)$-homogeneous solutions, dimensions $4 \leq n \leq 16$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2601.03161&lt;/td&gt;
					&lt;td&gt;Albritton, Guillod, Korobkov, Ren&lt;/td&gt;
					&lt;td&gt;Forward self-similar solutions, 2D, large data&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2601.03833&lt;/td&gt;
					&lt;td&gt;Gui, Liu, Xie&lt;/td&gt;
					&lt;td&gt;Global existence of 2D forward self-similar solutions&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2602.19846&lt;/td&gt;
					&lt;td&gt;Fujii&lt;/td&gt;
					&lt;td&gt;Sharp uniqueness/non-uniqueness in critical Besov spaces&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="landau-solutions-and-šveráks-classification"&gt;
 Landau Solutions and Šverák&amp;rsquo;s Classification&lt;span class="heading__anchor"&gt; &lt;a href="#landau-solutions-and-%c5%a1ver%c3%a1ks-classification"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In 1944, Landau discovered a three-parameter explicit family of $(-1)$-homogeneous axisymmetric no-swirl solutions of the 3D stationary NSE. Known as &lt;strong&gt;Landau solutions&lt;/strong&gt;, they are parameterized by vectors $b \in \mathbb{R}^3$ and represent fluid jets emanating from the origin. A seminal result of Šverák (2006) established that all $(-1)$-homogeneous solutions smooth on $\mathbb{S}^2$ must be Landau solutions — the only scale-invariant flows without singularities on the sphere.&lt;/p&gt;
&lt;h3 class="heading" id="forward-self-similar-solutions"&gt;
 Forward Self-Similar Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#forward-self-similar-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A &lt;strong&gt;forward self-similar solution&lt;/strong&gt; takes the form&lt;/p&gt;
&lt;p&gt;$$u(x, t) = \frac{1}{\sqrt{t}} U!\left(\frac{x}{\sqrt{t}}\right),$$&lt;/p&gt;
&lt;p&gt;where the self-similar profile $U$ solves the stationary scaled NSE. The seminal work of Jia and Šverák (2014) showed that for any $(-1)$-homogeneous initial data smooth away from the origin, at least one global self-similar solution exists for &lt;strong&gt;large data&lt;/strong&gt; — without any smallness restriction. Existence is proved via the Leray–Schauder continuation theorem rather than a fixed-point contraction (Jia &amp;amp; Šverák, 2015).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Discretely self-similar&lt;/strong&gt; (DSS) solutions, where $u(\lambda x, \lambda^2 t) = \lambda^{-1} u(x,t)$ for a specific $\lambda &amp;gt; 1$, were constructed for large data by Tsai (2014).&lt;/p&gt;
&lt;h3 class="heading" id="classification-of--1-homogeneous-solutions"&gt;
 Classification of $(-1)$-Homogeneous Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#classification-of--1-homogeneous-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Tian and Xin (1998) proved that all $(-1)$-homogeneous axisymmetric solutions with exactly one singularity must be Landau solutions. A key series of papers by Li, Li, and Yan (2016–2023) classified all $(-1)$-homogeneous axisymmetric no-swirl solutions with singularities at both the north and south poles of $\mathbb{S}^2$, parameterizing them as a four-dimensional surface with boundary. They also constructed the first &lt;strong&gt;non-axisymmetric&lt;/strong&gt; $(-1)$-homogeneous solutions with swirl using the Weierstrass representation of minimal surfaces.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-removable-singularity-theorem-li-li--yan-2024"&gt;
 1. Removable Singularity Theorem (Li, Li, &amp;amp; Yan, 2024)&lt;span class="heading__anchor"&gt; &lt;a href="#1-removable-singularity-theorem-li-li--yan-2024"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;One of the sharpest results of 2024 is the &lt;strong&gt;removable singularity theorem&lt;/strong&gt; proved by Li, Li, and Yan (arXiv:2410.11170, to appear in &lt;em&gt;Trans. Amer. Math. Soc.&lt;/em&gt;): any local $(-1)$-homogeneous solution $u$ near a potential singular ray through $P \in \mathbb{S}^2$ extends smoothly across $P$, &lt;strong&gt;provided&lt;/strong&gt; $u = o(\ln \operatorname{dist}(x, P))$ on $\mathbb{S}^2$.&lt;/p&gt;
&lt;p&gt;The result is &lt;strong&gt;sharp&lt;/strong&gt;: for any $\alpha &amp;gt; 0$, there exist local solutions where $|u(x)| / \ln |x&amp;rsquo;| \to -\alpha$ as $x \to P$, showing that logarithmic growth exactly prevents smooth extension. The paper also establishes existence of solutions with any finite number of singularities located arbitrarily on $\mathbb{S}^2$. A companion survey by Li and Yan (arXiv:2509.07243, Sep 2025) provides a state-of-the-art exposition of this topic.&lt;/p&gt;
&lt;h3 class="heading" id="2-self-similar-solutions-in-high-dimensions-bang-et-al-2025"&gt;
 2. Self-Similar Solutions in High Dimensions (Bang et al., 2025)&lt;span class="heading__anchor"&gt; &lt;a href="#2-self-similar-solutions-in-high-dimensions-bang-et-al-2025"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Bang, Gui, Liu, Wang, and Xie (arXiv:2510.10488, Oct 2025) proved existence of $(-1)$-homogeneous solutions to the steady NSE in &lt;strong&gt;high spatial dimensions&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For any $(-3)$-homogeneous, locally Lipschitz external force on $\mathbb{R}^n \setminus {0}$ with $4 \leq n \leq 16$, the steady NSE admit at least one $(-1)$-homogeneous solution that is scale-invariant and regular away from the origin.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Global uniqueness&lt;/strong&gt; holds when the external force is small. The key novelty is a &lt;strong&gt;dimension-reduction effect&lt;/strong&gt; from self-similarity: integral estimates of the positive part of the total head pressure enable energy estimates even in the supercritical dimension regime. For forces with only a nonnegative radial component, existence extends to &lt;strong&gt;all $n \geq 4$&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The same group (arXiv:2412.07283, Dec 2024) also established existence, uniqueness, and non-uniqueness of self-similar solutions to the steady NSE in &lt;strong&gt;2D sectors&lt;/strong&gt; with no-slip boundary conditions, providing rigorous corrections to classical Rosenhead (1940) calculations.&lt;/p&gt;
&lt;h3 class="heading" id="3-forward-self-similar-solutions-in-2d-for-large-data-2026"&gt;
 3. Forward Self-Similar Solutions in 2D for Large Data (2026)&lt;span class="heading__anchor"&gt; &lt;a href="#3-forward-self-similar-solutions-in-2d-for-large-data-2026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Two independent papers in January 2026 addressed the 2D problem, where classical local energy estimates break down because the initial $(-1)$-homogeneous vorticity is not locally integrable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gui, Liu, and Xie&lt;/strong&gt; (arXiv:2601.03833) established global existence of forward self-similar solutions for any divergence-free, $(-1)$-homogeneous, locally Hölder continuous initial velocity, with &lt;strong&gt;no smallness assumption&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Albritton, Guillod, Korobkov, and Ren&lt;/strong&gt; (arXiv:2601.03161) independently constructed such solutions from &lt;strong&gt;arbitrarily large&lt;/strong&gt; initial data and provided &lt;strong&gt;numerical evidence for non-uniqueness&lt;/strong&gt; — the first construction and validation of non-uniqueness for the 2D self-similar problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="4-non-uniqueness-of-lerayhopf-solutions-hou-wang--yang-2025"&gt;
 4. Non-Uniqueness of Leray–Hopf Solutions (Hou, Wang, &amp;amp; Yang, 2025)&lt;span class="heading__anchor"&gt; &lt;a href="#4-non-uniqueness-of-lerayhopf-solutions-hou-wang--yang-2025"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most dramatic recent development is the &lt;strong&gt;first rigorous computer-assisted proof of non-uniqueness of Leray–Hopf solutions&lt;/strong&gt; to the unforced 3D NSE by Hou, Wang, and Yang (arXiv:2509.25116, Sep 2025, revised Mar 2026):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There exist &lt;strong&gt;infinitely many distinct suitable Leray–Hopf solutions&lt;/strong&gt; to the 3D NSE on $\mathbb{R}^3 \times [0,1]$ with the same compactly supported, divergence-free initial condition $u_{in} \in L^q$ for any $q &amp;lt; 3$.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The proof executes the &lt;strong&gt;Jia–Šverák program&lt;/strong&gt; (Jia &amp;amp; Šverák, 2015), which requires finding a large forward self-similar background flow whose linearized operator has an &lt;strong&gt;unstable eigenvalue&lt;/strong&gt; (positive real part), then bifurcating to produce infinitely many Leray–Hopf solutions. The key steps are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A finite-element + spectral-basis numerical method computes a highly precise candidate profile $\tilde{U}$.&lt;/li&gt;
&lt;li&gt;The linearized operator $L_{\tilde{U}}$ is decomposed into a coercive part plus a finite-rank perturbation, whose invertibility is certified by &lt;strong&gt;computer-assisted interval arithmetic&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;This certifies an unstable eigenpair $(\tilde{v}, \tilde{\lambda})$ with $\operatorname{Re}(\tilde{\lambda}) &amp;gt; 0$, yielding the second (and infinitely many) solutions via Riesz projection and Duhamel analysis.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These solutions just miss the Prodi–Serrin condition that guarantees uniqueness. Guillod and Šverák (2017) had provided strong numerical evidence that such unstable profiles exist, but the rigorous proof remained elusive until Hou et al.&lt;/p&gt;
&lt;h3 class="heading" id="5-sharp-non-uniqueness-for-weak-solutions-via-convex-integration-20222026"&gt;
 5. Sharp Non-Uniqueness for Weak Solutions via Convex Integration (2022–2026)&lt;span class="heading__anchor"&gt; &lt;a href="#5-sharp-non-uniqueness-for-weak-solutions-via-convex-integration-20222026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A parallel program uses convex integration to prove non-uniqueness of weak solutions. Cheskidov and Luo (&lt;em&gt;Invent. Math.&lt;/em&gt;, 2022) proved sharp non-uniqueness in $L^p_t L^\infty$ for any $p &amp;lt; 2$ in the periodic setting. Miao, Nie, and Ye (arXiv:2412.09637, Dec 2024) extended this to $\mathbb{R}^3$. Fujii (arXiv:2602.19846, Feb 2026) completed a sharp classification in critical Besov spaces $C([0,T); \dot{B}^{n/p-1}_{p,q}(\mathbb{R}^n))$, finding that large-time asymptotics of non-unique solutions are governed by non-trivial &lt;strong&gt;stationary flows&lt;/strong&gt; — a first in the critical regularity setting.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Result&lt;/th&gt;
					&lt;th&gt;Authors&lt;/th&gt;
					&lt;th&gt;Year&lt;/th&gt;
					&lt;th&gt;Setting&lt;/th&gt;
					&lt;th style="text-align: center"&gt;Self-similar?&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;Non-uniqueness, $L^p_t L^\infty$, torus&lt;/td&gt;
					&lt;td&gt;Cheskidov &amp;amp; Luo&lt;/td&gt;
					&lt;td&gt;2022&lt;/td&gt;
					&lt;td&gt;3D periodic&lt;/td&gt;
					&lt;td style="text-align: center"&gt;No&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Non-uniqueness, $L^p_t L^\infty$, $\mathbb{R}^3$&lt;/td&gt;
					&lt;td&gt;Miao, Nie &amp;amp; Ye&lt;/td&gt;
					&lt;td&gt;2024&lt;/td&gt;
					&lt;td&gt;3D whole space&lt;/td&gt;
					&lt;td style="text-align: center"&gt;No&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Non-uniqueness of Leray–Hopf, 3D&lt;/td&gt;
					&lt;td&gt;Hou, Wang &amp;amp; Yang&lt;/td&gt;
					&lt;td&gt;2025&lt;/td&gt;
					&lt;td&gt;3D whole space&lt;/td&gt;
					&lt;td style="text-align: center"&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Forward self-similar, 2D, large data&lt;/td&gt;
					&lt;td&gt;Albritton et al.&lt;/td&gt;
					&lt;td&gt;2026&lt;/td&gt;
					&lt;td&gt;2D whole space&lt;/td&gt;
					&lt;td style="text-align: center"&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Steady NSE in 2D sector&lt;/td&gt;
					&lt;td&gt;Bang et al.&lt;/td&gt;
					&lt;td&gt;2024&lt;/td&gt;
					&lt;td&gt;2D sector&lt;/td&gt;
					&lt;td style="text-align: center"&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 class="heading" id="6-liouville-theorems-and-stability-of-landau-solutions"&gt;
 6. Liouville Theorems and Stability of Landau Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#6-liouville-theorems-and-stability-of-landau-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Tan&lt;/strong&gt; (arXiv:2501.03609, Jan 2025) proved new Liouville theorems for the stationary NSE (including the fractional case) under growth conditions in Lebesgue spaces. &lt;strong&gt;Ding and Tan&lt;/strong&gt; (arXiv:2501.03615, Jan 2025) proved a Liouville theorem for the stationary &lt;strong&gt;inhomogeneous&lt;/strong&gt; NSE via frequency localization of the Dirichlet energy near the origin.&lt;/p&gt;
&lt;p&gt;The asymptotic stability of small Landau solutions in $L^3$ was sharpened by &lt;strong&gt;Bradshaw and Wang&lt;/strong&gt; (arXiv:2409.12918, Sep 2024): $L^3$-asymptotic stability holds in Lorentz spaces $L^{3,q}$ for $q &amp;lt; \infty$, but &lt;strong&gt;fails&lt;/strong&gt; in $L^{3,\infty}$ (weak-$L^3$), marking the precise boundary of stability.&lt;/p&gt;
&lt;h3 class="heading" id="7-steady-nse-in-bounded-and-unbounded-domains"&gt;
 7. Steady NSE in Bounded and Unbounded Domains&lt;span class="heading__anchor"&gt; &lt;a href="#7-steady-nse-in-bounded-and-unbounded-domains"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A major reference work by Korobkov, Pileckas, and Russo (&lt;em&gt;Springer/Birkhäuser&lt;/em&gt;, March 2024) provides the first comprehensive book treatment of &lt;strong&gt;Leray&amp;rsquo;s problem&lt;/strong&gt;: existence of a solution in bounded domains under only the condition of zero total flux — without smallness on the boundary data.&lt;/p&gt;
&lt;p&gt;Gazzola, Korobkov, Ren, and Sperone (arXiv:2505.14642, May 2025) studied steady NSE in a &lt;strong&gt;junction of unbounded channels&lt;/strong&gt; with sources and sinks, under inhomogeneous Dirichlet boundary conditions and without smallness of fluxes. They prove existence of a solution with uniformly bounded Dirichlet integral in every compact subset via Leray&amp;rsquo;s &lt;em&gt;reductio ad absurdum&lt;/em&gt; argument using Morse–Sard-type theorems in Sobolev spaces.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Several central questions remain unresolved or only partially answered:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Clay Millennium Prize Problem.&lt;/strong&gt; Whether 3D NSE solutions from smooth initial data can blow up in finite time is not resolved. The Hou et al. non-uniqueness result concerns Leray–Hopf solutions from &lt;em&gt;singular&lt;/em&gt; $L^q$ ($q &amp;lt; 3$) initial data, not smooth data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Complete classification of $(-1)$-homogeneous solutions in 3D.&lt;/strong&gt; The axisymmetric no-swirl case is fully classified, and swirl solutions are well-studied, but a complete classification for all $(-1)$-homogeneous solutions with arbitrarily many singular rays and all possible swirl configurations is not yet achieved.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rigorous non-uniqueness of forward self-similar solutions in 3D.&lt;/strong&gt; The Jia–Šverák program produced numerical evidence (Guillod &amp;amp; Šverák, 2017), but a fully rigorous, non-computer-assisted proof of non-uniqueness for the forward (not backward) self-similar 3D problem remains open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Asymptotic stability of large Landau solutions.&lt;/strong&gt; While small Landau solutions are asymptotically stable in $L^3$, stability for large-parameter Landau solutions is not fully understood.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Leray problem in non-axisymmetric 3D exterior domains without flux restrictions.&lt;/strong&gt; The axisymmetric case was solved by Korobkov, Pileckas, and Russo, but the general 3D exterior domain problem under large flux remains open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Albritton, D., Guillod, J., Korobkov, M., &amp;amp; Ren, X. (2026). &lt;em&gt;Forward self-similar solutions to the 2D Navier-Stokes equations from large data&lt;/em&gt;. arXiv:2601.03161. &lt;a href="https://arxiv.org/abs/2601.03161"&gt;https://arxiv.org/abs/2601.03161&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bang, J., Gui, C., Liu, Y., Wang, C., &amp;amp; Xie, C. (2024). &lt;em&gt;Self-similar solutions to the steady Navier-Stokes equations in 2D sectors&lt;/em&gt;. arXiv:2412.07283. &lt;a href="https://arxiv.org/abs/2412.07283"&gt;https://arxiv.org/abs/2412.07283&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bang, J., Gui, C., Liu, Y., Wang, C., &amp;amp; Xie, C. (2025). &lt;em&gt;On the existence of self-similar solutions to the steady Navier-Stokes equations in high dimensions&lt;/em&gt;. arXiv:2510.10488. &lt;a href="https://arxiv.org/abs/2510.10488"&gt;https://arxiv.org/abs/2510.10488&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bradshaw, Z., &amp;amp; Wang, X. (2024). &lt;em&gt;Asymptotic stability of Landau solutions in Lorentz spaces&lt;/em&gt;. arXiv:2409.12918. &lt;a href="https://arxiv.org/pdf/2409.12918.pdf"&gt;https://arxiv.org/pdf/2409.12918.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Cheskidov, A., &amp;amp; Luo, X. (2022). Sharp nonuniqueness for the Navier-Stokes equations. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;. arXiv:2009.06596. &lt;a href="https://arxiv.org/abs/2009.06596"&gt;https://arxiv.org/abs/2009.06596&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Ding, M., &amp;amp; Tan, W. (2025). &lt;em&gt;Liouville-type theorem for the stationary inhomogeneous Navier-Stokes equations&lt;/em&gt;. arXiv:2501.03615. &lt;a href="https://arxiv.org/abs/2501.03615"&gt;https://arxiv.org/abs/2501.03615&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Fujii, M. (2026). &lt;em&gt;Sharp non-uniqueness for the Navier-Stokes equations in critical Besov spaces&lt;/em&gt;. arXiv:2602.19846. &lt;a href="https://arxiv.org/html/2602.19846v1"&gt;https://arxiv.org/html/2602.19846v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gazzola, F., Korobkov, M., Ren, X., &amp;amp; Sperone, G. (2025). &lt;em&gt;The steady Navier-Stokes equations in a system of unbounded channels with sources and sinks&lt;/em&gt;. arXiv:2505.14642. &lt;a href="https://arxiv.org/abs/2505.14642"&gt;https://arxiv.org/abs/2505.14642&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gui, C., Liu, Y., &amp;amp; Xie, C. (2026). &lt;em&gt;On the forward self-similar solutions to the two-dimensional Navier-Stokes equations&lt;/em&gt;. arXiv:2601.03833. &lt;a href="https://arxiv.org/html/2601.03833v2"&gt;https://arxiv.org/html/2601.03833v2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hou, T., Wang, Y., &amp;amp; Yang, C. (2025). &lt;em&gt;Nonuniqueness of Leray-Hopf solutions to the unforced incompressible 3D Navier-Stokes equations&lt;/em&gt;. arXiv:2509.25116. &lt;a href="https://arxiv.org/abs/2509.25116"&gt;https://arxiv.org/abs/2509.25116&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Jia, H., &amp;amp; Šverák, V. (2015). Are the incompressible 3d Navier–Stokes equations locally ill-posed in the natural energy space? &lt;em&gt;Journal of Functional Analysis, 268&lt;/em&gt;(12), 3734–3766. &lt;a href="https://www.sciencedirect.com/science/article/pii/S002212361500138X"&gt;https://www.sciencedirect.com/science/article/pii/S002212361500138X&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Korobkov, M., Pileckas, K., &amp;amp; Russo, R. (2024). &lt;em&gt;The Steady Navier-Stokes System: Basics of the Theory and the Leray Problem&lt;/em&gt;. Springer/Birkhäuser. &lt;a href="https://books.google.com/books/about/The_Steady_Navier_Stokes_System.html?id=GOf8EAAAQBAJ"&gt;https://books.google.com/books/about/The_Steady_Navier_Stokes_System.html?id=GOf8EAAAQBAJ&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Korobkov, M., &amp;amp; Ren, X. (2024). &lt;em&gt;On basic velocity estimates for the plane steady-state Navier-Stokes equations in convex domains&lt;/em&gt;. arXiv:2405.17884. &lt;a href="https://arxiv.org/abs/2405.17884"&gt;https://arxiv.org/abs/2405.17884&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Li, L., Li, Y., &amp;amp; Yan, Y. (2024). &lt;em&gt;Removable singularity of $(-1)$-homogeneous solutions of stationary Navier-Stokes equations&lt;/em&gt;. &lt;em&gt;Transactions of the American Mathematical Society&lt;/em&gt;. arXiv:2410.11170. &lt;a href="https://arxiv.org/abs/2410.11170"&gt;https://arxiv.org/abs/2410.11170&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Li, Y., &amp;amp; Yan, Y. (2025). &lt;em&gt;Recent research on $(-1)$-homogeneous solutions of stationary Navier-Stokes equations&lt;/em&gt;. arXiv:2509.07243. &lt;a href="https://arxiv.org/abs/2509.07243"&gt;https://arxiv.org/abs/2509.07243&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Miao, C., Nie, Y., &amp;amp; Ye, W. (2024). &lt;em&gt;Sharp non-uniqueness for the Navier-Stokes equations in the whole space&lt;/em&gt;. arXiv:2412.09637. &lt;a href="https://arxiv.org/abs/2412.09637"&gt;https://arxiv.org/abs/2412.09637&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Tan, W. (2025). &lt;em&gt;New Liouville type theorems for the stationary Navier-Stokes equations&lt;/em&gt;. arXiv:2501.03609. &lt;a href="https://arxiv.org/pdf/2501.03609.pdf"&gt;https://arxiv.org/pdf/2501.03609.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Tsai, T.-P. (2014). &lt;em&gt;Forward discretely self-similar solutions of the Navier-Stokes equations&lt;/em&gt;. arXiv:1210.2783. &lt;a href="https://arxiv.org/abs/1210.2783"&gt;https://arxiv.org/abs/1210.2783&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Recent Research Directions in Analysis of PDEs 2021–2026</title><link>https://blog.namln.org/en/posts/recent-pde-2126/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/recent-pde-2126/</guid><description>&lt;p&gt;The arXiv section of Analysis of Partial Differential Equations is one of the most prolific areas of pure mathematics, producing over 400 preprints per month as of early 2026. The period 2021–2026 has witnessed landmark breakthroughs — including a computer-assisted proof of finite-time singularity in the 3D Euler equations, the resolution of Hilbert&amp;rsquo;s Sixth Problem via kinetic theory, and the emergence of probabilistic and nonlocal operator methods as dominant paradigms. This survey identifies, categorises, and profiles the key research directions and landmark papers in math.AP during this era.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The landscape of math.AP in 2021–2026 organises into several major research directions:&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Direction&lt;/th&gt;
					&lt;th&gt;Landmark Papers&lt;/th&gt;
					&lt;th&gt;Landmark Results&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;Fluid singularity (Euler)&lt;/td&gt;
					&lt;td&gt;Chen &amp;amp; Hou (2022–2023)&lt;/td&gt;
					&lt;td&gt;Finite-time blowup for 3D Euler/2D Boussinesq, smooth data (&lt;em&gt;PNAS&lt;/em&gt; 2025)&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;NS non-uniqueness&lt;/td&gt;
					&lt;td&gt;Albritton, Brué &amp;amp; Colombo (2021)&lt;/td&gt;
					&lt;td&gt;Non-unique Leray–Hopf solutions for forced NS&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Hilbert&amp;rsquo;s 6th Problem&lt;/td&gt;
					&lt;td&gt;Deng, Hani &amp;amp; Ma (2024–2025)&lt;/td&gt;
					&lt;td&gt;Long-time Boltzmann derivation; fluid equations from Newton&amp;rsquo;s laws&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Wave kinetic equation&lt;/td&gt;
					&lt;td&gt;Deng &amp;amp; Hani (2021)&lt;/td&gt;
					&lt;td&gt;Rigorous WKE derivation from cubic NLS&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Mixed local-nonlocal operators&lt;/td&gt;
					&lt;td&gt;Biagi, Dipierro, Valdinoci et al. (2020–2022)&lt;/td&gt;
					&lt;td&gt;Regularity, max. principles, Faber-Krahn inequalities&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Double phase functionals&lt;/td&gt;
					&lt;td&gt;De Filippis &amp;amp; Mingione (2022–2023)&lt;/td&gt;
					&lt;td&gt;Gradient regularity in mixed/double phase settings&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Normalized Schrödinger&lt;/td&gt;
					&lt;td&gt;Wei &amp;amp; Wu (2021); Jeanjean &amp;amp; Le (2020)&lt;/td&gt;
					&lt;td&gt;Critical mass constraints, ground states, NLS&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;MFG inverse problems&lt;/td&gt;
					&lt;td&gt;Imanuvilov, Liu &amp;amp; Yamamoto (2023)&lt;/td&gt;
					&lt;td&gt;Lipschitz stability, Carleman estimates for MFG&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Keller-Segel chemotaxis&lt;/td&gt;
					&lt;td&gt;Li &amp;amp; Winkler (2022); Lyu &amp;amp; Wang (2021)&lt;/td&gt;
					&lt;td&gt;Signal-dependent motility, global regularity&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Stefan/free boundary&lt;/td&gt;
					&lt;td&gt;Ferrari et al. (2024); Arya, Jeon &amp;amp; Julin (2026)&lt;/td&gt;
					&lt;td&gt;$C^{1,\alpha}$ regularity, supercooled Stefan&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Stochastic PDEs&lt;/td&gt;
					&lt;td&gt;Bailleul &amp;amp; Bruned (2021); Bailleul &amp;amp; Hoshino (2025)&lt;/td&gt;
					&lt;td&gt;Renormalisation, regularity structures&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Calderón inverse problem&lt;/td&gt;
					&lt;td&gt;Cârstea, Uhlmann et al. (2021); Krupchyk (2025)&lt;/td&gt;
					&lt;td&gt;Nonlinear and fractional settings&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Dispersive PDEs&lt;/td&gt;
					&lt;td&gt;Deng, Nahmod &amp;amp; Yue (2020); Gubinelli et al. (2025)&lt;/td&gt;
					&lt;td&gt;Random tensors, modulated dispersive equations&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-mathap-landscape"&gt;
 The math.AP Landscape&lt;span class="heading__anchor"&gt; &lt;a href="#the-mathap-landscape"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Analysis of PDEs is the mathematical study of equations involving unknown functions and their partial derivatives, arising in physics, geometry, probability, and engineering. The arXiv math.AP category encompasses everything from regularity theory for elliptic and parabolic equations to global well-posedness for dispersive equations, from geometric flows to inverse problems, and from kinetic theory to stochastic PDEs. With roughly 300–400 papers per month (408 in February 2026 alone), it is one of the most active and interconnected areas of pure mathematics.&lt;/p&gt;
&lt;p&gt;The period 2021–2026 is characterised by three broad trends. First, &lt;strong&gt;grand-challenge resolutions&lt;/strong&gt;: several longstanding open problems — including Hilbert&amp;rsquo;s Sixth Problem and the existence of finite-time singularities for 3D Euler equations with smooth data — were settled using novel combinations of rigorous analysis, Feynman-diagram combinatorics, and computer-assisted numerics. Second, &lt;strong&gt;new paradigm emergence&lt;/strong&gt;: mixed local-nonlocal operators, double phase functionals, and normalised solutions have matured from isolated curiosities into systematic research programmes with their own regularity theories. Third, &lt;strong&gt;interdisciplinary expansion&lt;/strong&gt;: MFG systems, optimal transport, SPDEs, and AI-assisted methods have become structural parts of the math.AP ecosystem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-mathematical-fluid-dynamics-singularity-non-uniqueness-and-stability"&gt;
 1. Mathematical Fluid Dynamics: Singularity, Non-Uniqueness, and Stability&lt;span class="heading__anchor"&gt; &lt;a href="#1-mathematical-fluid-dynamics-singularity-non-uniqueness-and-stability"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h4 class="heading" id="finite-time-blowup-of-the-3d-euler-equations"&gt;
 Finite-Time Blowup of the 3D Euler Equations&lt;span class="heading__anchor"&gt; &lt;a href="#finite-time-blowup-of-the-3d-euler-equations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;The question of whether the 3D incompressible Euler equations&lt;/p&gt;
&lt;p&gt;$$\partial_t u + (u \cdot \nabla) u + \nabla p = 0, \qquad \operatorname{div} u = 0,$$&lt;/p&gt;
&lt;p&gt;can develop a singularity from smooth initial data — open since Euler introduced the equations in 1757 — saw a decisive resolution in a bounded-domain setting through a landmark two-part series by &lt;strong&gt;Jiajie Chen and Thomas Y. Hou&lt;/strong&gt; (arXiv:2210.07191, arXiv:2305.05660, &lt;em&gt;PNAS&lt;/em&gt; 2025). Their work proves finite-time, nearly self-similar blowup of both the &lt;strong&gt;2D Boussinesq&lt;/strong&gt; and &lt;strong&gt;3D axisymmetric Euler&lt;/strong&gt; equations with smooth initial data and finite energy in the presence of a solid boundary. The proof employs weighted $L^\infty$ and $C^{1/2}$ norms, sharp functional inequalities inspired by optimal transport, and computer-assisted rigorous numerics to verify nonlinear stability constants. The result was praised as one of the most significant advances in mathematical fluid mechanics in decades.&lt;/p&gt;
&lt;p&gt;Prior to Chen–Hou, &lt;strong&gt;Tarek Elgindi&lt;/strong&gt; (2021) showed finite-time singularity for the 3D axisymmetric Euler equations without swirl from $C^{1,\alpha}$ initial vorticity. The Chen–Hou 2021 paper on the Hou-Luo model proved asymptotically self-similar blowup from smooth data for the HL model. Concurrently, Hou and collaborators presented numerical evidence for singularity in 3D Navier-Stokes achieving a $10^7$-fold increase in maximum vorticity, and DeepMind (2025) used AI-assisted methods to discover families of unstable singularities in the Incompressible Porous Media and Boussinesq equations.&lt;/p&gt;
&lt;h4 class="heading" id="non-uniqueness-of-lerayhopf-solutions-for-navier-stokes"&gt;
 Non-Uniqueness of Leray–Hopf Solutions for Navier-Stokes&lt;span class="heading__anchor"&gt; &lt;a href="#non-uniqueness-of-lerayhopf-solutions-for-navier-stokes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;A 2021 breakthrough by &lt;strong&gt;Dallas Albritton, Elia Brué, and Maria Colombo&lt;/strong&gt; proved non-uniqueness of Leray–Hopf solutions to the &lt;em&gt;forced&lt;/em&gt; 3D Navier-Stokes equations: they exhibited two distinct Leray solutions with zero initial velocity and identical body force, exploiting the extreme instability of a self-similar background solution. Recognised as the most influential 2021 math.AP paper on arXiv by Paper Digest, the result was subsequently extended to bounded domains via gluing methods (arXiv:2209.03530) and to stochastic settings (&lt;em&gt;Electronic Journal of Probability&lt;/em&gt;, 2024).&lt;/p&gt;
&lt;h4 class="heading" id="stability-of-shear-flows-and-kinetic-theory"&gt;
 Stability of Shear Flows and Kinetic Theory&lt;span class="heading__anchor"&gt; &lt;a href="#stability-of-shear-flows-and-kinetic-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;Parallel to the singularity programme, sharp asymptotic stability results for &lt;strong&gt;2D monotone shear flows&lt;/strong&gt; with no-slip boundary conditions, and extensive work on &lt;strong&gt;inviscid damping&lt;/strong&gt; and enhanced dissipation near shear flows, have appeared throughout 2025–2026.&lt;/p&gt;
&lt;p&gt;Arguably the most monumental result in kinetic PDE theory during this period: &lt;strong&gt;Yu Deng, Zaher Hani, and Xiao Ma&lt;/strong&gt; provided a rigorous long-time derivation of the Boltzmann equation from hard-sphere dynamics (arXiv:2408.07818, 2024), extending Lanford&amp;rsquo;s 1975 short-time theorem to all times within the lifespan of the Boltzmann solution. In a companion paper (arXiv:2503.01800, 2025), they completed the derivation of the &lt;strong&gt;compressible Euler&lt;/strong&gt; and &lt;strong&gt;incompressible Navier-Stokes-Fourier&lt;/strong&gt; equations from Newton&amp;rsquo;s laws — effectively resolving &lt;strong&gt;Hilbert&amp;rsquo;s Sixth Problem&lt;/strong&gt; for rarefied hard-sphere gases. The proof uses cumulant ansätze, Feynman-diagram combinatorics, and a molecule-reduction algorithm. This followed the same team&amp;rsquo;s 2021 derivation of the &lt;strong&gt;wave kinetic equation&lt;/strong&gt; from the cubic NLS.&lt;/p&gt;
&lt;h3 class="heading" id="2-nonlocal-and-fractional-pdes-mixed-local-nonlocal-operators"&gt;
 2. Nonlocal and Fractional PDEs: Mixed Local-Nonlocal Operators&lt;span class="heading__anchor"&gt; &lt;a href="#2-nonlocal-and-fractional-pdes-mixed-local-nonlocal-operators"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;One of the dominant new paradigms of the 2020s is the study of operators of the form&lt;/p&gt;
&lt;p&gt;$$\mathcal{L} u = -\Delta u + (-\Delta)^s u, \quad s \in (0,1),$$&lt;/p&gt;
&lt;p&gt;which superpose a classical Laplacian with a fractional (nonlocal) Laplacian. These arise naturally in models combining Brownian and Lévy diffusion processes. The foundational paper by &lt;strong&gt;Biagi, Dipierro, Valdinoci, and Vecchi&lt;/strong&gt; (2020/2021) initiated a systematic theory of regularity and maximum principles for such operators.&lt;/p&gt;
&lt;p&gt;Between 2021 and 2026 an explosion of activity produced: gradient regularity for mixed local-nonlocal problems via De Filippis and Mingione (2022, minimisers of mixed functionals are locally $C^{1,\beta}$-regular); Hölder regularity for mixed local-nonlocal degenerate elliptic equations (Garain &amp;amp; Lindgren, 2022); the Wiener criterion for nonlocal Dirichlet problems (Kim, Lee &amp;amp; Lee, 2022); and a Faber-Krahn inequality for mixed operators (Biagi, Dipierro, Valdinoci &amp;amp; Vecchi, 2021). &lt;strong&gt;Serena Dipierro&lt;/strong&gt; and &lt;strong&gt;Enrico Valdinoci&lt;/strong&gt; were among the most prolific contributors, publishing on nonlocal logistic equations with Neumann conditions, ecological niches for mixed dispersal, and Sobolev inequalities for mixed operators.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Giovanni Leoni&amp;rsquo;s&lt;/strong&gt; 2023 treatise &lt;em&gt;A First Course in Fractional Sobolev Spaces&lt;/em&gt; provided a self-contained reference covering definitions, embeddings, Hardy inequalities, and interpolation inequalities, and ranked among the most-cited arXiv math.AP papers of 2023. Concurrently, a 2025 paper established well-posedness and regularity theory for time-fractional stochastic PDEs involving Caputo derivatives and general nonlocal operators driven by Gaussian and Lévy noise (arXiv:2512.03754).&lt;/p&gt;
&lt;h3 class="heading" id="3-double-phase-operators-and-nonstandard-growth"&gt;
 3. Double Phase Operators and Nonstandard Growth&lt;span class="heading__anchor"&gt; &lt;a href="#3-double-phase-operators-and-nonstandard-growth"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The &lt;strong&gt;double phase functional&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;$$\mathcal{H}(u) := \int_\Omega \bigl(|Du|^p + a(x)|Du|^q\bigr),dx, \quad q &amp;gt; p &amp;gt; 1,\ a(x) \geq 0,$$&lt;/p&gt;
&lt;p&gt;introduced by Colombo and Mingione, generated a remarkable surge of activity throughout 2021–2026.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Year&lt;/th&gt;
					&lt;th&gt;Paper&lt;/th&gt;
					&lt;th&gt;Authors&lt;/th&gt;
					&lt;th&gt;Key Contribution&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;2021&lt;/td&gt;
					&lt;td&gt;A new class of double phase variable exponent problems&lt;/td&gt;
					&lt;td&gt;Crespo-Blanco, Gasiński, Harjulehto, Winkert&lt;/td&gt;
					&lt;td&gt;Existence/uniqueness for new double phase with variable exponents&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2021&lt;/td&gt;
					&lt;td&gt;Double phase implicit obstacle problems&lt;/td&gt;
					&lt;td&gt;Zeng, Rădulescu, Winkert&lt;/td&gt;
					&lt;td&gt;Mixed BVPs with convection and multivalued conditions&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2022&lt;/td&gt;
					&lt;td&gt;Nonuniformly elliptic Schauder theory&lt;/td&gt;
					&lt;td&gt;De Filippis, Mingione&lt;/td&gt;
					&lt;td&gt;Schauder estimates in nonuniform elliptic settings&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2022&lt;/td&gt;
					&lt;td&gt;New embedding results for double phase problems&lt;/td&gt;
					&lt;td&gt;Ho, Winkert&lt;/td&gt;
					&lt;td&gt;Musielak-Orlicz Sobolev spaces with variable exponent&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2023&lt;/td&gt;
					&lt;td&gt;Regularity at nearly linear growth&lt;/td&gt;
					&lt;td&gt;De Filippis, Mingione&lt;/td&gt;
					&lt;td&gt;Hölder gradient regularity for log-type functionals&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2025&lt;/td&gt;
					&lt;td&gt;Partial regularity for parabolic double phase systems&lt;/td&gt;
					&lt;td&gt;Ok, Scilla, Stroffolini&lt;/td&gt;
					&lt;td&gt;Partial Hölder regularity for parabolic systems&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The work of &lt;strong&gt;Cristiana De Filippis&lt;/strong&gt; and &lt;strong&gt;Giuseppe Mingione&lt;/strong&gt; is particularly prominent throughout, providing a comprehensive regularity theory for double phase and nonuniformly elliptic functionals (arXiv:2308.10222).&lt;/p&gt;
&lt;h3 class="heading" id="4-normalized-solutions-and-variational-methods-for-schrödinger-equations"&gt;
 4. Normalized Solutions and Variational Methods for Schrödinger Equations&lt;span class="heading__anchor"&gt; &lt;a href="#4-normalized-solutions-and-variational-methods-for-schr%c3%b6dinger-equations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The problem of finding solutions $u \in H^1(\mathbb{R}^N)$ with prescribed $L^2$-norm — the &lt;em&gt;mass constraint&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;$$\int_{\mathbb{R}^N} |u|^2,dx = c$$&lt;/p&gt;
&lt;p&gt;— has become a central theme in the study of nonlinear Schrödinger equations. The influential papers by &lt;strong&gt;Louis Jeanjean and Thanh Trung Le&lt;/strong&gt; on multiple normalized solutions for Sobolev critical equations (2020–2021) and by &lt;strong&gt;Juncheng Wei and Yuanze Wu&lt;/strong&gt; on normalized solutions with critical Sobolev exponent and mixed nonlinearities (2021) launched a wave of activity. Key directions include: normalized ground states for NLS with potential (Bartsch, Molle, Rizzi &amp;amp; Verzini); normalized solutions for Schrödinger-Poisson-Slater equations; and standing waves and stability for &lt;strong&gt;Choquard equations&lt;/strong&gt;. The March 2026 arXiv listings confirm that sharp exponents, existence and asymptotics for Choquard equations, and boosted ground states for pseudo-relativistic Schrödinger equations remain highly active.&lt;/p&gt;
&lt;p&gt;Parallel work on eigenvalue problems addresses &lt;strong&gt;Steklov eigenvalues&lt;/strong&gt; (monotonicity for regular $N$-gons, sharp geometric bounds), eigenvalues of &lt;strong&gt;Pucci&amp;rsquo;s extremal operator&lt;/strong&gt; in 3D, and &lt;strong&gt;biharmonic Steklov problems&lt;/strong&gt; on thin sets.&lt;/p&gt;
&lt;h3 class="heading" id="5-mean-field-games-and-aggregation-diffusion-pdes"&gt;
 5. Mean Field Games and Aggregation-Diffusion PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#5-mean-field-games-and-aggregation-diffusion-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mean field game theory&lt;/strong&gt; generated a prolific suite of PDE questions between 2021 and 2026. Highlights include: Imanuvilov, Liu, and Yamamoto (2023) proving Lipschitz stability for determining states and inverse sources in MFG equations using Carleman estimates; Klibanov, Li, and Liu (2023) on Hölder stability via Carleman estimates; the inverse boundary problem for first-order master equations (Liu &amp;amp; Zhang, 2022); and Bresch, Jabin, and Soler (2022) introducing a novel probabilistic derivation of the mean-field limit applicable to Vlasov-Poisson-Fokker-Planck in 2D. By 2025–2026, nonlocal MFG models with spatial interactions and new work on &lt;strong&gt;Wasserstein gradient flows of kernel mean discrepancies&lt;/strong&gt; with connections to machine learning appeared on arXiv (arXiv:2506.01200).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Optimal transport&lt;/strong&gt; has deeply influenced aggregation-diffusion equations and gradient flows. The March 2026 arXiv listings include a major 73-page paper by &lt;strong&gt;Carrillo, Gwiazda, and Skrzeczkowski&lt;/strong&gt; presenting a new formula for the Wasserstein distance between solutions to nonlinear continuity equations.&lt;/p&gt;
&lt;h3 class="heading" id="6-chemotaxis-and-reaction-diffusion-systems"&gt;
 6. Chemotaxis and Reaction-Diffusion Systems&lt;span class="heading__anchor"&gt; &lt;a href="#6-chemotaxis-and-reaction-diffusion-systems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Chemotaxis systems — in particular Keller-Segel models with &lt;strong&gt;signal-dependent motility&lt;/strong&gt; (density-suppressed diffusion) — generated intense activity. Key papers include logistic damping effects and global classical solutions for reaction-diffusion systems with density-suppressed motility (Lyu &amp;amp; Wang, 2021), refined regularity analysis for Keller-Segel-consumption systems (Li &amp;amp; Winkler, 2022), and global existence with uniform boundedness under signal-dependent motility (Jiang &amp;amp; Laurençot, 2021). In 2024, a construction of smooth finite-time blowup solutions for the &lt;strong&gt;3D Keller-Segel-Navier-Stokes&lt;/strong&gt; (chemotaxis-fluid) system with buoyancy appeared, using a quantitative method that directly constructs the singular solution (arXiv:2404.17228).&lt;/p&gt;
&lt;p&gt;In parallel, &lt;strong&gt;free boundary reaction-diffusion models&lt;/strong&gt; for species spreading and SIS epidemic models — including 2026 work on asymmetric kernels in advective periodic environments — continue to produce threshold and long-time dynamics results.&lt;/p&gt;
&lt;h3 class="heading" id="7-free-boundary-problems"&gt;
 7. Free Boundary Problems&lt;span class="heading__anchor"&gt; &lt;a href="#7-free-boundary-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Stefan problem (modelling solidification and melting) remained highly active throughout 2021–2026. Key results include $C^{1,\alpha}$ regularity of flat free boundaries for the &lt;strong&gt;inhomogeneous one-phase Stefan problem&lt;/strong&gt; (Ferrari, Forcillo, Giovagnoli &amp;amp; Jesus, 2024; arXiv:2404.07535); regularity of the free boundary for the &lt;strong&gt;supercooled Stefan problem&lt;/strong&gt; in arbitrary dimensions (2025; arXiv:2512.10136), where the free boundary decomposes into regular, singular, and jump parts with the singular part having controlled parabolic dimension; and well-posedness and regularity of physical solutions for the supercooled Stefan problem assuming only integrable initial temperature, with explicit classification of free boundary points (2025; arXiv:2506.18741). These results use obstacle problem techniques, non-degeneracy estimates, and sharp free boundary classification arguments.&lt;/p&gt;
&lt;p&gt;Shape optimisation for &lt;strong&gt;principal eigenvalues of Pucci operators&lt;/strong&gt; and $\Gamma$-convergence of convolution-type functionals for free discontinuity problems are active related directions in 2026.&lt;/p&gt;
&lt;h3 class="heading" id="8-stochastic-pdes-and-regularity-structures"&gt;
 8. Stochastic PDEs and Regularity Structures&lt;span class="heading__anchor"&gt; &lt;a href="#8-stochastic-pdes-and-regularity-structures"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Martin Hairer&amp;rsquo;s theory of regularity structures generated deep ongoing activity. The period 2021–2026 saw Bailleul and Bruned (2021) extending the algebraic renormalisation framework of regularity structures to a broader class of singular SPDEs (arXiv:2101.11949); the publication of &lt;strong&gt;&amp;ldquo;A tourist&amp;rsquo;s guide to regularity structures&amp;rdquo;&lt;/strong&gt; by Bailleul and Hoshino (2025/2026) in &lt;em&gt;EMS Surveys&lt;/em&gt; as an essentially self-contained treatment; applications to stochastic quantisation ($\Phi^4_3$), the &lt;strong&gt;KPZ equation&lt;/strong&gt;, and stochastic geometric flows (Hairer, 2021); and variance renormalisation in regularity structures for the 2D generalised Parabolic Anderson Model (Gerencsér &amp;amp; Hsu, 2026).&lt;/p&gt;
&lt;p&gt;On the fluid side, global unique solvability for &lt;strong&gt;stochastic Navier-Stokes-Korteweg&lt;/strong&gt; equations and &lt;strong&gt;stochastic Allen-Cahn-Navier-Stokes&lt;/strong&gt; systems with ergodic invariant measures appeared in 2025, and non-uniqueness of Leray-Hopf solutions was extended to the stochastic forced setting.&lt;/p&gt;
&lt;h3 class="heading" id="9-dispersive-pdes-wave-turbulence-well-posedness-and-blowup"&gt;
 9. Dispersive PDEs: Wave Turbulence, Well-Posedness, and Blowup&lt;span class="heading__anchor"&gt; &lt;a href="#9-dispersive-pdes-wave-turbulence-well-posedness-and-blowup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The &lt;strong&gt;full derivation of the wave kinetic equation&lt;/strong&gt; from the cubic NLS by Deng and Hani (arXiv:1912.09518, 2021) was the most impactful dispersive result of the era. Their analysis relies on absolutely convergent Feynman-diagram (paired-tree) expansions and identifies favourable scaling laws $\alpha \sim L^{-\varepsilon}$ for the kinetic limit.&lt;/p&gt;
&lt;p&gt;Ongoing work includes polynomial growth of Sobolev norms for the fractional NLS on $\mathbb{T}^d$ (Wang, 2026); low-regularity global well-posedness for generalised Zakharov-Kuznetsov equations (Nowicki-Koth, 2026); &lt;strong&gt;modulated dispersive equations&lt;/strong&gt; (modulated KdV with normal form reduction; Gubinelli, Li, Li &amp;amp; Oh, 2025; arXiv:2505.24270); and probabilistic well-posedness of dispersive PDEs beyond variance blowup (2025; arXiv:2509.02344). Scattering results for the quintic generalised Benjamin-Bona-Mahony equation and the 3D Zakharov-Kuznetsov equation, and long-time asymptotics via Riemann-Hilbert and inverse scattering methods for integrable equations, appear in the March 2026 listings.&lt;/p&gt;
&lt;h3 class="heading" id="10-geometric-pdes"&gt;
 10. Geometric PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#10-geometric-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Ricci flow&lt;/strong&gt; uniqueness in the non-compact setting (Lee, 2025; arXiv:2503.20292) and a new non-Kähler expanding Ricci soliton construction with Kähler tangent cone at infinity (Bamler, Chen &amp;amp; Conlon, 2026) reflect the continued health of geometric flows. The &lt;strong&gt;volume-preserving mean curvature flow&lt;/strong&gt; regularity in dimensions 2 and 3 appeared in March 2026 (Arya, Jeon &amp;amp; Julin).&lt;/p&gt;
&lt;p&gt;Regularity theory for &lt;strong&gt;Monge-Ampère equations&lt;/strong&gt; received major contributions via a geometric approach: Brendle, Léger, McCann, and Rankin (2023; arXiv:2311.10208) derived the Pogorelov second-derivative bound using Kim-McCann-Warren&amp;rsquo;s pseudo-Riemannian geometry, providing a new approach to $C^1$ estimates for optimal transport maps. Liouville theorems and sharp solvability for the &lt;strong&gt;parabolic Monge-Ampère equation&lt;/strong&gt; with periodic data appeared in March 2026.&lt;/p&gt;
&lt;h3 class="heading" id="11-inverse-problems-for-pdes"&gt;
 11. Inverse Problems for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#11-inverse-problems-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The &lt;strong&gt;Calderón problem&lt;/strong&gt; — recovering a coefficient from boundary Dirichlet-to-Neumann data — attracted major advances: the quasilinear setting (Cârstea, Feizmohammadi, Kian, Krupchyk &amp;amp; Uhlmann, 2021), inverse problems for fractional semilinear elliptic equations (Lai &amp;amp; Lin, 2020), the Calderón problem via Vekua theory (Clifford analysis framework, 2026; arXiv:2601.17313), and the convex lifting approach (Alberti, Petit &amp;amp; Sanna, 2025; arXiv:2507.00645). The &lt;strong&gt;anisotropic Calderón problem&lt;/strong&gt; for fractional Schrödinger operators on closed Riemannian manifolds (Krupchyk, 2025) was an important further advance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inverse moving source problems for parabolic equations&lt;/strong&gt; (Zhao, 2023), reconstruction of scalar parameters in subdiffusion, and inverse problems for &lt;strong&gt;multi-term time-fractional diffusion&lt;/strong&gt; with Caputo derivatives are active in 2025–2026.&lt;/p&gt;
&lt;h3 class="heading" id="12-semi-classical-analysis-spectral-theory-and-nonlinear-elliptic-theory"&gt;
 12. Semi-Classical Analysis, Spectral Theory, and Nonlinear Elliptic Theory&lt;span class="heading__anchor"&gt; &lt;a href="#12-semi-classical-analysis-spectral-theory-and-nonlinear-elliptic-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A 2024 arXiv survey on &lt;strong&gt;semi-classical analysis&lt;/strong&gt; introducing three representative topics ranked as the top 2024 math.AP paper by Paper Digest, and a 2026 paper celebrating the &lt;strong&gt;100th anniversary of the WKB papers&lt;/strong&gt; (Vũ Ngọc) indicate that semi-classical methods remain foundational.&lt;/p&gt;
&lt;p&gt;In nonlinear elliptic and parabolic theory, major contributions include: &lt;em&gt;Regularity Theory for Elliptic PDEs&lt;/em&gt; by Fernández-Real and Ros-Oton (2023), a comprehensive self-contained reference; Fujita-type results for degenerate parabolic equations on &lt;strong&gt;Heisenberg groups&lt;/strong&gt; (Fino, Ruzhansky &amp;amp; Torebek, 2023), ranked the highest-impact 2023 math.AP paper; and singularity formation for nonlinear heat equations on infinite graphs (Punko &amp;amp; Zucchero, 2026).&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="emerging-and-cross-cutting-themes-20252026"&gt;
 Emerging and Cross-Cutting Themes (2025–2026)&lt;span class="heading__anchor"&gt; &lt;a href="#emerging-and-cross-cutting-themes-20252026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Computer-assisted proofs and rigorous numerics.&lt;/strong&gt; The Chen–Hou Euler blowup proof and related work on the CLM model (Hou-Wang, 2026) demonstrate that computer-assisted methods with rigorous error control are becoming standard for complex nonlinear stability analyses. These methods combine spectral Galerkin approximations with interval arithmetic and weighted norm frameworks to certify nonlinear stability constants — a methodology likely to expand further.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI and machine learning for PDEs.&lt;/strong&gt; The 2026 workshop &lt;em&gt;MLPDES26&lt;/em&gt; and the NSF/AMS report on AI for the mathematical sciences signal growing interplay between pure math.AP and deep learning. Neural PDE networks for equation discovery (arXiv:2502.18377), geometric operator learning via optimal transport (arXiv:2507.20065), and AI-assisted singularity discovery (DeepMind, 2025) represent this interdisciplinary frontier.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PDE methods in geometry and probability.&lt;/strong&gt; The intersection of math.AP with differential geometry, probability (SPDEs), and mathematical physics remains extremely active. The March 2026 listings span general relativity (tensorial wave equations), Kähler geometry (Ricci solitons), and stochastic PDEs — confirming that math.AP functions as a hub connecting multiple mathematical disciplines.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Smooth-data Euler regularity beyond bounded domains.&lt;/strong&gt; The Chen–Hou result proves blowup in a bounded domain. Whether finite-time singularity occurs for the 3D Euler equations in all of $\mathbb{R}^3$ from smooth, rapidly decaying initial data — the original Euler problem — remains open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Navier-Stokes uniqueness from smooth initial data.&lt;/strong&gt; The Albritton-Brué-Colombo result proves non-uniqueness for &lt;em&gt;forced&lt;/em&gt; NS from zero initial velocity. Non-uniqueness (or uniqueness) of Leray–Hopf solutions for the &lt;em&gt;unforced&lt;/em&gt; equations from smooth $H^1$ initial data is unresolved (see the companion survey on self-similar solutions).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Optimal regularity theory for double phase problems.&lt;/strong&gt; Despite the comprehensive work of De Filippis and Mingione, optimal Schauder estimates for parabolic double phase systems at the boundary and under critical growth conditions are not fully established.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Complete derivation programme for Hilbert&amp;rsquo;s Sixth Problem.&lt;/strong&gt; Deng-Hani-Ma resolved the case of hard-sphere gases in the Boltzmann regime. The derivation of hydrodynamic equations from particle dynamics in other regimes — dense gases, quantum systems, plasma — remains largely open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Global well-posedness for energy-critical NLS in high dimensions.&lt;/strong&gt; Despite progress on wave kinetic theory and probabilistic well-posedness, the deterministic global well-posedness theory for energy-critical and supercritical dispersive equations in dimensions $d \geq 5$ has significant gaps.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum and numerical computation in pure math.AP.&lt;/strong&gt; The growing use of computer-assisted proofs raises methodological questions about standards of verification, reproducibility, and the scope of problems accessible to these techniques.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Albritton, D., Brué, E., &amp;amp; Colombo, M. (2021). &lt;em&gt;Non-uniqueness of Leray solutions of the forced Navier-Stokes equations&lt;/em&gt;. &lt;a href="https://cvgmt.sns.it/media/doc/paper/5405/main.pdf"&gt;https://cvgmt.sns.it/media/doc/paper/5405/main.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bailleul, I., &amp;amp; Bruned, Y. (2021). &lt;em&gt;Renormalised singular stochastic PDEs&lt;/em&gt;. arXiv:2101.11949. &lt;a href="https://www.pure.ed.ac.uk/ws/portalfiles/portal/194767736/2101.11949.pdf"&gt;https://www.pure.ed.ac.uk/ws/portalfiles/portal/194767736/2101.11949.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bailleul, I., &amp;amp; Hoshino, M. (2025). A tourist&amp;rsquo;s guide to regularity structures and singular stochastic PDEs. &lt;em&gt;EMS Surveys in Mathematical Sciences&lt;/em&gt;. &lt;a href="https://ems.press/journals/emss/articles/14298505"&gt;https://ems.press/journals/emss/articles/14298505&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Brendle, S., Léger, F., McCann, R. J., &amp;amp; Rankin, C. (2023). &lt;em&gt;A geometric approach to a priori estimates for optimal transport maps&lt;/em&gt;. arXiv:2311.10208. &lt;a href="https://arxiv.org/abs/2311.10208"&gt;https://arxiv.org/abs/2311.10208&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Chen, J., &amp;amp; Hou, T. Y. (2022). &lt;em&gt;Stable nearly self-similar blowup of the 2D Boussinesq and 3D Euler equations with smooth data I: Analysis&lt;/em&gt;. arXiv:2210.07191. &lt;a href="https://arxiv.org/abs/2210.07191"&gt;https://arxiv.org/abs/2210.07191&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Chen, J., &amp;amp; Hou, T. Y. (2023). &lt;em&gt;Stable nearly self-similar blowup of the 2D Boussinesq and 3D Euler equations with smooth data II: Rigorous numerics&lt;/em&gt;. arXiv:2305.05660. &lt;a href="https://arxiv.org/abs/2305.05660"&gt;https://arxiv.org/abs/2305.05660&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Chen, J., &amp;amp; Hou, T. Y. (2025). Singularity formation in 3D Euler equations with smooth initial data. &lt;em&gt;PNAS, 122&lt;/em&gt;(28). &lt;a href="https://www.pnas.org/doi/10.1073/pnas.2500940122"&gt;https://www.pnas.org/doi/10.1073/pnas.2500940122&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;De Filippis, C., &amp;amp; Mingione, G. (2023). &lt;em&gt;Regularity for double phase problems at nearly linear growth&lt;/em&gt;. arXiv:2308.10222. &lt;a href="https://arxiv.org/abs/2308.10222"&gt;https://arxiv.org/abs/2308.10222&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;DeepMind. (2025). &lt;em&gt;Discovering new solutions to century-old problems in fluid dynamics&lt;/em&gt;. &lt;a href="https://deepmind.google/blog/discovering-new-solutions-to-century-old-problems-in-fluid-dynamics/"&gt;https://deepmind.google/blog/discovering-new-solutions-to-century-old-problems-in-fluid-dynamics/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Deng, Y., &amp;amp; Hani, Z. (2021). &lt;em&gt;On the derivation of the wave kinetic equation for NLS&lt;/em&gt;. arXiv:1912.09518. &lt;a href="http://arxiv.org/pdf/1912.09518.pdf"&gt;http://arxiv.org/pdf/1912.09518.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Deng, Y., Hani, Z., &amp;amp; Ma, X. (2024). &lt;em&gt;Long time derivation of the Boltzmann equation from hard sphere dynamics&lt;/em&gt;. arXiv:2408.07818. &lt;a href="https://www.semanticscholar.org/paper/91b67412a6058c1ace054a32fbf36fa2d2998d3d"&gt;https://www.semanticscholar.org/paper/91b67412a6058c1ace054a32fbf36fa2d2998d3d&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Deng, Y., Hani, Z., &amp;amp; Ma, X. (2025). &lt;em&gt;Hilbert&amp;rsquo;s sixth problem: Derivation of fluid equations via Boltzmann&amp;rsquo;s kinetic theory&lt;/em&gt;. arXiv:2503.01800. &lt;a href="https://www.semanticscholar.org/paper/01d8f11b5d31f7037fb4914797e938db11d76ec5"&gt;https://www.semanticscholar.org/paper/01d8f11b5d31f7037fb4914797e938db11d76ec5&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Ferrari, F., Forcillo, N., Giovagnoli, D., &amp;amp; Jesus, B. (2024). &lt;em&gt;Free boundary regularity for the inhomogeneous one-phase Stefan problem&lt;/em&gt;. arXiv:2404.07535. &lt;a href="https://arxiv.org/abs/2404.07535"&gt;https://arxiv.org/abs/2404.07535&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gubinelli, M., Li, J., Li, T., &amp;amp; Oh, T. (2025). &lt;em&gt;Nonlinear PDEs with modulated dispersion IV: Normal form reduction for modulated KdV&lt;/em&gt;. arXiv:2505.24270. &lt;a href="https://arxiv.org/pdf/2505.24270.pdf"&gt;https://arxiv.org/pdf/2505.24270.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hou, T. Y. (2021). &lt;em&gt;The potentially singular behavior of the 3D Navier-Stokes equations&lt;/em&gt;. arXiv:2107.06509. &lt;a href="https://arxiv.org/abs/2107.06509"&gt;https://arxiv.org/abs/2107.06509&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hu, J., Jin, S., Liu, N., &amp;amp; Zhang, L. (2024). Quantum circuits for partial differential equations via Schrödingerisation. &lt;em&gt;Quantum, 8&lt;/em&gt;, 1563.&lt;/p&gt;
&lt;p&gt;Imanuvilov, O. Y., Liu, Y., &amp;amp; Yamamoto, M. (2023). Lipschitz stability for determining states and inverse sources in MFG equations. &lt;em&gt;[Journal of Mathematical Analysis]&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Ok, J., Scilla, G., &amp;amp; Stroffolini, B. (2025). &lt;em&gt;Partial regularity for parabolic systems of double phase type&lt;/em&gt;. arXiv:2510.03849. &lt;a href="https://arxiv.org/pdf/2510.03849.pdf"&gt;https://arxiv.org/pdf/2510.03849.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Paper Digest. (2025, March). &lt;em&gt;Most influential arXiv (Analysis of PDEs) papers — 2025-03 version&lt;/em&gt;. &lt;a href="https://www.paperdigest.org/2025/03/most-influential-arxiv-analysis-of-pdes-papers-2025-03-version/"&gt;https://www.paperdigest.org/2025/03/most-influential-arxiv-analysis-of-pdes-papers-2025-03-version/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Segata, J., &amp;amp; Chen, M. (2026). &lt;em&gt;Scattering for the 3D Zakharov-Kuznetsov equation&lt;/em&gt; [arXiv preprint]. arXiv math.AP March 2026.&lt;/p&gt;
&lt;p&gt;arXiv math.AP listings. (2026, February–March). &lt;a href="https://arxiv.org/list/math.AP/2026-03"&gt;https://arxiv.org/list/math.AP/2026-03&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Paper Reading - Optimization problems for elliptic PDEs (2601.01591)</title><link>https://blog.namln.org/en/posts/pr-2601.01591/</link><pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/pr-2601.01591/</guid><description>&lt;p&gt;This paper is a panoramic tour of three families of &lt;strong&gt;optimal control problems for elliptic PDEs&lt;/strong&gt;: where the control is the coefficient, the potential, or the source term, unifying and sharpening results from the authors’ previous works.&lt;/p&gt;
&lt;h2 class="heading" id="three-ways-to-control-an-elliptic-pde"&gt;
 Three ways to control an elliptic PDE&lt;span class="heading__anchor"&gt; &lt;a href="#three-ways-to-control-an-elliptic-pde"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The authors always consider a Dirichlet problem on a bounded domain $\Omega \subset \mathbb{R}^d$, with the solution $u$ as the &lt;strong&gt;state&lt;/strong&gt; and a function (or measure) as the &lt;strong&gt;control&lt;/strong&gt;. They study three settings:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal coefficients&lt;/strong&gt; $a(x)$:
$$
-\mathrm{div}(a(x)\nabla u) = f \text{ in } \Omega, \quad u=0 \text{ on } \partial\Omega,
$$
cost function $J(u,a) = \int_\Omega j(u,a),dx$, with a constraint $\int_\Omega \psi(a),dx \le 1$.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal potentials&lt;/strong&gt; $V(x)$:
$$
-\Delta u + V(x)u = f \text{ in } \Omega, \quad u\in H_0^1(\Omega),
$$
cost function $J(u,V) = \int_\Omega (j(x,u) + \psi(V)),dx$.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal sources&lt;/strong&gt; $f$:
$$
-\Delta u = f \text{ in } \Omega, \quad u\in H_0^1(\Omega),
$$
cost function $J(f) = \int_\Omega j(x,u_f,f),dx$ with $\int_\Omega \psi(f),dx \le m$.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In all cases, $\psi$ is convex and lower semi-continuous (l.s.c), encoding constraints and penalizations on the control. The paper focuses on existence of optimal controls (sometimes as measures), characterization via auxiliary variational problems and adjoint states, bang–bang behavior, and regularity of optimal controls and their induced interfaces.&lt;/p&gt;
&lt;h2 class="heading" id="optimal-coefficients-where-to-put-the-good-material"&gt;
 Optimal Coefficients: Where to Put the Good Material?&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-coefficients-where-to-put-the-good-material"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="minimal-compliance-and-measure-valued-coefficients"&gt;
 Minimal Compliance and Measure-Valued Coefficients&lt;span class="heading__anchor"&gt; &lt;a href="#minimal-compliance-and-measure-valued-coefficients"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The model problem is compliance minimization for $-\mathrm{div}(a(x)\nabla u) = f$, $u=0$, with non-neg­ative $a$.&lt;/p&gt;
&lt;p&gt;Compliance is defined as:
$$
C(a) = \int_\Omega f u_a,dx,
$$
and it relates to the energy
$$
E(a) = \inf_{u\in H_0^1} \int_\Omega \left(\tfrac{1}{2} a|\nabla u|^2 - f u\right)dx
$$
via $C(a) = -2E(a)$.&lt;/p&gt;
&lt;p&gt;The optimization problem is written as:
$$
\min_{a \geq 0} \left\{ C(a) + \int_\Omega \psi(a)dx \right\},
$$
or equivalently as a &lt;strong&gt;max–min&lt;/strong&gt; problem in $(a,u)$.&lt;/p&gt;
&lt;p&gt;Two growth regimes of $\psi$ are crucial:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Superlinear&lt;/strong&gt;: $\psi(s)/s \to +\infty$. Then admissible coefficients are in $L^1(\Omega)$, and there exists an optimal $a_{\mathrm{opt}}\in L^1(\Omega)$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Linear growth&lt;/strong&gt;: $\psi(s)/s \to k&amp;gt;0$. Then it is natural to extend the problem to &lt;strong&gt;measures&lt;/strong&gt; $\mu\ge 0$, allowing &amp;ldquo;thin&amp;rdquo; structures on lower-dimensional sets. The cost $\int \psi(\mu)$ is interpreted through the Lebesgue–singular decomposition and the recession function $\psi_\infty$. An optimal measure $\mu_{\mathrm{opt}}\in \mathcal{M}^+(\Omega)$ still exists.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because the functional is convex in $u$ and concave in $a$, the authors exchange inf and sup and reduce to an &lt;strong&gt;auxiliary minimization problem in $u$&lt;/strong&gt; alone:
$$
\inf_{u} \int_\Omega \psi^{*}(|\nabla u|^2)dx - 2\int_\Omega u df,
$$
where $\psi^{*}$ is the Legendre–Fenchel conjugate. Under mild assumptions this problem has a unique minimizer $\bar u$, and the optimal coefficient is recovered point-wise from the optimality condition:
$$
a_{\mathrm{opt}}|\nabla\bar u|^2 = \psi(a_{\mathrm{opt}}) + \psi^*(|\nabla\bar u|^2).
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Power penalization&lt;/strong&gt; $\psi(s) = s^p/p$, $p&amp;gt;1$: The auxiliary problem involves a nonlinear PDE
$$-\Delta_{2p/(p-1)} u = \tfrac{2p}{p-1} f,$$
and the optimal coefficient is $a_{\mathrm{opt}}(x) = |\nabla \bar u(x)|^{2/(p-1)}$. For $\Omega$ a ball and $f=1$ or $f=\delta_0$, the authors give explicit radial formulas and plots for $\bar u$ and $a_{\mathrm{opt}}$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Two-phase box constraint&lt;/strong&gt; $\psi(s) = s$ on $[\alpha,\beta]$, $+\infty$ otherwise: The auxiliary problem yields an optimal coefficient $a_{\mathrm{opt}}\in L^\infty(\Omega)$ taking values in $[\alpha,\beta]$, and under regularity of $\Omega$ and $f$ one gets extra smoothness (e.g. $\nabla a_{\mathrm{opt}}\cdot \nabla \bar u \in L^2(\Omega)$).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="general-coefficients-and-g-closure"&gt;
 General Coefficients and G-Closure&lt;span class="heading__anchor"&gt; &lt;a href="#general-coefficients-and-g-closure"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For a &lt;strong&gt;general cost&lt;/strong&gt;:
$$\min_{a\ge 0}\min_{u} \int_\Omega (j(x,u)+\psi(a)),dx \quad \text{s.t. } u \text{ solves } -\mathrm{div}(a\nabla u)=f,$$
existence of an optimal $a$ may fail.&lt;/p&gt;
&lt;p&gt;The relaxed problem is naturally expressed via &lt;strong&gt;G-convergence&lt;/strong&gt;: sequences of scalar coefficients $a_n\in[\alpha,\beta]$ can generate limit operators with &lt;strong&gt;matrix-valued coefficients&lt;/strong&gt; $A(x)$, described by the celebrated Murat–Tartar &lt;strong&gt;G-closure&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The G-closure set $\mathcal{A}$ consists of symmetric matrices $A(x)$ whose eigenvalues $\lambda_1\le\cdots\le\lambda_d$ lie in $[\alpha,\beta]$ and satisfy a family of inequalities depending on a mixing parameter $t\in[0,1]$, involving the arithmetic and harmonic means $\mu_t, \nu_t$ of $\alpha,\beta$. For $d=2$, this gives an explicit admissible region in the $(\lambda_1,\lambda_2)$-plane.&lt;/p&gt;
&lt;p&gt;Relaxed functionals of the form $\int \psi(x,a),dx$ over G-limits have been studied in special cases, e.g. $\psi(x,a)=g(x)a$, where one can express the relaxation in terms of the largest eigenvalue $\lambda_{\max}(A(x))$. The authors show a numerical example where the relaxed optimal matrix $A_{\mathrm{opt}}$ has eigenvalues $\lambda_1\neq \lambda_2$ on a set of positive measure, revealing genuine microstructure.&lt;/p&gt;
&lt;h2 class="heading" id="optimal-potentials-shaping-the-landscape-vx"&gt;
 Optimal Potentials: Shaping the &amp;ldquo;Landscape&amp;rdquo; $V(x)$&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-potentials-shaping-the-landscape-vx"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Here the control is a &lt;strong&gt;nonnegative potential&lt;/strong&gt; $V$ in
$$-\Delta u + V u = f, \quad u\in H_0^1(\Omega).$$
The cost is:
$$\min \int_\Omega (j(x,u) + \psi(V)),dx,$$
with $V\ge 0$ and $\psi$ convex, l.s.c., super-linear (so any finite-cost $V$ lies in $L^1(\Omega)$).&lt;/p&gt;
&lt;h3 class="heading" id="compliance-case-eliminating-the-control"&gt;
 Compliance Case: Eliminating the Control&lt;span class="heading__anchor"&gt; &lt;a href="#compliance-case-eliminating-the-control"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For the compliance choice $j(x,u) = f(x)u$, the problem can again be reduced to a variational problem in $u$ only.&lt;/p&gt;
&lt;p&gt;Define:
$$
E(V) = \min_{u\in H_0^1(\Omega)} \int_\Omega \left(\tfrac{1}{2} |\nabla u|^2 + \tfrac{1}{2} V u^2 - f u\right)dx, \quad \Psi(V)=\int_\Omega \psi(V),dx.
$$&lt;/p&gt;
&lt;p&gt;Minimizing $-2E(V)+\Psi(V)$ over $V\ge 0$ is equivalent to:
$$
\min_{u\in H_0^1(\Omega)} \int_\Omega \left(|\nabla u|^2 + \psi^*(u^2) - 2 f u\right)dx,
$$
a semi-linear elliptic problem in $u$ with nonlinearity $g(s)=s(\psi^*)&amp;rsquo;(s^2)$. The optimal state $\bar u$ solves:
$$
-\Delta u + g(u) = f, \quad u\in H_0^1(\Omega),
$$
and the optimal potential is:
$$
V_{\mathrm{opt}} = (\psi^*)&amp;rsquo;(\bar u^2).
$$
So in this special case the control can be &lt;strong&gt;explicitly reconstructed&lt;/strong&gt; from the state.&lt;/p&gt;
&lt;h3 class="heading" id="general-costs-adjoint-equation-and-regularity"&gt;
 General Costs, Adjoint Equation, and Regularity&lt;span class="heading__anchor"&gt; &lt;a href="#general-costs-adjoint-equation-and-regularity"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For a general $j(x,u)$, the authors prove an &lt;strong&gt;existence theorem&lt;/strong&gt; of an optimal $V_{\mathrm{opt}}\in L^1(\Omega)$ under natural growth and coercivity assumptions on $j$ and super-linearity of $\psi$.&lt;/p&gt;
&lt;p&gt;Optimality conditions involve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The state $\bar u$ solving $-\Delta u + V_{\mathrm{opt}}u = f$.&lt;/li&gt;
&lt;li&gt;An adjoint state $v$ solving $-\Delta v + V_{\mathrm{opt}} v = \partial_s j(x,\bar u)$.&lt;/li&gt;
&lt;li&gt;A sub-differential relation $\bar u v \in \partial\psi(V_{\mathrm{opt}})$, rewritten as a point-wise inequality $h^{-}(\bar u v) \le V_{\mathrm{opt}} \le h(\bar u v)$, where $h$ is built from the sub-differential of $\psi$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From here, regularity of $V_{\mathrm{opt}}$ is linked to properties of $h$ and to elliptic regularity for $\bar u$ and $v$. Under strengthened assumptions on $j$, $f$, and $\Omega$, the authors show that $\bar u, v \in W^{2,q}(\Omega)$ for some $q&amp;gt;d/2$ (hence continuous), and the product $\bar u v V_{\mathrm{opt}}$ is in $BV(\Omega)$, so $V_{\mathrm{opt}}\in BV_{\mathrm{loc}}(\Omega\setminus K)$ where $K = {\bar u v =0}$. This identifies the &amp;ldquo;degeneracy set&amp;rdquo; $K$ as the core where singularities of the optimal potential may concentrate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bang–Bang Potentials&lt;/strong&gt;: If $\psi$ is flat on an interval $[\alpha,\beta]$ (e.g. $\psi(s) = s$ on $[\alpha,\beta]$, $+\infty$ otherwise), the function $h$ becomes multi-valued and the optimal potential is &lt;strong&gt;bang–bang&lt;/strong&gt;:
$$
V_{\mathrm{opt}} = \alpha + (\beta-\alpha)\mathbf{1}_E
$$
for some set $E$ of finite perimeter. The paper includes numerical simulations showing the geometry of such sets for specific loads $f$.&lt;/p&gt;
&lt;h2 class="heading" id="optimal-sources-choosing-the-right-hand-side"&gt;
 Optimal Sources: Choosing the Right-Hand Side&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-sources-choosing-the-right-hand-side"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Finally, the control is the source $f$ in $-\Delta u = f$, $u\in H_0^1(\Omega)$, with cost $J(f) = \int_\Omega j(x,u_f,f),dx$ and constraint $\int_\Omega \psi(f),dx\le m$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Existence with Superlinear and Linear $\psi$&lt;/strong&gt;: If $\psi$ is &lt;strong&gt;super-linear&lt;/strong&gt; and $j$ satisfies suitable lower bounds and convexity in $f$, then an optimal $f_{\mathrm{opt}}\in L^1(\Omega)$ exists.&lt;/p&gt;
&lt;p&gt;If $\psi$ has &lt;strong&gt;linear growth&lt;/strong&gt;, the natural admissible class is signed measures $f$ with finite total variation, and $\int \psi(f)$ is defined via the Lebesgue–singular decomposition and recession coefficients $c_-(\psi), c_+(\psi)$. Under a decomposition $j(x,s,z)=A(x,s)+B(x,z)$ with specific structure and lower bounds, the functional is lower semi-continuous under weak-* convergence of measures, and there exists an optimal measure-valued source $f_{\mathrm{opt}}$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Optimality Conditions and Bang–Bang Description&lt;/strong&gt;: Introduce the self-adjoint &lt;strong&gt;resolvent&lt;/strong&gt; operator $R$ mapping a source $f$ to the solution $u_f$. Under differentiability and growth conditions on $j$, the authors derive necessary (and, under convexity, sufficient) conditions for optimality. For super-linear $\psi$, define:
$$
w := R\big(\partial_s j(x, R(f_{\mathrm{opt}}), f_{\mathrm{opt}})\big) + \partial_z j(x, R(f_{\mathrm{opt}}), f_{\mathrm{opt}}).
$$
Then there is $\lambda \ge 0$ such that either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;$\lambda=0$&lt;/strong&gt;: $w$ has a fixed sign and $f_{\mathrm{opt}}$ saturates the endpoints of $\mathrm{dom}(\psi)$ on the regions where $w$ is strictly positive/negative — a &lt;strong&gt;pure bang–bang&lt;/strong&gt; behavior.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;$\lambda&amp;gt;0$&lt;/strong&gt;: the constraint is saturated, $\int \psi(f_{\mathrm{opt}})=m$, and $f_{\mathrm{opt}}$ satisfies a point-wise equality involving $\psi$, its conjugate $\psi^*$, and $w$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For linear-growth $\psi$, a similar structure holds, but the singular part of $f_{\mathrm{opt}}$ is supported on level sets where $w$ hits thresholds determined by the slopes $c_-(\psi), c_+(\psi)$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Spectral Example: Maximizing Energy Under an $L^2$ Constraint&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For:
$$
j(u) = -\tfrac{1}{2} u^2, \quad \psi(s)=\tfrac{1}{2} s^2,
$$
the problem becomes:
$$
\max \left\{\frac{1}{2}\int_\Omega u_f^2 f,dx : \int_\Omega f^2,dx \right\}.
$$&lt;/p&gt;
&lt;p&gt;The optimality system shows that the optimal source $f$ satisfies a &lt;strong&gt;fourth-order eigenvalue problem&lt;/strong&gt; $\Delta^2 f = f/\lambda$, equivalent to an eigenvalue problem for the Laplacian. The maximizer is a multiple of the &lt;strong&gt;first Dirichlet eigenfunction&lt;/strong&gt; $\varphi$ of $-\Delta$:
$$
f = \pm \sqrt{2m},\varphi, \quad \lambda = 1/\mu_1^2,
$$
where $\mu_1$ is the first eigenvalue. The paper includes a numerical plot for such an optimal source in an ellipse.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compliance with Box Constraints on the Source&lt;/strong&gt;: For compliance with box constraints:
$$
\min \left\{\int_\Omega f,R(f),dx : \int_\Omega f,dx \ge m,\ f\in[\alpha,\beta]\right\}, \quad 0\le \alpha&amp;lt;\beta,
$$
the optimal source is bang–bang:
$$
f _{\mathrm{opt}} = \alpha,\mathbf{1} _E + \beta,\mathbf{1} _{\Omega\setminus E},
$$
with $E = {R(f _{\mathrm{opt}}) &amp;lt; s}$ and $s$ chosen to fit the mass constraint. The corresponding state solves:
$$
-\Delta u = \beta,\mathbf{1} _{\{u&amp;lt;s\}} + \alpha,\mathbf{1} _{\{u&amp;gt;s\}}.
$$&lt;/p&gt;
&lt;p&gt;Using results from their previous work on optimal potentials, the authors prove that $f _{\mathrm{opt}} \in BV(\Omega)$: the interface between the regions where $f=\alpha$ and $f=\beta$ has finite perimeter.&lt;/p&gt;
&lt;p&gt;If $\Omega$ is &lt;strong&gt;convex&lt;/strong&gt;, they go further: in the special case $\alpha = 0$, $f _{\mathrm{opt}} = \mathbf{1} _E$ with $E = {w &amp;lt; s}$, where $w$ solves $-\Delta w = \mathbf{1} _{\{w&amp;lt;s\}}$. They show that the optimal set $E$ is &lt;strong&gt;convex&lt;/strong&gt; and its boundary is of class $C^1$. So in convex domains, the region where you &amp;ldquo;turn on&amp;rdquo; the source to maximize stiffness is itself a smooth convex set.&lt;/p&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal sources for elliptic PDEs. arXiv preprint arXiv:2509.01521.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal sources for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2509.01521}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;[2] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal coefficients for elliptic PDEs. arXiv preprint arXiv:2512.08431.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal coefficients for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2512.08431}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;[3] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2026). Optimization problems for elliptic PDEs. arXiv preprint arXiv:2601.01591.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2026optimization&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimization problems for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2601.01591}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2026}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Paper Reading - Optimal coefficients for elliptic PDEs (2512.08431)</title><link>https://blog.namln.org/en/posts/pr-2512.08431/</link><pubDate>Thu, 19 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/pr-2512.08431/</guid><description>&lt;p&gt;This paper gives a clear, fairly complete picture of how to optimally choose the &lt;strong&gt;coefficient&lt;/strong&gt; $a(x)$ (think &amp;ldquo;material quality&amp;rdquo;) in an elliptic PDE, with compliance as the main model and then a general optimal control formulation.&lt;/p&gt;
&lt;h2 class="heading" id="problem-setup"&gt;
 Problem Setup&lt;span class="heading__anchor"&gt; &lt;a href="#problem-setup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Considering the boundary value problem:
$$
-{\rm div}(a(x)\nabla u) = f \quad\text{in } \Omega,\qquad u=0 \text{ on } \partial\Omega,
$$
where $\Omega$ is a bounded domain, $f$ is a given load, and $a(x)$ is the design variable.&lt;/p&gt;
&lt;p&gt;Typical assumptions on $a(x)$:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Point-wise bounds $\alpha \le a(x) \le \beta$ (two material qualities, e.g., “soft” vs “stiff”).&lt;/li&gt;
&lt;li&gt;Possibly a budget constraint (e.g., only a fixed fraction of the domain can use the best material $\beta$).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The map $a \mapsto u_a$ is well-defined by elliptic theory: for each admissible $a$, the PDE has a unique weak solution in $H_0^1(\Omega)$.&lt;/p&gt;
&lt;h2 class="heading" id="example"&gt;
 Example&lt;span class="heading__anchor"&gt; &lt;a href="#example"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The &lt;strong&gt;elastic compliance&lt;/strong&gt; is a classical cost in mechanics: it measures how much the structure deforms under the load $f$. In this setting, a standard functional is&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;either $C(a) = \int_\Omega f,u_a,dx$ (work of the load),&lt;/li&gt;
&lt;li&gt;or equivalently the elastic energy $\int_\Omega a(x),|\nabla u_a|^2,dx$ up to constants.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Minimizing the compliance means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Given a fixed load and a given volume of good material, distribute (a(x)) in (\Omega) so that the resulting displacement (u_a) is as small as possible in the energy sense.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Key qualitative facts the paper emphasizes in this compliance setting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Existence&lt;/strong&gt;: under standard bounds $\alpha \le a \le \beta$ and a convex constraint (like a fixed integral of $a$), there exists at least one optimal coefficient $a_{\text{opt}}$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extremal behavior&lt;/strong&gt;: because the compliance functional is convex in $u$ but often leads to a concave dependence on $a$ under constraints, optimal $a_{\text{opt}}$ tend to take values only at the extremes $\alpha$ or $\beta$ almost everywhere, a typical “black-and-white” design phenomenon known in topology optimization.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Intuitively, if we can choose between “bad” and “good” material at each point but only have a limited budget of good material, it is never optimal to mix them continuously; we either go full good or full bad locally and let the PDE determine where gradients are large so good material is most effective.&lt;/p&gt;
&lt;h2 class="heading" id="from-two-phase-design-to-optimal-control"&gt;
 From two-phase design to optimal control&lt;span class="heading__anchor"&gt; &lt;a href="#from-two-phase-design-to-optimal-control"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The authors then move to a more general &lt;strong&gt;PDE-constrained optimal control&lt;/strong&gt; view: $a(x)$ is the control, the PDE is the state equation, and the cost is an abstract functional
$$
J(a) = \int_\Omega j(x, u_a(x), a(x), \nabla u_a(x)),dx,
$$
possibly plus boundary or integral terms.&lt;/p&gt;
&lt;p&gt;In this general framework:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The admissible set $\mathcal{A}$ of coefficients may encode box constraints, integral constraints, or more refined structure (e.g., multi-phase materials).&lt;/li&gt;
&lt;li&gt;The goal is to minimize $J(a)$ over $\mathcal{A}$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The paper outlines how standard tools of optimal control of PDEs apply:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Adjoint equation&lt;/strong&gt;: one introduces an adjoint state $p$ solving its own elliptic problem linked to derivatives of $j$ with respect to $u$ and $\nabla u$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;First-order optimality&lt;/strong&gt;: optimal coefficients satisfy variational inequalities or pointwise optimality conditions involving $a_{\text{opt}}$, $u_{a_{\text{opt}}}$, and $p$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In simple situations, one gets an explicit “gradient” of the cost with respect to the coefficient:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;local changes in $a(x)$ are weighted by expressions involving $\nabla u$ and $\nabla p$;&lt;/li&gt;
&lt;li&gt;this tells us where increasing stiffness (raising $a$) helps most, and where it is wasteful.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This general perspective makes clear that compliance minimization is just one concrete instance of a broader family of coefficient optimization problems.&lt;/p&gt;
&lt;h2 class="heading" id="bangbang-and-intermediate-materials"&gt;
 Bang–bang and intermediate materials&lt;span class="heading__anchor"&gt; &lt;a href="#bangbang-and-intermediate-materials"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;A recurring theme, already visible in compliance, is whether optimal coefficients are &lt;strong&gt;bang–bang&lt;/strong&gt; (only $\alpha$ or $\beta$) or can take intermediate values.&lt;/p&gt;
&lt;p&gt;The paper’s message, in line with the authors’ broader work, is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Under &lt;strong&gt;linear&lt;/strong&gt; or suitably convex-structured costs and simple constraints, the optimization problem often favors &lt;strong&gt;extreme coefficients&lt;/strong&gt; because any “grey” intermediate material can be improved by redistributing toward the extremes while keeping constraints satisfied.&lt;/li&gt;
&lt;li&gt;If instead the cost penalizes variations of $a$ (e.g., includes $|\nabla a|$ or a strictly convex cost of $a$), then intermediate values can become optimal and the design becomes smoother.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This has practical consequences:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For pure stiffness or compliance problems, we should expect &amp;ldquo;black-and-white&amp;rdquo; topologies.&lt;/li&gt;
&lt;li&gt;For problems where manufacturing or grading costs matter, optimal designs may be graded rather than sharply two-phase.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="applications"&gt;
 Applications&lt;span class="heading__anchor"&gt; &lt;a href="#applications"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Even though the arXiv abstract is brief, the paper’s role is clear: it systematizes and clarifies the theory of &lt;strong&gt;optimal coefficients for elliptic PDEs&lt;/strong&gt; in two complementary regimes—compliance and more general optimal control.&lt;/p&gt;
&lt;p&gt;For engineers and applied mathematicians, the main takeaways are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We can rigorously frame &amp;ldquo;optimal material distribution&amp;rdquo; as an elliptic PDE with a coefficient control and prove &lt;strong&gt;existence&lt;/strong&gt; of optimal designs under realistic constraints.&lt;/li&gt;
&lt;li&gt;In many practically relevant cases (especially compliance), optimal designs heavily favor &lt;strong&gt;extreme phases&lt;/strong&gt;, justifying the common use of binary material models in topology optimization.&lt;/li&gt;
&lt;li&gt;Adjoint-based optimality conditions give a &lt;strong&gt;computable sensitivity&lt;/strong&gt; of the cost to local changes in $a$, providing the mathematical underpinning for gradient-based optimization algorithms.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If we imagine designing a bridge deck or a heat sink, this theory tells us:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;where to place stiff or conductive material,&lt;/li&gt;
&lt;li&gt;why optimal layouts tend to be sharply separated regions of different material,&lt;/li&gt;
&lt;li&gt;and how to systematically refine the design using PDE solutions and their adjoints.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal sources for elliptic PDEs. arXiv preprint arXiv:2509.01521.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal sources for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2509.01521}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;[2] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal coefficients for elliptic PDEs. arXiv preprint arXiv:2512.08431.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal coefficients for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2512.08431}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Paper Reading - Optimal sources for elliptic PDEs (2509.01521)</title><link>https://blog.namln.org/en/posts/pr-2509.01521/</link><pubDate>Wed, 18 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/pr-2509.01521/</guid><description>&lt;h2 class="heading" id="introduction"&gt;
 Introduction&lt;span class="heading__anchor"&gt; &lt;a href="#introduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The authors study how to &amp;ldquo;best choose&amp;rdquo; a source term $f$ in a Poisson-type equation
$$
-\Delta u = f \quad\quad\text{in }\Omega,\quad u = 0\text{ on }\partial\Omega,
$$
so that a given performance measure (a cost functional) is optimized. The twist is that the source itself is the control, and it can be subject to various constraints (size, bounds, sign, etc.). This makes the problem sit at the intersection of optimal control, shape optimization, and regularity theory.&lt;/p&gt;
&lt;h2 class="heading" id="the-basic-optimization-setup"&gt;
 The basic optimization setup&lt;span class="heading__anchor"&gt; &lt;a href="#the-basic-optimization-setup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;First, we fix a bounded domain $\Omega \subset \mathbb{R}^d$ and, for each admissible source $f$, we solve the PDE to get the state $u_f$. Then we evaluate a cost
function which defined as follow:
$$
J(f) = \int_\Omega j(x, u_f(x), f(x)),dx,
$$
and we want to minimize $J$ over all admissible $f$.&lt;/p&gt;
&lt;p&gt;The admissible class is defined via an integral constraint:
$$
\int_\Omega \psi(f),dx \le m,
$$
for some convex function $\psi$. Different choices of $\psi$ encode different types of constraints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Super-linear $\psi$ (growing faster than $|s|$) keeps $f$ in $L^1$ and “penalizes” large values strongly.&lt;/li&gt;
&lt;li&gt;Linearly growing $\psi$ allows $f$ to be a measure (e.g., sums of Dirac masses), not just a function.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The first main result: under mild assumptions on $j$ and $\psi$, the problem always has at least one optimal source $f_{\text{opt}}$ (either as a function or a finite measure, depending on growth).&lt;/p&gt;
&lt;h2 class="heading" id="when-optimal-sources-are-all-or-nothing-bangbang-phenomenon"&gt;
 When optimal sources are “all or nothing” (bang–bang phenomenon)&lt;span class="heading__anchor"&gt; &lt;a href="#when-optimal-sources-are-all-or-nothing-bangbang-phenomenon"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;A central theme is the &lt;strong&gt;bang–bang phenomenon&lt;/strong&gt;: in many natural constraints, the best source uses only its extreme admissible values, like $f = \alpha$ or $f = \beta$, with no intermediate levels.&lt;/p&gt;
&lt;p&gt;This occurs, for instance, when we impose point-wise bounds:
$$
\alpha \le f \le \beta
$$
and choose a suitable $\psi$ that is affine on $[\alpha,\beta]$. Then the optimal source takes the form:
$$
f _{\text{opt}} = \beta,\mathbf{1} _E + \alpha,\mathbf{1} _{\Omega\setminus E}
$$
for some measurable set $E\subset \Omega$. At that point the problem becomes a &lt;strong&gt;shape optimization&lt;/strong&gt; problem in the unknown set $E$.&lt;/p&gt;
&lt;p&gt;The authors derive a precise system of necessary optimality conditions using a Lagrange multiplier $\lambda$ and an adjoint state $w$ (solution of another elliptic problem). Roughly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$w$ is built from derivatives of the integrand $j$ with respect to $u$ and $f$.&lt;/li&gt;
&lt;li&gt;The sign of $w+\lambda$ decides whether $f_{\text{opt}}$ equals $\alpha$ or $\beta$ at each point.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They show when these conditions are also sufficient, so we can fully characterize optimal controls in convex cases.&lt;/p&gt;
&lt;p&gt;A key structural insight: bang–bang behavior appears if and only if $\psi$ is &lt;strong&gt;not strictly convex&lt;/strong&gt; on some interval (it is affine on a nontrivial segment). If $\psi$ is strictly convex (e.g., $\psi(s)=s^2$), the optimal source is more regular and not bang–bang.&lt;/p&gt;
&lt;h2 class="heading" id="important-model-examples"&gt;
 Important model examples&lt;span class="heading__anchor"&gt; &lt;a href="#important-model-examples"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The paper discusses several instructive choices of $\psi$ and $j$, each corresponding to a classical PDE optimization problem:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Total variation constraint&lt;/strong&gt;: $\psi(s)=|s|$.
&lt;ul&gt;
&lt;li&gt;The admissible sources are bounded measures with total variation at most $m$.&lt;/li&gt;
&lt;li&gt;Optimality conditions show that $f_{\text{opt}}$ is supported where an adjoint field $w$ saturates a threshold.&lt;/li&gt;
&lt;li&gt;In radially symmetric cases (e.g., $\Omega$ a ball, linear cost), the optimal source is a Dirac delta at the center.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Nonnegative sources with mass constraint&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;$\psi(s)=s$ for $s\ge0$, $\psi(s)=+\infty$ otherwise.&lt;/li&gt;
&lt;li&gt;One finds conditions under which the optimal $f$ is a single Dirac mass carrying all the “budget”.&lt;/li&gt;
&lt;li&gt;For certain power-type functionals $\int |u|^p$, existence and structure of maximizers are detailed.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Box-constrained sources&lt;/strong&gt; $\alpha \le f \le \beta$ with a volume (mass) constraint $\int f \le m$:
&lt;ul&gt;
&lt;li&gt;The authors show precisely when the optimal $f$ is constant (always $\alpha$ or always $\beta$) and when it becomes a genuine bang–bang mixture of both extremes.&lt;/li&gt;
&lt;li&gt;Strict monotonicity of $j$ in $u$ tends to force true &lt;em&gt;bang–bang solutions&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tracking a target state&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Cost $J(f)=\int_\Omega |u_f - u_0|^2 dx$ with $\alpha \le f \le \beta$.&lt;/li&gt;
&lt;li&gt;Under mild assumptions on the target $u_0$, the unique optimal control is bang–bang almost everywhere, again determined by the sign of an adjoint field.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Strictly convex $\psi$&lt;/strong&gt;, like $\psi(s)=s^2$:
&lt;ul&gt;
&lt;li&gt;Then the optimal control is not bang–bang but a continuous function explicitly related to $w$ and the mass constraint.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compliance optimization&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Minimize $\int_\Omega f u_f,dx$ under $\alpha \le f \le \beta$ and $\int f \ge m$.&lt;/li&gt;
&lt;li&gt;This is equivalent to maximizing the elastic energy of the system with bounded loads.&lt;/li&gt;
&lt;li&gt;For $0\le \alpha &amp;lt; \beta$, the optimal right-hand side is bang–bang; the domain splits into two regions where the load is either $\alpha$ or $\beta$.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="regularity-of-the-optimal-sets-and-interfaces"&gt;
 Regularity of the optimal sets and interfaces&lt;span class="heading__anchor"&gt; &lt;a href="#regularity-of-the-optimal-sets-and-interfaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Once we know the optimal control is bang–bang, the main qualitative object is the &lt;strong&gt;interface&lt;/strong&gt; between the regions where $f=\alpha$ and $f=\beta$.&lt;/p&gt;
&lt;p&gt;The interface is essentially a level set of an elliptic solution $u$ (or of the adjoint $w$), so understanding its geometry is a regularity problem.&lt;/p&gt;
&lt;h3 class="heading" id="bounded-variation-bv-regularity"&gt;
 Bounded variation (BV) regularity&lt;span class="heading__anchor"&gt; &lt;a href="#bounded-variation-bv-regularity"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In a first model case (compliance with $0\le \alpha &amp;lt; \beta$), the authors show that the optimal source $f_{\text{opt}}$ belongs to the space $BV(\Omega)$. This means the interface set has &lt;strong&gt;finite perimeter&lt;/strong&gt;: geometrically, the boundary between phases has finite (d–1)-dimensional measure.&lt;/p&gt;
&lt;p&gt;More generally, they derive estimates that control the curvature-like quantities of $u$ via the $BV$-norm of $f$.&lt;/p&gt;
&lt;h3 class="heading" id="a-refined-view-near-critical-points"&gt;
 A refined view near critical points&lt;span class="heading__anchor"&gt; &lt;a href="#a-refined-view-near-critical-points"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A tougher issue is what happens on the set where $\nabla u=0$, because level sets can get very wild there. The authors prove:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For data $f \in BV(\Omega)$ satisfying a uniform positivity $f \ge \alpha&amp;gt;0$, certain weighted quantities like&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;$$
\int \frac{1}{|\nabla u|},\frac{1}{\log^q(1/|\nabla u|)},dx
$$&lt;/p&gt;
&lt;p&gt;stay finite for any $q&amp;gt;1$.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;They then construct weights involving $\log(1/|\nabla u|)$ which &amp;ldquo;switch off&amp;rdquo; exactly where $\nabla u=0$, and show that appropriately weighted indicators of level sets belong to $BV$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In particular, they define a refined Hausdorff-type measure $H_{d-1,q}$ with logarithmic weights and prove that, for sufficiently regular $f$, the set ${\nabla u=0}$ has zero $H_{d-1,q}$-measure for all $q&amp;gt;1$. This implies that the critical set has Hausdorff dimension at most $d-1$, with an even stronger “thinness” encoded by the log weights.&lt;/p&gt;
&lt;h3 class="heading" id="convex-domains-convex-and-smooth-optimal-regions"&gt;
 Convex domains: convex and smooth optimal regions&lt;span class="heading__anchor"&gt; &lt;a href="#convex-domains-convex-and-smooth-optimal-regions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In the compliance case on a &lt;strong&gt;convex&lt;/strong&gt; domain $\Omega$, the structure is even nicer. The optimal set $E={x : f_{\text{opt}}(x)=\beta}$ coincides with a sublevel set of a solution to a semi-linear equation.&lt;/p&gt;
&lt;p&gt;Using a result of Caffarelli–Spruck type convexity for level sets, they show:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$E$ is itself &lt;strong&gt;convex&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;One can rule out “corners”, and deduce that the boundary of $E$ is actually of class $C^1$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So in convex domains, the optimal high-load region is a smooth convex set.&lt;/p&gt;
&lt;h2 class="heading" id="summary"&gt;
 Summary&lt;span class="heading__anchor"&gt; &lt;a href="#summary"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;This work gives a unified and quite complete picture of how optimal sources for elliptic PDEs behave under natural constraints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It establishes existence of optimal controls for broad classes of convex functionals and constraints.&lt;/li&gt;
&lt;li&gt;It identifies exactly when we get bang–bang sources, turning a PDE control problem into a shape optimization problem.&lt;/li&gt;
&lt;li&gt;It provides sharp optimality conditions through adjoint states and sub-differential characterizations, allowing practical characterization and numerical approximation of optimal controls.&lt;/li&gt;
&lt;li&gt;It develops regularity theory for the resulting optimal sets and interfaces, including BV estimates, structure of level sets, and refined control of critical sets.&lt;/li&gt;
&lt;li&gt;For people working in optimal design, structural mechanics, or inverse problems, the message is: if our cost is convex and our constraint has a &amp;ldquo;flat&amp;rdquo; part (non-strictly convex $\psi$), expect extreme, piecewise-constant sources with reasonably regular interfaces that we can analyze geometrically and approximate numerically.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal sources for elliptic PDEs. arXiv preprint arXiv:2509.01521.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal sources for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2509.01521}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Restriction and extension</title><link>https://blog.namln.org/en/posts/restriction/</link><pubDate>Wed, 29 Oct 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/restriction/</guid><description>&lt;p&gt;Considering a smooth compact hyper-surface $\mathcal{S}$ in $\mathbb{R}^d$ with surface measure $d\sigma$. Given $f \in L^1(\mathbb{R}^d)$, the Fourier transform defined as follow:
$$
\begin{equation}
\hat{f}(x) = \int_{\mathbb{R}^d}e^{-2\pi i x \xi}f(x)dx
\end{equation}
$$
which by Riemann-Lebesgue is a bounded, continuous function vanishing at infinity.&lt;/p&gt;
&lt;p&gt;Since $\hat{f}$ is continuous on $\mathbb{R}^d$, by &lt;a href="https://en.wikipedia.org/wiki/Riemann%E2%80%93Lebesgue_lemma"&gt;the Rimann-Lesbegue lemma&lt;/a&gt; its restriction to the compact hyper-surface $S \subset \mathbb{R}^d$ is is well-defined pointwise. Specifically, the restriction $\hat{f}\mid_{S}: S \rightarrow \mathbb{C}$ is the continuous function given by
$$
\begin{equation}
\hat{f}\mid_{S}(\sigma) = \hat{f}(\sigma) = \int_{\mathbb{R}^d}e^{-2\pi i x \xi}f(x)dx
\end{equation}
$$
for each $\sigma \in S$. This is bounded (as $\hat{f}$ is bounded) and can be integrated against the surface measure $d\sigma$ on $S$.&lt;/p&gt;
&lt;p&gt;Thus when we restrict $\hat{f}$ to $S$, we get a meaningful function which has finite $L^q$-norm for every $q$ .&lt;/p&gt;
&lt;p&gt;When starting with $f \in L^2(\mathbb{R}^d)$, &lt;strong&gt;the Fourier transform $\hat{f}$ is not well-defined point-wise in general&lt;/strong&gt;, so there is no
meaningful way to restrict an arbitrary $L^2$ function to a set of measure zero such as the hyper-surface $S$.&lt;/p&gt;
&lt;p&gt;For especially, for any given $f \in L^2(\mathbb{R}^d)$, the Fourier transform is defined in the $L^2$ sense via the &lt;a href="https://en.wikipedia.org/wiki/Plancherel_theorem"&gt;Plancherel theorem&lt;/a&gt;:
$$
\begin{equation}
\mathcal{F}: L^2(\mathbb{R}^d) \to L^2(\mathbb{R}^d), \quad | \hat{f} | _{L^2} = | f | _{L^2}
\end{equation}
$$
It is an isometry. So:
$$
\begin{equation}
\hat{f} \in L^2(\mathbb{R}^d)
\end{equation}
$$
Since $\hat{f}$ is only an $L^2$ function — it is &lt;strong&gt;not necessarily continuous&lt;/strong&gt;, and &lt;strong&gt;not even bounded&lt;/strong&gt;, and &lt;strong&gt;need not have a pointwise value almost everywhere&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So the expression:
$$
\begin{equation}
\hat{f}|_S(\sigma) = \hat{f}(\sigma), \quad \sigma \in S
\end{equation}
$$
does not make sense pointwise for arbitrary $f \in L^2$.&lt;/p&gt;
&lt;p&gt;The question arises: what happens for $1 &amp;lt; p &amp;lt; 2$?&lt;/p&gt;
&lt;div style="padding: 6px; border: dodgerblue 2px solid;"&gt;&lt;span style="color:dodgerblue"&gt;&lt;b&gt; Question 1: &lt;/b&gt;&lt;/span&gt; 
&lt;p&gt;For which $p$ and $q$ do we have:
$$
\begin{equation}
||\hat{f}|| _{L^q(S, d\sigma)} \lesssim ||f|| _{L^p(\mathbb{R}^d)}, \quad \forall f.
\end{equation}
$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This is restriction of Fourier transforms to hyper-surfaces problem in Harmonic analysis.&lt;/p&gt;</description></item><item><title>Proof of Theorem of solution of wave equation in the case $n = 1$</title><link>https://blog.namln.org/en/posts/solution-wave-equation-n1/</link><pubDate>Thu, 31 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/solution-wave-equation-n1/</guid><description>&lt;embed src= "/files/pde/Solution%20of%20wave%20equation%20n%20=%201.pdf" width= "100%" height= "1000px" type="application/pdf" &gt;</description></item><item><title>Solution of Brezis Problem 8.24 (1) and (2)</title><link>https://blog.namln.org/en/posts/problem-8.24-brezis/</link><pubDate>Thu, 31 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/problem-8.24-brezis/</guid><description>&lt;embed src= "/files/pde/Problem%208.24%20Brezis.pdf" width= "100%" height= "1000px" type="application/pdf" &gt;</description></item><item><title>Solution of Evans PDE Problem 13</title><link>https://blog.namln.org/en/posts/problem-13-evans/</link><pubDate>Thu, 31 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/problem-13-evans/</guid><description>&lt;embed src= "/files/pde/Problem%2013%20Evans.pdf" width= "100%" height= "1000px" type="application/pdf" &gt;</description></item><item><title>A lemma of J. L. Lions</title><link>https://blog.namln.org/en/posts/a-lemma-lions/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/a-lemma-lions/</guid><description>&lt;p&gt;This post explores J. L. Lions&amp;rsquo; lemma about Banach spaces with compact injection, including applications to functional analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lemma statement&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;Let $X$, $Y$, and $Z$ be three Banach spaces with norms $|| \cdot ||_X$, $|| \cdot ||_Y$, and $|| \cdot ||_Z$. Assume that $X \subset Y$ with compact injection and that $Y \subset Z$ with continuous injection. Prove that&lt;/p&gt;
&lt;p&gt;$$
\forall \varepsilon &amp;gt; 0, \exists C_\varepsilon &amp;gt; 0 \text{ satisfying } || u ||_Y \leq \varepsilon || u ||_X + C _{\varepsilon}|| u ||_Z,\quad \forall u \in X
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Applications&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Prove that for every $\varepsilon &amp;gt; 0$ there exists $C_\varepsilon &amp;gt; 0$ satisfying&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\max_{t \in [0,1]} |u(t)| \leq \varepsilon \max_{t \in [0,1]} |u&amp;rsquo;(t)| + C_\varepsilon ||u ||_{L^1}, \quad \forall u \in C^1([0,1]).
$$&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Pick $p &amp;gt; 1$. Prove that for every $\varepsilon &amp;gt; 0$ there exists $C = C(\varepsilon, p)$ such that&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
|| u || _{L^\infty(0,1)} \leq \varepsilon || u || _{W^{1,p}(0,1)} + C || u || _{L^1(0,1)}, \quad \forall u \in W^{1,p}(0,1).
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For the initial lemma, just argue by contradiction. Assume the contrary that there exists some $\varepsilon_0 &amp;gt; 0$ and a sequence $(u_n)_{n \in \mathbb{Z}^{+}} \subset X$ such that&lt;/p&gt;
&lt;p&gt;$$
|| u ||_Y &amp;gt; \varepsilon || u ||_X + C _{\varepsilon}|| u ||_Z
$$&lt;/p&gt;
&lt;p&gt;Then $u_n \ne 0, \forall n \in \mathbb{Z}^{+}$.&lt;/p&gt;
&lt;p&gt;Let $v_n := \dfrac{u_n}{|| u_n||_X}$&lt;/p&gt;
&lt;p&gt;Then clearly, $||v_n||_X = 1$ and we have&lt;/p&gt;
&lt;p&gt;$$
||v_n|| _Y &amp;gt; \varepsilon_0 + C _{\varepsilon_0}||v_n||_Z
$$&lt;/p&gt;
&lt;p&gt;Since $X \subset Y$ with compact injection.&lt;/p&gt;
&lt;p&gt;Assume without loss generalization, there is $v \in Y$ such that $|| v_n - v|| _Y \rightarrow 0$ as $n \rightarrow \infty$. In particular, we have $(||v_n||) _{n \in \mathbb{Z}^{+}}$ bounded. It follows that $||v_n|| \rightarrow 0$ as $n \rightarrow \infty$.&lt;/p&gt;
&lt;p&gt;And because $Y \subset Z$ with continuous injection, we obtain:&lt;/p&gt;
&lt;p&gt;$$
||v_n - v||_Z \rightarrow 0 \quad \text{as} \quad n \rightarrow \infty
$$&lt;/p&gt;
&lt;p&gt;Then $v = 0$ and $||v_n||_Y \rightarrow 0$ as $n \rightarrow \infty$&lt;/p&gt;
&lt;p&gt;On the other hand, we also have&lt;/p&gt;
&lt;p&gt;$$
\lim_{n \rightarrow \infty} &amp;gt; \varepsilon_0 + \varepsilon_0\lim_{n \rightarrow \infty}||v_n||_Z
$$&lt;/p&gt;
&lt;p&gt;Consequently,&lt;/p&gt;
&lt;p&gt;$$
0 &amp;gt; \varepsilon_0 &amp;gt; 0
$$
which is a contradiction. The two application are more or less immediate after using the given lemma. The proof is completed.&lt;/p&gt;</description></item><item><title>Complex Hahn-Banach Theorem</title><link>https://blog.namln.org/en/posts/complex-hahn-banach-theorem/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/complex-hahn-banach-theorem/</guid><description>&lt;p&gt;Let $X$ be a complex vector space, $X_0$ one of its subspaces, $p: X \to \mathbb{R}_+$ such that&lt;/p&gt;
&lt;p&gt;$$
p(\lambda x) = |\lambda| p(x), \quad \forall \lambda \in \mathbb{C}, x \in X \text{ and } p(x + y) \leq p(x) + p(y), \quad \forall x, y \in X,
$$&lt;/p&gt;
&lt;p&gt;satisfying $|f(x)| \leq p(x)$, $\forall x \in X_0$, where $f: X_0 \to \mathbb{C}$ is linear.&lt;/p&gt;
&lt;p&gt;Under these conditions, there exists a linear functional $F: X \to \mathbb{C}$ such that $F|_{X_0} = f$ and&lt;/p&gt;
&lt;p&gt;$$
|F(x)| \leq p(x), \quad \forall x \in X.
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Since $f$ is linear, it follows that $\text{Re } f: X_0 \to \mathbb{R}$ is linear and
$$
\text{Re } f(x) \leq |f(x)| \leq p(x), \quad \forall x \in X_0.
$$&lt;/p&gt;
&lt;p&gt;By the Real Hahn-Banach Theorem there exists $g: X \to \mathbb{R}$ a linear functional such that $g$ is an extension for $\text{Re } f$ and $g(x) \leq p(x)$, $\forall x \in X$. We also have $g(x) = -g(-x) \geq -p(x)$ so $|g(x)| \leq p(x)$, $\forall x \in X$.&lt;/p&gt;
&lt;p&gt;Define now $F(x) = g(x) - i g(ix)$, $\forall x \in X$. This is obviously linear and if $x \in X_0$ we have
$$
F(x) = g(x) - i g(ix) = \text{Re } f(x) - i \text{Re } i f(x) =
\text{Re } f(x) + i \text{Im } f(x) = f(x), \quad \forall x \in X_0.
$$&lt;/p&gt;
&lt;p&gt;For the last part we have $|F(x)| = e^{i\theta} F(x) = F(e^{i\theta} x) = g(e^{i\theta} x)$, because this is a real number. Furthermore, we have $g(e^{i\theta} x) \leq p(e^{i\theta} x) = p(x)$. Combining the two above, we get
$$
|F(x)| \leq p(x), \quad \forall x \in X,
$$
which solves the theorem.&lt;/p&gt;</description></item><item><title>Real Hahn-Banach Theorem</title><link>https://blog.namln.org/en/posts/real-hahn-banach-theorem/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/real-hahn-banach-theorem/</guid><description>&lt;p&gt;Suppose $X$ is a vector space over $\mathbb{R}$, $p: X \to \mathbb{R}$ has the following properties:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$p(X) = \lambda p(x)$, $\forall x \in X$, $\lambda \in \mathbb{R}_+$ and $p(x + y) \leq p(x) + p(y)$, $\forall x, y \in X$.&lt;/li&gt;
&lt;li&gt;Let $X_0$ be a subspace of $X$ and $u: X_0 \to \mathbb{R}$ a linear functional such that $u(x) \leq p(x)$, $\forall x \in X_0$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Then we can find $f: X \to \mathbb{R}$ a linear functional such that $f|_{X_0} = u$ and $f(x) \leq u(x)$, $\forall x \in X$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Let $Y$ is a subspace of $X$, $g: Y \to \mathbb{R}$ is a linear functional which extends $u$ and $g \leq p$ on $Y$&lt;/p&gt;
&lt;p&gt;Consider the set $M = { (Y, g) }$. Define an order relation on $M$ like this $(Y_1, g_1) \leq (Y_2, g_2)$ if $Y_1 \subset Y_2$ and $g_2$ is an extension for $g_1$.&lt;/p&gt;
&lt;p&gt;We show that in $M$ every chain has an upper bound. Suppose $M_0$ is a totally ordered subset of $M$. Then define $Y_0 = \bigcup_{(Y,g) \in M_0} Y$ and $g: Y_0 \to \mathbb{R}$, $g(y) = g_0(y)$ if $y \in Y_0$ and $(Y_0, g) \in M_0$. This function is well defined, and $Y_0$ is a subspace of $X$ because the set $M_0$ is totally ordered.&lt;/p&gt;
&lt;p&gt;Furthermore, from the definition for $g_0$, we have that $g_0 \leq p$. Therefore $(Y_0, g_0) \in M$, and is obviously an upper bound for $M_0$. By Zorn&amp;rsquo;s Lemma, we find that $M$ has at least one maximal element $(Z, h)$.&lt;/p&gt;
&lt;p&gt;Suppose $X \neq Z$. Then we can find $x_0 \in X \setminus Z$. Define $W = \text{Span}{Z, x_0} = \mathbb{R} \cdot x_0 \oplus Z$. Therefore, $W$ is a linear subspace in $X$. Let $y, z \in Z$. Then
$$
h(y) + h(z) = h(y + z) \leq p(y + z) = p(y - x_0 + x_0 + z) \leq p(y - x_0) + p(x_0 + z)
$$
Therefore, we have
$$
h(z) - p(-x _0 + z) + h(y) - p(y - x _0) \leq - h(y) + p(x _0 + y), \quad\forall y, z \in Z
$$&lt;/p&gt;
&lt;p&gt;Therefore, we can say
$$
a = \sup_{z \in Z} (h(z) - p(-x_0 + z)) \leq - \inf_{y \in Z} (-h(y) + p(x_0 + y))
$$
Pick one $c \in [a, b]$ and define $h_1(z) = \lambda c + h(y)$, where $z = \lambda x_0 + y$ (unique representation), $h_1$ is linear, and extends $h_1$ on $W$, which means that it extends $u$ on $X_0$.&lt;/p&gt;
&lt;p&gt;We can check that $(W, h_1) \in M$ and the maximal element $h_1$ is the requested functional element, which is a contradiction.&lt;/p&gt;
&lt;p&gt;Therefore $Z = X$, and the maximal element $h_1$ is the requested functional.&lt;/p&gt;</description></item><item><title>Riesz Representation Theorem</title><link>https://blog.namln.org/en/posts/riesz-representation-theorem/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/riesz-representation-theorem/</guid><description>&lt;h2 class="heading" id="1-riesz-representation-theorem"&gt;
 1. Riesz Representation Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#1-riesz-representation-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Let $H$ be a Hilbert space over $\mathbb{R}$ or $\mathbb{C}$, and $T$ be a bounded linear functional on $H$ (a bounded operator from $H$ to the field $\mathbb{R}$ or $\mathbb{C}$, where $H$ is defined over that field). The following is known as the Riesz Representation Theorem:&lt;/p&gt;
&lt;div style="padding: 6px; border: dodgerblue 2px solid;"&gt;&lt;span style="color:dodgerblue"&gt;&lt;b&gt; Theorem 1: &lt;/b&gt;&lt;/span&gt; 
&lt;p&gt;If $T$ is a bounded linear functional on the Hilbert space $H$, then there exists $g \in H$ such that for every $f \in H$, we have:
$$
T(f) = \langle f, g \rangle.
$$&lt;/p&gt;
&lt;p&gt;Moreover, $|T| = |g|$ (here $|T|$ denotes the operator norm of $T$, while $|g|$ is the Hilbert space norm of $g$).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Now, let’s prove this theorem.&lt;/p&gt;
&lt;div style="padding: 6px; border: green 2px solid;"&gt;&lt;span style="color:green"&gt;&lt;b&gt; Proof: &lt;/b&gt;&lt;/span&gt; 
&lt;p&gt;Assume that $H$ is separable for now. The proof for any Hilbert space is not much more difficult, but the separable case nicely uses ideas we have developed related to Fourier analysis. Additionally, we will work over $\mathbb{R}$.&lt;/p&gt;
&lt;p&gt;Since $H$ is separable, we can choose an orthonormal basis $\phi_j$, $j \geq 1$, for $H$. Let $T$ be a bounded linear functional and set $a_j = T(\phi_j)$. For $f \in H$, set $c_j = \langle f, \phi_j \rangle$, and define
$$
f_n = \sum_{j=1}^{n} c_j \phi_j.
$$&lt;/p&gt;
&lt;p&gt;Since the $\phi_j$ form a basis, we know that $|f - f_n| \to 0$ as $n \to \infty$.&lt;/p&gt;
&lt;p&gt;Since $T$ is linear, we have:
$$
T(f_n) = \sum_{j=1}^{n} a_j c_j. \tag{1}
$$&lt;/p&gt;
&lt;p&gt;Since $T$ is bounded, assume with norm $|T| &amp;lt; \infty$, we have:
$$
|T(f) - T(f_n)| \leq |T| |f - f_n|. \tag{2}
$$&lt;/p&gt;
&lt;p&gt;Because $|f - f_n| \to 0$ as $n \to \infty$, we conclude from equations (1) and (2) that:
$$
T(f) = \lim_{n\to\infty} T(f_n) = \sum_{j=1}^{\infty} a_j c_j. \tag{3}
$$&lt;/p&gt;
&lt;p&gt;In fact, the sequence $a_j$ must be square-summable. To see this, first note that since $|T(f)| \leq |T| |f|$, we have:
$$
\left|\sum_{j=1}^{\infty} c_j a_j\right| \leq |T| \left(\sum_{j=1}^{\infty} c_j^2\right)^{1/2}. \tag{4}
$$&lt;/p&gt;
&lt;p&gt;Equation (4) must hold for every square-summable sequence $c_j$ (since any such $c_j$ corresponds to some element in $H$). Fix a positive integer $N$ and define the sequence $c_j = a_j$ for $j \leq N$, $c_j = 0$ for $j &amp;gt; N$. Clearly, such a sequence is square-summable, and equation (4) gives us:
$$
\left(\sum_{j=1}^{N} a_j^2\right)^{1/2} \leq |T|. \tag{5}
$$&lt;/p&gt;
&lt;p&gt;Thus, $a_j$ is square-summable, as the sequence of partial sums is bounded above.&lt;/p&gt;
&lt;p&gt;Since $a_j$ is square-summable, the function $g = \sum_{j} a_j \phi_j$ is well-defined as an element of $H$, and $T(f) = \sum_{j} a_j c_j = \langle f, g \rangle$. Finally, equation (5) shows that $|g| \leq |T|$. But from the Cauchy-Schwarz inequality, we also have $|T(f)| = |\langle f, g \rangle| \leq |f| |g|$ or $\frac{|T(f)|}{|f|} \leq |g|$, implying $|T| \leq |g|$, hence $|T| = |g|$. The proof is complete.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 class="heading" id="2-application-to-pde"&gt;
 2. Application to PDE&lt;span class="heading__anchor"&gt; &lt;a href="#2-application-to-pde"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;This example illustrates how functional analysis methods are used in PDEs (although the example is for an ODE). Consider the ODE:
$$
-f&amp;rsquo;&amp;rsquo;(x) + b(x)f(x) = q(x) \tag{6}
$$&lt;/p&gt;
&lt;p&gt;on the interval $0 &amp;lt; x &amp;lt; 1$, with $b(x) \geq \delta &amp;gt; 0$ for some $\delta$; assume the functions $b$ and $q$ are continuous on $[0, 1]$. We want to find a solution to equation (6) with $f&amp;rsquo;(0) = f&amp;rsquo;(1) = 0$ (other boundary conditions could also be applied). If we multiply (6) by a $C^1$ function $\phi$ and integrate the first term, $-f&amp;rsquo;&amp;rsquo;\phi$, by parts from $x = 0$ to $x = 1$, we obtain:
$$
\int_0^1 (f&amp;rsquo;(x)\phi&amp;rsquo;(x) + b(x)f(x)\phi(x)),dx = \int_0^1 q(x)\phi(x),dx. \tag{7}
$$&lt;/p&gt;
&lt;p&gt;Equation (7) must hold for every $\phi \in C^1([0, 1])$, if $f$ is a $C^2(0, 1)$ solution of equation (6) that is continuous on $[0, 1]$. Conversely, if for a $C^2$ function $f$, we find that (7) holds for every $\phi$, then $f$ must be a solution of equation (6), because if we &amp;ldquo;undo&amp;rdquo; the integration by parts in (7), we get:
$$
\phi(1)f&amp;rsquo;(1) - \phi(0)f&amp;rsquo;(0) + \phi(x)(-f&amp;rsquo;&amp;rsquo;(x) + b(x)f(x)) = \phi(x)q(x)
$$
for every $\phi$.&lt;/p&gt;
&lt;p&gt;A familiar PDE argument then shows that $f&amp;rsquo;(0) = f&amp;rsquo;(1) = 0$ and equation (6) must hold.&lt;/p&gt;
&lt;p&gt;We will show that there is a unique solution to equation (7). Such a &amp;ldquo;solution&amp;rdquo; does not necessarily need to be twice differentiable as required by equation (6), but it will satisfy equation (7). Equation (7) is often called the &amp;ldquo;weak&amp;rdquo; form of the problem.&lt;/p&gt;
&lt;p&gt;Define an inner product:
$$
\langle g, h \rangle = \int_0^1 (g&amp;rsquo;(x)h&amp;rsquo;(x) + b(x)g(x)h(x)),dx
$$&lt;/p&gt;
&lt;p&gt;on the space $C^1([0, 1])$, and let $H$ denote the completion of this space. This is essentially the procedure used on the third problem of the first exam; the presence of $b(x)$ makes no difference. (Note that we must use $b \geq \delta &amp;gt; 0$ to ensure that $\langle \cdot, \cdot \rangle$ is indeed an inner product, so that $|g| = \sqrt{\langle g, g \rangle} = 0$ if and only if $g \equiv 0$.) The space $H$ is a Hilbert space and can be understood (if needed) as a subspace of $C([0, 1])$.&lt;/p&gt;
&lt;p&gt;Define a functional $T : H \to \mathbb{R}$ by:
$$
T(\phi) = \int_0^1 q(x)\phi(x),dx
$$&lt;/p&gt;
&lt;p&gt;You can easily check that $T$ is bounded on $H$ (using Cauchy-Schwarz). From the Riesz Representation Theorem, it follows that there must exist a function $f \in H$ such that:
$$
T(\phi) = \langle f, \phi \rangle
$$&lt;/p&gt;
&lt;p&gt;for every $\phi \in H$. This is exactly equation (7), the weak form of the ODE!&lt;/p&gt;
&lt;p&gt;The function $f$ satisfying equation (7) lies in $H$. Under the conditions on $b$ (specifically, $b \geq \delta &amp;gt; 0$ and $|b|_\infty &amp;lt; \infty$ since $b \in C([0, 1])$), the function $f$ lies in the same space defined in the third problem of the first exam. Specifically, $f$ is a continuous function. Proving that $f$ is actually twice differentiable requires more work, along with additional assumptions about the function $q$.&lt;/p&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] (Original) &lt;a href="https://math.jhu.edu/~lindblad/632/riesz.pdf"&gt;The Riesz Representation Theorem&lt;/a&gt;, MA 466, Kurt Bryan&lt;/p&gt;</description></item><item><title>The application of Hahn-Banach Theorem 01</title><link>https://blog.namln.org/en/posts/hahn-banach-application-1/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/hahn-banach-application-1/</guid><description>&lt;p&gt;Suppose $X$ is a normed space and $X_0$ is a closed subspace of $X$ and $x_0 \in X \setminus X_0$. Then we can find $f \in X&amp;rsquo;$ such that $f(x_0) = 1$ and $f(x) = 0$, $\forall x \in X_0$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Since $x_0 \notin X_0$, we can find $\delta &amp;gt; 0$ such that $|x_0 - x| \geq \delta$, $\forall x \in X_0$, which is equivalent to $1 \leq \dfrac{|x_0 - x|}{\delta}$, $\forall x \in X_0$.&lt;/p&gt;
&lt;p&gt;Define $Y = \text{Span}{x_0, X_0} = X_0 \oplus \mathbb{K} \cdot x_0$. Then for each $y \in Y$ we can find a unique $\lambda \in \mathbb{K}$ such that $u = \lambda x_0 + x$, $x \in X_0$. Define $u: Y \to \mathbb{K}$ by $u(y) = u(\lambda x_0 + x) = \lambda$. It is well defined and linear.&lt;/p&gt;
&lt;p&gt;Furthermore, we have:
$$|u(y)| = |\lambda| \leq |\lambda| \frac{|x _0 + x|}{\delta} = \frac{1}{\delta} |y| \quad \text{for} \lambda \neq 0$$
If $\lambda = 0$, then $y \in X_0$ and $u(y) = 0 \leq \frac{1}{\delta} |y|$.&lt;/p&gt;
&lt;p&gt;Therefore, we obtain&lt;br&gt;
$$
u(y) \leq \frac{1}{\delta} |y| \quad\forall y \in Y
$$
By Hahn-Banach&amp;rsquo;s Theorem, we can extend $u$ to $f: X \to \mathbb{K}$ such that $f|_Y = u$ and $|f(x)| \leq \dfrac{1}{\delta} |x|$, $\forall x \in X$. Therefore $f(x_0) = u(x_0) = 1$ and $x \in X_0 \Rightarrow f(x) = 0$.&lt;/p&gt;</description></item><item><title>The application of Hahn-Banach Theorem 02</title><link>https://blog.namln.org/en/posts/hahn-banach-application-2/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/hahn-banach-application-2/</guid><description>&lt;p&gt;$X'$ = $\{ f: X \to \mathbb{K} \}$ where $f$ is is linear and continuous and $X$ is a Banach space over $\mathbb{K}$. Prove that $X' \neq {0}$, in fact, for every $x \neq 0 \in X$, we can find $f \in X&amp;rsquo;$ such that $f(x) = |x|$ and $|f| = 1$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Pick $x_0 \in X$. Define $X_0 = x_0 \cdot \mathbb{K}$, a subspace of $X$, and $g: X_0 \to \mathbb{K}$, $g(x) = x$, which is linear. Since $g$ and $|\cdot|$ satisfy the conditions of the Hahn-Banach theorem, we can find $f: X \to \mathbb{K}$ such that $f|_{X_0} = g$, $f$ is linear and $f(x) \leq |x|$, $\forall x \in X$. Therefore $f(x_0) = g(x_0) = |x_0|$ and $|f| \leq 1$. The equality $f(x_0) = |x_0|$ guarantees that $|f| = 1$.&lt;/p&gt;</description></item></channel></rss>