<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Home on Nam Le</title><link>https://blog.namln.org/en/</link><description>Recent content in Home on Nam Le</description><generator>Hugo</generator><language>en-US</language><lastBuildDate>Fri, 29 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.namln.org/en/index.xml" rel="self" type="application/rss+xml"/><item><title>Navier–Stokes Existence and Smoothness</title><link>https://blog.namln.org/en/posts/navier-stokes-existence-smoothness/</link><pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/navier-stokes-existence-smoothness/</guid><description>&lt;p&gt;The motion of a viscous incompressible fluid is described by the Navier–Stokes
equations, first written down by Claude-Louis Navier in 1822 and given their modern
form by George Gabriel Stokes. Whether smooth solutions to these equations can
always be continued for all time (or whether they can spontaneously develop a
singularity at some finite time) is one of the deepest open problems in mathematics,
and one of the seven &lt;a href="https://www.claymath.org/millennium-problems/"&gt;Clay Millennium Prize Problems&lt;/a&gt;,
carrying a 1,000,000$ prize for a solution.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Problem (Clay Millennium Prize, Fefferman 2000)&lt;/span&gt;
&lt;p&gt;Let $u_0 : \mathbb{R}^3 \to \mathbb{R}^3$ be a smooth divergence-free vector field.
Does there exist a smooth solution $u(x,t)$, $p(x,t)$ to the 3D incompressible
Navier–Stokes equations
$$\partial_t u + (u \cdot \nabla)u - \nu\Delta u + \nabla p = 0, \qquad \nabla \cdot u = 0,
\qquad u(\cdot,0) = u_0$$
defined for all $t &amp;gt; 0$ and satisfying $\int_{\mathbb{R}^3}|u(x,t)|^2,dx &amp;lt; C$
for all $t \geq 0$? A solution or a counterexample (a smooth $u_0$ for which no
such smooth solution exists) both qualify for the prize.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-equations-and-their-scaling"&gt;
 The Equations and Their Scaling&lt;span class="heading__anchor"&gt; &lt;a href="#the-equations-and-their-scaling"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Compared to the Euler equations (which describe inviscid flow), the Navier–Stokes
equations add the viscous term $\nu\Delta u$, where $\nu &amp;gt; 0$ is the kinematic
viscosity. This term dissipates energy and regularises the flow locally. The central
tension is that the nonlinear term $(u\cdot\nabla)u$ can concentrate energy at
small spatial scales faster than viscosity can diffuse it away.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scaling symmetry.&lt;/strong&gt; The Navier–Stokes equations are invariant under the rescaling
$$u(x,t) \mapsto \lambda u(\lambda x,, \lambda^2 t), \qquad
p(x,t) \mapsto \lambda^2 p(\lambda x,, \lambda^2 t).$$
A norm is &lt;em&gt;critical&lt;/em&gt; (or &lt;em&gt;scale-invariant&lt;/em&gt;) if it is preserved by this rescaling.
The critical norm in $L^p(\mathbb{R}^3)$ is $L^3$, since
$|\lambda u(\lambda\cdot)| _{L^3} = |u| _{L^3}$.
The energy norm $|u| _{L^2}$ is &lt;em&gt;subcritical&lt;/em&gt;: it scales as $\lambda^{1/2}|u| _{L^2}$,
which shrinks under the rescaling $\lambda \to \infty$ (i.e., zoom into small
scales). This mismatch is the core of the difficulty: global energy control does
not prevent concentration at arbitrarily small scales.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2D global regularity.&lt;/strong&gt; In two dimensions the scaling is different: the enstrophy
$|\nabla u|_{L^2}^2$ is scale-invariant and is controlled by the energy. Global
regularity in 2D follows from this enstrophy estimate, a fact known since the 1960s.
In 3D no analogous critical quantity is controlled globally, and the problem is open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-hierarchy-of-known-results"&gt;
 The Hierarchy of Known Results&lt;span class="heading__anchor"&gt; &lt;a href="#the-hierarchy-of-known-results"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="lerayhopf-weak-solutions-1934"&gt;
 Leray–Hopf Weak Solutions (1934)&lt;span class="heading__anchor"&gt; &lt;a href="#lerayhopf-weak-solutions-1934"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Leray 1934, Hopf 1951)&lt;/span&gt;
&lt;p&gt;For any $u_0 \in L^2(\mathbb{R}^3)$ divergence-free, there exists a global
&lt;em&gt;weak solution&lt;/em&gt; $u \in L^\infty(0,\infty;, L^2) \cap L^2(0,\infty;, H^1)$
satisfying the energy inequality
$$|u(t)| _{L^2}^2 + 2\nu\int _0^t |\nabla u| _{L^2}^2, ds \leq |u_0| _{L^2}^2.$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Leray&amp;rsquo;s construction, via a compactness argument on regularised equations, produces
a solution that is globally defined but potentially not smooth, and the term &amp;ldquo;weak&amp;rdquo;
refers to the fact that the equations are satisfied only in an integral (distributional)
sense, not pointwise. The energy inequality is the only bound available globally.
Whether Leray–Hopf solutions are unique, or whether they are the same as smooth
solutions when the initial data is smooth, is unknown.&lt;/p&gt;
&lt;h3 class="heading" id="partial-regularity-the-ckn-theorem"&gt;
 Partial Regularity: The CKN Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#partial-regularity-the-ckn-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The best known result limiting the size of potential singularities is the following.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Caffarelli–Kohn–Nirenberg, 1982)&lt;/span&gt;
&lt;p&gt;For any &lt;em&gt;suitable weak solution&lt;/em&gt; to the 3D Navier–Stokes equations, the set of
space-time singular points has &lt;em&gt;parabolic Hausdorff dimension at most 1&lt;/em&gt;. In
particular, at any given time the spatial singular set has Hausdorff dimension
at most $\dfrac{1}{2}$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A &amp;ldquo;suitable weak solution&amp;rdquo; is a weak solution satisfying a local energy inequality.
The CKN theorem proves that singularities, if they exist, cannot fill a curve or
surface: they can occupy at most a set of dimension one in space-time. This is the
most quantitative partial regularity result available and was simplified by Lin
(1998). Scheffer (1977) had earlier shown singular times have Hausdorff dimension
at most $\dfrac{1}{2}$.&lt;/p&gt;
&lt;h3 class="heading" id="conditional-regularity-ladyzhenskayaprodiserrin"&gt;
 Conditional Regularity: Ladyzhenskaya–Prodi–Serrin&lt;span class="heading__anchor"&gt; &lt;a href="#conditional-regularity-ladyzhenskayaprodiserrin"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Ladyzhenskaya 1967, Prodi 1959, Serrin 1962)&lt;/span&gt;
&lt;p&gt;If a weak solution additionally satisfies $u \in L^r(0,T;, L^s(\mathbb{R}^3))$
with $\dfrac{2}{r} + \dfrac{3}{s} = 1$ and $3 &amp;lt; s \leq \infty$, then $u$ is
smooth on $(0,T]$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The condition $\dfrac{2}{r} + \dfrac{3}{s} = 1$ is precisely the scale-invariant
line in the $(r,s)$ plane: membership in any of these spaces implies regularity.
The family ranges from $(r,s)=(\infty, 3)$ (critical $L^3$ control in space,
uniform in time) to $(r,s)=(2,\infty)$ (square-integrable $L^\infty$ control in
time). These are &lt;em&gt;conditional&lt;/em&gt; results: they do not prove that a weak solution
lies in such a space, only that if it does, it must be smooth.&lt;/p&gt;
&lt;h3 class="heading" id="the-critical-endpoint-escauriazasereginšverák"&gt;
 The Critical Endpoint: Escauriaza–Seregin–Šverák&lt;span class="heading__anchor"&gt; &lt;a href="#the-critical-endpoint-escauriazaseregin%c5%a1ver%c3%a1k"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Escauriaza–Seregin–Šverák, 2003)&lt;/span&gt;
&lt;p&gt;If $u$ is a Leray–Hopf weak solution with $\sup _{t \in [0,T^*)} |u(\cdot,t)| _{L^3(\mathbb{R}^3)} &amp;lt; \infty$,
then $u$ can be extended as a smooth solution past $T^*$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The endpoint case $s=3$ of the LPS family is the critical one: $L^3(\mathbb{R}^3)$
is exactly the scale-invariant norm for Navier–Stokes. The ESS proof is substantially
harder than the subcritical cases; it uses a compactness argument to reduce to a
smooth, backwards self-similar solution and then invokes a backwards uniqueness
theorem for parabolic equations to rule it out.&lt;/p&gt;
&lt;h3 class="heading" id="taos-quantitative-criterion"&gt;
 Tao&amp;rsquo;s Quantitative Criterion&lt;span class="heading__anchor"&gt; &lt;a href="#taos-quantitative-criterion"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Tao, 2019)&lt;/span&gt;
&lt;p&gt;If a smooth finite-energy solution first becomes singular at time $T^*$, then
$$\limsup_{t \uparrow T^*}
\dfrac{|u(\cdot,t)| _{L^3(\mathbb{R}^3)}}{\bigl(\log\log\log\tfrac{1}{T^*-t}\bigr)^c}
= \infty$$
for some absolute constant $c&amp;gt;0$. In particular, the critical $L^3$ norm must blow
up at least as fast as a triple-logarithm in $(T^*-t)^{-1}$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Tao&amp;rsquo;s result is the first &lt;em&gt;supercritical&lt;/em&gt; regularity criterion for Navier–Stokes:
it gives quantitative information about the blowup rate that goes (by a triple
logarithm) beyond what scaling alone can detect. The proof quantifies the
compactness arguments in the ESS proof, replacing each use of a compactness method
by an explicit Carleman inequality, and propagates lower bounds for the vorticity
across dyadic annuli. The triple-exponential dependence in Tao&amp;rsquo;s bound has since
been localised and sharpened by Barker–Prange (2021) and others.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-supercriticality-problem"&gt;
 The Supercriticality Problem&lt;span class="heading__anchor"&gt; &lt;a href="#the-supercriticality-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The fundamental analytical obstruction is that Navier–Stokes is &lt;em&gt;supercritical&lt;/em&gt;
with respect to the only globally controlled norm ($L^2$): the energy.&lt;/p&gt;
&lt;p&gt;Define the &lt;em&gt;critical regularity index&lt;/em&gt; as the Sobolev exponent $s$ such that
$\dot{H}^s(\mathbb{R}^3)$ is scale-invariant. For Navier–Stokes, $s = 1/2$. The
energy controls $\dot{H}^0 = L^2$ (subcritical), and regularity theory requires
control at $\dot{H}^1$ (critical viscous norm) or $L^3$ (critical Lebesgue norm).
There is a &lt;em&gt;regularity gap&lt;/em&gt; between what is globally available ($L^2$) and what
is needed ($L^3$ or $\dot{H}^1$). Every known approach to closing this gap runs
into the same obstruction: the nonlinearity can create structure at arbitrarily
small scales that the subcritical $L^2$ bound cannot see.&lt;/p&gt;
&lt;p&gt;Tao (2016) made this gap precise by constructing an &lt;em&gt;averaged&lt;/em&gt; Navier–Stokes system, where the bilinear nonlinearity $(u\cdot\nabla)u$ is replaced by a carefully designed convex average of related nonlinearities, for which finite-time blowup
can be rigorously proved. This construction does not produce a counterexample to the true Navier–Stokes equations, but it demonstrates that the specific algebraic structure of the nonlinearity is load-bearing: any proof of global regularity must use something specific about $(u\cdot\nabla)u$ that is not shared by its averages.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-improving-the-quantitative-blowup-rate"&gt;
 1. Improving the Quantitative Blowup Rate&lt;span class="heading__anchor"&gt; &lt;a href="#1-improving-the-quantitative-blowup-rate"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Tao&amp;rsquo;s triple-logarithmic rate is the sharpest known lower bound on blowup of the
critical $L^3$ norm. Scaling considerations suggest that the true rate, if blowup
occurs, should be much faster; conjecturally $|u|_{L^3} \sim (T^*-t)^{-\delta}$
for some $\delta &amp;gt; 0$, analogous to Type I blowup in nonlinear heat equations. The
gap between the triple-logarithmic lower bound and the conjectured power-law rate
represents the frontier of quantitative regularity theory. Closing even part of this
gap, for instance establishing a single-logarithmic or power-of-log lower bound,
would require new ideas beyond Carleman estimates.&lt;/p&gt;
&lt;h3 class="heading" id="2-type-i-vs-type-ii-blowup"&gt;
 2. Type I vs. Type II Blowup&lt;span class="heading__anchor"&gt; &lt;a href="#2-type-i-vs-type-ii-blowup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A blowup is called &lt;em&gt;Type I&lt;/em&gt; if the scale-invariant norm $|u(\cdot,t)|_{L^3}$
grows no faster than $O((T^&lt;em&gt;-t)^{-1/2})$ near $T^&lt;/em&gt;$. It is &lt;em&gt;Type II&lt;/em&gt; otherwise.
For the Navier–Stokes equations, ruling out Type I blowup would be a significant
advance: all self-similar singularities (where $u(x,t) = (T^*-t)^{-1/2}U(x/(T^*-t)^{1/2})$)
are of Type I, and several results (including work of Ružička and Seregin) already
rule them out under mild additional assumptions. Whether all Type I blowup can be
excluded, leaving only the less structured Type II, is open.&lt;/p&gt;
&lt;h3 class="heading" id="3-uniqueness-of-weak-solutions"&gt;
 3. Uniqueness of Weak Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#3-uniqueness-of-weak-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Leray–Hopf weak solutions exist globally, but they may not be unique. This is a
separate, equally deep question: even if all smooth solutions extend globally, one
must also ask whether weak solutions coincide with smooth ones when started from
smooth data. Recent work of Buckmaster and Vicol (2019) showed that weak solutions
below the Ladyzhenskaya–Prodi–Serrin threshold are indeed non-unique, using
convex integration techniques developed for the Euler equations (De Lellis–Székelyhidi).
Whether Leray–Hopf solutions with the energy inequality are unique is still open
and is perhaps the central problem in the weak solution theory.&lt;/p&gt;
&lt;h3 class="heading" id="4-self-similar-and-discretely-self-similar-solutions"&gt;
 4. Self-Similar and Discretely Self-Similar Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#4-self-similar-and-discretely-self-similar-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Self-similar solutions of the form $u(x,t) = (T^*-t)^{-1/2} U(x/(T^*-t)^{1/2})$
satisfy a nonlinear elliptic system for the profile $U$. Several non-existence
theorems show that backward self-similar solutions with certain integrability must
be trivial (Nečas–Ružička–Šverák, 1996). The case of &lt;em&gt;discretely&lt;/em&gt; self-similar
solutions, where $u(x,t) = \lambda u(\lambda x, \lambda^2 t)$ for a fixed
$\lambda \neq 1$, is less understood and was recently revisited. Whether the
set of self-similar profiles that could appear as blowup limits is empty is not known.&lt;/p&gt;
&lt;h3 class="heading" id="5-computer-assisted-proofs-via-rigorous-numerics"&gt;
 5. Computer-Assisted Proofs via Rigorous Numerics&lt;span class="heading__anchor"&gt; &lt;a href="#5-computer-assisted-proofs-via-rigorous-numerics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Chen–Hou approach to Euler singularities (2025) used a computer-assisted proof
framework: construct a numerical approximate profile, then verify its stability
rigorously using interval arithmetic. For Navier–Stokes the presence of viscosity
complicates such an approach (the profile is dissipated rather than transported),
but the same framework (dynamical rescaling plus nonlinear stability verification) might in principle detect or rule out singularities in specific axi-symmetric geometries. Applying and adapting the Hou group&amp;rsquo;s methods to the viscous problem
is an active direction.&lt;/p&gt;
&lt;h3 class="heading" id="6-the-zero-viscosity-limit-and-eulernavierstokes-connection"&gt;
 6. The Zero-Viscosity Limit and Euler–Navier–Stokes Connection&lt;span class="heading__anchor"&gt; &lt;a href="#6-the-zero-viscosity-limit-and-eulernavierstokes-connection"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;As $\nu \to 0$, Navier–Stokes formally converges to Euler. The precise relationship
is subtle: in the presence of boundaries (Prandtl layers) or after a potential Euler
singularity, the zero-viscosity limit can fail to hold in strong norms. If Euler
develops a finite-time singularity at time $T^*_E$ from smooth data (as Chen–Hou
suggest for bounded domains), then for small $\nu$ the Navier–Stokes solution must
either also develop a near-singularity or be regularised by viscosity before $T^*_E$.
Whether viscosity is always sufficient to regularise an Euler singularity, or whether
a Navier–Stokes singularity can arise from a nearby Euler one, is entirely open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Fefferman, C. L. (2000). Existence and smoothness of the Navier–Stokes equation. Clay Mathematics Institute Millennium Prize Problems. &lt;a href="https://www.claymath.org/wp-content/uploads/2022/06/navierstokes.pdf"&gt;https://www.claymath.org/wp-content/uploads/2022/06/navierstokes.pdf&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Leray, J. (1934). Sur le mouvement d&amp;rsquo;un liquide visqueux emplissant l&amp;rsquo;espace. &lt;em&gt;Acta Mathematica&lt;/em&gt;, &lt;strong&gt;63&lt;/strong&gt;, 193–248.&lt;/li&gt;
&lt;li&gt;Hopf, E. (1951). Über die Anfangswertaufgabe für die hydrodynamischen Grundgleichungen. &lt;em&gt;Mathematische Nachrichten&lt;/em&gt;, &lt;strong&gt;4&lt;/strong&gt;(1–6), 213–231.&lt;/li&gt;
&lt;li&gt;Caffarelli, L., Kohn, R., &amp;amp; Nirenberg, L. (1982). Partial regularity of suitable weak solutions of the Navier–Stokes equations. &lt;em&gt;Communications on Pure and Applied Mathematics&lt;/em&gt;, &lt;strong&gt;35&lt;/strong&gt;(6), 771–831.&lt;/li&gt;
&lt;li&gt;Ladyzhenskaya, O. A. (1967). On uniqueness and smoothness of generalized solutions to the Navier–Stokes equations. &lt;em&gt;Zapiski Nauchnykh Seminarov LOMI&lt;/em&gt;, &lt;strong&gt;5&lt;/strong&gt;, 169–185.&lt;/li&gt;
&lt;li&gt;Escauriaza, L., Seregin, G. A., &amp;amp; Šverák, V. (2003). $L_{3,\infty}$-solutions of the Navier–Stokes equations and backward uniqueness. &lt;em&gt;Russian Mathematical Surveys&lt;/em&gt;, &lt;strong&gt;58&lt;/strong&gt;(2), 211–250.&lt;/li&gt;
&lt;li&gt;Tao, T. (2019). Quantitative bounds for critically bounded solutions to the Navier–Stokes equations. arXiv:1908.04958. Published in &lt;em&gt;Nine Mathematical Challenges&lt;/em&gt;, AMS, 2021, pp. 149–193.&lt;/li&gt;
&lt;li&gt;Tao, T. (2016). Finite time blowup for an averaged three-dimensional Navier–Stokes equation. &lt;em&gt;Journal of the American Mathematical Society&lt;/em&gt;, &lt;strong&gt;29&lt;/strong&gt;(3), 601–674.&lt;/li&gt;
&lt;li&gt;Buckmaster, T. &amp;amp; Vicol, V. (2019). Nonuniqueness of weak solutions to the Navier–Stokes equation. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;189&lt;/strong&gt;(1), 101–144.&lt;/li&gt;
&lt;li&gt;Barker, T. &amp;amp; Prange, C. (2021). Localized quantitative estimates and potential blow-up rates for the Navier–Stokes equations. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;385&lt;/strong&gt;, 717–792.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Navier–Stokes Regularity: The Uniqueness of Weak Solutions</title><link>https://blog.namln.org/en/posts/navier-stokes-weak-uniqueness/</link><pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/navier-stokes-weak-uniqueness/</guid><description>&lt;p&gt;The &lt;a href="../navier-stokes-existence-smoothness"&gt;companion post on Navier–Stokes existence and smoothness&lt;/a&gt;
asked whether smooth solutions can break down in finite time. This post asks the
opposite question: when a solution is only weakly defined, satisfying the equations
in an integral sense rather than pointwise, is it uniquely determined by its initial
data? The answer, developed over the last two decades through a dramatic series of
results, is a resounding &lt;em&gt;no&lt;/em&gt; in many regimes. The frontier is now whether the
physically natural class of Leray–Hopf weak solutions retains uniqueness.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Question (Weak Uniqueness)&lt;/span&gt;
&lt;p&gt;Are Leray–Hopf weak solutions of the 3D incompressible Navier–Stokes equations
$$\partial_t u + (u\cdot\nabla)u - \nu\Delta u + \nabla p = 0, \qquad \nabla\cdot u = 0$$
uniquely determined by their initial data $u_0 \in L^2(\mathbb{R}^3)$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The question is one of the most urgent open problems in the PDE theory of fluid
dynamics. It is logically independent of the blowup question: Leray–Hopf solutions
exist globally for all time regardless of whether smooth solutions break down. What
is not known is whether two Leray–Hopf solutions started from the same data must
coincide.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="nashs-h-principle-the-conceptual-ancestor"&gt;
 Nash&amp;rsquo;s h-Principle: The Conceptual Ancestor&lt;span class="heading__anchor"&gt; &lt;a href="#nashs-h-principle-the-conceptual-ancestor"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The story begins not in fluid mechanics but in differential geometry. In 1954,
John Nash proved that any Riemannian manifold admits a $C^1$ isometric embedding
into Euclidean space, a result that contradicted the expectation, based on the rigid
behaviour of $C^2$ embeddings (Cauchy), that the metric should impose strong
constraints. The key insight is that $C^1$ embeddings are &lt;em&gt;flexible&lt;/em&gt;: one can
deform them by adding high-frequency oscillations that are invisible at the large
scale but locally produce any prescribed metric tensor.&lt;/p&gt;
&lt;p&gt;Gromov formulated this phenomenon as the &lt;em&gt;h-principle&lt;/em&gt;: for certain underdetermined
differential relations, the topological (homotopy-theoretic) obstructions are the
only ones, and any formal solution can be deformed into an actual solution. The
h-principle is a flexibility result: it says geometry is surprisingly unconstrained
below a critical regularity threshold.&lt;/p&gt;
&lt;p&gt;De Lellis and Székelyhidi recognised in the mid-2000s that the incompressible Euler
equations are formally analogous to Nash&amp;rsquo;s embedding problem. The Euler system is
underdetermined (more unknowns than equations), and one can attempt to construct
wild solutions by adding high-frequency oscillations. The crucial observation is that
the nonlinearity $u\otimes u$ in the Reynolds stress tensor plays the role of the
metric tensor in Nash&amp;rsquo;s problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="wild-euler-solutions"&gt;
 Wild Euler Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#wild-euler-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The first step was to show that the Euler equations possess infinitely many weak
solutions for given initial data.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (De Lellis–Székelyhidi, 2009–2013)&lt;/span&gt;
&lt;p&gt;For any divergence-free $u _0 \in L^2(\mathbb{T}^3)$ and any prescribed energy
profile $e(t) \in C^\infty([0,T])$ with $e(t) &amp;gt; |u _0| _{L^2}^2$ for all $t &amp;gt; 0$,
there exist infinitely many weak solutions $u \in C_t^0 L_x^2$ of the 3D Euler
equations with $u(\cdot,0) = u _0$ and $|u(\cdot,t)| _{L^2}^2 = e(t)$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In particular, the Euler equations admit weak solutions that spontaneously gain or
lose kinetic energy for no reason: &lt;em&gt;wild solutions&lt;/em&gt;. The construction proceeds by
convex integration: one builds the solution iteratively, at each stage adding a
high-frequency perturbation (a &lt;em&gt;Beltrami wave&lt;/em&gt;) that corrects the error in the
momentum equation while staying nearly invisible in the velocity field.&lt;/p&gt;
&lt;p&gt;Earlier, Scheffer (1993) and Shnirelman (1997) had shown the existence of weak Euler
solutions with compact support in space-time: the fluid is at rest, then spontaneously
moves, then returns to rest; but their constructions were indirect. De Lellis and
Székelyhidi&amp;rsquo;s convex integration scheme gave the first systematic and quantitative
approach.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="onsagers-conjecture"&gt;
 Onsager&amp;rsquo;s Conjecture&lt;span class="heading__anchor"&gt; &lt;a href="#onsagers-conjecture"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The De Lellis–Székelyhidi results raise an immediate question: at what regularity
does the fluid behaviour transition from flexible (wild, non-unique) to rigid
(energy-conserving, unique)? This is precisely what Lars Onsager conjectured in 1949.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Onsager's Conjecture (1949)&lt;/span&gt;
&lt;p&gt;For the 3D incompressible Euler equations, the threshold regularity for energy
conservation is the Hölder exponent $1/3$:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If $u \in C^{0,\alpha}$ with $\alpha &amp;gt; 1/3$, then every weak solution conserves
kinetic energy.&lt;/li&gt;
&lt;li&gt;For every $\alpha &amp;lt; 1/3$, there exist weak solutions in $C^{0,\alpha}$ that
dissipate energy.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;The &lt;strong&gt;positive direction&lt;/strong&gt; (conservation above $1/3$) was proved by
Constantin–E–Titi (1994). The &lt;strong&gt;negative direction&lt;/strong&gt; (dissipation possible below
$1/3$) required much more work and was fully resolved only recently.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Isett, 2018)&lt;/span&gt;
&lt;p&gt;For every $\alpha &amp;lt; 1/3$ there exist weak solutions $u \in C^{0,\alpha}(\mathbb{T}^3\times[0,T])$
of the 3D Euler equations that fail to conserve kinetic energy.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Isett&amp;rsquo;s proof, published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; in 2018, was the culmination
of a decade of refinements of the De Lellis–Székelyhidi scheme. The key difficulty at
regularity exactly $1/3$ is that the high-frequency perturbations must be sized to
cancel the Reynolds stress error while staying in $C^{1/3-}$; this requires a
delicate interplay of oscillation and concentration (&lt;em&gt;intermittency&lt;/em&gt;). De Lellis,
Székelyhidi, Buckmaster, and Vicol also obtained solutions attaining any prescribed
energy profile in $C^{1/3-}$. Onsager&amp;rsquo;s conjecture is now a theorem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="viscous-non-uniqueness-buckmastervicol"&gt;
 Viscous Non-Uniqueness: Buckmaster–Vicol&lt;span class="heading__anchor"&gt; &lt;a href="#viscous-non-uniqueness-buckmastervicol"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Adapting the convex integration scheme from Euler to Navier–Stokes requires overcoming
the viscous term $\nu\Delta u$, which smooths out high-frequency oscillations. The
intermittent Beltrami waves used by Isett concentrate energy at sparse spatial sets,
reducing their interaction with the Laplacian. Buckmaster and Vicol exploited this
idea to bring convex integration into the viscous setting.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Buckmaster–Vicol, 2019)&lt;/span&gt;
&lt;p&gt;There exist infinitely many weak solutions $u \in C_t^0 L_x^2(\mathbb{T}^3)$ of the
3D Navier–Stokes equations, belonging to the same regularity class as Leray–Hopf
solutions, that do not satisfy the global energy inequality. In particular, weak
solutions of 3D Navier–Stokes are not unique in the class $C_t^0 L_x^2$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The Buckmaster–Vicol solutions, published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; &lt;strong&gt;189&lt;/strong&gt;
(2019), 101–144, are weak in both the PDE sense and the energy sense: they satisfy
the equations distributionally and have finite kinetic energy, but they can gain
energy spontaneously, violating the natural dissipation law $\partial _t|u| _{L^2}^2
\leq -2\nu|\nabla u| _{L^2}^2$.&lt;/p&gt;
&lt;p&gt;This non-uniqueness is striking but also limited: the Buckmaster–Vicol solutions
are not Leray–Hopf solutions, because Leray–Hopf solutions are required to satisfy
the &lt;em&gt;energy inequality&lt;/em&gt; $|u(t)| _{L^2}^2 \leq |u _0| _{L^2}^2$. Whether this
single additional constraint, that energy does not increase, suffices to restore
uniqueness is the open question.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="crossing-the-energy-barrier-albrittonbruécolombo"&gt;
 Crossing the Energy Barrier: Albritton–Brué–Colombo&lt;span class="heading__anchor"&gt; &lt;a href="#crossing-the-energy-barrier-albrittonbru%c3%a9colombo"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The energy inequality distinguishing Leray–Hopf solutions from Buckmaster–Vicol wild
solutions seemed for a long time to be a genuine barrier to non-uniqueness. The
following result crossed this barrier, but required introducing an external force.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Albritton–Brué–Colombo, 2022)&lt;/span&gt;
&lt;p&gt;There exists a body force $f \in L^1(0,T;, L^2(\mathbb{R}^3))$ and two distinct
Leray–Hopf weak solutions of the &lt;strong&gt;forced&lt;/strong&gt; 3D Navier–Stokes equations
$\partial_t u + (u\cdot\nabla)u - \nu\Delta u + \nabla p = f$ with the same initial
data $u_0 \equiv 0$ and the same force $f$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; &lt;strong&gt;196&lt;/strong&gt; (2022), 415–455, the proof uses a
completely different mechanism from convex integration. The key ingredient is an
&lt;em&gt;unstable&lt;/em&gt; background solution: using Vishik&amp;rsquo;s construction of spectrally unstable
steady states of the 2D Euler equations, Albritton–Brué–Colombo lift a 2D unstable
vortex ring to an axisymmetric 3D solution and embed it into the Navier–Stokes flow
via a self-similar change of variables. The force $f$ is chosen precisely to make
this background exactly solve the forced equations; the instability then allows two
different solutions to branch from the same initial data.&lt;/p&gt;
&lt;p&gt;The force is singular; it belongs to $L^1_t L^2_x$ but is not smooth, and is
concentrated near the initial time $t=0$. Whether the same non-uniqueness can be
achieved with a smooth or zero force is the remaining open problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-unforced-case-current-frontier"&gt;
 The Unforced Case: Current Frontier&lt;span class="heading__anchor"&gt; &lt;a href="#the-unforced-case-current-frontier"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Non-uniqueness of Leray–Hopf solutions for the &lt;em&gt;unforced&lt;/em&gt; Navier–Stokes equations
remains open. The route to the unforced case requires finding a self-similar
background profile that solves the unforced equations exactly and has an unstable
eigenvalue, a far more demanding task than the forced case, where the profile can
be any divergence-free function.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Open Problem (Jia–Šverák Programme)&lt;/span&gt;
&lt;p&gt;Do there exist two distinct Leray–Hopf solutions of the 3D Navier–Stokes equations
with the same initial data and no external force?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Jia and Šverák (2013–2014) showed that non-uniqueness would follow from a spectral
assumption: if there exists a forward self-similar Navier–Stokes solution whose
linearised operator has an eigenvalue with positive real part, then Leray–Hopf
solutions are non-unique. Guillod and Šverák (2017) provided compelling numerical
evidence that such an unstable self-similar profile exists.&lt;/p&gt;
&lt;p&gt;In September 2025, Giri and Kwon posted a preprint (arXiv:2509.25116) claiming a
computer-assisted proof of the existence of an unstable self-similar profile for
the unforced equations, which, via the Jia–Šverák mechanism, would establish
non-uniqueness of Leray–Hopf solutions. The proof uses rigorous interval arithmetic
to verify the existence of an unstable eigenvalue. As of this writing the preprint
is under review by the community.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-regularity-threshold"&gt;
 The Regularity Threshold&lt;span class="heading__anchor"&gt; &lt;a href="#the-regularity-threshold"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The accumulated results suggest the following picture of the
&lt;strong&gt;flexibility-rigidity dichotomy&lt;/strong&gt; for the Euler and Navier–Stokes equations.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Regularity class&lt;/th&gt;
					&lt;th&gt;Euler&lt;/th&gt;
					&lt;th&gt;Navier–Stokes&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;$C^{0,\alpha}$, $\alpha &amp;lt; 1/3$&lt;/td&gt;
					&lt;td&gt;non-unique, dissipative (Isett 2018)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$C^{0,\alpha}$, $\alpha &amp;gt; 1/3$&lt;/td&gt;
					&lt;td&gt;energy-conserving (Constantin–E–Titi 1994)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$L^2$ (global energy inequality)&lt;/td&gt;
					&lt;td&gt;non-unique&lt;/td&gt;
					&lt;td&gt;&lt;strong&gt;open (unforced); non-unique forced (ABC 2022)&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$L^\infty_t L^3_x$ (LPS regularity)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;unique and smooth (ESS 2003)&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The Leray–Hopf class sits precisely at the boundary where uniqueness is expected
to break down but has not yet been proved to do so in the unforced case.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-resolving-the-jiašverák-spectral-condition"&gt;
 1. Resolving the Jia–Šverák Spectral Condition&lt;span class="heading__anchor"&gt; &lt;a href="#1-resolving-the-jia%c5%a1ver%c3%a1k-spectral-condition"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most direct path to unforced Leray–Hopf non-uniqueness is to rigorously confirm
or refute the spectral condition of Jia–Šverák: find (or prove the nonexistence of)
a forward self-similar Navier–Stokes profile with an unstable linearised eigenvalue.
The 2025 Giri–Kwon computer-assisted preprint claims this is now done. If confirmed,
the consequence is striking: Leray&amp;rsquo;s 1934 existence theorem cannot be supplemented
by uniqueness, and the Navier–Stokes Cauchy problem is &lt;em&gt;ill-posed&lt;/em&gt; in the Leray–Hopf
class.&lt;/p&gt;
&lt;h3 class="heading" id="2-selection-principles-and-physical-solutions"&gt;
 2. Selection Principles and Physical Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#2-selection-principles-and-physical-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;If Leray–Hopf solutions are indeed non-unique, a fundamental question becomes which
solution is the physically correct one, the one observed in experiments and computed
in simulations. Several selection criteria have been proposed:
the &lt;em&gt;vanishing viscosity&lt;/em&gt; limit of the Navier–Stokes solution as $\nu\to 0$ from
above, &lt;em&gt;entropy conditions&lt;/em&gt; analogous to those for hyperbolic conservation laws,
and &lt;em&gt;renormalisation group&lt;/em&gt; or &lt;em&gt;statistical ensemble&lt;/em&gt; approaches motivated by
turbulence theory. None of these has been rigorously validated as a selection
criterion that distinguishes a unique Leray–Hopf solution from the others.&lt;/p&gt;
&lt;h3 class="heading" id="3-sharp-regularity-thresholds-for-navierstokes"&gt;
 3. Sharp Regularity Thresholds for Navier–Stokes&lt;span class="heading__anchor"&gt; &lt;a href="#3-sharp-regularity-thresholds-for-navierstokes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For Euler, Onsager&amp;rsquo;s conjecture identifies $C^{1/3}$ as the sharp regularity
threshold for energy conservation. What is the analogous threshold for Navier–Stokes?
The Buckmaster–Vicol solutions are in $C_t^0 L_x^2$ (very rough), while the
Ladyzhenskaya–Prodi–Serrin class gives uniqueness. The precise exponent at which
uniqueness breaks down, if it does, is not known. Determining the sharp Sobolev
or Hölder regularity threshold for Navier–Stokes uniqueness, analogous to Onsager&amp;rsquo;s
$1/3$, is a central open problem.&lt;/p&gt;
&lt;h3 class="heading" id="4-uniqueness-for-axisymmetric-initial-data"&gt;
 4. Uniqueness for Axisymmetric Initial Data&lt;span class="heading__anchor"&gt; &lt;a href="#4-uniqueness-for-axisymmetric-initial-data"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A natural restricted problem is whether Leray–Hopf solutions with axisymmetric,
swirl-free initial data are unique. Such data imposes a strong geometric constraint
that eliminates most of the degrees of freedom available to convex integration.
Partial results are known (e.g., global regularity for axisymmetric data without
swirl is not proved but no counterexamples exist), but uniqueness in this class
has not been established. If the Giri–Kwon instability is confirmed, understanding
whether the instability mechanism survives axisymmetric perturbations is an
immediate question.&lt;/p&gt;
&lt;h3 class="heading" id="5-stochastic-regularisation"&gt;
 5. Stochastic Regularisation&lt;span class="heading__anchor"&gt; &lt;a href="#5-stochastic-regularisation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;There is a well-studied phenomenon, &lt;em&gt;regularisation by noise&lt;/em&gt;, in which adding
a stochastic forcing term to an ill-posed deterministic PDE restores well-posedness.
For the Navier–Stokes equations, Hofmanová–Zhu–Zhu (2023) showed non-uniqueness
persists even under multiplicative noise for certain body forces, by adapting the
Albritton–Brué–Colombo construction. Whether a generic stochastic perturbation
can restore uniqueness of Leray–Hopf solutions, and what the appropriate notion of
&amp;ldquo;generic&amp;rdquo; should be, is a rich open direction combining convex integration with stochastic
analysis.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Nash, J. (1954). $C^1$ isometric imbeddings. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;60&lt;/strong&gt;(3), 383–396.&lt;/li&gt;
&lt;li&gt;De Lellis, C. &amp;amp; Székelyhidi, L. (2009). The Euler equations as a differential inclusion. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;170&lt;/strong&gt;(3), 1417–1436.&lt;/li&gt;
&lt;li&gt;De Lellis, C. &amp;amp; Székelyhidi, L. (2013). Dissipative continuous Euler flows. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;, &lt;strong&gt;193&lt;/strong&gt;(2), 377–407.&lt;/li&gt;
&lt;li&gt;Constantin, P., E, W., &amp;amp; Titi, E. S. (1994). Onsager&amp;rsquo;s conjecture on the energy conservation for solutions of Euler&amp;rsquo;s equation. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;165&lt;/strong&gt;(1), 207–209.&lt;/li&gt;
&lt;li&gt;Isett, P. (2018). A proof of Onsager&amp;rsquo;s conjecture. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;188&lt;/strong&gt;(3), 871–963.&lt;/li&gt;
&lt;li&gt;Buckmaster, T. &amp;amp; Vicol, V. (2019). Nonuniqueness of weak solutions to the Navier–Stokes equation. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;189&lt;/strong&gt;(1), 101–144.&lt;/li&gt;
&lt;li&gt;Buckmaster, T. &amp;amp; Vicol, V. (2019). Convex integration and phenomenologies in turbulence. &lt;em&gt;EMS Surveys in Mathematical Sciences&lt;/em&gt;, &lt;strong&gt;6&lt;/strong&gt;(1–2), 1–88.&lt;/li&gt;
&lt;li&gt;Albritton, D., Brué, E., &amp;amp; Colombo, M. (2022). Non-uniqueness of Leray solutions of the forced Navier–Stokes equations. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;196&lt;/strong&gt;(1), 415–455.&lt;/li&gt;
&lt;li&gt;Jia, H. &amp;amp; Šverák, V. (2014). Local-in-space estimates near initial time for weak solutions of the Navier–Stokes equations and forward self-similar solutions. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;, &lt;strong&gt;196&lt;/strong&gt;(1), 233–265.&lt;/li&gt;
&lt;li&gt;Giri, V. &amp;amp; Kwon, H. (2025). Nonuniqueness of Leray–Hopf solutions to the unforced incompressible 3D Navier–Stokes equation. arXiv:2509.25116.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>The Regularity Problem for the 3D Euler Equations</title><link>https://blog.namln.org/en/posts/euler-regularity-problem/</link><pubDate>Fri, 29 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/euler-regularity-problem/</guid><description>&lt;p&gt;Leonhard Euler wrote down the equations governing the motion of an ideal
incompressible fluid in 1757. Whether smooth solutions to these equations can
develop a singularity in finite time, a point at which derivatives of the
velocity blow up, has been an open problem ever since, and remains one of the
central questions in mathematical fluid dynamics.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Problem (Euler Regularity)&lt;/span&gt;
&lt;p&gt;Let $u_0 : \mathbb{R}^3 \to \mathbb{R}^3$ be a smooth, divergence-free initial
velocity field with sufficient decay at infinity. Does the unique local smooth
solution $u(x,t)$ to the 3D incompressible Euler equations
$$\partial_t u + (u \cdot \nabla)u + \nabla p = 0, \qquad \nabla \cdot u = 0, \qquad u(\cdot,0)=u_0$$
remain smooth for all time $t &amp;gt; 0$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;L4&lt;/em&gt; on &lt;a href="https://www.unsolvedmath.com/problems/PDE-001"&gt;UnsolvedMath&lt;/a&gt;,
reflecting its depth, and is closely related to the Clay Millennium Prize Problem
on the Navier–Stokes equations. The two questions are linked through the
zero-viscosity limit, but neither implies the other.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-equations-and-what-regularity-means"&gt;
 The Equations and What Regularity Means&lt;span class="heading__anchor"&gt; &lt;a href="#the-equations-and-what-regularity-means"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The Euler equations express conservation of momentum (first equation) and
incompressibility (second equation) for an inviscid fluid. The unknowns are the
velocity field $u(x,t) \in \mathbb{R}^3$ and pressure $p(x,t) \in \mathbb{R}$;
the pressure is determined implicitly by incompressibility via an elliptic equation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vorticity.&lt;/strong&gt; The central quantity for singularity analysis is the vorticity
$\omega = \nabla \times u$, which satisfies the vorticity equation
$$\partial_t \omega + (u \cdot \nabla)\omega = (\omega \cdot \nabla)u.$$
The right-hand side, the &lt;em&gt;vortex stretching&lt;/em&gt; term, is the essential source of
difficulty. It creates a quadratic feedback: large $\omega$ produces large
$(\omega \cdot \nabla)u$, which can further amplify $\omega$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local well-posedness.&lt;/strong&gt; For $u_0 \in H^s(\mathbb{R}^3)$ with $s &amp;gt; 5/2$, there
exists a unique smooth solution on a time interval $[0, T^*)$ for some $T^* &amp;gt; 0$
depending on $|u _0| _{H^s}$ (Kato, 1972). The question is whether $T^*$ can be
taken equal to $+\infty$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why 2D is easy, 3D is not.&lt;/strong&gt; In two dimensions the vortex stretching term
$(\omega \cdot \nabla)u$ vanishes identically by antisymmetry. The scalar vorticity
$\omega = \partial_1 u_2 - \partial_2 u_1$ is then simply transported along fluid
particle paths without amplification, and $|\omega|_{L^\infty}$ is conserved.
Global regularity in 2D follows immediately. In 3D no such conservation holds,
and the problem is genuinely open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-bealekatomajda-criterion"&gt;
 The Beale–Kato–Majda Criterion&lt;span class="heading__anchor"&gt; &lt;a href="#the-bealekatomajda-criterion"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The first major structural result reduces the regularity problem to a single quantity.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Beale–Kato–Majda, 1984)&lt;/span&gt;
&lt;p&gt;A smooth solution $u$ of the 3D Euler equations loses regularity at time $T^*$ if
and only if
$$\int _0^{T^*} |\omega(\cdot,t)| _{L^\infty(\mathbb{R}^3)}, dt = +\infty.$$
In particular, if the vorticity remains bounded in $L^\infty$ on $[0,T]$ for every
finite $T$, the solution remains smooth globally.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The BKM criterion redirects the problem: one must show that the vorticity magnitude
$|\omega|_{L^\infty}$ cannot accumulate to infinity in finite time. Since $\omega$
satisfies a transport-stretching equation, this requires understanding the geometric
structure of the vorticity field under its own evolution.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="geometric-conditions-and-depletion-of-stretching"&gt;
 Geometric Conditions and Depletion of Stretching&lt;span class="heading__anchor"&gt; &lt;a href="#geometric-conditions-and-depletion-of-stretching"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The vortex stretching term $(\omega \cdot \nabla)u$ can be decomposed as
$$(\omega \cdot \nabla)u = |\omega|^2 (\hat\omega \cdot \nabla)\hat u,$$
where $\hat\omega = \omega/|\omega|$ is the unit vorticity direction. The key
observation is that stretching is governed not only by the magnitude of $\omega$
but also by the &lt;em&gt;geometry&lt;/em&gt; of the vorticity field.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Constantin–Fefferman–Majda, 1996)&lt;/span&gt;
&lt;p&gt;If the unit vorticity direction $\hat\omega = \omega/|\omega|$ is uniformly Lipschitz
in a neighbourhood of the set ${|\omega| &amp;gt; \lambda}$ for all $t \in [0, T]$ and
some $\lambda &amp;gt; 0$, then the solution remains smooth on $[0,T]$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This result says that blowup, if it occurs, must be accompanied by violent geometric
irregularity of vortex lines, not just large vorticity magnitude, but also loss of
Lipschitz regularity of the vorticity direction. It has motivated a line of research
on the geometric structure of vortex tubes near potential singularities.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="blowup-for-less-regular-data"&gt;
 Blowup for Less Regular Data&lt;span class="heading__anchor"&gt; &lt;a href="#blowup-for-less-regular-data"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Recent years have seen dramatic progress on singularity formation for initial data
that is smooth except at isolated points.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Elgindi, 2021)&lt;/span&gt;
&lt;p&gt;There exist axisymmetric, swirl-free initial velocity fields $u_0 \in C^{1,\alpha}(\mathbb{R}^3)$
for sufficiently small $\alpha &amp;gt; 0$ such that the corresponding solution to the 3D
Euler equations develops a finite-time singularity.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Elgindi&amp;rsquo;s proof, published in the &lt;em&gt;Annals of Mathematics&lt;/em&gt; &lt;strong&gt;194&lt;/strong&gt; (2021), 647–727,
constructs a self-similar blowup profile and establishes its nonlinear stability using
a dynamical rescaling formulation. The initial data is not smooth: it belongs to
$C^{1,\alpha}$ but not to $C^2$. The singularity forms at the axis of symmetry $r=0$.&lt;/p&gt;
&lt;p&gt;This was a breakthrough, but it left open the smooth case. Elgindi himself noted the
next target: constructing blowup from initial data that is non-smooth only at a
single point, or eventually from fully smooth data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extending Elgindi&amp;rsquo;s construction.&lt;/strong&gt; Chen and Hou (2022) proved the same type of
$C^{1,\alpha}$ blowup for the 3D axisymmetric Euler equations &lt;em&gt;with boundary&lt;/em&gt; (inside
a periodic cylinder), realising the Hou–Luo blowup scenario numerically proposed in
2014. Subsequent work by Córdoba, Martínez-Zoroa, and Zheng (2025, &lt;em&gt;Annals of PDE&lt;/em&gt;)
showed that the singularity can be formed from initial data in
$C^\infty(\mathbb{R}^3 \setminus {0}) \cap C^{1,\alpha}$, with non-smoothness at a
single point, a further step toward the smooth case.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-2025-breakthrough-smooth-blowup-with-boundary"&gt;
 The 2025 Breakthrough: Smooth Blowup with Boundary&lt;span class="heading__anchor"&gt; &lt;a href="#the-2025-breakthrough-smooth-blowup-with-boundary"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The most significant recent development is the following result, which provides a
rigorous proof of finite-time singularity from &lt;em&gt;smooth&lt;/em&gt; initial data.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Chen–Hou, PNAS 2025)&lt;/span&gt;
&lt;p&gt;There exists a family of smooth, finite-energy initial data for the 3D axisymmetric
Euler equations in a smooth bounded domain (periodic cylinder) such that the
corresponding solutions develop a finite-time singularity. The blowup is
nearly self-similar and occurs at the intersection of the boundary $r=1$
and the symmetry plane $z=0$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The paper, contributed by Thomas Hou and published in &lt;em&gt;PNAS&lt;/em&gt; in June 2025
(reviewed by Caflisch, Gómez-Serrano, Sverak, and Tao), provides a
&lt;em&gt;computer-assisted proof&lt;/em&gt;. The strategy is to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;construct a numerical approximate self-similar blowup profile via the dynamical
rescaling formulation,&lt;/li&gt;
&lt;li&gt;prove rigorously that the true solution remains close to this profile using
energy estimates with carefully verified error bounds (computed with interval
arithmetic), and&lt;/li&gt;
&lt;li&gt;conclude nonlinear stability of the blowup via a bootstrap argument.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This resolves the problem affirmatively in the setting of smooth data and a
smooth bounded domain. The boundary plays a crucial role: it creates an
antisymmetric flow pattern driving azimuthal vorticity toward a critical ring,
generating intense vortex stretching at a hyperbolic saddle point on the wall.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The remaining open case.&lt;/strong&gt; The problem in $\mathbb{R}^3$ (or on the periodic
torus $\mathbb{T}^3$) &lt;em&gt;without boundary&lt;/em&gt; remains open. It is not known whether
smooth initial data in free space can produce a singularity, or whether the
absence of a boundary provides a genuine stabilising mechanism.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-removing-the-boundary"&gt;
 1. Removing the Boundary&lt;span class="heading__anchor"&gt; &lt;a href="#1-removing-the-boundary"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most pressing open question is whether the Chen–Hou construction can be
extended to $\mathbb{R}^3$ or $\mathbb{T}^3$. The boundary in the 2025 result
acts as a geometric catalyst: it enforces a no-flow condition that concentrates
vorticity at a specific ring on the wall. Without a boundary, the antisymmetric
flow structure that drives the singularity must be sustained entirely by the
initial data and the nonlinear dynamics. Whether a comparable mechanism can
persist in free space, without the reflective constraint of the wall, is the
central open question.&lt;/p&gt;
&lt;h3 class="heading" id="2-self-similar-blowup-in-full-3d"&gt;
 2. Self-Similar Blowup in Full 3D&lt;span class="heading__anchor"&gt; &lt;a href="#2-self-similar-blowup-in-full-3d"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;All current singularity results are for &lt;em&gt;axisymmetric&lt;/em&gt; flows, which reduce the
problem from 3 spatial dimensions to 2 (the $rz$-plane). In full 3D, the angular
variable $\theta$ is active, and perturbations in the azimuthal direction can either
stabilise or destabilise the singularity. Elgindi, Ghoul, and Masmoudi (2021) proved
stability of the $C^{1,\alpha}$ blowup under axisymmetric perturbations. Whether
the singularity survives &lt;em&gt;fully 3D&lt;/em&gt; (non-axisymmetric) perturbations, a question
Elgindi posed as open, is crucial: a blowup that is destroyed by any non-symmetric
perturbation has limited physical relevance.&lt;/p&gt;
&lt;h3 class="heading" id="3-quantitative-vortex-stretching-and-the-role-of-geometry"&gt;
 3. Quantitative Vortex Stretching and the Role of Geometry&lt;span class="heading__anchor"&gt; &lt;a href="#3-quantitative-vortex-stretching-and-the-role-of-geometry"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The BKM criterion and the Constantin–Fefferman–Majda theorem both express the
same idea from opposite directions: blowup is controlled by the magnitude &lt;em&gt;and&lt;/em&gt;
geometry of the vorticity. Current research asks whether a quantitative version can
be made sharp. Specifically: if the vorticity direction $\hat\omega$ becomes
Hölder-continuous but not Lipschitz, does blowup necessarily follow? Or is there
a finer scale invariant quantity, perhaps involving the Hessian of the velocity
or the curvature of vortex lines, that governs the problem?&lt;/p&gt;
&lt;h3 class="heading" id="4-weak-solutions-and-non-uniqueness"&gt;
 4. Weak Solutions and Non-Uniqueness&lt;span class="heading__anchor"&gt; &lt;a href="#4-weak-solutions-and-non-uniqueness"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Separate from the question of whether smooth solutions blow up is the question of
what happens &lt;em&gt;after&lt;/em&gt; a potential singularity. De Lellis and Székelyhidi (2009–2013)
proved that the Euler equations have infinitely many weak $L^\infty$ solutions
for generic initial data, via convex integration. Isett (2018) proved that weak
solutions can dissipate energy, confirming Onsager&amp;rsquo;s 1949 conjecture. These results
show that the solution concept must be carefully chosen. After a smooth blowup, the
system likely enters a regime of non-unique weak solutions, and identifying the
physically relevant selection criterion, entropy conditions, vanishing viscosity,
$h$-principle, is a major open problem.&lt;/p&gt;
&lt;h3 class="heading" id="5-vanishing-viscosity-and-the-navierstokes-connection"&gt;
 5. Vanishing Viscosity and the Navier–Stokes Connection&lt;span class="heading__anchor"&gt; &lt;a href="#5-vanishing-viscosity-and-the-navierstokes-connection"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Navier–Stokes equations add a viscous term $\nu \Delta u$ to the right-hand
side. For any $\nu &amp;gt; 0$, global regularity of Navier–Stokes in 3D is itself open
(the Clay Millennium Problem). For the zero-viscosity limit $\nu \to 0$, the
central question is whether Navier–Stokes solutions converge to Euler solutions
uniformly in time, a question tied to boundary layer behaviour (the Prandtl
conjecture) and to the regularity of the Euler solution. If Euler develops a
singularity at time $T^*$, the behaviour of Navier–Stokes solutions near $T^*$
as $\nu \to 0$ is completely unknown.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Euler, L. (1757). Principes généraux du mouvement des fluides. &lt;em&gt;Mémoires de l&amp;rsquo;Académie des Sciences de Berlin&lt;/em&gt;, &lt;strong&gt;11&lt;/strong&gt;, 274–315.&lt;/li&gt;
&lt;li&gt;Beale, J. T., Kato, T., &amp;amp; Majda, A. (1984). Remarks on the breakdown of smooth solutions for the 3-D Euler equations. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;94&lt;/strong&gt;(1), 61–66.&lt;/li&gt;
&lt;li&gt;Constantin, P., Fefferman, C., &amp;amp; Majda, A. J. (1996). Geometric constraints on potentially singular solutions for the 3-D Euler equations. &lt;em&gt;Communications in Partial Differential Equations&lt;/em&gt;, &lt;strong&gt;21&lt;/strong&gt;(3–4), 559–571.&lt;/li&gt;
&lt;li&gt;Elgindi, T. M. (2021). Finite-time singularity formation for $C^{1,\alpha}$ solutions to the incompressible Euler equations on $\mathbb{R}^3$. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;194&lt;/strong&gt;(3), 647–727.&lt;/li&gt;
&lt;li&gt;Elgindi, T. M., Ghoul, T.-E., &amp;amp; Masmoudi, N. (2021). On the stability of self-similar blow-up for $C^{1,\alpha}$ solutions to the incompressible Euler equations. &lt;em&gt;Cambridge Journal of Mathematics&lt;/em&gt;, &lt;strong&gt;9&lt;/strong&gt;(4), 1035–1075.&lt;/li&gt;
&lt;li&gt;Chen, J. &amp;amp; Hou, T. Y. (2023). Finite time blowup of 2D Boussinesq and 3D Euler equations with $C^{1,\alpha}$ velocity and boundary. &lt;em&gt;Communications in Mathematical Physics&lt;/em&gt;, &lt;strong&gt;383&lt;/strong&gt;, 4827–4890.&lt;/li&gt;
&lt;li&gt;Chen, J. &amp;amp; Hou, T. Y. (2025). Singularity formation in 3D Euler equations with smooth initial data and boundary. &lt;em&gt;Proceedings of the National Academy of Sciences&lt;/em&gt;, &lt;strong&gt;122&lt;/strong&gt;(27). &lt;a href="https://doi.org/10.1073/pnas.2500940122"&gt;https://doi.org/10.1073/pnas.2500940122&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Córdoba, D., Martínez-Zoroa, L., &amp;amp; Zheng, F. (2025). Finite time singularities to the 3D incompressible Euler equations for solutions in $C^\infty(\mathbb{R}^3\setminus{0})\cap C^{1,\alpha}\cap L^2$. &lt;em&gt;Annals of PDE&lt;/em&gt;. &lt;a href="https://doi.org/10.1007/s40818-025-00214-2"&gt;https://doi.org/10.1007/s40818-025-00214-2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Isett, P. (2018). A proof of Onsager&amp;rsquo;s conjecture. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;188&lt;/strong&gt;(3), 871–963.&lt;/li&gt;
&lt;li&gt;Majda, A. J. &amp;amp; Bertozzi, A. L. (2002). &lt;em&gt;Vorticity and Incompressible Flow&lt;/em&gt;. Cambridge University Press.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>$C^r$ Stability Conjecture</title><link>https://blog.namln.org/en/posts/cr-stability-conjecture/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/cr-stability-conjecture/</guid><description>&lt;p&gt;Structural stability is a global topological property: a dynamical system is
structurally stable if all nearby systems have the same orbit structure, up to
continuous reparametrisation. Hyperbolicity is a local differential property:
the tangent bundle over the recurrent set splits into uniformly contracting and
expanding directions. That these two conditions should be equivalent is one of the
deepest principles in smooth dynamics.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Conjecture ($C^r$ Stability Conjecture, Palis–Smale, ~1970)&lt;/span&gt;
&lt;p&gt;Let $M$ be a closed smooth manifold and $r \geq 1$. If $f \in \mathrm{Diff}^r(M)$
is $C^r$-structurally stable, then $f$ is hyperbolic, i.e., it satisfies
&lt;strong&gt;Axiom A&lt;/strong&gt; and the &lt;strong&gt;Strong Transversality Condition&lt;/strong&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;L3&lt;/em&gt; on &lt;a href="https://www.unsolvedmath.com/problems/OPG-725"&gt;UnsolvedMath&lt;/a&gt;
and sits at the heart of the global theory of smooth dynamical systems. The case
$r = 1$ is resolved. The case $r \geq 2$ is open, and even basic consequences of
structural stability that are elementary for $r = 1$ remain unknown for $r = 2$.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="key-definitions"&gt;
 Key Definitions&lt;span class="heading__anchor"&gt; &lt;a href="#key-definitions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Structural stability.&lt;/strong&gt; A diffeomorphism $f \in \mathrm{Diff}^r(M)$ is
&lt;em&gt;$C^r$-structurally stable&lt;/em&gt; if there exists a $C^r$-neighborhood $\mathcal{U}$ of $f$
such that every $g \in \mathcal{U}$ is topologically conjugate to $f$: there is a
homeomorphism $h : M \to M$ with $h \circ f = g \circ h$. The system is therefore
robust under $C^r$-small perturbations in the strongest possible sense: topology,
not just orbit counts, is preserved.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Axiom A.&lt;/strong&gt; The diffeomorphism $f$ satisfies &lt;em&gt;Axiom A&lt;/em&gt; if:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the non-wandering set $\Omega(f)$ is hyperbolic: there is a $Df$-invariant splitting
$T_x M = E^s_x \oplus E^u_x$ over $\Omega(f)$ with uniform exponential contraction
on $E^s$ and expansion on $E^u$;&lt;/li&gt;
&lt;li&gt;the periodic points of $f$ are dense in $\Omega(f)$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Strong Transversality Condition (STC).&lt;/strong&gt; For every $x, y \in \Omega(f)$, the
stable manifold $W^s(x)$ and the unstable manifold $W^u(y)$ intersect transversally.
Tangential intersections, namely &lt;em&gt;homoclinic or heteroclinic tangencies&lt;/em&gt;, are forbidden.&lt;/p&gt;
&lt;p&gt;Together, Axiom A and the STC constitute what is usually meant by saying $f$ is
&lt;em&gt;hyperbolic&lt;/em&gt; in the sense of the stability conjecture.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-two-directions"&gt;
 The Two Directions&lt;span class="heading__anchor"&gt; &lt;a href="#the-two-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The conjecture, as an equivalence, has an easy direction and a hard direction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structural stability follows from hyperbolicity&lt;/strong&gt; (the easy direction). Robbin (1971)
proved this for $C^2$ diffeomorphisms; Robinson (1976) extended it to $C^1$. Both
proofs use the implicit function theorem on an appropriate space of conjugacies,
and work for all $r \geq 1$ since Axiom A + STC is the hypothesis.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Robbin 1971, Robinson 1976)&lt;/span&gt;
&lt;p&gt;For every $r \geq 1$, if $f \in \mathrm{Diff}^r(M)$ satisfies Axiom A and the
Strong Transversality Condition, then $f$ is $C^r$-structurally stable.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Hyperbolicity follows from structural stability&lt;/strong&gt; (the hard direction) is the
conjecture itself. It requires understanding what structural stability forces on
the dynamics, ruling out every non-hyperbolic mechanism compatible with stability.
This is where the difficulty lies, and where the gap between $r = 1$ and $r \geq 2$
opens.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-c1-case-mañés-theorem"&gt;
 The $C^1$ Case: Mañé&amp;rsquo;s Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#the-c1-case-ma%c3%b1%c3%a9s-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The $C^1$ stability conjecture was fully proved by Mañé in 1987.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Mañé, 1987)&lt;/span&gt;
&lt;p&gt;Every $C^1$-structurally stable diffeomorphism of a closed manifold satisfies
Axiom A and the Strong Transversality Condition.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The proof, published in &lt;em&gt;Publ. Math. IHÉS&lt;/em&gt; &lt;strong&gt;66&lt;/strong&gt; (1987), 161–210, is a tour de
force of $C^1$ perturbation theory. It rests on several tools that are available
only in the $C^1$ topology:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Pugh&amp;rsquo;s $C^1$ closing lemma (1967):&lt;/strong&gt; Given a non-wandering point $x$ of $f$,
one can make an arbitrarily small $C^1$ perturbation of $f$ to create a periodic
orbit passing near $x$. This is the essential mechanism for showing that periodic
points are dense in $\Omega(f)$.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mañé&amp;rsquo;s ergodic closing lemma (1982):&lt;/strong&gt; A more refined version that controls the
Lyapunov exponents of the created periodic orbit, allowing the construction of
hyperbolic periodic points that shadow the orbit of an ergodic measure.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Franks&amp;rsquo; lemma (1971):&lt;/strong&gt; Linear maps along periodic orbits can be prescribed
independently (up to $C^1$ conjugacy), allowing one to test whether a given
splitting is genuinely hyperbolic or can be destroyed by a small $C^1$ perturbation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The strategy is to assume structural stability and use these tools to show, step by
step, that the non-wandering set must be hyperbolic and that tangencies cannot persist.
Mañé had proved the surface case ($\dim M = 2$, $r = 1$) earlier, with the full
higher-dimensional result completed in the 1987 paper. Aoki (1992) and Hayashi (1992)
subsequently settled the closely related Mañé conjecture on the $C^1$ interior of the
set of diffeomorphisms with all hyperbolic periodic points.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-wall-at-r-geq-2"&gt;
 The Wall at $r \geq 2$&lt;span class="heading__anchor"&gt; &lt;a href="#the-wall-at-r-geq-2"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The $C^r$ case for $r \geq 2$ is not merely an incremental extension. The tools that
power Mañé&amp;rsquo;s proof are fundamentally $C^1$ phenomena.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The $C^r$ closing lemma is open for $r \geq 2$.&lt;/strong&gt; Pugh&amp;rsquo;s closing lemma fails for
$r \geq 2$ in general: Gutierrez showed that the local perturbation argument used
for $C^1$ does not work in the $C^2$ topology. A $C^r$ closing lemma is available
only for specific classes of diffeomorphisms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Conservative (volume-preserving) diffeomorphisms on surfaces: Asaoka–Irie
($C^\infty$, 2015), Cristofaro-Gardiner–Prasad–Zhang (2023).&lt;/li&gt;
&lt;li&gt;Partially hyperbolic diffeomorphisms with one-dimensional center bundle (all
$r \geq 2$ including $r = \infty$): Gan–Shi (2022) and the follow-up
$C^r$-chain closing lemma of Shi–Wang (&lt;em&gt;Ergodic Theory Dynam. Syst.&lt;/em&gt; &lt;strong&gt;44&lt;/strong&gt;, 2024).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the absence of a general $C^r$ closing lemma, the first step of Mañé&amp;rsquo;s proof,
showing that periodic points are dense in $\Omega(f)$ under $C^r$ structural
stability, is not known for $r \geq 2$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mañé himself underscored this gap.&lt;/strong&gt; In the 1987 paper, immediately after the
proof of Theorem A, he writes that for $r &amp;gt; 1$ &amp;ldquo;not even [being] known whether a
$C^2$ structurally stable diffeomorphism has at least one periodic point, it seems,
to say the least, difficult to prove that they are dense.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Franks&amp;rsquo; lemma also fails for $r \geq 2$.&lt;/strong&gt; Controlling linear maps along periodic
orbits requires $C^1$ perturbations; in higher regularity the ambient perturbation
must be smooth and the constraints on higher derivatives can prevent the desired
linear behaviour from being achieved.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-the-cr-closing-lemma-for-general-diffeomorphisms"&gt;
 1. The $C^r$ Closing Lemma for General Diffeomorphisms&lt;span class="heading__anchor"&gt; &lt;a href="#1-the-cr-closing-lemma-for-general-diffeomorphisms"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most direct path to the $C^r$ stability conjecture passes through a general
$C^r$ closing lemma. For $r \geq 2$ this asks: given any non-wandering point of a
$C^r$ diffeomorphism, can one make an arbitrarily small $C^r$ perturbation to close
the orbit? Answering this in the affirmative for all closed manifolds and all
$r \geq 2$ would be a landmark result, and would immediately advance the stability
conjecture. The recent progress in conservative surface dynamics (Cristofaro-Gardiner
et al., 2023) and partially hyperbolic settings shows the question is not hopeless,
but the general dissipative case remains untouched.&lt;/p&gt;
&lt;h3 class="heading" id="2-the-surface-case-dim-m--2-r-geq-2"&gt;
 2. The Surface Case $\dim M = 2$, $r \geq 2$&lt;span class="heading__anchor"&gt; &lt;a href="#2-the-surface-case-dim-m--2-r-geq-2"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;On surfaces the dynamics is simpler: the non-wandering set has lower-dimensional
structure, and the absence of a center bundle means &amp;ldquo;partially hyperbolic&amp;rdquo; reduces
to &amp;ldquo;hyperbolic.&amp;rdquo; Mañé settled the surface case for $r = 1$. The $C^r$ stability
conjecture for surfaces and $r \geq 2$ is already an important open target and may
be the most accessible subcase. Recent $C^\infty$ closing lemmas for conservative
surface diffeomorphisms (Asaoka–Irie) suggest that the conservative surface case
may be reachable.&lt;/p&gt;
&lt;h3 class="heading" id="3-partially-hyperbolic-diffeomorphisms"&gt;
 3. Partially Hyperbolic Diffeomorphisms&lt;span class="heading__anchor"&gt; &lt;a href="#3-partially-hyperbolic-diffeomorphisms"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A diffeomorphism is &lt;em&gt;partially hyperbolic&lt;/em&gt; if the tangent bundle splits as
$TM = E^{ss} \oplus E^c \oplus E^{uu}$ with uniform contraction on $E^{ss}$,
uniform expansion on $E^{uu}$, and an intermediate &amp;ldquo;center&amp;rdquo; bundle $E^c$.
For these systems, Gan–Shi (2022) and Shi–Wang (2024) have established $C^r$
closing and chain-closing lemmas when $\dim E^c = 1$. The question is whether
$C^r$-structural stability of a partially hyperbolic diffeomorphism forces the
center bundle to also become hyperbolic, that is, whether partial hyperbolicity
implies full hyperbolicity under stability.&lt;/p&gt;
&lt;h3 class="heading" id="4-the-palis-global-conjecture"&gt;
 4. The Palis Global Conjecture&lt;span class="heading__anchor"&gt; &lt;a href="#4-the-palis-global-conjecture"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Palis proposed that the complement of the hyperbolic diffeomorphisms is exactly the
closure of systems exhibiting &lt;em&gt;homoclinic tangencies&lt;/em&gt; or &lt;em&gt;heteroclinic cycles&lt;/em&gt;. This
is a positive description of non-hyperbolic dynamics, and is a strengthening of the
$C^r$ stability conjecture (it would also characterise what structural stability
forbids). In $C^1$ topology this programme is largely complete through Bonatti–
Crovisier&amp;rsquo;s connecting lemma (2004) and related results. For $r \geq 2$ it is wide
open, and progress on the Palis conjecture in $C^r$ would likely resolve the
stability conjecture as a corollary.&lt;/p&gt;
&lt;h3 class="heading" id="5-flows-and-the-vector-field-analogue"&gt;
 5. Flows and the Vector Field Analogue&lt;span class="heading__anchor"&gt; &lt;a href="#5-flows-and-the-vector-field-analogue"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The stability conjecture has a natural analogue for $C^r$ vector fields: a
$C^r$-structurally stable flow should satisfy Axiom A and the strong transversality
condition. For $r = 1$ this is also proved. For $r \geq 2$ it is open. The vector
field setting introduces additional complications from singular points (zeros of the
vector field), as Labarca–Pacifico showed that on manifolds with boundary stable
flows can fail Axiom A, so the correct formulation may need adaptation. Progress
on the diffeomorphism case would likely shed light on the flow case as well.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Palis, J. &amp;amp; Smale, S. (1970). Structural stability theorems. &lt;em&gt;Proc. Sympos. Pure Math.&lt;/em&gt;, &lt;strong&gt;14&lt;/strong&gt;, 223–231.&lt;/li&gt;
&lt;li&gt;Robbin, J. W. (1971). A structural stability theorem. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;94&lt;/strong&gt;(2), 447–493.&lt;/li&gt;
&lt;li&gt;Robinson, C. (1976). Structural stability of $C^1$ diffeomorphisms. &lt;em&gt;Journal of Differential Equations&lt;/em&gt;, &lt;strong&gt;22&lt;/strong&gt;(1), 28–73.&lt;/li&gt;
&lt;li&gt;Mañé, R. (1987). A proof of the $C^1$ stability conjecture. &lt;em&gt;Publications Mathématiques de l&amp;rsquo;IHÉS&lt;/em&gt;, &lt;strong&gt;66&lt;/strong&gt;, 161–210.&lt;/li&gt;
&lt;li&gt;Aoki, N. (1992). The set of Axiom A diffeomorphisms with no cycles. &lt;em&gt;Bol. Soc. Brasil. Mat.&lt;/em&gt;, &lt;strong&gt;23&lt;/strong&gt;(1–2), 21–65.&lt;/li&gt;
&lt;li&gt;Hayashi, S. (1992). Diffeomorphisms in $\mathcal{F}^1(M)$ satisfy Axiom A. &lt;em&gt;Ergodic Theory Dynam. Systems&lt;/em&gt;, &lt;strong&gt;12&lt;/strong&gt;(2), 233–253.&lt;/li&gt;
&lt;li&gt;Gan, S. &amp;amp; Shi, Y. (2022). $C^r$-closing lemma for partially hyperbolic diffeomorphisms with 1D-center bundle. &lt;em&gt;Journal of Differential Equations&lt;/em&gt;, &lt;strong&gt;334&lt;/strong&gt;, 337–363.&lt;/li&gt;
&lt;li&gt;Shi, Y. &amp;amp; Wang, X. (2024). $C^r$-chain closing lemma for certain partially hyperbolic diffeomorphisms. &lt;em&gt;Ergodic Theory Dynam. Systems&lt;/em&gt;, &lt;strong&gt;44&lt;/strong&gt;(7), 1923–1944.&lt;/li&gt;
&lt;li&gt;Bonatti, C. &amp;amp; Crovisier, S. (2004). Récurrence et généricité. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;, &lt;strong&gt;158&lt;/strong&gt;(1), 33–104.&lt;/li&gt;
&lt;li&gt;Berger, P. (2017). Lectures on structural stability in dynamics. arXiv:1703.00092.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Inequality for Square-Summable Complex Series</title><link>https://blog.namln.org/en/posts/inequality-square-summable-complex-series/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/inequality-square-summable-complex-series/</guid><description>&lt;p&gt;Some inequalities look formidable until the right decomposition makes them
transparent. The conjecture below, posed by Zoltan Retkes on the
&lt;a href="http://www.openproblemgarden.org/op/inequality_for_square_summable_complex_series"&gt;Open Problem Garden&lt;/a&gt;
in 2012 with a £10 prize attached, is one such case: once the dyadic structure of
the positive integers is made explicit, the proof reduces to two classical facts.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Conjecture (Retkes, 2012), now proved&lt;/span&gt;
&lt;p&gt;For all $\alpha = (\alpha_1, \alpha_2, \ldots) \in \ell^2(\mathbb{C})$,
$$\sum_{n \geq 1} |\alpha_n|^2 \geq \frac{6}{\pi^2} \sum_{k \geq 0}
\left|, \sum_{l \geq 0} \frac{\alpha_{2^k(2l+1)}}{l+1} ,\right|^2.$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The conjecture was confirmed by an anonymous comment on the problem page in November
2013. A self-contained proof and an extension to $\ell^p$ were subsequently published
by Ibragimov and Salimova in &lt;em&gt;Elemente der Mathematik&lt;/em&gt; &lt;strong&gt;70&lt;/strong&gt; (2015), 79–81.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-dyadic-decomposition"&gt;
 The Dyadic Decomposition&lt;span class="heading__anchor"&gt; &lt;a href="#the-dyadic-decomposition"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The index $2^k(2l+1)$ running over $k \geq 0$ and $l \geq 0$ is not arbitrary:
it encodes a canonical partition of the positive integers. Every $n \in \mathbb{N}^+$
factors uniquely as
$$n = 2^k \cdot r, \qquad k \geq 0,\quad r \text{ odd positive},$$
where $k = v_2(n)$ is the 2-adic valuation of $n$ and $r = n/2^k$ is its odd part.
Writing $r = 2l+1$ gives the bijection $\mathbb{N}_0 \times \mathbb{N}_0 \to \mathbb{N}^+$,
$(k, l) \mapsto 2^k(2l+1)$. In particular the sets
$$A_k = {2^k(2l+1) : l \geq 0} = {2^k, 3 \cdot 2^k, 5 \cdot 2^k, \ldots}$$
form a &lt;strong&gt;partition&lt;/strong&gt; of $\mathbb{N}^+$. Explicitly: $A_0 = {1, 3, 5, 7, \ldots}$
(odd numbers), $A_1 = {2, 6, 10, 14, \ldots}$ (twice an odd number), and so on.
This partition is the key structural fact behind the proof.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="proof"&gt;
 Proof&lt;span class="heading__anchor"&gt; &lt;a href="#proof"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The argument has two ingredients: the &lt;strong&gt;Basel sum&lt;/strong&gt; $\sum_{l \geq 0}(l+1)^{-2} = \pi^2/6$,
and the &lt;strong&gt;Cauchy–Schwarz inequality&lt;/strong&gt; in $\ell^2(\mathbb{C})$.&lt;/p&gt;
&lt;p&gt;Define two sequences in $\ell^2(\mathbb{C})$:
$$x = \left(1,, \tfrac{1}{2},, \tfrac{1}{3},, \ldots\right), \qquad
y_k = \left(\alpha_{2^k},, \alpha_{3 \cdot 2^k},, \alpha_{5 \cdot 2^k},, \ldots\right)
\quad (k \geq 0).$$&lt;/p&gt;
&lt;p&gt;The inner sum in the conjecture is exactly the $\ell^2$ inner product $\langle x, y_k \rangle$:
$$\sum_{l \geq 0} \frac{\alpha_{2^k(2l+1)}}{l+1} = \langle x, y_k \rangle.$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1: Apply Cauchy–Schwarz.&lt;/strong&gt; For each $k$,&lt;/p&gt;
&lt;p&gt;$$|\langle x, y_k \rangle|^2 \leq |x|_2^2 \cdot |y_k|_2^2.$$&lt;/p&gt;
&lt;p&gt;Summing over $k \geq 0$,&lt;/p&gt;
&lt;p&gt;$$\sum _{k \geq 0} |\langle x, y _k \rangle|^2 \leq |x| _2^2 \sum _{k \geq 0} |y _k| _2^2.$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2: Evaluate using the Basel problem and the partition.&lt;/strong&gt; The Basel problem gives
$$|x| _2^2 = \sum _{l \geq 0} \frac{1}{(l+1)^2} = \frac{\pi^2}{6}.$$&lt;/p&gt;
&lt;p&gt;Since the sets $A_k$ partition $\mathbb{N}^+$,
$$\sum _{k \geq 0} |y_k|_2^2 = \sum _{k \geq 0} \sum _{l \geq 0} |\alpha _{2^k(2l+1)}|^2
= \sum _{n \geq 1} |\alpha_n|^2.$$&lt;/p&gt;
&lt;p&gt;Combining both steps,
$$\sum_{k \geq 0} \left|\sum_{l \geq 0} \frac{\alpha_{2^k(2l+1)}}{l+1}\right|^2
\leq \frac{\pi^2}{6} \sum_{n \geq 1} |\alpha_n|^2,$$
which is the inequality with the $\frac{6}{\pi^2}$ factor moved to the other side.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="sharpness-of-the-constant"&gt;
 Sharpness of the Constant&lt;span class="heading__anchor"&gt; &lt;a href="#sharpness-of-the-constant"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The constant $6/\pi^2$ is the best possible. To see this, consider the truncated
sequence $\alpha^{(N)}$ defined by $\alpha^{(N)}_{2l+1} = 1/(l+1)$ for
$l = 0, 1, \ldots, N-1$ and $\alpha^{(N)}_n = 0$ otherwise. Then:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The left-hand side equals $\displaystyle\sum_{l=0}^{N-1} \frac{1}{(l+1)^2} \to \frac{\pi^2}{6}$.&lt;/li&gt;
&lt;li&gt;The only non-zero contribution to the right-hand side comes from $k = 0$
(since all non-zero indices are odd, i.e. in $A_0$), giving
$\displaystyle\frac{6}{\pi^2}\left(\sum_{l=0}^{N-1} \frac{1}{(l+1)^2}\right)^2 \to \frac{6}{\pi^2} \cdot \frac{\pi^4}{36} = \frac{\pi^2}{6}$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The ratio of the right-hand side to the left-hand side therefore tends to $1$ as
$N \to \infty$, so no larger constant than $6/\pi^2$ can hold universally. Equality
is never achieved for $\alpha \in \ell^2(\mathbb{C})\setminus{0}$ with finite norm
since the limiting sequence does not belong to $\ell^2(\mathbb{C})$.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="extension-to-ellp"&gt;
 Extension to $\ell^p$&lt;span class="heading__anchor"&gt; &lt;a href="#extension-to-ellp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The Cauchy–Schwarz inequality used above is a special case of Hölder&amp;rsquo;s inequality,
and the proof generalises immediately.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Ibragimov–Salimova, 2015)&lt;/span&gt;
&lt;p&gt;Let $p, q \in (1,\infty)$ with $\tfrac{1}{p} + \tfrac{1}{q} = 1$. For all
$\alpha = (\alpha_1, \alpha_2, \ldots) \in \ell^p(\mathbb{C})$ and
$x = (x_0, x_1, \ldots) \in \ell^q(\mathbb{C})$,
$$\sum_{n \geq 1} |\alpha_n|^p \geq \left(\sum_{l \geq 0} |x_l|^q\right)^{-p/q}
\sum_{k \geq 0} \left|\sum_{l \geq 0} x_l, \alpha_{2^k(2l+1)}\right|^p.$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Retkes&amp;rsquo;s original inequality is the case $p = q = 2$ and $x_l = 1/(l+1)$, where
$(\sum_{l\geq 0}|x_l|^2)^{-1} = 6/\pi^2$ by the Basel problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="remarks-on-structure"&gt;
 Remarks on Structure&lt;span class="heading__anchor"&gt; &lt;a href="#remarks-on-structure"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;The role of the dyadic partition.&lt;/strong&gt; The sets $A_k$ are the &lt;em&gt;dyadic layers&lt;/em&gt; of
$\mathbb{N}^+$: each integer sits in exactly one layer determined by its 2-adic
valuation. This structure also appears in the theory of Hardy spaces, where the
dyadic martingale decomposition underpins the $H^1$–BMO duality, and in wavelets,
where the dyadic scaling of the real line organises the multiresolution analysis.
The inequality can be read as a norm comparison between the $\ell^2$ norm and a
weighted sum over dyadic layers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Relation to the Basel problem.&lt;/strong&gt; The constant $6/\pi^2$, the reciprocal of
$\zeta(2)$, appears here because the weight sequence $1/(l+1)$ used in the inner
sum is precisely the harmonic sequence, whose $\ell^2$ norm squared is $\zeta(2)$.
Any other weight sequence $x \in \ell^2(\mathbb{C})$ would produce the analogous
inequality with $|x|_2^{-2}$ in place of $6/\pi^2$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The inequality as a rearrangement estimate.&lt;/strong&gt; The right-hand side reorganises the
entries of $\alpha$ by their dyadic layer and applies a weighted average within each
layer. The inequality says the total $\ell^2$ energy cannot be less than $6/\pi^2$
times the energy of this rearranged, averaged version of the sequence, a
quantitative statement about how averaging destroys energy.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="further-questions"&gt;
 Further Questions&lt;span class="heading__anchor"&gt; &lt;a href="#further-questions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;While the original conjecture is settled, several natural variants remain.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt;Question 1&lt;/span&gt;
&lt;p&gt;What is the sharp constant in the inequality if the dyadic partition is replaced by
the partition induced by a prime $p \neq 2$, i.e. by the sets
$A_k^{(p)} = {p^k m : \gcd(m, p) = 1}$? The same argument applies with
$x_l = w_l$ for any weight sequence $w \in \ell^2(\mathbb{C})$, but the resulting
constant depends on $|w|_2$ and the choice of weight, not on $\pi$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt;Question 2&lt;/span&gt;
&lt;p&gt;The inner sum $\sum_{l \geq 0} \alpha_{2^k(2l+1)}/(l+1)$ averages the entries in
layer $A_k$ with the harmonic weights. What happens if the harmonic weight $1/(l+1)$
is replaced by a weight $w(l)$ depending on the position $l$ within the layer in a
more general way, for instance $w(l) = l^{-s}$ for $s &amp;gt; 1/2$? The sharp constant
would then involve $\zeta(2s)$ instead of $\zeta(2) = \pi^2/6$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt;Question 3&lt;/span&gt;
&lt;p&gt;For $p = 1$ the Ibragimov–Salimova theorem requires $q = \infty$, and the Hölder
inequality takes a different form. Does an analogue of Retkes&amp;rsquo;s inequality hold for
$\alpha \in \ell^1(\mathbb{C})$, and if so, what is the sharp constant?&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Ibragimov, Z. O. &amp;amp; Salimova, D. F. (2015). On an inequality in $\ell_p(\mathbb{C})$ involving Basel problem. &lt;em&gt;Elemente der Mathematik&lt;/em&gt;, &lt;strong&gt;70&lt;/strong&gt;(2), 79–81. &lt;a href="https://ems.press/content/serial-article-files/45532"&gt;https://ems.press/content/serial-article-files/45532&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Retkes, Z. (2012). Inequality for square summable complex series. &lt;em&gt;Open Problem Garden&lt;/em&gt;. &lt;a href="http://www.openproblemgarden.org/op/inequality_for_square_summable_complex_series"&gt;http://www.openproblemgarden.org/op/inequality_for_square_summable_complex_series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Benko, D. &amp;amp; Molokach, J. (2013). The Basel problem as a rearrangement of series. &lt;em&gt;College Mathematics Journal&lt;/em&gt;, &lt;strong&gt;44&lt;/strong&gt;(3), 171–176.&lt;/li&gt;
&lt;li&gt;Ritelli, D. (2013). Another proof of $\zeta(2) = \pi^2/6$ using double integrals. &lt;em&gt;American Mathematical Monthly&lt;/em&gt;, &lt;strong&gt;120&lt;/strong&gt;(7), 642–645.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Recent Advances in Neural Network Optimization for LLM Training</title><link>https://blog.namln.org/en/posts/llm-optimization-2025-survey/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/llm-optimization-2025-survey/</guid><description>&lt;p&gt;The optimization landscape for LLM training looks very different from two years
ago. AdamW still dominates production runs, but a wave of research is eroding
that dominance from multiple angles simultaneously: matrix-aware optimizers,
horizon-free schedulers, a sharply revised understanding of µP, and
communication-efficient distributed methods. This post synthesizes 18 recent
papers across five interconnected fronts.&lt;/p&gt;
&lt;p&gt;The unifying thread is an active re-examination of long-held assumptions, from
whether gradient geometry matters, to what µP is actually doing, to whether
weight decay is a regularizer at all.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="1-muon-and-non-euclidean-optimizers"&gt;
 1. Muon and Non-Euclidean Optimizers&lt;span class="heading__anchor"&gt; &lt;a href="#1-muon-and-non-euclidean-optimizers"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Muon&lt;/strong&gt; (&lt;em&gt;&lt;strong&gt;Mo&lt;/strong&gt;mentum &lt;strong&gt;U&lt;/strong&gt;rthog&lt;/em&gt;&lt;em&gt;on&lt;/em&gt;*alized by Newton-Schulz*) applies a
gradient orthogonalization step via a Newton-Schulz iteration before each weight
update. Rather than treating each parameter as an independent scalar (as Adam
does), Muon recognizes that weight matrices have geometric structure and
optimizes them accordingly, performing steepest descent under the &lt;strong&gt;spectral
norm&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The core Newton-Schulz iteration, which runs stably in &lt;code&gt;bfloat16&lt;/code&gt; on tensor
cores, is:&lt;/p&gt;
&lt;p&gt;$$
X \leftarrow aX + b(XX^\top)X + c(XX^\top)^2 X
$$&lt;/p&gt;
&lt;p&gt;with coefficients $a = 3.4445$, $b = -4.7750$, $c = 2.0315$. In PyTorch:&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;newtonschulz5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-7&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.4445&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;4.7750&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;2.0315&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bfloat16&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;/=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;G&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;X&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;A ready-to-use implementation lives at
&lt;a href="https://github.com/KellerJordan/Muon"&gt;KellerJordan/Muon&lt;/a&gt;. Install via:&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install git+https://github.com/KellerJordan/Muon&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;Muon is intended for hidden-layer matrix weights only. Embeddings, the output
head, and scalar/vector parameters should still use AdamW:&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;muon&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MuonWithAuxAdam&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;hidden_matrix_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndim&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;embed&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;embed_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;embed&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scalar_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndim&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;head_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lm_head&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MuonWithAuxAdam&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;muon_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;hidden_matrix_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;adamw_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embed_params&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;scalar_params&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;head_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;adamw_lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3e-4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;adamw_wd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# LR has built-in muP scaling, so no retuning is needed as you scale up&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;h3 class="heading" id="scaling-muon-the-moonlight-result"&gt;
 Scaling Muon: the Moonlight result&lt;span class="heading__anchor"&gt; &lt;a href="#scaling-muon-the-moonlight-result"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;MoonshotAI&amp;rsquo;s &lt;strong&gt;Moonlight&lt;/strong&gt; (3B/16B-parameter MoE, trained on 5.7T tokens)
provides the strongest evidence yet that Muon scales to real LLM training
(&lt;a href="https://arxiv.org/abs/2502.16982"&gt;arXiv:2502.16982&lt;/a&gt;,
&lt;a href="https://github.com/MoonshotAI/Moonlight"&gt;GitHub&lt;/a&gt;). Two fixes are needed to
make Muon work beyond small scale:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Weight decay:&lt;/strong&gt; without it, weight and output RMS norms grow until they
overflow &lt;code&gt;bfloat16&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Per-parameter update scale adjustment:&lt;/strong&gt; matching the RMS update norm of
AdamW by a factor of $\sqrt{(1-\beta_1)/(1+\beta_1)}$.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With these in place, scaling-law experiments indicate roughly &lt;strong&gt;2× computational
efficiency&lt;/strong&gt; compared to AdamW at compute-optimal settings.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Train a Qwen-like dense model with Muon (from Moonlight repo)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;python3 examples/toy_train.py &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --model qwen --optimizer muon &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --dataset openwebtext-100k &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --hidden_size &lt;span class="m"&gt;896&lt;/span&gt; --lr 1e-3&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;A further efficiency variant is
&lt;a href="https://github.com/nil0x9/flash-muon"&gt;Flash-Muon&lt;/a&gt;, which reimplements the
Newton-Schulz inner loop using a custom Triton kernel that exploits the symmetry
of the $XX^\top$ computation, halving the effective FLOP count.&lt;/p&gt;
&lt;h3 class="heading" id="theoretical-foundations"&gt;
 Theoretical foundations&lt;span class="heading__anchor"&gt; &lt;a href="#theoretical-foundations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Kovalev (2025)&lt;/strong&gt; shows in &lt;em&gt;Understanding Gradient Orthogonalization via
Non-Euclidean Trust-Region Optimization&lt;/em&gt; that the orthogonalized gradient update
can be interpreted as a first-order trust-region method where the trust-region is
defined in terms of the matrix spectral norm. This framework unifies Muon with
normalized SGD and signSGD with momentum.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pethick et al. (2025)&lt;/strong&gt; propose &lt;strong&gt;Scion&lt;/strong&gt;, a family of LMO-based algorithms
that subsumes Muon, AdamW, and normalized SGD under a single framework
(&lt;a href="https://arxiv.org/abs/2502.07529"&gt;arXiv:2502.07529&lt;/a&gt;). By choosing an explicit
norm for deep architectures, Scion also achieves hyperparameter transferability
across model widths.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Polar Express&lt;/strong&gt; (Amsel et al., 2025) replaces Newton-Schulz with a minimax
polar decomposition, solving a minimax problem at each iteration to minimize
worst-case error. It converges faster than Newton-Schulz in both early and
asymptotic stages, while remaining numerically stable in &lt;code&gt;bfloat16&lt;/code&gt;.&lt;/p&gt;
&lt;h3 class="heading" id="challenging-the-geometric-narrative"&gt;
 Challenging the geometric narrative&lt;span class="heading__anchor"&gt; &lt;a href="#challenging-the-geometric-narrative"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite the theoretical appeal, &lt;strong&gt;Shumaylov et al. (2026)&lt;/strong&gt; mount a systematic
challenge in &lt;em&gt;Muon is Not That Special: Random or Inverted Spectra Work Just as
Well&lt;/em&gt;. They introduce:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Freon:&lt;/strong&gt; a family of optimizers based on Schatten (quasi-)norms,
interpolating between SGD and Muon. The best-performing Schatten parameter for
GPT-2 lies in the &lt;em&gt;quasi-norm&lt;/em&gt; regime, which no LMO-based optimizer can
represent.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kaon:&lt;/strong&gt; replaces Muon&amp;rsquo;s singular values with random noise, yet still
matches Muon&amp;rsquo;s validation loss on GPT-2.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Their key insight: performance is primarily controlled by two local quantities,
&lt;em&gt;alignment&lt;/em&gt; (how well the update direction aligns with the gradient) and &lt;em&gt;descent
potential&lt;/em&gt; (step-size optimality). Muon succeeds by guaranteeing step-size
optimality, not by tracking an ideal geometry.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th style="text-align: left"&gt;Optimizer&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Core mechanism&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Key claim&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Muon&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Newton-Schulz orthogonalization&lt;/td&gt;
					&lt;td style="text-align: left"&gt;~2× efficiency over AdamW at compute-optimal&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Scion&lt;/td&gt;
					&lt;td style="text-align: left"&gt;LMO over norm-ball&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Unifies Muon/Adam; HP transferable across widths&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Polar Express&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Minimax polar decomposition&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Faster convergence; bfloat16-safe&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Freon / Kaon&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Schatten quasi-norms / random SVs&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Geometry is irrelevant; alignment drives performance&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="2-learning-rate-scheduling"&gt;
 2. Learning Rate Scheduling&lt;span class="heading__anchor"&gt; &lt;a href="#2-learning-rate-scheduling"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="linear-decay-is-provably-optimal"&gt;
 Linear decay is provably optimal&lt;span class="heading__anchor"&gt; &lt;a href="#linear-decay-is-provably-optimal"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio et al. (2023/2024)&lt;/strong&gt; close a long-standing gap between theory and
practice in &lt;em&gt;Optimal Linear Decay Learning Rate Schedules and Further
Refinements&lt;/em&gt; (&lt;a href="https://arxiv.org/abs/2310.07831"&gt;arXiv:2310.07831&lt;/a&gt;). Under
worst-case analysis, &lt;strong&gt;linear decay&lt;/strong&gt;, setting $\eta_t \propto (1 - t/T)$, is
the theoretically optimal schedule for a broad class of optimizers including SGD.
Across 10 diverse benchmarks, it consistently outperforms cosine annealing.&lt;/p&gt;
&lt;p&gt;$$
\eta_t = \eta_{\max} \cdot \left(1 - \frac{t}{T}\right)
$$&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# PyTorch built-in, the optimal default&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;scheduler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lr_scheduler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LinearLR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_iters&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;total_steps&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;h3 class="heading" id="the-wsd-cooldown-phase"&gt;
 The WSD cooldown phase&lt;span class="heading__anchor"&gt; &lt;a href="#the-wsd-cooldown-phase"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Warmup-Stable-Decay (WSD) scheduler separates training into distinct phases
ending in a sharp LR drop. &lt;strong&gt;Dremov et al. (2025)&lt;/strong&gt; analyse the cooldown phase
specifically in &lt;em&gt;Training Dynamics of the Cooldown Stage in WSD&lt;/em&gt;, finding:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cooldown shapes that balance exploration and exploitation consistently
outperform purely exploratory or exploitative alternatives.&lt;/li&gt;
&lt;li&gt;There is substantial sensitivity to AdamW&amp;rsquo;s $\beta_2$ parameter during
cooldown, and &lt;strong&gt;higher $\beta_2$ values yield consistent improvements&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Loss-landscape visualisations support the &amp;ldquo;river valley&amp;rdquo; perspective: the
cooldown follows a narrow valley in parameter space.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="convex-theory-meets-llm-practice"&gt;
 Convex theory meets LLM practice&lt;span class="heading__anchor"&gt; &lt;a href="#convex-theory-meets-llm-practice"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Schaipp et al. (2025)&lt;/strong&gt; show in &lt;em&gt;The Surprising Agreement Between Convex
Optimization Theory and Learning-Rate Scheduling for Large Model Training&lt;/em&gt; that
schedules for large model training obey performance bounds from non-smooth convex
optimisation. For the constant schedule with linear cooldown, the bound is:&lt;/p&gt;
&lt;p&gt;$$
\bar{f}&lt;em&gt;T - f^* \leq \frac{|x_0 - x^*|^2}{2\eta T} + \frac{\eta}{2} \sum&lt;/em&gt;{t=0}^{T-1} \sigma_t^2
$$&lt;/p&gt;
&lt;p&gt;where the cooldown benefit appears explicitly through the absence of logarithmic
terms. This enables &lt;strong&gt;principled LR transfer&lt;/strong&gt;: exploiting the theory yields
noticeable validation loss improvements for 124M and 210M Llama-type models when
extending schedules for continued training.&lt;/p&gt;
&lt;h3 class="heading" id="anytime-schedules-and-weight-averaging"&gt;
 Anytime schedules and weight averaging&lt;span class="heading__anchor"&gt; &lt;a href="#anytime-schedules-and-weight-averaging"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Meterez et al. (2026)&lt;/strong&gt; prove in &lt;em&gt;Anytime Pretraining: Horizon-Free
Learning-Rate Schedules with Weight Averaging&lt;/em&gt;
(&lt;a href="https://arxiv.org/abs/2602.03702"&gt;arXiv:2602.03702&lt;/a&gt;) that horizon-free (anytime)
schedules exist for overparameterised linear regression, with &lt;strong&gt;weight averaging&lt;/strong&gt;
central to achieving minimax-optimal convergence. At 150M–300M params trained at
1–32× Chinchilla scale, a constant LR with weight averaging matches well-tuned
cosine decay across the full training duration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Weight averaging is a largely underutilised practical lever. It should be a
default, not an afterthought.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 class="heading" id="schedulefree-at-llm-scale"&gt;
 ScheduleFree+ at LLM scale&lt;span class="heading__anchor"&gt; &lt;a href="#schedulefree-at-llm-scale"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio (2026)&lt;/strong&gt; extends schedule-free learning to full LLM pretraining in
&lt;em&gt;ScheduleFree+: Scaling Learning-Rate-Free and Schedule-Free Learning to Large
Language Models&lt;/em&gt; (&lt;a href="https://arxiv.org/abs/2605.19095"&gt;arXiv:2605.19095&lt;/a&gt;).
Practical fixes for large batch and model sizes enable ScheduleFree+ to achieve
a &lt;strong&gt;31% improvement&lt;/strong&gt; over WSD schedules at 1000 tokens per parameter, while
also providing a theoretical foundation for checkpoint merging during pretraining.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install schedulefree&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;
&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;schedulefree&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AdamWScheduleFree&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AdamWScheduleFree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;warmup_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Must switch to eval mode before evaluation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;val_loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;GitHub: &lt;a href="https://github.com/facebookresearch/schedule_free"&gt;facebookresearch/schedule_free&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="3-hyperparameter-transfer-and-scaling-laws-µp"&gt;
 3. Hyperparameter Transfer and Scaling Laws (µP)&lt;span class="heading__anchor"&gt; &lt;a href="#3-hyperparameter-transfer-and-scaling-laws-%c2%b5p"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="weight-decay-as-the-true-driver-of-lr-transfer"&gt;
 Weight decay as the true driver of LR transfer&lt;span class="heading__anchor"&gt; &lt;a href="#weight-decay-as-the-true-driver-of-lr-transfer"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Maximal Update Parameterisation (µP) is widely used to transfer optimal
learning rates from proxy models to large ones without re-tuning. &lt;strong&gt;Kosson et al.
(2025/2026)&lt;/strong&gt;, accepted to ICLR 2026, provide a large-scale empirical refutation
of the standard µP narrative in &lt;em&gt;Weight Decay May Matter More than µP for
Learning Rate Transfer in Practice&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Their finding: µP&amp;rsquo;s geometric alignment assumptions, which require alignment
between a layer&amp;rsquo;s inputs, weights, and gradient updates, hold only &lt;strong&gt;briefly at
the start of training&lt;/strong&gt;. For the remainder, it is &lt;strong&gt;weight decay&lt;/strong&gt; that
stabilises update dynamics across widths and facilitates LR transfer. This
implies µP&amp;rsquo;s scaling primarily acts as an implicit warmup, and can be largely
replaced by modified warmup schedules.&lt;/p&gt;
&lt;h3 class="heading" id="embedding-layer-lr-as-the-key-factor"&gt;
 Embedding layer LR as the key factor&lt;span class="heading__anchor"&gt; &lt;a href="#embedding-layer-lr-as-the-key-factor"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Kalra &amp;amp; Barkeshli (2026)&lt;/strong&gt; provide complementary evidence in &lt;em&gt;Quantifying
Hyperparameter Transfer and the Importance of Embedding Layer Learning Rate&lt;/em&gt;,
tracing µP&amp;rsquo;s advantage over standard parameterisation (SP) to a single factor:
the &lt;strong&gt;embedding layer learning rate&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In SP, the embedding LR acts as a training bottleneck. Simply increasing it by a
factor of model width, matching µP, eliminates most of the gap. Three
quantitative metrics are used: quality of scaling law fit, robustness to
extrapolation errors, and asymptotic loss penalty.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;span class="lnt"&gt;9
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Simple fix that captures most of µP&amp;#39;s benefit in SP&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;embed_lr_multiplier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_width&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;base_width&lt;/span&gt; &lt;span class="c1"&gt;# = d_model / d_model_proxy&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;param_groups&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;lr&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;base_lr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;embed_lr_multiplier&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;non_embed_params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;lr&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;base_lr&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AdamW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param_groups&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;weight_decay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;Open question:&lt;/strong&gt; Kosson et al. argue µP acts as an implicit warmup; Kalra &amp;amp;
Barkeshli argue it is about the embedding LR. Both contradict µP&amp;rsquo;s original
geometric motivation. No consensus has emerged, and the practical implications
differ significantly.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="4-normalization-weight-decay-and-variance-reduction"&gt;
 4. Normalization, Weight Decay, and Variance Reduction&lt;span class="heading__anchor"&gt; &lt;a href="#4-normalization-weight-decay-and-variance-reduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-end-of-training-gradient-spike"&gt;
 The end-of-training gradient spike&lt;span class="heading__anchor"&gt; &lt;a href="#the-end-of-training-gradient-spike"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio (2025)&lt;/strong&gt; identifies a subtle pathology in &lt;em&gt;Why Gradients Rapidly
Increase Near the End of Training&lt;/em&gt;: gradient norms spike sharply near the end of
long LLM runs. The diagnosis is a three-way interaction between &lt;strong&gt;weight decay&lt;/strong&gt;,
&lt;strong&gt;normalisation layers&lt;/strong&gt;, and the &lt;strong&gt;LR schedule&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;When a layer is followed by normalisation, its scale becomes irrelevant to the
forward pass, but weight decay continues shrinking the parameters. This creates
an implicit competition between the optimizer&amp;rsquo;s effective update size and
normalisation rescaling, causing gradient norms to grow unchecked as the LR
decays.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; disable weight decay for AdamW-updated layers in architectures where
those layers are directly followed by normalisation (e.g. every transformer
block):&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;no_wd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;named_parameters&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;norm&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;embed&amp;#34;&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndim&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;no_wd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;param&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AdamW&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;wd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;weight_decay&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;params&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;no_wd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;weight_decay&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3e-4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;This simultaneously eliminates the spike and reduces loss throughout training.
The analysis explains why weight decay should be disabled for AdamW-updated
layers in architectures like modded-nanoGPT.&lt;/p&gt;
&lt;h3 class="heading" id="weight-normalisation-as-an-alternative"&gt;
 Weight normalisation as an alternative&lt;span class="heading__anchor"&gt; &lt;a href="#weight-normalisation-as-an-alternative"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Nemotron-Flash&lt;/strong&gt; (Fu et al., 2025, NeurIPS 2025) investigates weight
normalisation as a practical mechanism in small language models, finding that it
enables more effective weight updates and improves final convergence. Weight
normalisation sidesteps the weight-decay/normalisation interaction described
above, though at the cost of slightly worse final loss compared to a well-tuned
baseline.&lt;/p&gt;
&lt;h3 class="heading" id="mars-variance-reduction-meets-preconditioned-gradients"&gt;
 MARS: variance reduction meets preconditioned gradients&lt;span class="heading__anchor"&gt; &lt;a href="#mars-variance-reduction-meets-preconditioned-gradients"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite decades of theoretical work, variance reduction has largely failed to
yield practical gains in deep learning. &lt;strong&gt;Yuan et al. (2024/2025)&lt;/strong&gt; attempt to
change this in &lt;em&gt;MARS: Unleashing the Power of Variance Reduction for Training
Large Models&lt;/em&gt;, proposing a unified framework that reconciles AdamW, Lion, and
Shampoo with variance reduction via a &lt;strong&gt;scaled stochastic recursive momentum&lt;/strong&gt;
technique.&lt;/p&gt;
&lt;p&gt;GPT-2 training results look strong. However, the comprehensive benchmark by
&lt;strong&gt;Semenov et al. (2025)&lt;/strong&gt;, &lt;em&gt;Benchmarking Optimizers for Large Language Model
Pretraining&lt;/em&gt;, a 73-page study covering 44 figures and 48 tables across
standardised scenarios, reveals that &lt;strong&gt;MARS does not work well with small batch
sizes&lt;/strong&gt;, limiting its practical applicability in memory-constrained settings.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This underscores the danger of evaluating optimizers on a single benchmark
setup: MARS looks excellent at the batch sizes used in the original paper and
brittle elsewhere.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="5-distributed-training-diloco-and-its-descendants"&gt;
 5. Distributed Training: DiLoCo and Its Descendants&lt;span class="heading__anchor"&gt; &lt;a href="#5-distributed-training-diloco-and-its-descendants"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;DiLoCo (Distributed Low-Communication training) uses AdamW as an &lt;em&gt;inner&lt;/em&gt;
optimizer for $H$ local steps on each worker (typically $H = 500$), then
synchronises by applying Nesterov momentum to the &lt;strong&gt;pseudo-gradient&lt;/strong&gt;, the sum
of all parameter changes across those inner steps. This reduces communication
frequency by up to 500×.&lt;/p&gt;
&lt;h3 class="heading" id="opendiloco-the-open-source-foundation"&gt;
 OpenDiLoCo: the open-source foundation&lt;span class="heading__anchor"&gt; &lt;a href="#opendiloco-the-open-source-foundation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;PrimeIntellect&amp;rsquo;s
&lt;a href="https://github.com/PrimeIntellect-ai/OpenDiloco"&gt;OpenDiLoCo&lt;/a&gt; provides a
reproducible drop-in implementation, demonstrated training across two continents
and three countries with 90–95% compute utilisation. It later served as the
foundation for INTELLECT-1, a 10B-parameter model trained globally.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;open_diloco.hivemind_diloco&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DiLoCoOptimizer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;inner_optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AdamW&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;4e-4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;outer_optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;optim&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SGD&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;momentum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nesterov&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DiLoCoOptimizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;dht&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dht&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;num_inner_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# sync every 500 steps, 500× fewer communications&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;inner_optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;inner_optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;outer_optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;outer_optimizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;h3 class="heading" id="why-diloco-works-on-a-single-node-snoo"&gt;
 Why DiLoCo works on a single node: SNOO&lt;span class="heading__anchor"&gt; &lt;a href="#why-diloco-works-on-a-single-node-snoo"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Kallusky et al. (2025)&lt;/strong&gt; show in &lt;em&gt;SNOO: Step-K Nesterov Outer Optimizer&lt;/em&gt; that
DiLoCo&amp;rsquo;s effectiveness, even on a single node, stems from applying &lt;strong&gt;Nesterov
momentum to the pseudo-gradient&lt;/strong&gt;. Their method isolates this as a standalone
Lookahead variant. Results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;1.5–2.5× FLOPs efficiency&lt;/strong&gt; gains up to $10^{23}$ training FLOPs.&lt;/li&gt;
&lt;li&gt;Improvements &lt;em&gt;increase&lt;/em&gt; with model size.&lt;/li&gt;
&lt;li&gt;Compatible with both AdamW and Muon as inner optimizers.&lt;/li&gt;
&lt;li&gt;Minimal memory overhead.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The single-worker DiLoCo achieves speedups of up to &lt;strong&gt;6.32%&lt;/strong&gt; in steps-to-loss
over AdamW on a 160M Llama model.&lt;/p&gt;
&lt;h3 class="heading" id="smoothing-diloco-generalized-primal-averaging-gpa"&gt;
 Smoothing DiLoCo: Generalized Primal Averaging (GPA)&lt;span class="heading__anchor"&gt; &lt;a href="#smoothing-diloco-generalized-primal-averaging-gpa"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Defazio et al. (2025/2026)&lt;/strong&gt; propose &lt;strong&gt;GPA&lt;/strong&gt; in &lt;em&gt;Smoothing DiLoCo with Primal
Averaging for Faster Training of LLMs&lt;/em&gt;
(&lt;a href="https://arxiv.org/abs/2512.17131"&gt;arXiv:2512.17131&lt;/a&gt;), which decouples
DiLoCo&amp;rsquo;s interpolation constants to enable smooth iterate averaging at every
step, replacing uniform averaging with exponential moving averaging.&lt;/p&gt;
&lt;p&gt;GPA unifies single-worker DiLoCo and ScheduleFree within a single non-distributed
framework. Speedups over AdamW in steps-to-target-loss:&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th style="text-align: left"&gt;Model&lt;/th&gt;
					&lt;th style="text-align: right"&gt;Speedup&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Llama-160M&lt;/td&gt;
					&lt;td style="text-align: right"&gt;8.71%&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Llama-1B&lt;/td&gt;
					&lt;td style="text-align: right"&gt;10.13%&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Llama-8B&lt;/td&gt;
					&lt;td style="text-align: right"&gt;9.58%&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 class="heading" id="streaming-diloco-towards-free-distributed-training"&gt;
 Streaming DiLoCo: towards free distributed training&lt;span class="heading__anchor"&gt; &lt;a href="#streaming-diloco-towards-free-distributed-training"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Douillard et al. (2025)&lt;/strong&gt; address the remaining bottleneck in &lt;em&gt;Streaming
DiLoCo with Overlapping Communication: Towards a Distributed Free Lunch&lt;/em&gt;
(&lt;a href="https://arxiv.org/abs/2501.18512"&gt;arXiv:2501.18512&lt;/a&gt;): even with infrequent
synchronisation, each sync exchanges all parameters simultaneously. Three fixes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Streaming sync:&lt;/strong&gt; synchronise only subsets of parameters at a time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Overlapping communication:&lt;/strong&gt; continue training during synchronisation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quantisation:&lt;/strong&gt; reduce cross-worker data to fewer bits.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Together, required bandwidth drops by &lt;strong&gt;two orders of magnitude&lt;/strong&gt; while
maintaining comparable quality at billion-parameter scale.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th style="text-align: left"&gt;Method&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Setting&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Key contribution&lt;/th&gt;
					&lt;th style="text-align: left"&gt;Gain&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;SNOO&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Single-node&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Nesterov momentum on pseudo-gradient&lt;/td&gt;
					&lt;td style="text-align: left"&gt;1.5–2.5× FLOP efficiency&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;GPA&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Single-node&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Smooth iterate averaging; unifies DiLoCo + SF&lt;/td&gt;
					&lt;td style="text-align: left"&gt;~9% steps-to-loss&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td style="text-align: left"&gt;Streaming DiLoCo&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Distributed&lt;/td&gt;
					&lt;td style="text-align: left"&gt;Streaming sync + quantisation&lt;/td&gt;
					&lt;td style="text-align: left"&gt;~100× bandwidth reduction&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="6-cross-cutting-themes-and-open-questions"&gt;
 6. Cross-Cutting Themes and Open Questions&lt;span class="heading__anchor"&gt; &lt;a href="#6-cross-cutting-themes-and-open-questions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Several recurrent tensions emerge from reading these papers together.&lt;/p&gt;
&lt;h3 class="heading" id="geometry-vs-step-size-calibration-in-muon"&gt;
 Geometry vs. step-size calibration in Muon&lt;span class="heading__anchor"&gt; &lt;a href="#geometry-vs-step-size-calibration-in-muon"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Kovalev, Pethick et al., and Amsel et al. offer geometric explanations for
Muon&amp;rsquo;s success. Shumaylov et al. argue that geometry is practically irrelevant
and step-size optimality is the true driver. Which narrative guides future
research matters: geometry points toward more sophisticated matrix norms; the
step-size interpretation suggests much simpler paths to similar gains.&lt;/p&gt;
&lt;h3 class="heading" id="what-µp-is-actually-doing"&gt;
 What µP is actually doing&lt;span class="heading__anchor"&gt; &lt;a href="#what-%c2%b5p-is-actually-doing"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Kosson et al. argue µP is primarily an implicit warmup mechanism. Kalra &amp;amp;
Barkeshli argue it is essentially about the embedding layer LR. Both stand in
contrast to µP&amp;rsquo;s original geometric motivation. The practical stakes are high:
the warmup interpretation suggests µP can be discarded with a schedule change;
the embedding LR interpretation suggests a single-line fix.&lt;/p&gt;
&lt;h3 class="heading" id="weight-decay-as-a-multi-role-hyperparameter"&gt;
 Weight decay as a multi-role hyperparameter&lt;span class="heading__anchor"&gt; &lt;a href="#weight-decay-as-a-multi-role-hyperparameter"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Weight decay appears as a protagonist in three independent stories in this
survey:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Defazio:&lt;/strong&gt; source of end-of-training gradient spikes via interaction with
normalisation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kosson et al.:&lt;/strong&gt; the true driver of LR transfer, not µP geometry.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kalra &amp;amp; Barkeshli:&lt;/strong&gt; improves scaling law fits but &lt;em&gt;hurts&lt;/em&gt; extrapolation
robustness.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is no longer tenable to treat weight decay as a simple regulariser with a
sensible default. It must be understood per-layer and in interaction with your
normalisation strategy.&lt;/p&gt;
&lt;h3 class="heading" id="diloco-as-the-practical-distributed-optimizer"&gt;
 DiLoCo as the practical distributed optimizer&lt;span class="heading__anchor"&gt; &lt;a href="#diloco-as-the-practical-distributed-optimizer"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite a large body of research on distributed optimizers, DiLoCo and its
derivatives appear to be the only methods that consistently add value beyond
simply scaling the batch size. The finding that its benefits carry over to
single-node settings (via SNOO and GPA) makes it a particularly important line
of work for practitioners at all scales.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="practical-recommendations-for-2026"&gt;
 Practical Recommendations for 2026&lt;span class="heading__anchor"&gt; &lt;a href="#practical-recommendations-for-2026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Based on the convergence of evidence across these papers, for a new large
training run consider:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Optimizer:&lt;/strong&gt; Muon for hidden-layer matrix weights + AdamW for
embeddings/head. The Moonlight scaling fixes (weight decay + update scale
adjustment) are necessary above ~1B parameters.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Schedule:&lt;/strong&gt; ScheduleFree+ or linear decay instead of cosine. If you need a
fixed-horizon schedule, WSD with higher $\beta_2$ during cooldown.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weight decay:&lt;/strong&gt; Disable it for layers directly followed by normalisation to
avoid end-of-training gradient spikes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Outer optimizer:&lt;/strong&gt; Wrap your training loop with single-worker DiLoCo (SNOO
or GPA) for a ~9% efficiency gain with no architectural changes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;µP alternatives:&lt;/strong&gt; Before adopting full µP overhead, try increasing the
embedding layer LR by a factor of $d_{\text{model}} / d_{\text{proxy}}$.
This may reproduce most of the benefit.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;None of these require fundamental architectural changes.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;#&lt;/th&gt;
					&lt;th&gt;Paper&lt;/th&gt;
					&lt;th&gt;Venue&lt;/th&gt;
					&lt;th&gt;Links&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;1&lt;/td&gt;
					&lt;td&gt;Jordan et al. (2024): &lt;em&gt;Muon: An optimizer for hidden layers&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://kellerjordan.github.io/posts/muon/"&gt;blog&lt;/a&gt; · &lt;a href="https://github.com/KellerJordan/Muon"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2&lt;/td&gt;
					&lt;td&gt;Liu et al. (2025): &lt;em&gt;Muon is Scalable for LLM Training&lt;/em&gt; (Moonlight)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2502.16982"&gt;arXiv:2502.16982&lt;/a&gt; · &lt;a href="https://github.com/MoonshotAI/Moonlight"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;3&lt;/td&gt;
					&lt;td&gt;Kovalev (2025): &lt;em&gt;Understanding Gradient Orthogonalization&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;4&lt;/td&gt;
					&lt;td&gt;Pethick et al. (2025): &lt;em&gt;Training Deep Learning Models with Norm-Constrained LMOs&lt;/em&gt; (Scion)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2502.07529"&gt;arXiv:2502.07529&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;5&lt;/td&gt;
					&lt;td&gt;Amsel et al. (2025): &lt;em&gt;The Polar Express&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;6&lt;/td&gt;
					&lt;td&gt;Shumaylov et al. (2026): &lt;em&gt;Muon is Not That Special&lt;/em&gt; (Freon/Kaon)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;7&lt;/td&gt;
					&lt;td&gt;Defazio et al. (2023): &lt;em&gt;Optimal Linear Decay Learning Rate Schedules&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2310.07831"&gt;arXiv:2310.07831&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;8&lt;/td&gt;
					&lt;td&gt;Dremov et al. (2025): &lt;em&gt;Training Dynamics of the Cooldown Stage in WSD&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;9&lt;/td&gt;
					&lt;td&gt;Schaipp et al. (2025): &lt;em&gt;Surprising Agreement Between Convex Theory and LR Scheduling&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;10&lt;/td&gt;
					&lt;td&gt;Meterez et al. (2026): &lt;em&gt;Anytime Pretraining&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2602.03702"&gt;arXiv:2602.03702&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;11&lt;/td&gt;
					&lt;td&gt;Defazio (2026): &lt;em&gt;ScheduleFree+&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2605.19095"&gt;arXiv:2605.19095&lt;/a&gt; · &lt;a href="https://github.com/facebookresearch/schedule_free"&gt;GitHub&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;12&lt;/td&gt;
					&lt;td&gt;Kosson et al. (2026): &lt;em&gt;Weight Decay May Matter More than µP&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;ICLR 2026&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;13&lt;/td&gt;
					&lt;td&gt;Kalra &amp;amp; Barkeshli (2026): &lt;em&gt;Quantifying HP Transfer and Embedding LR&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;14&lt;/td&gt;
					&lt;td&gt;Defazio (2025): &lt;em&gt;Why Gradients Rapidly Increase Near End of Training&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;15&lt;/td&gt;
					&lt;td&gt;Fu et al. (2025): &lt;em&gt;Nemotron-Flash&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;NeurIPS 2025&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;16&lt;/td&gt;
					&lt;td&gt;Yuan et al. (2025): &lt;em&gt;MARS&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;17&lt;/td&gt;
					&lt;td&gt;Semenov et al. (2025): &lt;em&gt;Benchmarking Optimizers for LLM Pretraining&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;18&lt;/td&gt;
					&lt;td&gt;Kallusky et al. (2025): &lt;em&gt;SNOO&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;19&lt;/td&gt;
					&lt;td&gt;Defazio et al. (2026): &lt;em&gt;Smoothing DiLoCo with Primal Averaging (GPA)&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2512.17131"&gt;arXiv:2512.17131&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;20&lt;/td&gt;
					&lt;td&gt;Douillard et al. (2025): &lt;em&gt;Streaming DiLoCo&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2501.18512"&gt;arXiv:2501.18512&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;21&lt;/td&gt;
					&lt;td&gt;Douillard et al. (2023/2024): &lt;em&gt;DiLoCo&lt;/em&gt; (original)&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://arxiv.org/abs/2311.08105"&gt;arXiv:2311.08105&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;22&lt;/td&gt;
					&lt;td&gt;PrimeIntellect AI (2024): &lt;em&gt;OpenDiLoCo&lt;/em&gt;&lt;/td&gt;
					&lt;td&gt;n/a&lt;/td&gt;
					&lt;td&gt;&lt;a href="https://github.com/PrimeIntellect-ai/OpenDiloco"&gt;GitHub&lt;/a&gt; · &lt;a href="https://www.primeintellect.ai/blog/opendiloco"&gt;blog&lt;/a&gt;&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;</description></item><item><title>The Invariant Subspace Problem</title><link>https://blog.namln.org/en/posts/invariant-subspace-problem/</link><pubDate>Thu, 28 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/invariant-subspace-problem/</guid><description>&lt;p&gt;Few questions in functional analysis have attracted sustained attention across
as many decades as this one. It sits at the confluence of operator theory,
spectral theory, and complex analysis, and every partial result has opened new
territory rather than narrowing the problem to a routine case.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Problem (Invariant Subspace Problem)&lt;/span&gt;
&lt;p&gt;Does every bounded linear operator $T$ on an infinite-dimensional separable
complex Hilbert space $\mathcal{H}$ have a &lt;strong&gt;non-trivial closed invariant subspace&lt;/strong&gt;?&lt;/p&gt;
&lt;p&gt;That is, does there always exist a closed subspace $\mathcal{M} \subsetneq \mathcal{H}$
with $\mathcal{M} \neq {0}$ such that $T\mathcal{M} \subseteq \mathcal{M}$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;medium importance&lt;/em&gt; on the
&lt;a href="http://www.openproblemgarden.org/op/invariant_subspace_problem"&gt;Open Problem Garden&lt;/a&gt;.
It is old enough to have accumulated a rich history of partial results, yet still
open in the Hilbert space setting after more than seventy years.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="trivial-observations-and-why-they-run-out"&gt;
 Trivial Observations and Why They Run Out&lt;span class="heading__anchor"&gt; &lt;a href="#trivial-observations-and-why-they-run-out"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Two subspaces are always invariant: ${0}$ and $\mathcal{H}$ itself. These are the
&lt;em&gt;trivial&lt;/em&gt; invariant subspaces; the problem asks whether anything else must exist.&lt;/p&gt;
&lt;p&gt;On finite-dimensional spaces the answer is immediate: every operator on $\mathbb{C}^n$
has an eigenvector (by the fundamental theorem of algebra applied to the characteristic
polynomial), and the span of any eigenvector is a one-dimensional invariant subspace.
This argument fails completely in infinite dimensions, where the spectrum can be
continuous and eigenvectors need not exist.&lt;/p&gt;
&lt;p&gt;On non-separable Hilbert spaces the problem is also trivial but for a different reason:
for any non-zero vector $x \in \mathcal{H}$, the closed linear span
$\overline{\operatorname{span}{T^n x : n \geq 0}}$ is a closed invariant subspace,
and if $\mathcal{H}$ is non-separable it cannot equal all of $\mathcal{H}$.
So the problem is genuinely about &lt;strong&gt;separable&lt;/strong&gt; spaces.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="landscape-of-known-results"&gt;
 Landscape of Known Results&lt;span class="heading__anchor"&gt; &lt;a href="#landscape-of-known-results"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="positive-results-classes-with-invariant-subspaces"&gt;
 Positive Results: Classes with Invariant Subspaces&lt;span class="heading__anchor"&gt; &lt;a href="#positive-results-classes-with-invariant-subspaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Aronszajn–Smith, 1954)&lt;/span&gt;
&lt;p&gt;Every compact operator on a Banach space of dimension greater than one has a
non-trivial closed invariant subspace.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The compact case was already known to von Neumann in the 1930s for Hilbert spaces,
but was never published; Aronszajn and Smith gave the first published proof, extended
to Banach spaces. The key idea is that a compact operator can be approximated by
finite-rank operators, each of which has invariant subspaces, and a limiting argument
produces an invariant subspace for the compact operator.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Lomonosov, 1973)&lt;/span&gt;
&lt;p&gt;If a bounded operator $T$ on a Banach space commutes with a non-zero compact operator,
then $T$ has a non-trivial hyperinvariant subspace (a subspace invariant under every
operator that commutes with $T$).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Lomonosov&amp;rsquo;s proof is strikingly short, less than a page, and uses the
Schauder fixed-point theorem in an unexpected way. It subsumes both the compact
case (an operator commutes with itself) and the polynomially compact case
(an operator commutes with $p(T)$, which is compact if $p(T)$ is).
For several years it seemed that Lomonosov&amp;rsquo;s theorem might resolve the problem
entirely, until Hadwin, Nordgren, Radjavi, and Rosenthal (1980) exhibited an
operator that does not commute with any non-zero compact operator yet still has
invariant subspaces.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Brown, 1987)&lt;/span&gt;
&lt;p&gt;Every subnormal operator on a Hilbert space has a non-trivial invariant subspace.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;An operator $T$ is &lt;em&gt;subnormal&lt;/em&gt; if it is the restriction of a normal operator on a
larger Hilbert space. Normal operators are handled by the spectral theorem, which
produces a rich lattice of invariant subspaces; subnormal operators inherit
invariant subspaces by restriction. Brown&amp;rsquo;s proof uses techniques from rational
approximation theory (the solution of the Halmos problem on subnormal operators).&lt;/p&gt;
&lt;p&gt;Beyond these landmark theorems, invariant subspaces are also known for:
hyponormal operators with some additional conditions, operators whose spectrum has
interior points, operators satisfying growth conditions on the resolvent, and
polynomially bounded operators with spectrum containing the unit circle under
further constraints (Liu, 2017; Réjasse, 2023).&lt;/p&gt;
&lt;h3 class="heading" id="beurlings-theorem-a-complete-classification"&gt;
 Beurling&amp;rsquo;s Theorem: A Complete Classification&lt;span class="heading__anchor"&gt; &lt;a href="#beurlings-theorem-a-complete-classification"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Theorem (Beurling, 1949)&lt;/span&gt;
&lt;p&gt;The closed invariant subspaces of the unilateral shift $S : H^2(\mathbb{D}) \to H^2(\mathbb{D})$,
$(Sf)(z) = zf(z)$, are exactly the subspaces of the form $\varphi H^2(\mathbb{D})$
where $\varphi$ is an inner function (i.e. $|\varphi(e^{i\theta})| = 1$ a.e.).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Beurling&amp;rsquo;s theorem is a landmark because it gives not merely existence but a full
classification of all invariant subspaces for a single operator. The shift on $H^2$
is in many senses the canonical operator for the Hilbert space invariant subspace
problem: finding a counterexample to the full problem is equivalent to finding an
operator with no invariant subspaces, and the shift shows how rich such structure
can be even for a single operator.&lt;/p&gt;
&lt;h3 class="heading" id="negative-results-counterexamples-on-banach-spaces"&gt;
 Negative Results: Counterexamples on Banach Spaces&lt;span class="heading__anchor"&gt; &lt;a href="#negative-results-counterexamples-on-banach-spaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Enflo, 1975/1987; Read, 1984)&lt;/span&gt;
&lt;p&gt;There exist separable Banach spaces and bounded linear operators on them with no
non-trivial closed invariant subspace. In particular, Read constructed such an
operator on $\ell^1$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Enflo&amp;rsquo;s counterexample was the first, constructed in 1975 though not published until
1987 due to its length and complexity. Read&amp;rsquo;s construction (1984) arrived independently
and somewhat earlier in print; a further, more explicit example by Read (1985) lives on
the classical space $\ell^1$. These results make clear that the answer to the invariant
subspace problem is &lt;strong&gt;negative for general Banach spaces&lt;/strong&gt;. The Hilbert space case
remains the central open question precisely because no counterexample on any reflexive
Banach space, much less a Hilbert space, has been found.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-hilbertbanach-gap"&gt;
 The Hilbert–Banach Gap&lt;span class="heading__anchor"&gt; &lt;a href="#the-hilbertbanach-gap"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The separation between Hilbert space and general Banach space behaviour is a
recurring theme. Several features of Hilbert spaces that Banach spaces lack suggest
why counterexamples might not exist in the Hilbert setting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The inner product gives every operator an adjoint $T^*$, and the lattice of invariant
subspaces of $T$ and of $T^*$ are related by orthogonal complementation.&lt;/li&gt;
&lt;li&gt;The spectral theorem for normal operators provides a complete invariant subspace
theory for that class, anchoring intuition.&lt;/li&gt;
&lt;li&gt;Reflexivity and the existence of unconditional bases in specific Hilbert spaces
constrain operator behaviour more than in $\ell^1$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these features has yet been converted into a proof for the general case.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-proof-attempts"&gt;
 Recent Proof Attempts&lt;span class="heading__anchor"&gt; &lt;a href="#recent-proof-attempts"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The problem has attracted renewed attention in recent years.&lt;/p&gt;
&lt;p&gt;In May 2023, Per Enflo, the same mathematician who produced the first Banach space
counterexample, posted a preprint to arXiv (2305.15442) claiming a &lt;strong&gt;positive
resolution&lt;/strong&gt; for all separable Hilbert spaces. The original preprint was 13 pages;
a substantially expanded version (52 KB) appeared in April 2024. Enflo himself has
been cautious about the result, noting that expert review is ongoing. As of this
writing the preprint has not received a definitive verdict from the community.&lt;/p&gt;
&lt;p&gt;In July 2023 an independent preprint by Neville (arXiv:2307.08176) also claimed
a positive solution for separable Hilbert spaces.&lt;/p&gt;
&lt;p&gt;In September 2024 a peer-reviewed article in &lt;em&gt;Axioms&lt;/em&gt; by Khalil, Yousef, Alshanti,
and Abu Hammad announced a proof, but basic errors were identified shortly after
publication (Ghatasheh, arXiv:2411.19409, November 2024).&lt;/p&gt;
&lt;p&gt;The problem therefore remains officially open. The cluster of recent attempts reflects
both its difficulty and its continued centrality in functional analysis.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-cyclic-vectors-and-the-spectral-radius-formula"&gt;
 1. Cyclic Vectors and the Spectral Radius Formula&lt;span class="heading__anchor"&gt; &lt;a href="#1-cyclic-vectors-and-the-spectral-radius-formula"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A vector $x \in \mathcal{H}$ is &lt;em&gt;cyclic&lt;/em&gt; for $T$ if $\mathcal{H} = \overline{\operatorname{span}{T^n x : n \geq 0}}$. An operator with a non-trivial invariant subspace cannot have every non-zero vector be cyclic. The contrapositive is: if every non-zero vector is cyclic, then $T$ is a counterexample.&lt;/p&gt;
&lt;p&gt;Read&amp;rsquo;s Banach-space constructions proceed by building &lt;em&gt;hypercyclic&lt;/em&gt; operators whose
orbits are dense. On Hilbert spaces, Hilbert space geometry severely constrains the
density of orbits. Making this constraint quantitative, via growth estimates on
$|T^n x|$ or on the resolvent $|(T-\lambda)^{-1}|$, might close the gap between
known positive results and the general case.&lt;/p&gt;
&lt;h3 class="heading" id="2-dual-algebra-techniques"&gt;
 2. Dual Algebra Techniques&lt;span class="heading__anchor"&gt; &lt;a href="#2-dual-algebra-techniques"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A powerful modern approach studies the &lt;em&gt;dual algebra&lt;/em&gt; $\mathcal{A} _T$, the weak-$*$
closure of the polynomials in $T$ as a subalgebra of $\mathcal{B}(\mathcal{H})$.
If $\mathcal{A} _T = \mathcal{B}(\mathcal{H})$ (the operator is &lt;em&gt;reflexive&lt;/em&gt; in this
sense), one can sometimes extract invariant subspaces from the structure of the
algebra. Results along these lines have been obtained for $C _{00}$ contractions
(Bercovici, Foiaş, Pearcy) and for polynomially bounded operators under spectral
conditions (Liu, 2017). The key open question is whether every Hilbert space contraction
is reflexive in this sense, or whether the dual algebra approach can be made to work
for all contractions via Sz.-Nagy–Foiaş theory.&lt;/p&gt;
&lt;h3 class="heading" id="3-contractions-and-the-sz-nagyfoiaş-calculus"&gt;
 3. Contractions and the Sz.-Nagy–Foiaş Calculus&lt;span class="heading__anchor"&gt; &lt;a href="#3-contractions-and-the-sz-nagyfoia%c5%9f-calculus"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Every contraction ($|T| \leq 1$) on a Hilbert space admits a minimal unitary dilation
(Sz.-Nagy&amp;rsquo;s dilation theorem), and Foiaş developed a functional calculus for
contractions based on $H^\infty(\mathbb{D})$. The rich structure of this calculus has
produced invariant subspace theorems for $C_{11}$ contractions and for contractions
whose spectrum is rich enough. The question is whether the calculus can be pushed to
all contractions; the general invariant subspace problem for contractions is equivalent
to the full problem (by rescaling), so this is not a simplification but a different
vantage point that has been productive.&lt;/p&gt;
&lt;h3 class="heading" id="4-almost-invariant-half-spaces"&gt;
 4. Almost Invariant Half-Spaces&lt;span class="heading__anchor"&gt; &lt;a href="#4-almost-invariant-half-spaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A weaker notion, studied by Androulakis, Popov, Tcaciuc, and Troitsky, asks for
&lt;em&gt;almost invariant half-spaces&lt;/em&gt;: closed subspaces $\mathcal{M}$ of infinite dimension
and infinite codimension such that $T\mathcal{M} \subseteq \mathcal{M} + \mathcal{F}$
for some finite-dimensional subspace $\mathcal{F}$. These exist for every operator
on any infinite-dimensional Banach space. Whether every operator on a Hilbert space
has a genuinely invariant (not just almost invariant) infinite-dimensional subspace
of infinite codimension remains open and is a concrete intermediate target.&lt;/p&gt;
&lt;h3 class="heading" id="5-hyperinvariant-subspaces"&gt;
 5. Hyperinvariant Subspaces&lt;span class="heading__anchor"&gt; &lt;a href="#5-hyperinvariant-subspaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A subspace is &lt;em&gt;hyperinvariant&lt;/em&gt; for $T$ if it is invariant under every operator that
commutes with $T$. Every hyperinvariant subspace is invariant, so existence of a
hyperinvariant subspace implies a positive answer to the invariant subspace problem.
Lomonosov&amp;rsquo;s 1973 theorem gives hyperinvariant subspaces when $T$ commutes with a
compact operator. The &lt;em&gt;hyperinvariant subspace problem&lt;/em&gt;, does every operator on a
Hilbert space (other than scalar multiples of the identity) have a hyperinvariant
subspace?, is also open and may be harder than the invariant subspace problem itself.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Aronszajn, N. &amp;amp; Smith, K. T. (1954). Invariant subspaces of completely continuous operators. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;60&lt;/strong&gt;(2), 345–350.&lt;/li&gt;
&lt;li&gt;Beurling, A. (1949). On two problems concerning linear transformations in Hilbert space. &lt;em&gt;Acta Mathematica&lt;/em&gt;, &lt;strong&gt;81&lt;/strong&gt;, 239–255.&lt;/li&gt;
&lt;li&gt;Brown, S. (1987). Hyponormal operators with thick spectra have invariant subspaces. &lt;em&gt;Annals of Mathematics&lt;/em&gt;, &lt;strong&gt;125&lt;/strong&gt;(1), 93–103.&lt;/li&gt;
&lt;li&gt;Enflo, P. H. (1987). On the invariant subspace problem for Banach spaces. &lt;em&gt;Acta Mathematica&lt;/em&gt;, &lt;strong&gt;158&lt;/strong&gt;, 213–313.&lt;/li&gt;
&lt;li&gt;Enflo, P. H. (2023). On the invariant subspace problem in Hilbert spaces. arXiv:2305.15442.&lt;/li&gt;
&lt;li&gt;Lomonosov, V. I. (1973). Invariant subspaces of operators commuting with compact operators. &lt;em&gt;Functional Analysis and Its Applications&lt;/em&gt;, &lt;strong&gt;7&lt;/strong&gt;(3), 213–214.&lt;/li&gt;
&lt;li&gt;Read, C. J. (1984). A solution to the invariant subspace problem. &lt;em&gt;Bulletin of the London Mathematical Society&lt;/em&gt;, &lt;strong&gt;16&lt;/strong&gt;(4), 337–401.&lt;/li&gt;
&lt;li&gt;Read, C. J. (1985). A solution to the invariant subspace problem on the space $\ell^1$. &lt;em&gt;Bulletin of the London Mathematical Society&lt;/em&gt;, &lt;strong&gt;17&lt;/strong&gt;(4), 305–317.&lt;/li&gt;
&lt;li&gt;Radjavi, H. &amp;amp; Rosenthal, P. (2003). &lt;em&gt;Invariant Subspaces&lt;/em&gt; (2nd ed.). Dover.&lt;/li&gt;
&lt;li&gt;Bercovici, H., Foiaş, C., &amp;amp; Pearcy, C. (1985). &lt;em&gt;Dual Algebras with Applications to Invariant Subspaces and Dilation Theory&lt;/em&gt;. AMS.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Something Like Picard for 1-Forms</title><link>https://blog.namln.org/en/posts/something-like-picard-for-1-forms/</link><pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/something-like-picard-for-1-forms/</guid><description>&lt;p&gt;Picard&amp;rsquo;s great theorem is a statement about how wildly a holomorphic function can
behave near an essential singularity. The conjecture below asks whether injectivity
of local primitives of a 1-form is enough to rule out such wild behaviour at the
origin, forcing the 1-form to extend meromorphically across the puncture.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt;Conjecture (Elsner, 2010)&lt;/span&gt;
&lt;p&gt;Let $D$ be the open unit disk and let $U_1,\dots,U_n$ be open sets with
$\bigcup_{j=1}^n U_j = D\setminus{0}$. Suppose there are injective holomorphic
functions $f_j : U_j \to \mathbb{C}$ such that
$$\mathrm{d}f_j = \mathrm{d}f_k \quad \text{on every connected component of } U_j \cap U_k.$$
Then the $\mathrm{d}f_j$ glue together to a &lt;strong&gt;meromorphic&lt;/strong&gt; 1-form on $D$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;medium importance&lt;/em&gt; on the
&lt;a href="http://www.openproblemgarden.org/op/something_like_picard_for_1_forms"&gt;Open Problem Garden&lt;/a&gt;
and is not recommended for undergraduates, reflecting the depth of the tools involved.
It arises from Elsner&amp;rsquo;s study of hyperelliptic action integrals in the context of the
exact WKB method for Schrödinger equations with polynomial potential
(Elsner, &lt;em&gt;Ann. Inst. Fourier&lt;/em&gt; &lt;strong&gt;49&lt;/strong&gt;(1), 1999).&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="setup-and-interpretation"&gt;
 Setup and Interpretation&lt;span class="heading__anchor"&gt; &lt;a href="#setup-and-interpretation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The compatibility condition $\mathrm{d}f_j = \mathrm{d}f_k$ on each connected
component of $U_j \cap U_k$ is equivalent to saying $f_j - f_k$ is locally constant
there. The local differentials therefore glue together unambiguously to a global
holomorphic 1-form
$$\omega \in \Omega^1(D\setminus{0})$$
whose restriction to each $U_j$ equals $\mathrm{d}f_j$. The conjecture asserts that
$\omega$ does not have an essential singularity at the origin: it extends to a
meromorphic 1-form on all of $D$, meaning near $0$ it looks like
$$\omega = \left(\frac{c_{-m}}{z^m} + \cdots + \frac{c_{-1}}{z} + c_0 + c_1 z + \cdots\right)dz$$
for some $m \ge 0$.&lt;/p&gt;
&lt;p&gt;The injectivity of each $f_j$ is the crucial hypothesis. Without it the statement is
false: any holomorphic 1-form $\omega$ on $D\setminus{0}$ with an essential
singularity at $0$ is locally $\mathrm{d}f_j$ for some holomorphic $f_j$, and these
$f_j$ can be chosen on contractible pieces of the cover; injectivity is what
prohibits essential singularities from arising.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="what-is-already-known"&gt;
 What Is Already Known&lt;span class="heading__anchor"&gt; &lt;a href="#what-is-already-known"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt;Partial Result&lt;/span&gt;
&lt;p&gt;Under the hypotheses of the conjecture:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The 1-form $\omega$ is holomorphic on $D\setminus{0}$.&lt;/li&gt;
&lt;li&gt;If the residue of $\omega$ at the origin vanishes, Picard&amp;rsquo;s big theorem can be
applied to conclude that $\omega$ extends meromorphically across $0$.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;p&gt;Point (1) is straightforward: each $\mathrm{d}f_j$ is holomorphic on $U_j$ and the
local forms agree on overlaps, so $\omega$ is holomorphic wherever it is defined,
i.e. on $D\setminus{0}$.&lt;/p&gt;
&lt;p&gt;Point (2) is the key partial result recorded by Elsner. If $\operatorname{Res}_0\omega = 0$,
then $\omega$ has trivial monodromy around the origin and admits a single-valued
holomorphic primitive $F$ on the punctured disk: $\omega = \mathrm{d}F$. The
injectivity of each local branch $f_j$ then forces $F$ itself to be injective on
some punctured neighbourhood of $0$ (since $f_j = F + c$ locally). An injective
holomorphic function on a punctured disk cannot have an essential singularity there,
and this is where Picard enters: at an essential singularity, by Picard&amp;rsquo;s big theorem,
every value is taken infinitely often in any punctured neighbourhood, contradicting
injectivity. Hence $F$ has at most a pole at $0$, and $\omega = \mathrm{d}F$ is meromorphic.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;open case&lt;/strong&gt; is when $\operatorname{Res}_0\omega \ne 0$, so that $\omega$ has
non-trivial monodromy and no single-valued global primitive exists. The local
primitives $f_j$ then experience monodromy as one loops around the origin, and the
injectivity constraint must be leveraged in this more delicate multi-valued setting.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="connection-to-picards-theorem"&gt;
 Connection to Picard&amp;rsquo;s Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#connection-to-picards-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The title of the conjecture reflects a precise structural analogy.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0;"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt;Theorem (Picard's Great Theorem)&lt;/span&gt;
&lt;p&gt;If $f$ has an essential singularity at $z_0$, then in every punctured neighbourhood
of $z_0$ the function $f$ takes every value in $\mathbb{C}$, with at most one exception,
infinitely many times.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;In particular, a function with an essential singularity is far from injective near
that point. The conjecture elevates this observation to the level of 1-forms: an
injective holomorphic primitive should preclude essential singularities in the
1-form itself, even when the primitive is only locally and multi-valuedly defined.&lt;/p&gt;
&lt;p&gt;Standard Picard covers the zero-residue case by reducing to a single-valued primitive.
The conjecture asks for an analogue that works when the monodromy is non-trivial, a
genuinely new statement about multi-valued functions and their differential geometry.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="origin-hyperelliptic-action-integrals"&gt;
 Origin: Hyperelliptic Action Integrals&lt;span class="heading__anchor"&gt; &lt;a href="#origin-hyperelliptic-action-integrals"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The problem arises from the &lt;em&gt;exact WKB method&lt;/em&gt; applied to the stationary
Schrödinger equation $-\psi&amp;rsquo;&amp;rsquo; + V(x)\psi = E\psi$ with polynomial potential $V$.
The formal WKB ansatz $\psi \sim e^{S/\hbar}$ produces a multivalued &lt;em&gt;action integral&lt;/em&gt;
$$\mathcal{I}(E) = \int_\gamma \sqrt{V(x) - E}\mathrm{d}x$$
defined on a hyperelliptic Riemann surface whose branch structure depends on the
energy parameter $E$. Elsner&amp;rsquo;s 1999 paper constructs the Riemann surface of
$\mathcal{I}$ explicitly and shows its branch points accumulate densely in the
value plane, a phenomenon that obstructs Borel–Laplace resummation of the
WKB symbols.&lt;/p&gt;
&lt;p&gt;In this setting the local inverses of $\mathcal{I}$ play the role of the $f_j$: they
are locally injective holomorphic functions whose differentials agree on overlaps.
The conjecture asks whether the obstruction to global meromorphic extension can
arise only from a pole, a controlled singularity, rather than an essential one.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions"&gt;
 Research Directions&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-the-non-zero-residue-case"&gt;
 1. The Non-Zero Residue Case&lt;span class="heading__anchor"&gt; &lt;a href="#1-the-non-zero-residue-case"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The open heart of the problem is the case $\operatorname{Res}_0\omega \ne 0$. Here
$\omega$ is not exact near $0$, the monodromy of the primitive is a non-trivial
translation $f_j \mapsto f_j + 2\pi i, \operatorname{Res}_0\omega$, and no single
injective function encompasses the full behaviour near the singularity.&lt;/p&gt;
&lt;p&gt;A natural approach is to pass to a cyclic cover $\tilde D \to D$ that trivialises the
monodromy, construct a single-valued primitive on $\tilde D\setminus{0}$, and
then appeal to the zero-residue argument there. The key difficulty is that the
injectivity of each $f_j$ on $U_j$ does not immediately imply injectivity of the
lifted primitive on $\tilde D$, since different sheets can collide. Making this
argument precise, or finding a counterexample, is the main open problem.&lt;/p&gt;
&lt;h3 class="heading" id="2-quantitative-control-via-nevanlinna-theory"&gt;
 2. Quantitative Control via Nevanlinna Theory&lt;span class="heading__anchor"&gt; &lt;a href="#2-quantitative-control-via-nevanlinna-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;An alternative strategy replaces Picard&amp;rsquo;s theorem by its quantitative form. If $F$ is
a meromorphic function on the punctured disk with an essential singularity, the
Nevanlinna characteristic $T(r,F)$ grows faster than any power of $\log(1/r)$ as
$r\to 0$. For an injective function the counting functions $N(r,a,F)$, recording
how often $F = a$ in the punctured disk, satisfy strong constraints.&lt;/p&gt;
&lt;p&gt;Nevanlinna-theoretic methods might give a direct bound on $T(r,f_j)$ in terms of the
geometry of the cover ${U_j}$ and the injectivity of $f_j$, ruling out essential
singularities of $\omega$ without passing through the monodromy argument. This would
require adapting the standard Nevanlinna machinery to functions that are only locally
defined on an open cover.&lt;/p&gt;
&lt;h3 class="heading" id="3-replacing-injectivity-by-finite-valence"&gt;
 3. Replacing Injectivity by Finite Valence&lt;span class="heading__anchor"&gt; &lt;a href="#3-replacing-injectivity-by-finite-valence"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;One can ask whether the conjecture remains true if &amp;ldquo;injective&amp;rdquo; is weakened to
&amp;ldquo;at most $d$-to-one&amp;rdquo; for some fixed integer $d$. Finite-valence holomorphic functions
cannot have essential singularities either, by a Picard-type argument (a function of
valence at most $d$ takes each value at most $d$ times, so in any neighbourhood of an
essential singularity it must omit a set of positive capacity, contradicting Picard).&lt;/p&gt;
&lt;p&gt;If the conjecture extends to finite valence, the proof strategy will likely yield a
valence-independent argument that illuminates the zero-residue case more transparently.
If it fails for finite valence, the counterexample geometry would clarify what role
injectivity plays beyond the mere avoidance of essential singularities.&lt;/p&gt;
&lt;h3 class="heading" id="4-several-complex-variables"&gt;
 4. Several Complex Variables&lt;span class="heading__anchor"&gt; &lt;a href="#4-several-complex-variables"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In $\mathbb{C}^n$ for $n \ge 2$ the theory of isolated singularities of holomorphic
functions changes dramatically: by Hartogs&amp;rsquo; extension theorem, isolated singularities
of holomorphic functions are always removable. One would expect the analogous
conjecture for holomorphic 1-forms in $\mathbb{C}^n$ to be more tractable, or even to
follow from known extension results.&lt;/p&gt;
&lt;p&gt;Formulating the precise analogue, replacing the punctured disk by a domain
$\Omega\setminus{0}$ in $\mathbb{C}^n$, and specifying what &amp;ldquo;meromorphic 1-form&amp;rdquo;
means on a higher-dimensional domain, and checking whether Hartogs-type arguments
already resolve it would clarify which features of the problem are genuinely
one-dimensional.&lt;/p&gt;
&lt;h3 class="heading" id="5-geometric-formulation-on-riemann-surfaces"&gt;
 5. Geometric Formulation on Riemann Surfaces&lt;span class="heading__anchor"&gt; &lt;a href="#5-geometric-formulation-on-riemann-surfaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The disk $D$ and the puncture at $0$ are not special: the same question can be posed
on any Riemann surface $X$ with a marked point $p$. Given an open cover of
$X\setminus{p}$ and injective holomorphic functions $f_j$ on each piece with
compatible differentials, does $\omega = \mathrm{d}f_j$ extend meromorphically
across $p$?&lt;/p&gt;
&lt;p&gt;The answer may depend on the genus and the function theory of $X$. For the disk
(simply connected, genus 0) the monodromy is a simple translation; for a torus or
higher-genus surface the monodromy group is richer and the argument structure should
change. Comparing these cases may isolate the essential input from the topology versus
the analysis.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Elsner, B. (1999). Hyperelliptic action integral. &lt;em&gt;Annales de l&amp;rsquo;Institut Fourier&lt;/em&gt;, &lt;strong&gt;49&lt;/strong&gt;(1), 303–331. &lt;a href="https://www.numdam.org/item/AIF_1999__49_1_303_0/"&gt;https://www.numdam.org/item/AIF_1999__49_1_303_0/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Ahlfors, L. V. (1979). &lt;em&gt;Complex Analysis&lt;/em&gt; (3rd ed.). McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Conway, J. B. (1978). &lt;em&gt;Functions of One Complex Variable&lt;/em&gt; (2nd ed.). Springer.&lt;/li&gt;
&lt;li&gt;Nevanlinna, R. (1970). &lt;em&gt;Analytic Functions&lt;/em&gt;. Springer.&lt;/li&gt;
&lt;li&gt;Forster, O. (1981). &lt;em&gt;Lectures on Riemann Surfaces&lt;/em&gt;. Springer.&lt;/li&gt;
&lt;li&gt;Delabaere, E., Dillinger, H., &amp;amp; Pham, F. (1993). Résurgence de Voros et périodes des courbes hyperelliptiques. &lt;em&gt;Annales de l&amp;rsquo;Institut Fourier&lt;/em&gt;, &lt;strong&gt;43&lt;/strong&gt;(1), 163–199.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Criterion for Boundedness of Power Series</title><link>https://blog.namln.org/en/posts/power_series_boundedness/</link><pubDate>Tue, 26 May 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/power_series_boundedness/</guid><description>&lt;h2 class="heading" id="introduction--problem-statement"&gt;
 Introduction &amp;amp; Problem Statement&lt;span class="heading__anchor"&gt; &lt;a href="#introduction--problem-statement"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Power series constitute one of the most ubiquitous objects in analysis.
A power series $\sum_{n=0}^{\infty}a_n x^n$ with infinite radius of
convergence defines a real-entire function $f:\mathbb{R}\to\mathbb{R}$.
Whereas the question of &lt;em&gt;convergence&lt;/em&gt; is completely settled by
Cauchy–Hadamard theory, the question of &lt;em&gt;boundedness&lt;/em&gt; of the sum function
is far subtler and, as of this writing, remains open.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 1 (Rüdinger, 2009)&lt;/span&gt;
&lt;p&gt;Let $(a_n) _{n\ge 0}$ be a sequence of real numbers such that the power
series $\sum _{n=0}^{\infty}a_n x^n$ converges for every $x\in\mathbb{R}$,
thereby defining a smooth function $f:\mathbb{R}\to\mathbb{R}$.
Give a &lt;strong&gt;necessary and sufficient&lt;/strong&gt; criterion on $(a_n)$ for $f$ to be
bounded on $\mathbb{R}$.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The problem is rated &lt;em&gt;low importance&lt;/em&gt; on the
&lt;a href="http://www.openproblemgarden.org/op/criterion_for_boundedness_of_power_series"&gt;Open Problem Garden&lt;/a&gt;
and is recommended as accessible to undergraduates; nevertheless, a complete
answer appears to be unknown.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Motivating examples.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Function&lt;/th&gt;
					&lt;th&gt;Power series&lt;/th&gt;
					&lt;th&gt;Bounded?&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;$\cos x$&lt;/td&gt;
					&lt;td&gt;$\displaystyle\sum_{k=0}^{\infty}\frac{(-1)^k}{(2k)!}x^{2k}$&lt;/td&gt;
					&lt;td&gt;$|\cos x|\le 1$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$\sin x$&lt;/td&gt;
					&lt;td&gt;$\displaystyle\sum_{k=0}^{\infty}\frac{(-1)^k}{(2k+1)!}x^{2k+1}$&lt;/td&gt;
					&lt;td&gt;$|\sin x|\le 1$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$e^x$&lt;/td&gt;
					&lt;td&gt;$\displaystyle\sum_{n=0}^{\infty}\frac{x^n}{n!}$&lt;/td&gt;
					&lt;td&gt;$e^x\to+\infty$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;$p(x)=a_0+\cdots+a_Nx^N,\ N\ge 1$&lt;/td&gt;
					&lt;td&gt;(polynomial)&lt;/td&gt;
					&lt;td&gt;unbounded&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background--prerequisites"&gt;
 Background &amp;amp; Prerequisites&lt;span class="heading__anchor"&gt; &lt;a href="#background--prerequisites"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;This section collects the core mathematical tools needed to engage
seriously with Question 1.&lt;/p&gt;
&lt;h3 class="heading" id="power-series-and-entire-functions"&gt;
 Power Series and Entire Functions&lt;span class="heading__anchor"&gt; &lt;a href="#power-series-and-entire-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 1 (Power Series &amp; Radius of Convergence)&lt;/span&gt;
&lt;p&gt;A &lt;em&gt;power series&lt;/em&gt; centred at the origin is a formal series
$\sum_{n=0}^{\infty}a_n x^n$ with $a_n\in\mathbb{R}$. Its &lt;em&gt;radius of
convergence&lt;/em&gt; is
$$
R = \frac{1}{\limsup_{n\to\infty}|a_n|^{1/n}} \in [0,+\infty].
$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Throughout this note we always assume $R=+\infty$, i.e.,
$\limsup_{n\to\infty}|a_n|^{1/n}=0$.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 2 (Entire Function)&lt;/span&gt;
&lt;p&gt;A function $f:\mathbb{C}\to\mathbb{C}$ is called &lt;em&gt;entire&lt;/em&gt; if it is
holomorphic on all of $\mathbb{C}$. Every power series with $R=+\infty$
defines a real-entire function, and by the identity theorem its complex
extension is entire.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 1 (Cauchy–Hadamard)&lt;/span&gt;
&lt;p&gt;The radius of convergence of $\sum a_n z^n$ equals
$$
R = \Bigl(\limsup_{n\to\infty}|a_n|^{1/n}\Bigr)^{-1}.
$$&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 1&lt;/span&gt;
&lt;p&gt;The condition $R=+\infty$ is equivalent to $a_n = O(r^n/n!)$ for every
$r&amp;gt;0$, i.e., the coefficients decay faster than any geometric sequence.
This is the Paley–Wiener type condition for entire functions of order $1$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="order-and-type-of-entire-functions"&gt;
 Order and Type of Entire Functions&lt;span class="heading__anchor"&gt; &lt;a href="#order-and-type-of-entire-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 3 (Order and Type)&lt;/span&gt;
&lt;p&gt;The &lt;em&gt;order&lt;/em&gt; of an entire function $f$ is
$$
\rho = \limsup_{r\to\infty}\frac{\log\log M(r)}{\log r},
\qquad M(r)=\max_{|z|=r}|f(z)|.
$$
The &lt;em&gt;type&lt;/em&gt; $\sigma$ (for $0&amp;lt;\rho&amp;lt;\infty$) is
$$
\sigma = \limsup_{r\to\infty}\frac{\log M(r)}{r^{\rho}}.
$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A bounded &lt;em&gt;complex&lt;/em&gt; entire function has order $\rho=0$ (by Liouville&amp;rsquo;s
theorem it must be constant), while a bounded &lt;em&gt;real-valued&lt;/em&gt; entire function
can be non-constant. Boundedness is therefore a genuinely real-variable
phenomenon.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="liouvilles-theorem-and-its-limitations"&gt;
 Liouville&amp;rsquo;s Theorem and Its Limitations&lt;span class="heading__anchor"&gt; &lt;a href="#liouvilles-theorem-and-its-limitations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 2 (Liouville)&lt;/span&gt;
&lt;p&gt;Every bounded entire function $f:\mathbb{C}\to\mathbb{C}$ is constant.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 2 (Why Liouville does not solve the problem)&lt;/span&gt;
&lt;p&gt;Question 1 concerns &lt;em&gt;real-valued&lt;/em&gt; functions $f:\mathbb{R}\to\mathbb{R}$.
A function may be bounded on $\mathbb{R}$ while its complex extension is
unbounded. For instance, $\cos z$ satisfies $|\cos z|\to\infty$ along
the imaginary axis (since $\cos(iy)=\cosh y\to+\infty$). Liouville&amp;rsquo;s
theorem therefore does not apply, and the problem is genuinely non-trivial.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="algebraic-structure-of-the-relevant-function-space"&gt;
 Algebraic Structure of the Relevant Function Space&lt;span class="heading__anchor"&gt; &lt;a href="#algebraic-structure-of-the-relevant-function-space"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #27ae60; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#27ae60; font-weight:bold;"&gt; Definition 4 (Space of Bounded Power Series)&lt;/span&gt;
&lt;p&gt;Let $\mathcal{B}$ denote the set of all functions $f:\mathbb{R}\to\mathbb{R}$
that can be represented as a convergent power series $\sum_{n\ge 0}a_n x^n$
(with $R=+\infty$) and that are bounded on $\mathbb{R}$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #e67e22; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#e67e22; font-weight:bold;"&gt; Proposition 1, Algebraic Properties of $\mathcal{B}$ (Rüdinger, 2009)&lt;/span&gt;
&lt;ol&gt;
&lt;li&gt;$\mathcal{B}$ is a &lt;strong&gt;linear subspace&lt;/strong&gt; of $C^\infty(\mathbb{R})$: if
$f,g\in\mathcal{B}$ and $\lambda\in\mathbb{R}$ then $f+\lambda g\in\mathcal{B}$.&lt;/li&gt;
&lt;li&gt;$\mathcal{B}$ is &lt;strong&gt;closed under pointwise multiplication&lt;/strong&gt;: if
$f,g\in\mathcal{B}$ then $fg\in\mathcal{B}$.&lt;/li&gt;
&lt;li&gt;$\mathcal{B}$ contains &lt;strong&gt;all functions of the form&lt;/strong&gt; $c\cos(h(x))$,
where $c\in\mathbb{R}$ and $h:\mathbb{R}\to\mathbb{R}$ is any entire function.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 3&lt;/span&gt;
&lt;p&gt;Part (3) follows from $\cos(h(x)) = \operatorname{Re}(e^{ih(x)})$ together
with $|\cos(h(x))|\le 1$. The class is strictly larger than
${c\cos(bx):c,b\in\mathbb{R}}$; for example, $\cos(x^3-x)\in\mathcal{B}$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="known-partial-results"&gt;
 Known Partial Results&lt;span class="heading__anchor"&gt; &lt;a href="#known-partial-results"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="necessary-conditions"&gt;
 Necessary Conditions&lt;span class="heading__anchor"&gt; &lt;a href="#necessary-conditions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #e67e22; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#e67e22; font-weight:bold;"&gt; Proposition 2, Necessary Condition for Boundedness (Rüdinger, 2009)&lt;/span&gt;
&lt;p&gt;Suppose $f(x)=\sum_{n=0}^{\infty}a_n x^n$ is bounded on $\mathbb{R}$.
Then either:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;$a_0$ is the &lt;em&gt;only&lt;/em&gt; non-zero coefficient (i.e., $f$ is the constant
function $f\equiv a_0$), or&lt;/li&gt;
&lt;li&gt;there are &lt;strong&gt;infinitely many&lt;/strong&gt; indices $n$ with $a_n\neq 0$, and the
signs of the non-zero $a_n$ &lt;strong&gt;change infinitely often&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 4&lt;/span&gt;
&lt;p&gt;The sign-change condition is necessary: if the non-zero coefficients are
eventually of one sign, the dominant-term comparison shows
$f(x)\to\pm\infty$ as $x\to+\infty$ or $x\to-\infty$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #8e44ad; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#8e44ad; font-weight:bold;"&gt; Corollary 1&lt;/span&gt;
&lt;p&gt;Every non-constant polynomial is unbounded on $\mathbb{R}$.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;&lt;em&gt;Proof.&lt;/em&gt;&lt;/summary&gt;
A polynomial has only finitely many non-zero coefficients. By Proposition 2 (1),
the only bounded polynomial is the constant function. Any non-constant
polynomial satisfies $|p(x)|\to\infty$ as $|x|\to\infty$.
&lt;/details&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="the-sign-change-condition-is-not-sufficient"&gt;
 The Sign-Change Condition Is Not Sufficient&lt;span class="heading__anchor"&gt; &lt;a href="#the-sign-change-condition-is-not-sufficient"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The condition of Proposition 2 is &lt;em&gt;not&lt;/em&gt; sufficient, as the following
examples show.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #16a085; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#16a085; font-weight:bold;"&gt; Example 1&lt;/span&gt;
&lt;p&gt;Consider the geometric series
$$
f(x) = \sum_{n=0}^{\infty}(-1)^n x^{2n} = \frac{1}{1+x^2},
\qquad |x|&amp;lt;1.
$$
The coefficients alternate in sign, yet $R=1\neq+\infty$. One must first
require $R=+\infty$ before the sign-change condition becomes meaningful.&lt;/p&gt;
&lt;p&gt;For a subtler case with $R=+\infty$: take $a_n=(-1)^n/n!$, so
$$
f(x) = \sum_{n=0}^{\infty}\frac{(-1)^n}{n!}x^n = e^{-x}.
$$
The signs alternate, yet $e^{-x}\to+\infty$ as $x\to-\infty$.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 5&lt;/span&gt;
&lt;p&gt;The $e^{-x}$ example reveals the key gap: sign alternation of the
&lt;em&gt;coefficients&lt;/em&gt; does not prevent the &lt;em&gt;function&lt;/em&gt; from growing in one
direction, because the series for $e^{-x}$ reconstructs exponential
growth in the negative half-line. A complete criterion must capture
cancellation in &lt;strong&gt;both&lt;/strong&gt; directions.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="connections-to-entire-function-theory"&gt;
 Connections to Entire Function Theory&lt;span class="heading__anchor"&gt; &lt;a href="#connections-to-entire-function-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 3 (Borel–Carathéodory)&lt;/span&gt;
&lt;p&gt;Let $f$ be holomorphic in $|z|\le R$. Then for $0&amp;lt;r&amp;lt;R$,
$$
M(r) \le \frac{2r}{R-r}\sup_{|z|=R}\operatorname{Re}f(z) + \frac{R+r}{R-r},|f(0)|.
$$&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 6&lt;/span&gt;
&lt;p&gt;Borel–Carathéodory shows that the &lt;em&gt;real part&lt;/em&gt; of a complex-valued entire
function controls its modulus. For a real-valued function on $\mathbb{R}$
the analogous control is more delicate, since we only observe the function
on a line, not on a disk.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #c0392b; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#c0392b; font-weight:bold;"&gt; Theorem 4 (Hadamard Factorisation)&lt;/span&gt;
&lt;p&gt;Every entire function of finite order $\rho$ can be written as
$$
f(z) = z^m e^{g(z)}\prod_{n=1}^{\infty} E_p!\left(\frac{z}{z_n}\right),
$$
where $m\ge 0$, $p=\lfloor\rho\rfloor$, $g$ is a polynomial of degree
$\le\rho$, and the $E_p$ are Weierstrass elementary factors.&lt;/p&gt;
&lt;/div&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 7&lt;/span&gt;
&lt;p&gt;A bounded real entire function of infinite order (if one exists) would
not be directly covered by the Hadamard factorisation. Understanding the
zero set and the exponential factor in $e^{g(z)}$ may be key to
classifying all $f\in\mathcal{B}$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="the-open-sub-question-on-the-generators-of-mathcalb"&gt;
 The Open Sub-Question on the Generators of $\mathcal{B}$&lt;span class="heading__anchor"&gt; &lt;a href="#the-open-sub-question-on-the-generators-of-mathcalb"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 2 (Rüdinger, 2009)&lt;/span&gt;
&lt;p&gt;Does $\mathcal{B}$ consist &lt;em&gt;precisely&lt;/em&gt; of functions of the form $c\cos(h(x))$
and their linear combinations and products, where $h:\mathbb{R}\to\mathbb{R}$
is entire and $c\in\mathbb{R}$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A &lt;strong&gt;positive&lt;/strong&gt; answer would give an implicit characterisation via algebraic
generators. A &lt;strong&gt;negative&lt;/strong&gt; answer would require producing a bounded entire
function on $\mathbb{R}$ that does &lt;em&gt;not&lt;/em&gt; lie in the
$\mathbb{R}$-algebra generated by ${\cos\circ, h : h\text{ entire}}$.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid #7f8c8d; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:#7f8c8d; font-weight:bold;"&gt; Remark 8&lt;/span&gt;
&lt;p&gt;By Proposition 1 (3), every $c\cos(h(x))$ belongs to $\mathcal{B}$, and
$\mathcal{B}$ is an algebra, so all products and sums remain in
$\mathcal{B}$. What is unknown is whether &lt;em&gt;every&lt;/em&gt; element of $\mathcal{B}$
arises this way. Note that $\sin x = \cos(x-\pi/2) \in \mathcal{B}$, so
sine is already covered.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="research-directions-and-conjectures"&gt;
 Research Directions and Conjectures&lt;span class="heading__anchor"&gt; &lt;a href="#research-directions-and-conjectures"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="direction-1-coefficient-growth-rate"&gt;
 Direction 1: Coefficient Growth Rate&lt;span class="heading__anchor"&gt; &lt;a href="#direction-1-coefficient-growth-rate"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A promising approach is to examine the &lt;em&gt;rate&lt;/em&gt; of decay of $|a_n|$, not just
the sign pattern.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 3&lt;/span&gt;
&lt;p&gt;Is there a decay condition on $|a_n|$, combined with the sign-change
condition, that gives a &lt;strong&gt;sufficient&lt;/strong&gt; criterion for $f\in\mathcal{B}$?&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;em&gt;Approach.&lt;/em&gt; The Cauchy estimates give $|a_n| = |f^{(n)}(0)|/n!\le M(r)/r^n$
for all $r&amp;gt;0$. If $f\in\mathcal{B}$ with $|f|\le B$, the bound
$|a_n|\le B/r^n$ holds for every $r&amp;gt;0$, but this recovers only the
$R=+\infty$ condition. Is there a sharper constraint?&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-2-fourier-analytic-approach"&gt;
 Direction 2: Fourier-Analytic Approach&lt;span class="heading__anchor"&gt; &lt;a href="#direction-2-fourier-analytic-approach"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Every $f\in L^\infty(\mathbb{R})\cap L^2(\mathbb{R})$ possesses a
square-integrable Fourier transform. If $f$ is also entire, Paley–Wiener
forces the transform to be compactly supported. However, a generic
$f\in\mathcal{B}$ may not lie in $L^2$ (e.g., $\cos x\notin L^2(\mathbb{R})$).&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 4&lt;/span&gt;
&lt;p&gt;Can the Fourier theory for tempered distributions give a necessary and
sufficient condition for $f\in\mathcal{B}$ in terms of the spectral
support of $f$?&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-3-differential-equation-characterisation"&gt;
 Direction 3: Differential Equation Characterisation&lt;span class="heading__anchor"&gt; &lt;a href="#direction-3-differential-equation-characterisation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Bounded entire functions often arise as solutions to ODEs. For instance
$y&amp;rsquo;&amp;rsquo;+y=0$ has bounded solutions $A\cos x + B\sin x$. More generally,
$y&amp;rsquo;&amp;rsquo;+\omega(x)y=0$ with $\omega$ entire and bounded can produce bounded
solutions.&lt;/p&gt;
&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 5&lt;/span&gt;
&lt;p&gt;Characterise those linear differential operators $L$ with entire coefficients
whose full solution space lies within $\mathcal{B}$.&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-4-evenodd-decomposition-and-reduction"&gt;
 Direction 4: Even/Odd Decomposition and Reduction&lt;span class="heading__anchor"&gt; &lt;a href="#direction-4-evenodd-decomposition-and-reduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Every $f\in\mathcal{B}$ splits as $f=f_e+f_o$ where
$$
f_e(x)=\tfrac{1}{2}(f(x)+f(-x))=\sum_{k\ge 0}a_{2k}x^{2k}
\quad\text{and}\quad
f_o(x)=\tfrac{1}{2}(f(x)-f(-x))=\sum_{k\ge 0}a_{2k+1}x^{2k+1}.
$$
Since $f_e(x)=g(x^2)$ for the entire function $g(t)=\sum_{k\ge 0}a_{2k}t^k$,
boundedness of $f_e$ reduces to: &lt;em&gt;is $g$ bounded on $[0,+\infty)$?&lt;/em&gt; This
reduction may make the even and odd parts easier to study separately.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="direction-5-polynomial-approximation-and-numerics"&gt;
 Direction 5: Polynomial Approximation and Numerics&lt;span class="heading__anchor"&gt; &lt;a href="#direction-5-polynomial-approximation-and-numerics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;div style="padding:10px 14px; border:2px solid dodgerblue; border-radius:6px; margin:16px 0"&gt;
&lt;span style="color:dodgerblue; font-weight:bold;"&gt; Question 6&lt;/span&gt;
&lt;p&gt;If the partial sums $S_N(x)=\sum_{n=0}^{N}a_n x^n$ are uniformly bounded
on growing intervals $[-R_N,R_N]$ (with $R_N\to\infty$), does it follow
that $f\in\mathcal{B}$? Conversely, if $f\in\mathcal{B}$, how fast must
$R_N$ grow relative to $N$ for the bound to hold?&lt;/p&gt;
&lt;/div&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="summary-of-open-problems"&gt;
 Summary of Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#summary-of-open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;#&lt;/th&gt;
					&lt;th&gt;Statement&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q1&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Give a necessary and sufficient condition on $(a_n)$ for $f=\sum a_n x^n$ to be bounded on $\mathbb{R}$.&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q2&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Is $\mathcal{B}$ generated (as an algebra) precisely by ${c\cos(h(x)):h\text{ entire}}$?&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q3&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Does a sharper decay condition on $&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q4&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Can spectral-support (Paley–Wiener / distribution) theory characterise $\mathcal{B}$?&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q5&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Which linear ODEs with entire coefficients have solution space $\subseteq\mathcal{B}$?&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;Q6&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;What is the precise relationship between truncation bounds on $[-R_N,R_N]$ and $f\in\mathcal{B}$?&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Ahlfors, L. V. (1979). &lt;em&gt;Complex Analysis&lt;/em&gt;, 3rd ed. McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Boas, R. P. (1954). &lt;em&gt;Entire Functions&lt;/em&gt;. Academic Press.&lt;/li&gt;
&lt;li&gt;Conway, J. B. (1978). &lt;em&gt;Functions of One Complex Variable&lt;/em&gt;, 2nd ed. Springer.&lt;/li&gt;
&lt;li&gt;Levin, B. Ya. (1996). &lt;em&gt;Lectures on Entire Functions&lt;/em&gt;. AMS Translations of Mathematical Monographs, vol. 150.&lt;/li&gt;
&lt;li&gt;Rudin, W. (1976). &lt;em&gt;Principles of Mathematical Analysis&lt;/em&gt;, 3rd ed. McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Rudin, W. (1987). &lt;em&gt;Real and Complex Analysis&lt;/em&gt;, 3rd ed. McGraw-Hill.&lt;/li&gt;
&lt;li&gt;Rüdinger, A. (2009). Criterion for boundedness of power series. &lt;em&gt;Open Problem Garden&lt;/em&gt;. &lt;a href="http://www.openproblemgarden.org/op/criterion_for_boundedness_of_power_series"&gt;http://www.openproblemgarden.org/op/criterion_for_boundedness_of_power_series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Stein, E. M. and Shakarchi, R. (2003). &lt;em&gt;Fourier Analysis: An Introduction&lt;/em&gt;. Princeton University Press.&lt;/li&gt;
&lt;li&gt;Stein, E. M. and Shakarchi, R. (2010). &lt;em&gt;Complex Analysis&lt;/em&gt;. Princeton University Press.&lt;/li&gt;
&lt;li&gt;Titchmarsh, E. C. (1939). &lt;em&gt;The Theory of Functions&lt;/em&gt;, 2nd ed. Oxford University Press.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Brezis' first open problem - An elliptic equation involving the critical exponent in 3D</title><link>https://blog.namln.org/en/posts/brezis-first-open-problem/</link><pubDate>Sat, 18 Apr 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/brezis-first-open-problem/</guid><description>&lt;h2 class="heading" id="yamabe-problem"&gt;
 Yamabe problem&lt;span class="heading__anchor"&gt; &lt;a href="#yamabe-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Yamabe problem: Suppose $(\mathcal{M}, g_0)$ is a compact closed Riemannian manifold with dimension $N \geq 3$, does there exist a conformal metric $g = u^{\frac{4}{N-2}}g_0$ which has constant scalar curvature $R_g \equiv C$?&lt;/p&gt;
&lt;p&gt;Find $u &amp;gt; 0$ on $\mathcal{M}$ such that
$$
-\frac{4(N-1)}{N-2}\Delta_{g_0}u + R_{g_0}u = Cu^{\frac{N+2}{N-2}}\qquad\text{on }\mathcal{M}.
$$&lt;/p&gt;
&lt;p&gt;Some results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Trudinger [1968]: if $g$ has non-positive scalar curvature.&lt;/li&gt;
&lt;li&gt;Aubin [1976]: $N \geq 6$ and $(\mathcal{M}, g)$ not locally conformally flat.&lt;/li&gt;
&lt;li&gt;Schoen [1984]: any dimension, the remaining cases, assuming the Positive Mass Theorem by Schoen-Yau [1979].&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="a-special-case"&gt;
 A special case&lt;span class="heading__anchor"&gt; &lt;a href="#a-special-case"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Consider the special case where $\mathcal{M}$ is a bounded domain $\Omega$ in $\mathbb{R}^{N}$:
$$
\begin{cases}
-\Delta u = u^{\frac{N+2}{N-2}}\qquad\text{in }\Omega, \\
u &amp;gt; 0\qquad\text{in }\Omega, \\
u = 0\qquad\text{on }\partial\Omega.
\end{cases}
$$&lt;/p&gt;
&lt;p&gt;Pohozaev [1965]: if $\Omega$ is star-shaped, then there is no nontrivial solution.&lt;/p&gt;
&lt;h2 class="heading" id="brezis-nirenberg-problem"&gt;
 Brezis-Nirenberg problem&lt;span class="heading__anchor"&gt; &lt;a href="#brezis-nirenberg-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Consider a lower-order perturbation:
$$
\begin{cases}
-\Delta u = u^{\frac{N+2}{N-2}} + \lambda u\qquad\text{in }\Omega, \\
u &amp;gt; 0\qquad\text{in }\Omega, \\
u = 0\qquad\text{on }\partial\Omega.
\end{cases}
$$&lt;/p&gt;
&lt;p&gt;Some results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pohozaev&amp;rsquo;s result also yields nonexistence when $\lambda \leq 0$ and $\Omega$ is star-shaped.&lt;/li&gt;
&lt;li&gt;If a positive solution exists, then necessarily $\lambda &amp;lt; \lambda_1$, where $\lambda_1$ is the first eigenvalue of $-\Delta$ on $\Omega$ with zero Dirichlet boundary condition.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hence, for positive solutions on star-shaped domains,
$$
0 &amp;lt; \lambda &amp;lt; \lambda_1.
$$&lt;/p&gt;
&lt;h2 class="heading" id="brezis-open-problem-11"&gt;
 Brezis&amp;rsquo; Open Problem 1.1&lt;span class="heading__anchor"&gt; &lt;a href="#brezis-open-problem-11"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Let $N=3$, and let $\Omega = B_1 \subset \mathbb{R}^3$ be the unit ball. Consider
$$
\begin{cases}
-\Delta u = u^5 + \lambda u \qquad \text{in } B_1, \\
u = 0 \qquad \text{on } \partial B_1.
\end{cases}
$$
We ask whether this problem admits a nontrivial positive solution $u \not\equiv 0$.&lt;/p&gt;
&lt;p&gt;Here the exponent $5 = \frac{N+2}{N-2}$ is the critical Sobolev exponent when $N=3$, and this is exactly the source of the main compactness difficulty.&lt;/p&gt;
&lt;p&gt;Let $\lambda_1$ be the first Dirichlet eigenvalue of $-\Delta$ on $B_1$. The classical Brezis-Nirenberg theory shows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If $\lambda \leq 0$, then the only solution is $u \equiv 0$.&lt;/li&gt;
&lt;li&gt;If $\frac{1}{4}\lambda_1 &amp;lt; \lambda &amp;lt; \lambda_1$, then there exists a positive radial solution.&lt;/li&gt;
&lt;li&gt;If $0 &amp;lt; \lambda \leq \frac{1}{4}\lambda_1$, then any radial solution must be trivial; hence there is no positive radial solution.&lt;/li&gt;
&lt;li&gt;If $\lambda &amp;gt; \lambda_1$, there exist sign-changing solutions, but no positive solution.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Therefore the unresolved case is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 1.1.&lt;/strong&gt; Assume
$$
0 &amp;lt; \lambda \leq \frac{1}{4}\lambda_1.
$$
Does there exist a nontrivial solution?&lt;br&gt;
Equivalently, since no positive radial solution can exist in this range, can there exist a non-radial positive solution?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This problem has remained open for decades, even if one restricts further to a smaller interval such as
$$
0 &amp;lt; \lambda &amp;lt; \varepsilon
$$
for some sufficiently small $\varepsilon &amp;gt; 0$.&lt;/p&gt;
&lt;h2 class="heading" id="remarks"&gt;
 Remarks&lt;span class="heading__anchor"&gt; &lt;a href="#remarks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;A few points are worth emphasizing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;By the Gidas-Ni-Nirenberg symmetry principle, positive solutions on a ball are often expected to be radial; however, in this regime Brezis observed that any radial solution must vanish, so any eventual positive solution would have to be genuinely non-radial.&lt;/li&gt;
&lt;li&gt;This makes dimension $3$ sharply different from higher-dimensional cases, where the Brezis-Nirenberg existence theory is better understood.&lt;/li&gt;
&lt;li&gt;The bifurcation picture suggests branches of sign-changing non-radial solutions emerging from higher eigenvalues, but it is not known whether such branches can reach the interval $\left(0,\frac14\lambda_1\right]$.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;H. Brezis and L. Nirenberg, &lt;em&gt;Positive solutions of nonlinear elliptic equations involving critical Sobolev exponents&lt;/em&gt;, Comm. Pure Appl. Math. 36 (1983), 437&amp;ndash;477.&lt;/li&gt;
&lt;li&gt;H. Brezis, &lt;em&gt;Some of My Favorite Open Problems&lt;/em&gt;, Open Problem 1.1.&lt;/li&gt;
&lt;li&gt;M. Comte, &lt;em&gt;Solutions of elliptic equations with critical Sobolev exponent in dimension three&lt;/em&gt;, Nonlinear Anal. 17 (1991), 445&amp;ndash;455.&lt;/li&gt;
&lt;li&gt;O. Druet, &lt;em&gt;Elliptic equations with critical Sobolev exponents in dimension 3&lt;/em&gt;, Ann. Inst. H. Poincaré Anal. Non Linéaire 19 (2002), 125&amp;ndash;142.&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Recent Advances in KAN-Based Numerical PDE Solvers</title><link>https://blog.namln.org/en/posts/kan-pde-solvers/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/kan-pde-solvers/</guid><description>&lt;p&gt;Kolmogorov-Arnold Networks (KANs), introduced in 2024, have rapidly become one of the most active frontiers in scientific machine learning for solving partial differential equations (PDEs) (Liu et al., 2024). Unlike Multi-Layer Perceptrons (MLPs), which apply fixed activation functions at nodes, KANs place &lt;strong&gt;learnable univariate activation functions on edges&lt;/strong&gt;, grounded in the Kolmogorov-Arnold representation theorem: every continuous multivariate function can be expressed as a composition of univariate functions and summations. This structural difference gives KANs two key properties relevant to PDE numerics — &lt;strong&gt;higher interpretability&lt;/strong&gt; and &lt;strong&gt;parameter efficiency&lt;/strong&gt; — making them an appealing successor to MLP-based Physics-Informed Neural Networks (PINNs).&lt;/p&gt;
&lt;p&gt;From 2024 through early 2026, researchers have published dozens of frameworks combining KANs with classical numerical concepts (spectral methods, operator learning, energy-stable time-stepping, neural operators) and targeting problems ranging from single PDEs to high-dimensional systems with hundreds of variables.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The KAN-for-PDEs landscape organises into several interrelated research threads:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Physics-Informed KAN Frameworks (PIKANs / KINN)&lt;/strong&gt; — direct replacements of MLP layers in PINNs with KAN layers, using strong, energy, and inverse PDE formulations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spectral-Basis and Wavelet-Enriched KANs&lt;/strong&gt; — embedding orthogonal polynomial or wavelet bases to combat spectral bias.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;KAN-Based Neural Operators&lt;/strong&gt; — KAN sub-networks inside DeepONet, FNO, and pseudo-differential operator frameworks for learning PDE solution maps.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Time-Dependent and Evolutionary KANs&lt;/strong&gt; — energy-stable schemes, KAN-ODEs, and moving-boundary solvers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Discontinuities, Shock Waves, and Turbulence&lt;/strong&gt; — specialised architectures for sharp transitions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High-Dimensional PDEs&lt;/strong&gt; — separable and tensor-product KAN surrogates scaling to hundreds of dimensions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data-Driven Discovery and Inverse Problems&lt;/strong&gt; — interpretability-driven model identification.&lt;/li&gt;
&lt;/ol&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Architecture&lt;/th&gt;
					&lt;th&gt;Key Strength&lt;/th&gt;
					&lt;th&gt;Representative Work&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;KINN&lt;/td&gt;
					&lt;td&gt;Forward/inverse problems, strong/energy/inverse forms&lt;/td&gt;
					&lt;td&gt;Wang et al., 2024&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;ChebPIKAN&lt;/td&gt;
					&lt;td&gt;Fluid mechanics PDEs, orthogonal basis&lt;/td&gt;
					&lt;td&gt;Cui et al., 2024&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;KANO&lt;/td&gt;
					&lt;td&gt;Symbolic operator recovery, variable-coefficient PDEs&lt;/td&gt;
					&lt;td&gt;arXiv:2509.16825&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;EvoKAN&lt;/td&gt;
					&lt;td&gt;Long-horizon time evolution, energy stability&lt;/td&gt;
					&lt;td&gt;arXiv:2503.01618&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Anant-KAN&lt;/td&gt;
					&lt;td&gt;High-dimensional PDEs (up to 300D)&lt;/td&gt;
					&lt;td&gt;arXiv:2505.03595&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;DPINN&lt;/td&gt;
					&lt;td&gt;Shock waves and discontinuities&lt;/td&gt;
					&lt;td&gt;arXiv:2507.08338&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-kolmogorov-arnold-representation-theorem"&gt;
 The Kolmogorov-Arnold Representation Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#the-kolmogorov-arnold-representation-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The theoretical foundation of KANs is the Kolmogorov-Arnold theorem: any continuous function $f: [0,1]^n \to \mathbb{R}$ can be written as&lt;/p&gt;
&lt;p&gt;$$f(x_1, \ldots, x_n) = \sum_{q=0}^{2n} \Phi_q!\left(\sum_{p=1}^{n} \phi_{q,p}(x_p)\right),$$&lt;/p&gt;
&lt;p&gt;where $\phi_{q,p}: [0,1] \to \mathbb{R}$ and $\Phi_q: \mathbb{R} \to \mathbb{R}$ are univariate continuous functions. In contrast to MLPs — where activations are fixed and weights are learned — KANs &lt;strong&gt;parameterise the activation functions themselves&lt;/strong&gt; (typically as B-splines or orthogonal polynomials) on each edge of the network graph.&lt;/p&gt;
&lt;h3 class="heading" id="physics-informed-neural-networks-pinns--the-starting-point"&gt;
 Physics-Informed Neural Networks (PINNs) — The Starting Point&lt;span class="heading__anchor"&gt; &lt;a href="#physics-informed-neural-networks-pinns--the-starting-point"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;PINNs (Raissi, Perdikaris, &amp;amp; Karniadakis, 2019) embed physical laws directly into the neural network loss function. For a PDE $\mathcal{N}[u] = f$ on domain $\Omega$ with boundary condition $\mathcal{B}[u] = g$ on $\partial\Omega$, the PINN loss is&lt;/p&gt;
&lt;p&gt;$$\mathcal{L} = \underbrace{\frac{1}{N _r}\sum _{i=1}^{N _r}|\mathcal{N}[u _\theta](x _i)|^2} _{\text{PDE residual}} + \underbrace{\frac{1}{N _b}\sum _{j=1}^{N _b}|\mathcal{B}[u _\theta](x _j) - g(x _j)|^2} _{\text{boundary condition}}.$$&lt;/p&gt;
&lt;p&gt;The substitution of MLP layers with KAN layers in this framework is the basic idea behind all PIKAN architectures.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-physics-informed-kan-frameworks"&gt;
 1. Physics-Informed KAN Frameworks&lt;span class="heading__anchor"&gt; &lt;a href="#1-physics-informed-kan-frameworks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h4 class="heading" id="kinn--the-foundational-framework"&gt;
 KINN — The Foundational Framework&lt;span class="heading__anchor"&gt; &lt;a href="#kinn--the-foundational-framework"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;The &lt;strong&gt;Kolmogorov-Arnold-Informed Neural Network (KINN)&lt;/strong&gt; is the primary physics-informed framework replacing MLP layers in PINNs with KAN layers (Wang et al., 2024). KINN supports three PDE formulations: the &lt;strong&gt;strong form&lt;/strong&gt; (collocating the PDE residual directly), the &lt;strong&gt;energy form&lt;/strong&gt; (minimising a variational energy functional), and the &lt;strong&gt;inverse form&lt;/strong&gt; (recovering unknown parameters from observations).&lt;/p&gt;
&lt;p&gt;Systematic benchmarks demonstrate that KINN significantly outperforms MLP-based PINNs in accuracy and convergence speed for multi-scale problems, stress concentration, singularities, nonlinear hyperelasticity, and heterogeneous materials. The one domain where MLP remains competitive is complex geometry problems. Published in &lt;em&gt;Computer Methods in Applied Mechanics and Engineering&lt;/em&gt; (2024), KINN has become the canonical reference for subsequent KAN-PDE research.&lt;/p&gt;
&lt;h4 class="heading" id="chebyshev-and-polynomial-basis-pikans"&gt;
 Chebyshev and Polynomial Basis PIKANs&lt;span class="heading__anchor"&gt; &lt;a href="#chebyshev-and-polynomial-basis-pikans"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;A major architectural refinement has been substituting B-spline basis functions with &lt;strong&gt;orthogonal polynomial bases&lt;/strong&gt;. The &lt;strong&gt;ChebPIKAN&lt;/strong&gt; model leverages orthogonality of Chebyshev polynomials and integrates physics-informed loss functions for fluid-mechanics PDEs including the Allen-Cahn, Burgers, Helmholtz, Kovasznay flow, cylinder wake flow, and cavity flow equations (Cui et al., 2024). ChebPIKAN significantly outperforms vanilla KAN by embedding essential physical information and alleviating overfitting.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;AC-PKAN&lt;/strong&gt; (Attention-Enhanced Chebyshev PKAN) further addresses the &lt;em&gt;rank collapse&lt;/em&gt; problem in Chebyshev-based KANs by integrating wavelet-activated MLPs with an internal attention mechanism, provably preserving a full-rank Jacobian and approximating PDEs of arbitrary order (arXiv:2505.08687). An external &lt;strong&gt;Residual Gradient Attention (RGA)&lt;/strong&gt; mechanism dynamically re-weights individual loss terms based on gradient norms, stabilising training of stiff PDE systems.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Legendre-KAN&lt;/strong&gt; method applies Legendre polynomial orthogonality to solve the fully nonlinear Monge-Ampère equation with Dirichlet boundary conditions, demonstrating effectiveness on both smooth and singular solutions across various dimensions and in the optimal transport problem.&lt;/p&gt;
&lt;h4 class="heading" id="hybrid-kanmlp-and-augmented-lagrangian-approaches"&gt;
 Hybrid KAN–MLP and Augmented Lagrangian Approaches&lt;span class="heading__anchor"&gt; &lt;a href="#hybrid-kanmlp-and-augmented-lagrangian-approaches"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;The &lt;strong&gt;AL-PKAN&lt;/strong&gt; introduces a hybrid encoder-decoder architecture where the decoder maps hidden variable features from high-dimensional latent space into trainable univariate activation functions via KAN (Zhang et al., 2025). An augmented Lagrangian function treats penalty factors and Lagrangian multipliers as learnable parameters to dynamically balance constraint terms. This approach typically improves prediction accuracy by &lt;strong&gt;one to two orders of magnitude&lt;/strong&gt; compared to traditional neural networks.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;HPKM-PINN&lt;/strong&gt; combines MLP and KAN branches with a trainable convex mixing parameter to blend features optimally across subdomains, especially effective for multi-scale problems.&lt;/p&gt;
&lt;h3 class="heading" id="2-spectral-basis-and-wavelet-enriched-kans"&gt;
 2. Spectral-Basis and Wavelet-Enriched KANs&lt;span class="heading__anchor"&gt; &lt;a href="#2-spectral-basis-and-wavelet-enriched-kans"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Wav-KAN&lt;/strong&gt; incorporates wavelet functions into the KAN structure, capturing both high-frequency and low-frequency components via continuous dyadic wavelet transforms for multiresolution analysis. This directly addresses the &lt;em&gt;spectral bias&lt;/em&gt; problem inherent in standard neural networks, which struggle to resolve high-frequency features in PDE solutions.&lt;/p&gt;
&lt;p&gt;PIKANs have been extended to &lt;strong&gt;multi-resolution spectral hybridisations (HWF-PIKAN)&lt;/strong&gt;, combining wavelet and Fourier features to explicitly counteract spectral bias and accelerate convergence for advection-dominated and kinetic equations.&lt;/p&gt;
&lt;p&gt;A unified benchmark published in February 2026 provides a &lt;strong&gt;systematic, controlled comparison between MLP-based PINNs and KAN-based PIKANs&lt;/strong&gt; across a representative collection of ODEs and PDEs (arXiv:2602.15068). The results show that PIKANs consistently achieve more accurate solutions, converge in fewer iterations, and yield superior gradient estimates.&lt;/p&gt;
&lt;h3 class="heading" id="3-kan-based-neural-operators"&gt;
 3. KAN-Based Neural Operators&lt;span class="heading__anchor"&gt; &lt;a href="#3-kan-based-neural-operators"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Neural operators learn mappings between infinite-dimensional function spaces, enabling generalisation across families of PDEs. KANs are increasingly embedded in operator architectures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DeepOKAN&lt;/strong&gt; replaces MLP sub-networks in the Deep Operator Network (DeepONet) framework with KAN sub-networks using Gaussian Radial Basis Functions (Abueidda et al., 2024). The branch and trunk networks of DeepONet are re-implemented as RBF-KAN layers. Evaluated on 1D sinusoidal waves, 2D orthotropic elasticity, and transient Poisson problems, DeepOKAN consistently achieves lower training losses and more accurate predictions compared to standard DeepONet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PO-CKAN&lt;/strong&gt; (Physics-informed Deep Operator KAN with Chunk Rational Structure) integrates PDE residual loss into a DeepONet-style branch–trunk architecture using Chunkwise Rational KAN sub-networks (arXiv:2510.08795). On Burgers&amp;rsquo; equation with viscosity $\nu = 0.01$, PO-CKAN reduces mean relative $L^2$ error by approximately &lt;strong&gt;48%&lt;/strong&gt; compared to PI-DeepONet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KANO&lt;/strong&gt; (Kolmogorov-Arnold Neural Operator) is the most theoretically ambitious framework, jointly parameterising operators in both &lt;strong&gt;spectral and spatial bases&lt;/strong&gt; within a pseudo-differential operator framework (arXiv:2509.16825). KANO overcomes the pure-spectral bottleneck of Fourier Neural Operators (FNO): while FNO remains practical only for spectrally sparse operators, KANO remains expressive over generic variable-coefficient PDEs. Crucially, KANO achieves &lt;strong&gt;symbolic recovery of the learned operator&lt;/strong&gt;, enabling closed-form extraction of governing equations. On the quantum Hamiltonian learning benchmark, KANO attains state infidelity $\approx 6 \times 10^{-6}$ compared to FNO&amp;rsquo;s $\approx 1.5 \times 10^{-2}$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KAN-ONets&lt;/strong&gt; embeds adaptive, learnable B-spline activations from KAN into FNO (yielding FNO-KAN for uniform grids) and into the attention-based GNOT (yielding GNOT-KAN for arbitrary grids). Across seven challenging PDE benchmarks, KAN-ONets achieves &lt;strong&gt;MSE reductions of 10.2–30.2%&lt;/strong&gt; compared to existing models.&lt;/p&gt;
&lt;h3 class="heading" id="4-time-dependent-and-evolutionary-kans"&gt;
 4. Time-Dependent and Evolutionary KANs&lt;span class="heading__anchor"&gt; &lt;a href="#4-time-dependent-and-evolutionary-kans"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;EvoKAN&lt;/strong&gt; (Evolutionary Kolmogorov-Arnold Network, March 2025) introduces a novel paradigm: rather than retraining repeatedly, EvoKAN &lt;strong&gt;encodes only the PDE&amp;rsquo;s initial state&lt;/strong&gt; during an initial learning phase, then evolves the network parameters numerically, governed by the same PDE (arXiv:2503.01618). KAN weights are treated as time-dependent functions updated through time steps, enabling prediction over arbitrarily long time horizons.&lt;/p&gt;
&lt;p&gt;EvoKAN integrates the &lt;strong&gt;Scalar Auxiliary Variable (SAV) method&lt;/strong&gt; to guarantee unconditional energy stability: at each time step, SAV requires only solving decoupled linear systems with constant coefficients. EvoKAN has been validated on the 1D and 2D Allen-Cahn equations (phase-field phenomena with sharp interfaces) and the 2D Navier-Stokes equations (turbulent flows), closely matching analytical references.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KAN-ODEs&lt;/strong&gt; apply KANs as the backbone of neural ordinary differential equation (ODE) frameworks, enabling data-driven discovery of governing dynamics with greater interpretability compared to MLP-based neural ODEs (arXiv:2407.04192).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Shallow-KAN&lt;/strong&gt; addresses Stefan-type moving boundary problems (melting, solidification) by approximating the temperature distribution and moving interface while enforcing governing PDEs, phase equilibrium, and the Stefan condition through physics-informed residuals (arXiv:2601.09818). A key finding is that &lt;strong&gt;two hidden layers with tens of learnable parameters&lt;/strong&gt; suffice — far fewer than the nearly one million parameters required by standard MLP-based PINNs for the same problem.&lt;/p&gt;
&lt;h3 class="heading" id="5-discontinuities-shock-waves-and-turbulence"&gt;
 5. Discontinuities, Shock Waves, and Turbulence&lt;span class="heading__anchor"&gt; &lt;a href="#5-discontinuities-shock-waves-and-turbulence"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A known weakness of smooth neural networks is difficulty resolving &lt;strong&gt;sharp spatial transitions and discontinuities&lt;/strong&gt; such as shock waves. Two specialised frameworks address this:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DPINN&lt;/strong&gt; (Discontinuity-aware PINN) incorporates a discontinuity-aware KAN for modelling shock-wave properties, combined with an adaptive Fourier-feature embedding layer to mitigate spectral bias, mesh transformation for complex geometries, and learnable local artificial viscosity to stabilise the algorithm near discontinuities (arXiv:2507.08338). Numerical experiments on the inviscid Burgers&amp;rsquo; equation and transonic/supersonic airfoil flows demonstrate superior accuracy over existing methods.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;Physics-Infused KAN for Turbulence&lt;/strong&gt; (2026) targets turbulent flow prediction integrated with CFD, applying KAN within the Reynolds-Averaged Navier-Stokes (RANS) framework. It addresses the &lt;em&gt;information bottleneck&lt;/em&gt; phenomenon in multi-output KANs and proposes pruning-based network optimisation, achieving high prediction accuracy for Navier-Stokes equations.&lt;/p&gt;
&lt;h3 class="heading" id="6-high-dimensional-pdes-and-the-curse-of-dimensionality"&gt;
 6. High-Dimensional PDEs and the Curse of Dimensionality&lt;span class="heading__anchor"&gt; &lt;a href="#6-high-dimensional-pdes-and-the-curse-of-dimensionality"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;High-dimensional PDEs (tens to hundreds of dimensions) are where conventional numerical methods completely fail due to exponential cost scaling. KAN has shown early promise here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anant-Net&lt;/strong&gt; (2025) is a scalable neural surrogate employing a tensor product formulation with dimension-wise sweeps and selective automatic differentiation (arXiv:2505.03595). Benchmarked on the Poisson, Sine-Gordon, Allen-Cahn, and transient heat equations, Anant-Net &lt;strong&gt;solves PDEs in up to 300 dimensions on a single GPU within a few hours&lt;/strong&gt;. The framework includes &lt;strong&gt;Anant-KAN&lt;/strong&gt;, an interpretable KAN-based variant offering deeper insights into the learned solution structure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Separable PIKANs (SPIKANs)&lt;/strong&gt; decompose the PDE solution into products of one-dimensional KAN networks, drastically reducing computational complexity for high-dimensional problems while retaining accuracy and interpretability.&lt;/p&gt;
&lt;h3 class="heading" id="7-data-driven-discovery-and-inverse-problems"&gt;
 7. Data-Driven Discovery and Inverse Problems&lt;span class="heading__anchor"&gt; &lt;a href="#7-data-driven-discovery-and-inverse-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;KANs are especially powerful for &lt;strong&gt;scientific discovery tasks&lt;/strong&gt; where interpretability of the learned function is critical.&lt;/p&gt;
&lt;p&gt;Data-driven model discovery with KANs has been demonstrated on complex dynamical systems — including the Ikeda map and optical-cavity systems — where sparse optimisation methods fail due to non-sparse governing equations (arXiv:2409.15167). KAN captures complex behaviour while offering interpretability through its edge-wise univariate functions, providing insight into governing dynamics inaccessible in black-box MLPs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PI-KAN-PointNet&lt;/strong&gt; extends PIKAN to simultaneously solve inverse problems over multiple irregular geometries within a single training run, demonstrated on natural convection over 135 geometries with sparse data. &lt;strong&gt;KINN for Inverse Problems&lt;/strong&gt; enables identification of unknown material parameters in heterogeneous or hyperelastic materials from partial observations. &lt;strong&gt;KANHedge&lt;/strong&gt; applies KANs to high-dimensional BSDE solvers for option pricing, demonstrating improved hedging performance over MLP-based deep BSDE solvers (arXiv:2601.11097).&lt;/p&gt;
&lt;h3 class="heading" id="8-comparative-analysis-kan-vs-mlp-for-pdes"&gt;
 8. Comparative Analysis: KAN vs. MLP for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#8-comparative-analysis-kan-vs-mlp-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A comprehensive comparison between MLP and KAN representations for differential equations establishes nuanced findings (arXiv:2406.02917):&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Architecture&lt;/th&gt;
					&lt;th&gt;Shallow Networks&lt;/th&gt;
					&lt;th&gt;Deep Networks&lt;/th&gt;
					&lt;th&gt;Robustness&lt;/th&gt;
					&lt;th&gt;Interpretability&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;KAN (B-spline)&lt;/td&gt;
					&lt;td&gt;Superior accuracy&lt;/td&gt;
					&lt;td&gt;Comparable to MLP&lt;/td&gt;
					&lt;td&gt;Lower (may diverge with different seeds)&lt;/td&gt;
					&lt;td&gt;High — symbolic extraction possible&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;KAN (Chebyshev/Legendre)&lt;/td&gt;
					&lt;td&gt;High accuracy&lt;/td&gt;
					&lt;td&gt;Competitive&lt;/td&gt;
					&lt;td&gt;Moderate — rank collapse risk&lt;/td&gt;
					&lt;td&gt;High&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;MLP/PINN&lt;/td&gt;
					&lt;td&gt;Moderate accuracy&lt;/td&gt;
					&lt;td&gt;Robust&lt;/td&gt;
					&lt;td&gt;High&lt;/td&gt;
					&lt;td&gt;Low&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;PIKAN (optimised)&lt;/td&gt;
					&lt;td&gt;Superior&lt;/td&gt;
					&lt;td&gt;Superior or comparable&lt;/td&gt;
					&lt;td&gt;Moderate&lt;/td&gt;
					&lt;td&gt;High&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Key findings: KANs in &lt;strong&gt;shallow settings significantly outperform MLPs&lt;/strong&gt;, leveraging per-edge nonlinear expressiveness. In deep settings, KANs do not consistently outperform MLPs, but when properly optimised (e.g., with L-BFGS or Self-Scaled Broyden second-order optimisers), they achieve superior accuracy. &lt;strong&gt;JAX-based PIKAN implementations&lt;/strong&gt; have achieved up to 84× training speedup over original NumPy/PyTorch KANs.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Despite rapid progress, several challenges remain:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Computational cost.&lt;/strong&gt; Spline function evaluation involves multiple iterations, making KANs significantly slower per parameter than MLPs. Variants like PowerMLP propose more efficient formulations (arXiv:2412.13571), but a satisfactory solution to raw training speed at scale is still outstanding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scalability to complex geometries.&lt;/strong&gt; KINN and standard PIKANs underperform MLPs on irregular geometry problems. This remains a practical bottleneck for engineering applications involving complex domains.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Gradient instability in deep KANs.&lt;/strong&gt; Deep PIKANs face vanishing/exploding gradient challenges, motivating Glorot-like initialisation strategies and residual-gated architectures.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Theoretical guarantees.&lt;/strong&gt; Generalisation bounds for KANs trained on PDE collocation have been studied — bounds scale with $\ell_1$ norms of spline coefficients — but practical understanding of how architecture choices affect convergence and generalisation remains incomplete (arXiv:2410.08026).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Operator learning completeness.&lt;/strong&gt; While KANO achieves symbolic operator recovery, the theoretical relationship between KAN architecture depth/width and approximation of PDE solution operators is still under active development.&lt;/p&gt;
&lt;p&gt;The trajectory is clear: KAN-based PDE solvers are moving from proof-of-concept demonstrations on canonical benchmarks toward &lt;strong&gt;production-ready frameworks&lt;/strong&gt; for engineering simulation, turbulence modelling, inverse problems, and high-dimensional scientific computing. The combination of interpretability, parameter efficiency, and growing theoretical foundations positions KANs as a genuinely transformative architecture for numerical PDEs.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Abueidda, D. W., Pantidis, P., &amp;amp; Mobasher, M. E. (2024). &lt;em&gt;DeepOKAN: Deep operator network based on Kolmogorov Arnold networks for mechanics problems&lt;/em&gt;. arXiv:2405.19143. &lt;a href="https://www.alphaxiv.org/overview/2405.19143v3"&gt;https://www.alphaxiv.org/overview/2405.19143v3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Cui, Z., et al. (2024). Physics-informed Kolmogorov–Arnold network with Chebyshev polynomials for fluid mechanics. &lt;em&gt;Physics of Fluids, 37&lt;/em&gt;(9), 095120. &lt;a href="https://pubs.aip.org/aip/pof/article-abstract/37/9/095120/3361431"&gt;https://pubs.aip.org/aip/pof/article-abstract/37/9/095120/3361431&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Knottenbelt, W., et al. (2026). &lt;em&gt;KANHedge: Efficient hedging of high-dimensional options using Kolmogorov-Arnold network-based BSDE solver&lt;/em&gt;. arXiv:2601.11097. &lt;a href="https://arxiv.org/abs/2601.11097"&gt;https://arxiv.org/abs/2601.11097&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Kovachki, N., et al. (2023). Neural operator: Learning maps between function spaces with applications to PDEs. &lt;em&gt;Journal of Machine Learning Research, 24&lt;/em&gt;(89), 1–97.&lt;/p&gt;
&lt;p&gt;Li, Z., et al. (2025). &lt;em&gt;Discontinuity-aware KAN-based physics-informed neural networks&lt;/em&gt;. arXiv:2507.08338. &lt;a href="https://arxiv.org/html/2507.08338v1"&gt;https://arxiv.org/html/2507.08338v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2024). &lt;em&gt;KAN: Kolmogorov–Arnold Networks&lt;/em&gt;. arXiv:2404.19756. &lt;a href="https://storage.prod.researchhub.com/uploads/papers/2024/05/04/2404.19756.pdf"&gt;https://storage.prod.researchhub.com/uploads/papers/2024/05/04/2404.19756.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2024). &lt;em&gt;A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks&lt;/em&gt;. arXiv:2406.02917. &lt;a href="https://arxiv.org/abs/2406.02917"&gt;https://arxiv.org/abs/2406.02917&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2026). &lt;em&gt;A unified benchmark of physics-informed neural networks and Kolmogorov-Arnold networks&lt;/em&gt;. arXiv:2602.15068. &lt;a href="https://arxiv.org/html/2602.15068v1"&gt;https://arxiv.org/html/2602.15068v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Peng, W., et al. (2025). &lt;em&gt;KANO: Kolmogorov-Arnold Neural Operator&lt;/em&gt;. arXiv:2509.16825. &lt;a href="https://arxiv.org/abs/2509.16825"&gt;https://arxiv.org/abs/2509.16825&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Shukla, K., et al. (2025). &lt;em&gt;Anant-Net: Breaking the curse of dimensionality with scalable and interpretable neural surrogates for high-dimensional PDEs&lt;/em&gt;. arXiv:2505.03595. &lt;a href="https://arxiv.org/html/2505.03595v3"&gt;https://arxiv.org/html/2505.03595v3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Tang, K., et al. (2025). &lt;em&gt;AC-PKAN: Attention-enhanced and Chebyshev polynomial-based Kolmogorov-Arnold networks&lt;/em&gt;. arXiv:2505.08687. &lt;a href="https://arxiv.org/html/2505.08687v2"&gt;https://arxiv.org/html/2505.08687v2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, Z., et al. (2025). &lt;em&gt;EvoKAN: Energy-dissipative evolutionary Kolmogorov-Arnold networks for complex PDE systems&lt;/em&gt;. arXiv:2503.01618. &lt;a href="https://arxiv.org/abs/2503.01618"&gt;https://arxiv.org/abs/2503.01618&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, Z., et al. (2024). Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems based on Kolmogorov–Arnold Networks. &lt;em&gt;Computer Methods in Applied Mechanics and Engineering&lt;/em&gt;. arXiv:2406.11045. &lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S0045782524007722"&gt;https://www.sciencedirect.com/science/article/abs/pii/S0045782524007722&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xu, Y., et al. (2026). &lt;em&gt;Shallow-KAN based solution of moving boundary PDEs&lt;/em&gt;. arXiv:2601.09818. &lt;a href="https://arxiv.org/html/2601.09818v1"&gt;https://arxiv.org/html/2601.09818v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Yang, L., et al. (2025). &lt;em&gt;KAN-ODEs: Kolmogorov-Arnold network ordinary differential equations for learning dynamical systems and hidden physics&lt;/em&gt;. arXiv:2407.04192. &lt;a href="https://arxiv.org/html/2407.04192v1"&gt;https://arxiv.org/html/2407.04192v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zhang, Z., et al. (2025). Physics-informed neural networks with hybrid Kolmogorov-Arnold networks. &lt;em&gt;PMC&lt;/em&gt;. &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC11950322/"&gt;https://pmc.ncbi.nlm.nih.gov/articles/PMC11950322/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zuo, Q., et al. (2025). &lt;em&gt;Data-driven model discovery with Kolmogorov-Arnold networks&lt;/em&gt;. arXiv:2409.15167. &lt;a href="https://arxiv.org/abs/2409.15167"&gt;https://arxiv.org/abs/2409.15167&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Recent Advances in Numerical PDEs</title><link>https://blog.namln.org/en/posts/recent-numerical-pde/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/recent-numerical-pde/</guid><description>&lt;p&gt;Numerical methods for partial differential equations (PDEs) have entered a period of rapid transformation, driven by two converging forces: deep learning&amp;rsquo;s maturation as a tool for high-dimensional function approximation, and the resurgence of classical methods augmented by machine learning. The field broadly divides into &lt;em&gt;physics-informed machine learning&lt;/em&gt;, &lt;em&gt;neural operator learning&lt;/em&gt;, &lt;em&gt;foundation models for PDEs&lt;/em&gt;, and the continuing evolution of &lt;em&gt;classical high-order&lt;/em&gt;, &lt;em&gt;structure-preserving&lt;/em&gt;, and &lt;em&gt;data-driven discovery&lt;/em&gt; methods. Quantum computing and laser-based hardware solvers are also beginning to enter the landscape. This survey organises the most active research fronts, highlights landmark and recent key papers, and identifies open problems as of early 2026.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The table below summarises the major approaches covered in this survey, their representative key papers, and their current status.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Approach&lt;/th&gt;
					&lt;th&gt;Representative Key Papers&lt;/th&gt;
					&lt;th&gt;Status&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;PINNs (adaptive/staged training)&lt;/td&gt;
					&lt;td&gt;Raissi et al. (2019); IEEE 2025 staged training; PhysicsNeMo/Modulus&lt;/td&gt;
					&lt;td&gt;Production-ready&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;KANs for PDEs&lt;/td&gt;
					&lt;td&gt;Liu et al. (2024, ICLR 2025); KINN; PI-KAN; HRKANs&lt;/td&gt;
					&lt;td&gt;Active frontier&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Fourier Neural Operators&lt;/td&gt;
					&lt;td&gt;Li et al. (2020); O-FNO (2025); ReBA accelerator&lt;/td&gt;
					&lt;td&gt;Widely adopted&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;DeepONet variants&lt;/td&gt;
					&lt;td&gt;Lu et al. (2019); L-DeepONet; Hybrid KAN-DeepONet; Quantum DeepONet&lt;/td&gt;
					&lt;td&gt;Mature + expanding&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;PDE Foundation Models&lt;/td&gt;
					&lt;td&gt;Poseidon; OmniArch; PDEformer; Geo-NeW&lt;/td&gt;
					&lt;td&gt;Emerging (2024–2026)&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Deep BSDE &amp;amp; high-dimensional&lt;/td&gt;
					&lt;td&gt;Han, Jentzen, &amp;amp; E (PNAS 2018); Deep Shotgun; DRDM; Heun-BSDE&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Data-driven PDE discovery&lt;/td&gt;
					&lt;td&gt;SINDy (Brunton et al.); GN-SINDy; Evo-SINDy; Bayesian-SINDy&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Structure-preserving methods&lt;/td&gt;
					&lt;td&gt;Hairer et al. (2006); Stochastic multisymplectic; Geo-NeW&lt;/td&gt;
					&lt;td&gt;Maturing&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;High-order FEM/DG&lt;/td&gt;
					&lt;td&gt;hp-DGFEM Boltzmann; ML-accelerated FEM; FEX-PG&lt;/td&gt;
					&lt;td&gt;Mature + augmented&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Fractional PDEs&lt;/td&gt;
					&lt;td&gt;Review (2024); O-FNO for fractional Poisson; Fractional Laplacian meshfree&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Hamilton–Jacobi PDEs&lt;/td&gt;
					&lt;td&gt;Review arXiv:2502.20833; Actor-critic NN; Deep BSDE for HJB&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Multiscale / ROM&lt;/td&gt;
					&lt;td&gt;MLP-based multiscale; POD-DL-ROM; Multi-fidelity ROM&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Uncertainty quantification&lt;/td&gt;
					&lt;td&gt;QMC/RQMC; PDE-DKL&lt;/td&gt;
					&lt;td&gt;Active&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Quantum computing&lt;/td&gt;
					&lt;td&gt;Schrödingerisation; H-DES (ColibriTD); Quantum DeepONet&lt;/td&gt;
					&lt;td&gt;Early-stage&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Photonic/analog solvers&lt;/td&gt;
					&lt;td&gt;LightSolver LPU&lt;/td&gt;
					&lt;td&gt;Very early-stage&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-classical-pde-problem"&gt;
 The Classical PDE Problem&lt;span class="heading__anchor"&gt; &lt;a href="#the-classical-pde-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A general PDE on a domain $\Omega \subseteq \mathbb{R}^d$ takes the form&lt;/p&gt;
&lt;p&gt;$$\mathcal{N} [u] (x) = f(x), \quad x \in \Omega, \qquad \mathcal{B} [u] (x) = g(x), \quad x \in \partial \Omega,$$&lt;/p&gt;
&lt;p&gt;where $\mathcal{N}$ is a (possibly nonlinear) differential operator, $\mathcal{B}$ encodes boundary or initial conditions, and $u: \Omega \to \mathbb{R}$ is the unknown. Classical mesh-based methods — finite element (FEM), finite difference (FDM), finite volume (FVM), and spectral methods — discretise $\Omega$ into $N$ degrees of freedom and solve a resulting algebraic system. Their complexity typically scales as $O(N^\alpha)$ for some $\alpha \geq 1$, and in $d$ dimensions $N \sim h^{-d}$ for mesh spacing $h$, leading to exponential cost as $d$ grows.&lt;/p&gt;
&lt;h3 class="heading" id="the-deep-learning-turn"&gt;
 The Deep Learning Turn&lt;span class="heading__anchor"&gt; &lt;a href="#the-deep-learning-turn"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The 2019 PINN paper by Raissi, Perdikaris, and Karniadakis, and the 2020 FNO paper by Li et al., triggered an explosion of mesh-free and operator-learning approaches. Rather than discretising $\Omega$, these methods parameterise $u$ (or the solution operator $\mathcal{N}^{-1}$) as a neural network and minimise a physics-informed or data-driven loss. The key advantages are mesh-free flexibility, natural handling of inverse problems, and — in the operator-learning setting — the ability to generalise across PDE instances.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-physics-informed-neural-networks-pinns-and-variants"&gt;
 1. Physics-Informed Neural Networks (PINNs) and Variants&lt;span class="heading__anchor"&gt; &lt;a href="#1-physics-informed-neural-networks-pinns-and-variants"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;PINNs, introduced by Raissi, Perdikaris, and Karniadakis (2019), embed physical laws directly into the neural network loss function as residual terms of the form $\mathcal{L}_{\text{phys}} = |f(\hat{u})|^2$, supplemented by data, boundary, and initial condition constraints. Their appeal lies in a mesh-free design that handles irregular geometries and inverse problems naturally. Yet PINN training is notoriously fragile — subject to spectral bias, loss imbalance, and stiffness — motivating a rich line of training improvements.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Staged training strategies.&lt;/strong&gt; A 2025 IEEE paper proposes a two-stage process: a short-time pretraining phase followed by extension to the full time domain, combined with uncertainty-guided sampling. This significantly improves accuracy and efficiency for time-dependent PDEs compared to standard PINNs (IEEE, 2025).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Evolutionary optimisation of PINNs.&lt;/strong&gt; A 2025 arXiv paper introduces evolutionary optimisation to tune PINN architectures, improving robustness when data are scarce by complying with physical laws through training loss (arXiv:2501.06572).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Automatic structure discovery via knowledge distillation.&lt;/strong&gt; A 2025 &lt;em&gt;Nature Communications&lt;/em&gt; paper proposes a physics-informed distillation framework that decouples physical and parameter regularisation in teacher–student networks, then uses clustering and parameter reconstruction to embed physically meaningful structures. Experiments on Laplace, Burgers, Poisson, and fluid mechanics equations show improved accuracy, training efficiency, and transferability (arXiv:2502.06026).&lt;/p&gt;
&lt;p&gt;Production-ready frameworks include &lt;em&gt;PhysicsNeMo/Modulus&lt;/em&gt; (CUDA-optimised kernels with 4× speedups) and &lt;em&gt;DeepXDE&lt;/em&gt;, which support adaptive weighting schemes, curriculum learning, intelligent residual point sampling, and domain decomposition for stiff problems.&lt;/p&gt;
&lt;h3 class="heading" id="2-kolmogorovarnold-networks-kans-for-pdes"&gt;
 2. Kolmogorov–Arnold Networks (KANs) for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#2-kolmogorovarnold-networks-kans-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Proposed by Liu, Wang, Vaidya et al. (2024, accepted ICLR 2025), &lt;strong&gt;KANs&lt;/strong&gt; replace fixed activation functions at MLP nodes with learnable spline-parameterised functions on each edge. This change — inspired by the Kolmogorov-Arnold representation theorem — provides faster neural scaling laws, improved interpretability, and comparable or better accuracy with far fewer parameters, especially for scientific AI tasks. The major PINN-KAN hybrid architectures are as follows:&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Architecture&lt;/th&gt;
					&lt;th&gt;PDE focus&lt;/th&gt;
					&lt;th&gt;Key claim&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;KINN&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Solid mechanics, multi-scale, singularities&lt;/td&gt;
					&lt;td&gt;Significantly outperforms MLP-PINNs in accuracy and convergence speed&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;PI-KAN&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Navier–Stokes (forward)&lt;/td&gt;
					&lt;td&gt;High prediction accuracy; addresses information bottleneck&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;HRKANs&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Poisson, Burgers&lt;/td&gt;
					&lt;td&gt;Highest fitting accuracy, lowest training time vs. KAN and ReLU-KAN&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;PIKANs&lt;/strong&gt; (adaptive grid)&lt;/td&gt;
					&lt;td&gt;Forward PDE problems&lt;/td&gt;
					&lt;td&gt;Up to 84× faster training; adaptive state transition reduces $L^2$ error by 43%&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;EvoKAN&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Complex PDE systems&lt;/td&gt;
					&lt;td&gt;Energy-dissipative; encodes only the initial state, avoiding retraining&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;&lt;strong&gt;KAN-ODEs&lt;/strong&gt;&lt;/td&gt;
					&lt;td&gt;Schrödinger, Allen–Cahn, dynamical systems&lt;/td&gt;
					&lt;td&gt;Improved performance over Neural ODEs in discovering hidden physics&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;KANs are also being used inside &lt;strong&gt;DeepONet branch/trunk networks&lt;/strong&gt; for hybrid neural operator surrogates in porous media flows, including Darcy flow and 2D/3D multiphase problems (arXiv:2511.02962). For a deeper treatment of KAN architectures for PDEs, see the companion post in this series.&lt;/p&gt;
&lt;h3 class="heading" id="3-neural-operator-learning"&gt;
 3. Neural Operator Learning&lt;span class="heading__anchor"&gt; &lt;a href="#3-neural-operator-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Neural operators learn mappings between infinite-dimensional function spaces — enabling resolution-invariant, discretisation-agnostic PDE solvers. The two dominant architectures are the &lt;strong&gt;Fourier Neural Operator (FNO)&lt;/strong&gt; and &lt;strong&gt;Deep Operator Networks (DeepONet)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FNO&lt;/strong&gt; applies global convolution in Fourier space, giving resolution invariance and fast inference. The 2025 &lt;em&gt;Optimised FNO (O-FNO)&lt;/em&gt; integrates residual connections and enhanced spectral resolution for the 2D fractional Poisson equation, achieving over 98% test accuracy and outperforming both base FNO and DeepONet. A hardware/algorithm co-design chip, &lt;strong&gt;ReBA&lt;/strong&gt;, implements the Galerkin Transformer achieving 34.57× speedup over CPUs and up to 51.26× over prior accelerators (IEEE, 2025).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DeepONet&amp;rsquo;s&lt;/strong&gt; branch-trunk architecture excels under noise and complex geometries where FNO degrades. Recent extensions include multi-fidelity physics-guided DeepONet (2025), Fusion DeepONet for hypersonic flow predictions on arbitrary grids (arXiv:2501.01934), and &lt;strong&gt;Latent-space DeepONet (L-DeepONet)&lt;/strong&gt; (&lt;em&gt;Nature Communications&lt;/em&gt;, 2024), which outperforms all other neural operators with small latent dimensions ($d \leq 100$), enabling real-time high-dimensional predictions. Ensemble and Mixture-of-Experts DeepONets achieve 2–4× lower relative $\ell_2$ errors through basis enrichment and spatial locality (arXiv:2405.11907). &lt;strong&gt;Taylor Mode Neural Operators&lt;/strong&gt; provide an order-of-magnitude speed-up for DeepONet and 8× for FNO in computing high-order derivatives via Taylor-mode automatic differentiation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Graph Neural Operator Methods.&lt;/strong&gt; The &lt;strong&gt;GOLA framework&lt;/strong&gt; (2025) addresses the limitation of regular-grid assumptions by constructing graphs from irregularly sampled spatial points with a Fourier-based encoder for learnable complex-coefficient embeddings, outperforming baselines in data-scarce regimes across 2D Darcy, Advection, Eikonal, and Nonlinear Diffusion problems (arXiv:2505.18923).&lt;/p&gt;
&lt;h3 class="heading" id="4-foundation-models-for-pdes"&gt;
 4. Foundation Models for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#4-foundation-models-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Inspired by the success of LLMs, PDE foundation models represent a paradigm shift: large transformers pre-trained on diverse physical systems that can be fine-tuned for downstream tasks with minimal data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Poseidon&lt;/strong&gt; (ETH Zurich, 2024) is a multiscale operator transformer with time-conditioned layer norms, enabling continuous-in-time evaluation. Pre-trained on diverse physical systems, it exploits the semigroup property of time-dependent PDEs for significant data scaling (arXiv:2405.19101).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OmniArch&lt;/strong&gt; (ICML 2025) is the first multi-scale and multi-physics scientific computing foundation model, featuring a Fourier encoder-decoder and transformer backbone with a &lt;em&gt;PDE-Aligner&lt;/em&gt; for physics-informed fine-tuning. It achieves unified 1D-2D-3D pre-training on PDEBench and demonstrates zero-shot learning on new physics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PDEformer&lt;/strong&gt; (2025) represents PDEs as computational graphs integrating symbolic and numerical information; a graph transformer with implicit neural representation enables mesh-free predictions with zero-shot accuracy comparable to specialist models (arXiv:2402.12652).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Multimodal PDE Foundation Model&lt;/strong&gt; (UCLA, 2025) integrates both numerical inputs (equation parameters, initial conditions) and text descriptions. It achieves average relative error below 3.3% in-distribution and generates interpretable scientific text — bridging NLP and scientific computing (arXiv:2502.06026).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Physics-informed fine-tuning&lt;/strong&gt; (arXiv:2603.15431, 2026) establishes that hybrid fine-tuning (combining physics-informed and data-driven objectives) achieves superior extrapolation to downstream tasks and enables data-free learning of unseen PDE families.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Geo-NeW&lt;/strong&gt; (arXiv:2602.02788, Feb 2026) — General-Geometry Neural Whitney Forms — is a data-driven finite element method jointly learning differential operators and compatible finite element spaces on the geometry. It exactly preserves physical conservation laws via Finite Element Exterior Calculus, with state-of-the-art performance on out-of-distribution geometries.&lt;/p&gt;
&lt;h3 class="heading" id="5-deep-learning-for-high-dimensional-pdes"&gt;
 5. Deep Learning for High-Dimensional PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#5-deep-learning-for-high-dimensional-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Classical mesh-based methods suffer exponential complexity growth in dimension $d$. Three principal deep learning paradigms address this.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Deep BSDE method&lt;/strong&gt; (Han, Jentzen, &amp;amp; E, &lt;em&gt;PNAS&lt;/em&gt;, 2018) reformulates semilinear parabolic PDEs using backward stochastic differential equations (BSDEs) and learns the gradient of the solution with neural networks, enabling solution of PDEs in hundreds to thousands of dimensions. A 2025 review by the original authors traces subsequent advances. Key recent improvements include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Deep Shotgun Method&lt;/strong&gt; (&lt;em&gt;J. Sci. Comput.&lt;/em&gt;, 2025): avoids full trajectory simulation, using only data distribution, achieving results up to dimension 10,000 (Springer, 2025).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;XNet-enhanced Deep BSDE&lt;/strong&gt; (2025): a new network architecture with fewer parameters, significantly improving computational efficiency and accuracy (arXiv:2502.06238).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deep Random Difference Method (DRDM)&lt;/strong&gt; (2025): approximates the convection-diffusion operator using only first-order differences, avoiding Hessian computations, with proved first-order accuracy in time step $h$ (arXiv:2506.20308).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stratonovich-based BSDE with Heun integration&lt;/strong&gt; (2025): identifies that Euler-Maruyama discretisation bias is the root cause of BSDE underperformance relative to PINNs; Heun integration eliminates this bias and achieves competitive results across high-dimensional benchmarks (arXiv:2505.01078).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;strong&gt;Deep Ritz method&lt;/strong&gt; (E &amp;amp; Yu, 2018) minimises energy functionals using neural networks. Extensions to multiscale problems leverage scale convergence theory to derive $\Gamma$-limits of oscillatory energy functionals.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Full History Recursive Multilevel Picard (MLP)&lt;/strong&gt; methodology — combining Picard iterations with multilevel Monte Carlo — was the first method proven to overcome the curse of dimensionality for semilinear parabolic PDEs and remains one of very few methods with such proven guarantees.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PDE-DKL&lt;/strong&gt; (2025) combines deep learning for low-dimensional latent representations with Gaussian Processes for kernel regression under explicit PDE constraints, providing both high accuracy and principled uncertainty quantification in limited-data regimes (arXiv:2501.18258).&lt;/p&gt;
&lt;h3 class="heading" id="6-classical-high-order-methods-fem-dg-and-spectral"&gt;
 6. Classical High-Order Methods: FEM, DG, and Spectral&lt;span class="heading__anchor"&gt; &lt;a href="#6-classical-high-order-methods-fem-dg-and-spectral"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Despite the deep learning surge, classical methods continue to mature, particularly in rigorous error analysis and efficiency.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;hp-version DG finite element method for the Boltzmann transport problem&lt;/strong&gt; (&lt;em&gt;J. Sci. Comput.&lt;/em&gt;, 2024) achieves arbitrary-order convergence rates and handles polytopic elements, enabling efficient parallel implementation within existing multigroup discrete ordinates software. High-order DG methods for unsteady compressible flows — targeting acoustic waves, turbulence, and magnetohydrodynamics — benefit from block-diagonal mass matrices allowing efficient explicit time-stepping.&lt;/p&gt;
&lt;p&gt;A systematic 2024 approach uses neural networks to learn the element-wise solution map of PDEs, accelerating finite element-type methods in an &amp;ldquo;element neural network&amp;rdquo; paradigm that generalises across element geometries. Machine learning-based spectral methods combine orthogonal function expansions (Fourier, Legendre) with deep neural operator learning for highly accurate solutions with fewer grid points.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FEX-PG&lt;/strong&gt; (2024) solves high-dimensional partial integro-differential equations using parameter grouping to reduce coefficient count and Taylor series approximation for integral terms, achieving relative errors on the order of single-precision machine epsilon while providing &lt;em&gt;interpretable, explicit&lt;/em&gt; solution formulas absent from most DL methods (arXiv:2410.00835).&lt;/p&gt;
&lt;h3 class="heading" id="7-structure-preserving-numerical-methods"&gt;
 7. Structure-Preserving Numerical Methods&lt;span class="heading__anchor"&gt; &lt;a href="#7-structure-preserving-numerical-methods"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Structure-preserving methods retain intrinsic properties of the continuous system — symplecticity, energy conservation, divergence-free constraints — at the discrete level. They enhance numerical stability and long-term accuracy, ensuring computed solutions respect the underlying mathematical structure.&lt;/p&gt;
&lt;p&gt;Recent research encompasses geometric integrators and mimetic discretisations for conservative finite element, difference, and volume schemes; stochastic multisymplectic PDEs and their structure-preserving discretisations (&lt;em&gt;Studies in Applied Mathematics&lt;/em&gt;, 2025); and structure-preserving learning via the Geo-NeW model, which exactly preserves physical conservation laws through Finite Element Exterior Calculus. A 2024 University of Maryland workshop identified integration of structure-preserving methods with uncertainty quantification as a key open problem.&lt;/p&gt;
&lt;h3 class="heading" id="8-data-driven-pde-discovery"&gt;
 8. Data-Driven PDE Discovery&lt;span class="heading__anchor"&gt; &lt;a href="#8-data-driven-pde-discovery"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;SINDy&lt;/strong&gt; and its extensions use sparse regression over a dictionary of candidate functions. &lt;strong&gt;GN-SINDy&lt;/strong&gt; (2024–2026) addresses high dimensionality and large datasets by combining Q-DEIM greedy sampling, differentiable surrogate modelling, and sparse regression, showing robustness on Burgers, Allen-Cahn, and KdV equations. &lt;strong&gt;Evo-SINDy&lt;/strong&gt; (ACM, 2025) uses multi-population co-evolutionary algorithms for universal PDE identification. &lt;strong&gt;Bayesian-SINDy&lt;/strong&gt; quantifies parameter uncertainty robustly (arXiv:2402.15357).&lt;/p&gt;
&lt;p&gt;On the neural-symbolic front, &lt;strong&gt;Mechanistic PDE Networks&lt;/strong&gt; (arXiv:2502.18377, 2025) represent spatiotemporal data as space-time dependent linear PDEs within neural network hidden representations, then solve and decode for specific tasks. &lt;strong&gt;MORL4PDEs&lt;/strong&gt; (&lt;em&gt;Chaos Solitons Fractals&lt;/em&gt;, 2024) uses reinforcement learning and genetic algorithms for symbolic PDE regression without pre-specified candidate libraries. The &lt;strong&gt;Physics-Informed Information Criterion (PIC)&lt;/strong&gt; (&lt;em&gt;Research&lt;/em&gt;, 2022) selects the most appropriate PDE from candidates by incorporating symmetry constraints.&lt;/p&gt;
&lt;h3 class="heading" id="9-hamiltonjacobi-pdes"&gt;
 9. Hamilton–Jacobi PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#9-hamiltonjacobi-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Hamilton–Jacobi (HJ) PDEs govern optimal control, level-set methods, and front propagation. A comprehensive 2025 review (arXiv:2502.20833) covers grid-based methods, representation formula methods, Monte Carlo via Laplace&amp;rsquo;s method, and deep learning approaches. Key deep learning advances include actor-critic neural network frameworks for static HJ equations (convergence analysed in 2024), and variational methods that solve HJ PDEs up to 100 dimensions with relative errors of 1–5%. Deep BSDE methods naturally apply to Hamilton-Jacobi-Bellman (HJB) equations arising in stochastic optimal control.&lt;/p&gt;
&lt;h3 class="heading" id="10-fractional-and-non-local-pdes"&gt;
 10. Fractional and Non-Local PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#10-fractional-and-non-local-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Fractional-order derivatives model anomalous diffusion, viscoelastic behaviour, and memory effects that integer-order PDEs cannot capture. Recent advances include semi-analytical methods (Adomian Decomposition, Variational Iteration) applied to 3D time-fractional diffusion, telegraph, and wave equations; a 2024 comprehensive review of fractional stochastic PDEs covering the latest numerical methods and practical implementations; the Optimised FNO (O-FNO, 2025) achieving 98%+ test accuracy for fractional Poisson equations; and a 2025 meshfree finite difference scheme for the fractional Laplacian on arbitrary bounded domains.&lt;/p&gt;
&lt;h3 class="heading" id="11-multiscale-methods-and-model-order-reduction"&gt;
 11. Multiscale Methods and Model Order Reduction&lt;span class="heading__anchor"&gt; &lt;a href="#11-multiscale-methods-and-model-order-reduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The 2024 &lt;em&gt;Numerical Multiscale Methods&lt;/em&gt; dissertation establishes an equivalence between time averaging and space homogenisation, and extends Deep Ritz to multiscale problems via scale convergence theory. &lt;strong&gt;Multi-fidelity reduced order models&lt;/strong&gt; for PDE-constrained optimisation (arXiv:2503.21252, 2025) use a hierarchical trust region algorithm with active learning, constructing a full/reduced/ML model hierarchy on-the-fly. &lt;strong&gt;POD-DL-ROMs&lt;/strong&gt; (Politecnico di Milano, 2024) combine proper orthogonal decomposition with autoencoder architectures for nonlinear parametric PDEs, providing a mathematically rigorous framework enhancing accuracy of reduced models.&lt;/p&gt;
&lt;h3 class="heading" id="12-uncertainty-quantification-and-stochastic-pdes"&gt;
 12. Uncertainty Quantification and Stochastic PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#12-uncertainty-quantification-and-stochastic-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Quasi-Monte Carlo (QMC) methods&lt;/strong&gt; achieve faster convergence than Monte Carlo for smooth integrands. A 2024 paper analyses QMC with generalised Gaussian random variables and Gevrey regular inputs — relaxing the standard uniformly bounded assumption — analysing dimension truncation, FEM, and QMC errors jointly for randomly shifted rank-1 lattice rules (arXiv:2411.03793). &lt;strong&gt;Randomised QMC (RQMC)&lt;/strong&gt; with scrambled Sobol&amp;rsquo; sequences achieves smaller bias and RMSE than Monte Carlo for risk-averse optimisation (arXiv:2408.02842). A 2024 ICERM semester at Brown University (&amp;ldquo;Numerical PDEs: Analysis, Algorithms, and Data Challenges&amp;rdquo;) served as a major gathering point for researchers integrating uncertainty quantification with PDE methods.&lt;/p&gt;
&lt;h3 class="heading" id="13-quantum-and-photonic-computing-for-pdes"&gt;
 13. Quantum and Photonic Computing for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#13-quantum-and-photonic-computing-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Schrödingerisation&lt;/strong&gt; techniques convert general linear PDEs into Schrödinger-type equations via the &amp;ldquo;warped transformation,&amp;rdquo; enabling direct quantum Hamiltonian simulation. A 2024 &lt;em&gt;Quantum&lt;/em&gt; journal paper provides explicit quantum circuit implementations for the heat and advection equations with complexity analysis demonstrating quantum advantage in high dimensions. &lt;strong&gt;ColibriTD&amp;rsquo;s H-DES&lt;/strong&gt; (March 2025) was reported as the first real-hardware solution of a PDE via variational quantum algorithm, executing on IBM&amp;rsquo;s 156-qubit Heron R2 processor for the inviscid Burgers&amp;rsquo; equation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LightSolver&amp;rsquo;s Laser Processing Unit (LPU)&lt;/strong&gt; (announced September 2025) can now directly map and solve PDEs, with constant-time iteration steps independent of problem size, claiming up to 100× speed gains over GPU solvers and partnerships with Ansys for engineering integration.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;PINN training stability.&lt;/strong&gt; Despite many improvements, PINN training remains fragile for stiff and multi-scale problems. A general theory of loss landscape conditioning and principled hyperparameter selection is lacking.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Neural operator generalisation theory.&lt;/strong&gt; While FNO and DeepONet generalise empirically across PDE instances, rigorous approximation-theoretic guarantees relating operator-learning error to network width, depth, and training data remain incomplete.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Foundation model reliability and extrapolation.&lt;/strong&gt; PDE foundation models show impressive zero-shot accuracy within their pre-training distribution, but their failure modes on out-of-distribution physics — and the extent to which physics-informed fine-tuning can compensate — are not yet well understood.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;High-dimensional solvers beyond parabolic PDEs.&lt;/strong&gt; The Deep BSDE method and MLP method primarily address semilinear parabolic PDEs. Extending their curse-of-dimensionality guarantees to elliptic, hyperbolic, or fully nonlinear PDEs remains largely open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structure-preserving deep learning.&lt;/strong&gt; Integrating conservation laws and geometric structure (symplecticity, divergence-free constraints) into neural PDE solvers at scale — beyond the Geo-NeW approach for specific exterior calculus structures — is an active and unresolved challenge.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum hardware advantage.&lt;/strong&gt; Near-term quantum devices face noise and connectivity limitations that restrict their practical advantage over classical HPC for PDE solving. Demonstrating genuine quantum speedup for industrially relevant PDEs on real hardware remains an open goal.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Brunton, S. L., Proctor, J. L., &amp;amp; Kutz, J. N. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems. &lt;em&gt;PNAS, 113&lt;/em&gt;(15), 3932–3937.&lt;/p&gt;
&lt;p&gt;ColibriTD. (2025, March). &lt;em&gt;H-DES: First real-hardware PDE solver via variational quantum algorithm&lt;/em&gt;. The Quantum Insider. &lt;a href="https://thequantuminsider.com/2025/03/25/colibritd-announces-h-des-pde-solver-as-a-step-toward-accessible-quantum-simulation-in-engineering/"&gt;https://thequantuminsider.com/2025/03/25/colibritd-announces-h-des-pde-solver-as-a-step-toward-accessible-quantum-simulation-in-engineering/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;E, W., &amp;amp; Yu, B. (2018). The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. &lt;em&gt;Communications in Mathematics and Statistics, 6&lt;/em&gt;(1), 1–12.&lt;/p&gt;
&lt;p&gt;E, W., Han, J., &amp;amp; Jentzen, A. (2022). Algorithms for solving high dimensional PDEs: From nonlinear Monte Carlo to machine learning. &lt;em&gt;Nonlinearity, 35&lt;/em&gt;(1), 278.&lt;/p&gt;
&lt;p&gt;Han, J., Jentzen, A., &amp;amp; E, W. (2018). Solving high-dimensional partial differential equations using deep learning. &lt;em&gt;PNAS, 115&lt;/em&gt;(34), 8505–8510. &lt;a href="https://www.pnas.org/doi/10.1073/pnas.1718942115"&gt;https://www.pnas.org/doi/10.1073/pnas.1718942115&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Han, J. (2025). &lt;em&gt;A brief review of the Deep BSDE method for solving high-dimensional partial differential equations&lt;/em&gt;. arXiv:2505.17032. &lt;a href="https://arxiv.org/abs/2505.17032"&gt;https://arxiv.org/abs/2505.17032&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hu, J., Jin, S., Liu, N., &amp;amp; Zhang, L. (2024). Quantum circuits for partial differential equations via Schrödingerisation. &lt;em&gt;Quantum, 8&lt;/em&gt;, 1563. &lt;a href="https://quantum-journal.org/papers/q-2024-12-12-1563/"&gt;https://quantum-journal.org/papers/q-2024-12-12-1563/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;A staged training approach for physics-informed neural networks in solving partial differential equations&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11172661/"&gt;https://ieeexplore.ieee.org/document/11172661/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;Higher-order-ReLU-KANs (HRKANs) for solving physics-informed neural networks more accurately, robustly and faster&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11105234/"&gt;https://ieeexplore.ieee.org/document/11105234/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;ReBA: A hybrid sparse reconfigurable butterfly accelerator for solving PDEs via hardware and algorithm co-design&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11044078/"&gt;https://ieeexplore.ieee.org/document/11044078/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;IEEE. (2025). &lt;em&gt;An optimized Fourier neural operator for the 2D fractional Poisson equation&lt;/em&gt;. &lt;a href="https://ieeexplore.ieee.org/document/11405135/"&gt;https://ieeexplore.ieee.org/document/11405135/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Li, Z., et al. (2020). &lt;em&gt;Fourier neural operator for parametric partial differential equations&lt;/em&gt;. arXiv:2010.08895.&lt;/p&gt;
&lt;p&gt;LightSolver. (2025, September). &lt;em&gt;LightSolver announces advance in physical modeling on the LPU&lt;/em&gt;. The Quantum Insider. &lt;a href="https://thequantuminsider.com/2025/09/16/lightsolver-announces-advance-in-physical-modeling-on-the-lpu-and-new-roadmap-for-optical-analog-pde-solving/"&gt;https://thequantuminsider.com/2025/09/16/lightsolver-announces-advance-in-physical-modeling-on-the-lpu-and-new-roadmap-for-optical-analog-pde-solving/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Liu, Z., et al. (2024). &lt;em&gt;KAN: Kolmogorov-Arnold Networks&lt;/em&gt;. arXiv:2404.19756. ICLR 2025. &lt;a href="https://arxiv.org/abs/2404.19756"&gt;https://arxiv.org/abs/2404.19756&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Lu, L., Jin, P., Pang, G., Zhang, Z., &amp;amp; Karniadakis, G. E. (2021). Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. &lt;em&gt;Nature Machine Intelligence, 3&lt;/em&gt;, 218–229.&lt;/p&gt;
&lt;p&gt;Lu, L., et al. (2024). Learning nonlinear operators in latent spaces for real-time predictions of complex dynamics in physical systems. &lt;em&gt;Nature Communications&lt;/em&gt;. &lt;a href="https://www.nature.com/articles/s41467-024-49411-w"&gt;https://www.nature.com/articles/s41467-024-49411-w&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;McCabe, M., et al. (2025). &lt;em&gt;Poseidon: Efficient foundation models for PDEs&lt;/em&gt;. arXiv:2405.19101. &lt;a href="https://arxiv.org/html/2405.19101v2"&gt;https://arxiv.org/html/2405.19101v2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Peng, W., et al. (2025). &lt;em&gt;OmniArch: Building foundation model for scientific computing&lt;/em&gt;. ICML 2025. &lt;a href="https://icml.cc/virtual/2025/poster/45099"&gt;https://icml.cc/virtual/2025/poster/45099&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Raissi, M., Perdikaris, P., &amp;amp; Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. &lt;em&gt;Journal of Computational Physics, 378&lt;/em&gt;, 686–707.&lt;/p&gt;
&lt;p&gt;Shi, Z., et al. (2025). &lt;em&gt;Physics-informed fine-tuning of foundation models for partial differential equations&lt;/em&gt;. arXiv:2603.15431. &lt;a href="https://arxiv.org/html/2603.15431v1"&gt;https://arxiv.org/html/2603.15431v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, S., et al. (2025). &lt;em&gt;Geo-NeW: Structure-preserving learning improves geometry generalization in PDEs&lt;/em&gt;. arXiv:2602.02788. &lt;a href="https://arxiv.org/abs/2602.02788"&gt;https://arxiv.org/abs/2602.02788&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Wang, Z., et al. (2024). Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems. &lt;em&gt;Computer Methods in Applied Mechanics and Engineering&lt;/em&gt;. &lt;a href="https://linkinghub.elsevier.com/retrieve/pii/S0045782524007722"&gt;https://linkinghub.elsevier.com/retrieve/pii/S0045782524007722&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xiao, P., et al. (2025). Quantum DeepONet: Neural operators accelerated by quantum computing. &lt;em&gt;Quantum, 9&lt;/em&gt;, 1761. &lt;a href="https://quantum-journal.org/papers/q-2025-06-04-1761/"&gt;https://quantum-journal.org/papers/q-2025-06-04-1761/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xie, Z., et al. (2025). &lt;em&gt;Anant-Net: Breaking the curse of dimensionality with scalable and interpretable neural surrogates&lt;/em&gt;. arXiv:2505.03595. &lt;a href="https://arxiv.org/html/2505.03595v3"&gt;https://arxiv.org/html/2505.03595v3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xie, Z., et al. (2025). &lt;em&gt;A deep shotgun method for solving high-dimensional parabolic partial differential equations&lt;/em&gt;. &lt;em&gt;Journal of Scientific Computing&lt;/em&gt;. &lt;a href="https://link.springer.com/10.1007/s10915-025-02983-1"&gt;https://link.springer.com/10.1007/s10915-025-02983-1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Xu, K., &amp;amp; Darve, E. (2025). &lt;em&gt;Integration matters for learning PDEs with backwards SDEs&lt;/em&gt;. arXiv:2505.01078. &lt;a href="https://arxiv.org/abs/2505.01078"&gt;https://arxiv.org/abs/2505.01078&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zeng, Q., et al. (2025). Automatic network structure discovery of physics informed neural networks via knowledge distillation. &lt;em&gt;Nature Communications&lt;/em&gt;. &lt;a href="https://www.nature.com/articles/s41467-025-64624-3"&gt;https://www.nature.com/articles/s41467-025-64624-3&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zhang, Y., et al. (2024). &lt;em&gt;PDEformer: Towards a foundation model for one-dimensional partial differential equations&lt;/em&gt;. arXiv:2402.12652. &lt;a href="http://arxiv.org/pdf/2402.12652.pdf"&gt;http://arxiv.org/pdf/2402.12652.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Zhang, Y., et al. (2025). &lt;em&gt;A multimodal PDE foundation model for prediction and scientific text descriptions&lt;/em&gt;. arXiv:2502.06026. &lt;a href="https://arxiv.org/abs/2502.06026"&gt;https://arxiv.org/abs/2502.06026&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Recent Advances in Steady States of Navier-Stokes Equations</title><link>https://blog.namln.org/en/posts/ss-nse/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/ss-nse/</guid><description>&lt;p&gt;The study of steady-state and self-similar solutions of the incompressible Navier-Stokes equations (NSE) has undergone remarkable progress in the 2020s. This post surveys landmark results from 2024–2026 touching on existence, uniqueness, classification, and stability of such solutions. The stationary (steady) NSE in $\mathbb{R}^3$ reads:&lt;/p&gt;
&lt;p&gt;$$-\nu \Delta u + (u \cdot \nabla) u + \nabla p = 0, \quad \operatorname{div} u = 0.$$&lt;/p&gt;
&lt;p&gt;A central object of the self-similar theory is the class of &lt;strong&gt;$(-1)$-homogeneous&lt;/strong&gt; (scale-invariant) solutions: a function $u$ is $(-1)$-homogeneous if $u(\lambda x) = \lambda^{-1} u(x)$ for all $\lambda &amp;gt; 0$. These are precisely the profiles of forward self-similar solutions $u(x,t) = t^{-1/2} U(x/\sqrt{t})$ of the time-dependent NSE.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Five landmark results define the frontier of this area in 2024–2026:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Non-uniqueness of Leray–Hopf solutions&lt;/strong&gt; via a computer-assisted proof in the self-similar framework (Hou, Wang, &amp;amp; Yang, 2025).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Forward self-similar solutions in 2D&lt;/strong&gt; for arbitrarily large initial data (Albritton, Guillod, Korobkov, &amp;amp; Ren, 2026).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Existence of self-similar solutions in high dimensions&lt;/strong&gt; ($4 \leq n \leq 16$) without smallness conditions (Bang, Gui, Liu, Wang, &amp;amp; Xie, 2025).&lt;/li&gt;
&lt;li&gt;Sharp &lt;strong&gt;removable singularity results&lt;/strong&gt; for $(-1)$-homogeneous solutions with singular rays (Li, Li, &amp;amp; Yan, 2024).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Steady NSE in junction domains&lt;/strong&gt; with large, non-small fluxes (Gazzola, Korobkov, Ren, &amp;amp; Sperone, 2025).&lt;/li&gt;
&lt;/ol&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Paper&lt;/th&gt;
					&lt;th&gt;Authors&lt;/th&gt;
					&lt;th&gt;Contribution&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2410.11170&lt;/td&gt;
					&lt;td&gt;Li, Li, Yan&lt;/td&gt;
					&lt;td&gt;Optimal removable singularity for $(-1)$-homogeneous solutions&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2412.07283&lt;/td&gt;
					&lt;td&gt;Bang, Gui, Liu, Wang, Xie&lt;/td&gt;
					&lt;td&gt;Self-similar solutions in 2D sector: existence/non-uniqueness&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2505.14642&lt;/td&gt;
					&lt;td&gt;Gazzola, Korobkov, Ren, Sperone&lt;/td&gt;
					&lt;td&gt;Steady NSE in junction channels, non-small fluxes&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2509.25116&lt;/td&gt;
					&lt;td&gt;Hou, Wang, Yang&lt;/td&gt;
					&lt;td&gt;&lt;strong&gt;First rigorous non-uniqueness of Leray–Hopf&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2510.10488&lt;/td&gt;
					&lt;td&gt;Bang, Gui, Liu, Wang, Xie&lt;/td&gt;
					&lt;td&gt;$(-1)$-homogeneous solutions, dimensions $4 \leq n \leq 16$&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2601.03161&lt;/td&gt;
					&lt;td&gt;Albritton, Guillod, Korobkov, Ren&lt;/td&gt;
					&lt;td&gt;Forward self-similar solutions, 2D, large data&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2601.03833&lt;/td&gt;
					&lt;td&gt;Gui, Liu, Xie&lt;/td&gt;
					&lt;td&gt;Global existence of 2D forward self-similar solutions&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;arXiv:2602.19846&lt;/td&gt;
					&lt;td&gt;Fujii&lt;/td&gt;
					&lt;td&gt;Sharp uniqueness/non-uniqueness in critical Besov spaces&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="landau-solutions-and-šveráks-classification"&gt;
 Landau Solutions and Šverák&amp;rsquo;s Classification&lt;span class="heading__anchor"&gt; &lt;a href="#landau-solutions-and-%c5%a1ver%c3%a1ks-classification"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In 1944, Landau discovered a three-parameter explicit family of $(-1)$-homogeneous axisymmetric no-swirl solutions of the 3D stationary NSE. Known as &lt;strong&gt;Landau solutions&lt;/strong&gt;, they are parameterized by vectors $b \in \mathbb{R}^3$ and represent fluid jets emanating from the origin. A seminal result of Šverák (2006) established that all $(-1)$-homogeneous solutions smooth on $\mathbb{S}^2$ must be Landau solutions — the only scale-invariant flows without singularities on the sphere.&lt;/p&gt;
&lt;h3 class="heading" id="forward-self-similar-solutions"&gt;
 Forward Self-Similar Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#forward-self-similar-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A &lt;strong&gt;forward self-similar solution&lt;/strong&gt; takes the form&lt;/p&gt;
&lt;p&gt;$$u(x, t) = \frac{1}{\sqrt{t}} U!\left(\frac{x}{\sqrt{t}}\right),$$&lt;/p&gt;
&lt;p&gt;where the self-similar profile $U$ solves the stationary scaled NSE. The seminal work of Jia and Šverák (2014) showed that for any $(-1)$-homogeneous initial data smooth away from the origin, at least one global self-similar solution exists for &lt;strong&gt;large data&lt;/strong&gt; — without any smallness restriction. Existence is proved via the Leray–Schauder continuation theorem rather than a fixed-point contraction (Jia &amp;amp; Šverák, 2015).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Discretely self-similar&lt;/strong&gt; (DSS) solutions, where $u(\lambda x, \lambda^2 t) = \lambda^{-1} u(x,t)$ for a specific $\lambda &amp;gt; 1$, were constructed for large data by Tsai (2014).&lt;/p&gt;
&lt;h3 class="heading" id="classification-of--1-homogeneous-solutions"&gt;
 Classification of $(-1)$-Homogeneous Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#classification-of--1-homogeneous-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Tian and Xin (1998) proved that all $(-1)$-homogeneous axisymmetric solutions with exactly one singularity must be Landau solutions. A key series of papers by Li, Li, and Yan (2016–2023) classified all $(-1)$-homogeneous axisymmetric no-swirl solutions with singularities at both the north and south poles of $\mathbb{S}^2$, parameterizing them as a four-dimensional surface with boundary. They also constructed the first &lt;strong&gt;non-axisymmetric&lt;/strong&gt; $(-1)$-homogeneous solutions with swirl using the Weierstrass representation of minimal surfaces.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-removable-singularity-theorem-li-li--yan-2024"&gt;
 1. Removable Singularity Theorem (Li, Li, &amp;amp; Yan, 2024)&lt;span class="heading__anchor"&gt; &lt;a href="#1-removable-singularity-theorem-li-li--yan-2024"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;One of the sharpest results of 2024 is the &lt;strong&gt;removable singularity theorem&lt;/strong&gt; proved by Li, Li, and Yan (arXiv:2410.11170, to appear in &lt;em&gt;Trans. Amer. Math. Soc.&lt;/em&gt;): any local $(-1)$-homogeneous solution $u$ near a potential singular ray through $P \in \mathbb{S}^2$ extends smoothly across $P$, &lt;strong&gt;provided&lt;/strong&gt; $u = o(\ln \operatorname{dist}(x, P))$ on $\mathbb{S}^2$.&lt;/p&gt;
&lt;p&gt;The result is &lt;strong&gt;sharp&lt;/strong&gt;: for any $\alpha &amp;gt; 0$, there exist local solutions where $|u(x)| / \ln |x&amp;rsquo;| \to -\alpha$ as $x \to P$, showing that logarithmic growth exactly prevents smooth extension. The paper also establishes existence of solutions with any finite number of singularities located arbitrarily on $\mathbb{S}^2$. A companion survey by Li and Yan (arXiv:2509.07243, Sep 2025) provides a state-of-the-art exposition of this topic.&lt;/p&gt;
&lt;h3 class="heading" id="2-self-similar-solutions-in-high-dimensions-bang-et-al-2025"&gt;
 2. Self-Similar Solutions in High Dimensions (Bang et al., 2025)&lt;span class="heading__anchor"&gt; &lt;a href="#2-self-similar-solutions-in-high-dimensions-bang-et-al-2025"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Bang, Gui, Liu, Wang, and Xie (arXiv:2510.10488, Oct 2025) proved existence of $(-1)$-homogeneous solutions to the steady NSE in &lt;strong&gt;high spatial dimensions&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For any $(-3)$-homogeneous, locally Lipschitz external force on $\mathbb{R}^n \setminus {0}$ with $4 \leq n \leq 16$, the steady NSE admit at least one $(-1)$-homogeneous solution that is scale-invariant and regular away from the origin.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Global uniqueness&lt;/strong&gt; holds when the external force is small. The key novelty is a &lt;strong&gt;dimension-reduction effect&lt;/strong&gt; from self-similarity: integral estimates of the positive part of the total head pressure enable energy estimates even in the supercritical dimension regime. For forces with only a nonnegative radial component, existence extends to &lt;strong&gt;all $n \geq 4$&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The same group (arXiv:2412.07283, Dec 2024) also established existence, uniqueness, and non-uniqueness of self-similar solutions to the steady NSE in &lt;strong&gt;2D sectors&lt;/strong&gt; with no-slip boundary conditions, providing rigorous corrections to classical Rosenhead (1940) calculations.&lt;/p&gt;
&lt;h3 class="heading" id="3-forward-self-similar-solutions-in-2d-for-large-data-2026"&gt;
 3. Forward Self-Similar Solutions in 2D for Large Data (2026)&lt;span class="heading__anchor"&gt; &lt;a href="#3-forward-self-similar-solutions-in-2d-for-large-data-2026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Two independent papers in January 2026 addressed the 2D problem, where classical local energy estimates break down because the initial $(-1)$-homogeneous vorticity is not locally integrable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gui, Liu, and Xie&lt;/strong&gt; (arXiv:2601.03833) established global existence of forward self-similar solutions for any divergence-free, $(-1)$-homogeneous, locally Hölder continuous initial velocity, with &lt;strong&gt;no smallness assumption&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Albritton, Guillod, Korobkov, and Ren&lt;/strong&gt; (arXiv:2601.03161) independently constructed such solutions from &lt;strong&gt;arbitrarily large&lt;/strong&gt; initial data and provided &lt;strong&gt;numerical evidence for non-uniqueness&lt;/strong&gt; — the first construction and validation of non-uniqueness for the 2D self-similar problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="4-non-uniqueness-of-lerayhopf-solutions-hou-wang--yang-2025"&gt;
 4. Non-Uniqueness of Leray–Hopf Solutions (Hou, Wang, &amp;amp; Yang, 2025)&lt;span class="heading__anchor"&gt; &lt;a href="#4-non-uniqueness-of-lerayhopf-solutions-hou-wang--yang-2025"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The most dramatic recent development is the &lt;strong&gt;first rigorous computer-assisted proof of non-uniqueness of Leray–Hopf solutions&lt;/strong&gt; to the unforced 3D NSE by Hou, Wang, and Yang (arXiv:2509.25116, Sep 2025, revised Mar 2026):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There exist &lt;strong&gt;infinitely many distinct suitable Leray–Hopf solutions&lt;/strong&gt; to the 3D NSE on $\mathbb{R}^3 \times [0,1]$ with the same compactly supported, divergence-free initial condition $u_{in} \in L^q$ for any $q &amp;lt; 3$.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The proof executes the &lt;strong&gt;Jia–Šverák program&lt;/strong&gt; (Jia &amp;amp; Šverák, 2015), which requires finding a large forward self-similar background flow whose linearized operator has an &lt;strong&gt;unstable eigenvalue&lt;/strong&gt; (positive real part), then bifurcating to produce infinitely many Leray–Hopf solutions. The key steps are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A finite-element + spectral-basis numerical method computes a highly precise candidate profile $\tilde{U}$.&lt;/li&gt;
&lt;li&gt;The linearized operator $L_{\tilde{U}}$ is decomposed into a coercive part plus a finite-rank perturbation, whose invertibility is certified by &lt;strong&gt;computer-assisted interval arithmetic&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;This certifies an unstable eigenpair $(\tilde{v}, \tilde{\lambda})$ with $\operatorname{Re}(\tilde{\lambda}) &amp;gt; 0$, yielding the second (and infinitely many) solutions via Riesz projection and Duhamel analysis.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These solutions just miss the Prodi–Serrin condition that guarantees uniqueness. Guillod and Šverák (2017) had provided strong numerical evidence that such unstable profiles exist, but the rigorous proof remained elusive until Hou et al.&lt;/p&gt;
&lt;h3 class="heading" id="5-sharp-non-uniqueness-for-weak-solutions-via-convex-integration-20222026"&gt;
 5. Sharp Non-Uniqueness for Weak Solutions via Convex Integration (2022–2026)&lt;span class="heading__anchor"&gt; &lt;a href="#5-sharp-non-uniqueness-for-weak-solutions-via-convex-integration-20222026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A parallel program uses convex integration to prove non-uniqueness of weak solutions. Cheskidov and Luo (&lt;em&gt;Invent. Math.&lt;/em&gt;, 2022) proved sharp non-uniqueness in $L^p_t L^\infty$ for any $p &amp;lt; 2$ in the periodic setting. Miao, Nie, and Ye (arXiv:2412.09637, Dec 2024) extended this to $\mathbb{R}^3$. Fujii (arXiv:2602.19846, Feb 2026) completed a sharp classification in critical Besov spaces $C([0,T); \dot{B}^{n/p-1}_{p,q}(\mathbb{R}^n))$, finding that large-time asymptotics of non-unique solutions are governed by non-trivial &lt;strong&gt;stationary flows&lt;/strong&gt; — a first in the critical regularity setting.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Result&lt;/th&gt;
					&lt;th&gt;Authors&lt;/th&gt;
					&lt;th&gt;Year&lt;/th&gt;
					&lt;th&gt;Setting&lt;/th&gt;
					&lt;th style="text-align: center"&gt;Self-similar?&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;Non-uniqueness, $L^p_t L^\infty$, torus&lt;/td&gt;
					&lt;td&gt;Cheskidov &amp;amp; Luo&lt;/td&gt;
					&lt;td&gt;2022&lt;/td&gt;
					&lt;td&gt;3D periodic&lt;/td&gt;
					&lt;td style="text-align: center"&gt;No&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Non-uniqueness, $L^p_t L^\infty$, $\mathbb{R}^3$&lt;/td&gt;
					&lt;td&gt;Miao, Nie &amp;amp; Ye&lt;/td&gt;
					&lt;td&gt;2024&lt;/td&gt;
					&lt;td&gt;3D whole space&lt;/td&gt;
					&lt;td style="text-align: center"&gt;No&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Non-uniqueness of Leray–Hopf, 3D&lt;/td&gt;
					&lt;td&gt;Hou, Wang &amp;amp; Yang&lt;/td&gt;
					&lt;td&gt;2025&lt;/td&gt;
					&lt;td&gt;3D whole space&lt;/td&gt;
					&lt;td style="text-align: center"&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Forward self-similar, 2D, large data&lt;/td&gt;
					&lt;td&gt;Albritton et al.&lt;/td&gt;
					&lt;td&gt;2026&lt;/td&gt;
					&lt;td&gt;2D whole space&lt;/td&gt;
					&lt;td style="text-align: center"&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Steady NSE in 2D sector&lt;/td&gt;
					&lt;td&gt;Bang et al.&lt;/td&gt;
					&lt;td&gt;2024&lt;/td&gt;
					&lt;td&gt;2D sector&lt;/td&gt;
					&lt;td style="text-align: center"&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 class="heading" id="6-liouville-theorems-and-stability-of-landau-solutions"&gt;
 6. Liouville Theorems and Stability of Landau Solutions&lt;span class="heading__anchor"&gt; &lt;a href="#6-liouville-theorems-and-stability-of-landau-solutions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Tan&lt;/strong&gt; (arXiv:2501.03609, Jan 2025) proved new Liouville theorems for the stationary NSE (including the fractional case) under growth conditions in Lebesgue spaces. &lt;strong&gt;Ding and Tan&lt;/strong&gt; (arXiv:2501.03615, Jan 2025) proved a Liouville theorem for the stationary &lt;strong&gt;inhomogeneous&lt;/strong&gt; NSE via frequency localization of the Dirichlet energy near the origin.&lt;/p&gt;
&lt;p&gt;The asymptotic stability of small Landau solutions in $L^3$ was sharpened by &lt;strong&gt;Bradshaw and Wang&lt;/strong&gt; (arXiv:2409.12918, Sep 2024): $L^3$-asymptotic stability holds in Lorentz spaces $L^{3,q}$ for $q &amp;lt; \infty$, but &lt;strong&gt;fails&lt;/strong&gt; in $L^{3,\infty}$ (weak-$L^3$), marking the precise boundary of stability.&lt;/p&gt;
&lt;h3 class="heading" id="7-steady-nse-in-bounded-and-unbounded-domains"&gt;
 7. Steady NSE in Bounded and Unbounded Domains&lt;span class="heading__anchor"&gt; &lt;a href="#7-steady-nse-in-bounded-and-unbounded-domains"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A major reference work by Korobkov, Pileckas, and Russo (&lt;em&gt;Springer/Birkhäuser&lt;/em&gt;, March 2024) provides the first comprehensive book treatment of &lt;strong&gt;Leray&amp;rsquo;s problem&lt;/strong&gt;: existence of a solution in bounded domains under only the condition of zero total flux — without smallness on the boundary data.&lt;/p&gt;
&lt;p&gt;Gazzola, Korobkov, Ren, and Sperone (arXiv:2505.14642, May 2025) studied steady NSE in a &lt;strong&gt;junction of unbounded channels&lt;/strong&gt; with sources and sinks, under inhomogeneous Dirichlet boundary conditions and without smallness of fluxes. They prove existence of a solution with uniformly bounded Dirichlet integral in every compact subset via Leray&amp;rsquo;s &lt;em&gt;reductio ad absurdum&lt;/em&gt; argument using Morse–Sard-type theorems in Sobolev spaces.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Several central questions remain unresolved or only partially answered:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Clay Millennium Prize Problem.&lt;/strong&gt; Whether 3D NSE solutions from smooth initial data can blow up in finite time is not resolved. The Hou et al. non-uniqueness result concerns Leray–Hopf solutions from &lt;em&gt;singular&lt;/em&gt; $L^q$ ($q &amp;lt; 3$) initial data, not smooth data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Complete classification of $(-1)$-homogeneous solutions in 3D.&lt;/strong&gt; The axisymmetric no-swirl case is fully classified, and swirl solutions are well-studied, but a complete classification for all $(-1)$-homogeneous solutions with arbitrarily many singular rays and all possible swirl configurations is not yet achieved.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rigorous non-uniqueness of forward self-similar solutions in 3D.&lt;/strong&gt; The Jia–Šverák program produced numerical evidence (Guillod &amp;amp; Šverák, 2017), but a fully rigorous, non-computer-assisted proof of non-uniqueness for the forward (not backward) self-similar 3D problem remains open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Asymptotic stability of large Landau solutions.&lt;/strong&gt; While small Landau solutions are asymptotically stable in $L^3$, stability for large-parameter Landau solutions is not fully understood.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Leray problem in non-axisymmetric 3D exterior domains without flux restrictions.&lt;/strong&gt; The axisymmetric case was solved by Korobkov, Pileckas, and Russo, but the general 3D exterior domain problem under large flux remains open.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Albritton, D., Guillod, J., Korobkov, M., &amp;amp; Ren, X. (2026). &lt;em&gt;Forward self-similar solutions to the 2D Navier-Stokes equations from large data&lt;/em&gt;. arXiv:2601.03161. &lt;a href="https://arxiv.org/abs/2601.03161"&gt;https://arxiv.org/abs/2601.03161&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bang, J., Gui, C., Liu, Y., Wang, C., &amp;amp; Xie, C. (2024). &lt;em&gt;Self-similar solutions to the steady Navier-Stokes equations in 2D sectors&lt;/em&gt;. arXiv:2412.07283. &lt;a href="https://arxiv.org/abs/2412.07283"&gt;https://arxiv.org/abs/2412.07283&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bang, J., Gui, C., Liu, Y., Wang, C., &amp;amp; Xie, C. (2025). &lt;em&gt;On the existence of self-similar solutions to the steady Navier-Stokes equations in high dimensions&lt;/em&gt;. arXiv:2510.10488. &lt;a href="https://arxiv.org/abs/2510.10488"&gt;https://arxiv.org/abs/2510.10488&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bradshaw, Z., &amp;amp; Wang, X. (2024). &lt;em&gt;Asymptotic stability of Landau solutions in Lorentz spaces&lt;/em&gt;. arXiv:2409.12918. &lt;a href="https://arxiv.org/pdf/2409.12918.pdf"&gt;https://arxiv.org/pdf/2409.12918.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Cheskidov, A., &amp;amp; Luo, X. (2022). Sharp nonuniqueness for the Navier-Stokes equations. &lt;em&gt;Inventiones Mathematicae&lt;/em&gt;. arXiv:2009.06596. &lt;a href="https://arxiv.org/abs/2009.06596"&gt;https://arxiv.org/abs/2009.06596&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Ding, M., &amp;amp; Tan, W. (2025). &lt;em&gt;Liouville-type theorem for the stationary inhomogeneous Navier-Stokes equations&lt;/em&gt;. arXiv:2501.03615. &lt;a href="https://arxiv.org/abs/2501.03615"&gt;https://arxiv.org/abs/2501.03615&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Fujii, M. (2026). &lt;em&gt;Sharp non-uniqueness for the Navier-Stokes equations in critical Besov spaces&lt;/em&gt;. arXiv:2602.19846. &lt;a href="https://arxiv.org/html/2602.19846v1"&gt;https://arxiv.org/html/2602.19846v1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gazzola, F., Korobkov, M., Ren, X., &amp;amp; Sperone, G. (2025). &lt;em&gt;The steady Navier-Stokes equations in a system of unbounded channels with sources and sinks&lt;/em&gt;. arXiv:2505.14642. &lt;a href="https://arxiv.org/abs/2505.14642"&gt;https://arxiv.org/abs/2505.14642&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gui, C., Liu, Y., &amp;amp; Xie, C. (2026). &lt;em&gt;On the forward self-similar solutions to the two-dimensional Navier-Stokes equations&lt;/em&gt;. arXiv:2601.03833. &lt;a href="https://arxiv.org/html/2601.03833v2"&gt;https://arxiv.org/html/2601.03833v2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hou, T., Wang, Y., &amp;amp; Yang, C. (2025). &lt;em&gt;Nonuniqueness of Leray-Hopf solutions to the unforced incompressible 3D Navier-Stokes equations&lt;/em&gt;. arXiv:2509.25116. &lt;a href="https://arxiv.org/abs/2509.25116"&gt;https://arxiv.org/abs/2509.25116&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Jia, H., &amp;amp; Šverák, V. (2015). Are the incompressible 3d Navier–Stokes equations locally ill-posed in the natural energy space? &lt;em&gt;Journal of Functional Analysis, 268&lt;/em&gt;(12), 3734–3766. &lt;a href="https://www.sciencedirect.com/science/article/pii/S002212361500138X"&gt;https://www.sciencedirect.com/science/article/pii/S002212361500138X&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Korobkov, M., Pileckas, K., &amp;amp; Russo, R. (2024). &lt;em&gt;The Steady Navier-Stokes System: Basics of the Theory and the Leray Problem&lt;/em&gt;. Springer/Birkhäuser. &lt;a href="https://books.google.com/books/about/The_Steady_Navier_Stokes_System.html?id=GOf8EAAAQBAJ"&gt;https://books.google.com/books/about/The_Steady_Navier_Stokes_System.html?id=GOf8EAAAQBAJ&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Korobkov, M., &amp;amp; Ren, X. (2024). &lt;em&gt;On basic velocity estimates for the plane steady-state Navier-Stokes equations in convex domains&lt;/em&gt;. arXiv:2405.17884. &lt;a href="https://arxiv.org/abs/2405.17884"&gt;https://arxiv.org/abs/2405.17884&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Li, L., Li, Y., &amp;amp; Yan, Y. (2024). &lt;em&gt;Removable singularity of $(-1)$-homogeneous solutions of stationary Navier-Stokes equations&lt;/em&gt;. &lt;em&gt;Transactions of the American Mathematical Society&lt;/em&gt;. arXiv:2410.11170. &lt;a href="https://arxiv.org/abs/2410.11170"&gt;https://arxiv.org/abs/2410.11170&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Li, Y., &amp;amp; Yan, Y. (2025). &lt;em&gt;Recent research on $(-1)$-homogeneous solutions of stationary Navier-Stokes equations&lt;/em&gt;. arXiv:2509.07243. &lt;a href="https://arxiv.org/abs/2509.07243"&gt;https://arxiv.org/abs/2509.07243&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Miao, C., Nie, Y., &amp;amp; Ye, W. (2024). &lt;em&gt;Sharp non-uniqueness for the Navier-Stokes equations in the whole space&lt;/em&gt;. arXiv:2412.09637. &lt;a href="https://arxiv.org/abs/2412.09637"&gt;https://arxiv.org/abs/2412.09637&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Tan, W. (2025). &lt;em&gt;New Liouville type theorems for the stationary Navier-Stokes equations&lt;/em&gt;. arXiv:2501.03609. &lt;a href="https://arxiv.org/pdf/2501.03609.pdf"&gt;https://arxiv.org/pdf/2501.03609.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Tsai, T.-P. (2014). &lt;em&gt;Forward discretely self-similar solutions of the Navier-Stokes equations&lt;/em&gt;. arXiv:1210.2783. &lt;a href="https://arxiv.org/abs/1210.2783"&gt;https://arxiv.org/abs/1210.2783&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Recent Research Directions in Analysis of PDEs 2021–2026</title><link>https://blog.namln.org/en/posts/recent-pde-2126/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/recent-pde-2126/</guid><description>&lt;p&gt;The arXiv section of Analysis of Partial Differential Equations is one of the most prolific areas of pure mathematics, producing over 400 preprints per month as of early 2026. The period 2021–2026 has witnessed landmark breakthroughs — including a computer-assisted proof of finite-time singularity in the 3D Euler equations, the resolution of Hilbert&amp;rsquo;s Sixth Problem via kinetic theory, and the emergence of probabilistic and nonlocal operator methods as dominant paradigms. This survey identifies, categorises, and profiles the key research directions and landmark papers in math.AP during this era.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="overview"&gt;
 Overview&lt;span class="heading__anchor"&gt; &lt;a href="#overview"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The landscape of math.AP in 2021–2026 organises into several major research directions:&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Direction&lt;/th&gt;
					&lt;th&gt;Landmark Papers&lt;/th&gt;
					&lt;th&gt;Landmark Results&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;Fluid singularity (Euler)&lt;/td&gt;
					&lt;td&gt;Chen &amp;amp; Hou (2022–2023)&lt;/td&gt;
					&lt;td&gt;Finite-time blowup for 3D Euler/2D Boussinesq, smooth data (&lt;em&gt;PNAS&lt;/em&gt; 2025)&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;NS non-uniqueness&lt;/td&gt;
					&lt;td&gt;Albritton, Brué &amp;amp; Colombo (2021)&lt;/td&gt;
					&lt;td&gt;Non-unique Leray–Hopf solutions for forced NS&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Hilbert&amp;rsquo;s 6th Problem&lt;/td&gt;
					&lt;td&gt;Deng, Hani &amp;amp; Ma (2024–2025)&lt;/td&gt;
					&lt;td&gt;Long-time Boltzmann derivation; fluid equations from Newton&amp;rsquo;s laws&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Wave kinetic equation&lt;/td&gt;
					&lt;td&gt;Deng &amp;amp; Hani (2021)&lt;/td&gt;
					&lt;td&gt;Rigorous WKE derivation from cubic NLS&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Mixed local-nonlocal operators&lt;/td&gt;
					&lt;td&gt;Biagi, Dipierro, Valdinoci et al. (2020–2022)&lt;/td&gt;
					&lt;td&gt;Regularity, max. principles, Faber-Krahn inequalities&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Double phase functionals&lt;/td&gt;
					&lt;td&gt;De Filippis &amp;amp; Mingione (2022–2023)&lt;/td&gt;
					&lt;td&gt;Gradient regularity in mixed/double phase settings&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Normalized Schrödinger&lt;/td&gt;
					&lt;td&gt;Wei &amp;amp; Wu (2021); Jeanjean &amp;amp; Le (2020)&lt;/td&gt;
					&lt;td&gt;Critical mass constraints, ground states, NLS&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;MFG inverse problems&lt;/td&gt;
					&lt;td&gt;Imanuvilov, Liu &amp;amp; Yamamoto (2023)&lt;/td&gt;
					&lt;td&gt;Lipschitz stability, Carleman estimates for MFG&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Keller-Segel chemotaxis&lt;/td&gt;
					&lt;td&gt;Li &amp;amp; Winkler (2022); Lyu &amp;amp; Wang (2021)&lt;/td&gt;
					&lt;td&gt;Signal-dependent motility, global regularity&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Stefan/free boundary&lt;/td&gt;
					&lt;td&gt;Ferrari et al. (2024); Arya, Jeon &amp;amp; Julin (2026)&lt;/td&gt;
					&lt;td&gt;$C^{1,\alpha}$ regularity, supercooled Stefan&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Stochastic PDEs&lt;/td&gt;
					&lt;td&gt;Bailleul &amp;amp; Bruned (2021); Bailleul &amp;amp; Hoshino (2025)&lt;/td&gt;
					&lt;td&gt;Renormalisation, regularity structures&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Calderón inverse problem&lt;/td&gt;
					&lt;td&gt;Cârstea, Uhlmann et al. (2021); Krupchyk (2025)&lt;/td&gt;
					&lt;td&gt;Nonlinear and fractional settings&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;Dispersive PDEs&lt;/td&gt;
					&lt;td&gt;Deng, Nahmod &amp;amp; Yue (2020); Gubinelli et al. (2025)&lt;/td&gt;
					&lt;td&gt;Random tensors, modulated dispersive equations&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="background"&gt;
 Background&lt;span class="heading__anchor"&gt; &lt;a href="#background"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="the-mathap-landscape"&gt;
 The math.AP Landscape&lt;span class="heading__anchor"&gt; &lt;a href="#the-mathap-landscape"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Analysis of PDEs is the mathematical study of equations involving unknown functions and their partial derivatives, arising in physics, geometry, probability, and engineering. The arXiv math.AP category encompasses everything from regularity theory for elliptic and parabolic equations to global well-posedness for dispersive equations, from geometric flows to inverse problems, and from kinetic theory to stochastic PDEs. With roughly 300–400 papers per month (408 in February 2026 alone), it is one of the most active and interconnected areas of pure mathematics.&lt;/p&gt;
&lt;p&gt;The period 2021–2026 is characterised by three broad trends. First, &lt;strong&gt;grand-challenge resolutions&lt;/strong&gt;: several longstanding open problems — including Hilbert&amp;rsquo;s Sixth Problem and the existence of finite-time singularities for 3D Euler equations with smooth data — were settled using novel combinations of rigorous analysis, Feynman-diagram combinatorics, and computer-assisted numerics. Second, &lt;strong&gt;new paradigm emergence&lt;/strong&gt;: mixed local-nonlocal operators, double phase functionals, and normalised solutions have matured from isolated curiosities into systematic research programmes with their own regularity theories. Third, &lt;strong&gt;interdisciplinary expansion&lt;/strong&gt;: MFG systems, optimal transport, SPDEs, and AI-assisted methods have become structural parts of the math.AP ecosystem.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="recent-developments"&gt;
 Recent Developments&lt;span class="heading__anchor"&gt; &lt;a href="#recent-developments"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-mathematical-fluid-dynamics-singularity-non-uniqueness-and-stability"&gt;
 1. Mathematical Fluid Dynamics: Singularity, Non-Uniqueness, and Stability&lt;span class="heading__anchor"&gt; &lt;a href="#1-mathematical-fluid-dynamics-singularity-non-uniqueness-and-stability"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h4 class="heading" id="finite-time-blowup-of-the-3d-euler-equations"&gt;
 Finite-Time Blowup of the 3D Euler Equations&lt;span class="heading__anchor"&gt; &lt;a href="#finite-time-blowup-of-the-3d-euler-equations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;The question of whether the 3D incompressible Euler equations&lt;/p&gt;
&lt;p&gt;$$\partial_t u + (u \cdot \nabla) u + \nabla p = 0, \qquad \operatorname{div} u = 0,$$&lt;/p&gt;
&lt;p&gt;can develop a singularity from smooth initial data — open since Euler introduced the equations in 1757 — saw a decisive resolution in a bounded-domain setting through a landmark two-part series by &lt;strong&gt;Jiajie Chen and Thomas Y. Hou&lt;/strong&gt; (arXiv:2210.07191, arXiv:2305.05660, &lt;em&gt;PNAS&lt;/em&gt; 2025). Their work proves finite-time, nearly self-similar blowup of both the &lt;strong&gt;2D Boussinesq&lt;/strong&gt; and &lt;strong&gt;3D axisymmetric Euler&lt;/strong&gt; equations with smooth initial data and finite energy in the presence of a solid boundary. The proof employs weighted $L^\infty$ and $C^{1/2}$ norms, sharp functional inequalities inspired by optimal transport, and computer-assisted rigorous numerics to verify nonlinear stability constants. The result was praised as one of the most significant advances in mathematical fluid mechanics in decades.&lt;/p&gt;
&lt;p&gt;Prior to Chen–Hou, &lt;strong&gt;Tarek Elgindi&lt;/strong&gt; (2021) showed finite-time singularity for the 3D axisymmetric Euler equations without swirl from $C^{1,\alpha}$ initial vorticity. The Chen–Hou 2021 paper on the Hou-Luo model proved asymptotically self-similar blowup from smooth data for the HL model. Concurrently, Hou and collaborators presented numerical evidence for singularity in 3D Navier-Stokes achieving a $10^7$-fold increase in maximum vorticity, and DeepMind (2025) used AI-assisted methods to discover families of unstable singularities in the Incompressible Porous Media and Boussinesq equations.&lt;/p&gt;
&lt;h4 class="heading" id="non-uniqueness-of-lerayhopf-solutions-for-navier-stokes"&gt;
 Non-Uniqueness of Leray–Hopf Solutions for Navier-Stokes&lt;span class="heading__anchor"&gt; &lt;a href="#non-uniqueness-of-lerayhopf-solutions-for-navier-stokes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;A 2021 breakthrough by &lt;strong&gt;Dallas Albritton, Elia Brué, and Maria Colombo&lt;/strong&gt; proved non-uniqueness of Leray–Hopf solutions to the &lt;em&gt;forced&lt;/em&gt; 3D Navier-Stokes equations: they exhibited two distinct Leray solutions with zero initial velocity and identical body force, exploiting the extreme instability of a self-similar background solution. Recognised as the most influential 2021 math.AP paper on arXiv by Paper Digest, the result was subsequently extended to bounded domains via gluing methods (arXiv:2209.03530) and to stochastic settings (&lt;em&gt;Electronic Journal of Probability&lt;/em&gt;, 2024).&lt;/p&gt;
&lt;h4 class="heading" id="stability-of-shear-flows-and-kinetic-theory"&gt;
 Stability of Shear Flows and Kinetic Theory&lt;span class="heading__anchor"&gt; &lt;a href="#stability-of-shear-flows-and-kinetic-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h4&gt;&lt;p&gt;Parallel to the singularity programme, sharp asymptotic stability results for &lt;strong&gt;2D monotone shear flows&lt;/strong&gt; with no-slip boundary conditions, and extensive work on &lt;strong&gt;inviscid damping&lt;/strong&gt; and enhanced dissipation near shear flows, have appeared throughout 2025–2026.&lt;/p&gt;
&lt;p&gt;Arguably the most monumental result in kinetic PDE theory during this period: &lt;strong&gt;Yu Deng, Zaher Hani, and Xiao Ma&lt;/strong&gt; provided a rigorous long-time derivation of the Boltzmann equation from hard-sphere dynamics (arXiv:2408.07818, 2024), extending Lanford&amp;rsquo;s 1975 short-time theorem to all times within the lifespan of the Boltzmann solution. In a companion paper (arXiv:2503.01800, 2025), they completed the derivation of the &lt;strong&gt;compressible Euler&lt;/strong&gt; and &lt;strong&gt;incompressible Navier-Stokes-Fourier&lt;/strong&gt; equations from Newton&amp;rsquo;s laws — effectively resolving &lt;strong&gt;Hilbert&amp;rsquo;s Sixth Problem&lt;/strong&gt; for rarefied hard-sphere gases. The proof uses cumulant ansätze, Feynman-diagram combinatorics, and a molecule-reduction algorithm. This followed the same team&amp;rsquo;s 2021 derivation of the &lt;strong&gt;wave kinetic equation&lt;/strong&gt; from the cubic NLS.&lt;/p&gt;
&lt;h3 class="heading" id="2-nonlocal-and-fractional-pdes-mixed-local-nonlocal-operators"&gt;
 2. Nonlocal and Fractional PDEs: Mixed Local-Nonlocal Operators&lt;span class="heading__anchor"&gt; &lt;a href="#2-nonlocal-and-fractional-pdes-mixed-local-nonlocal-operators"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;One of the dominant new paradigms of the 2020s is the study of operators of the form&lt;/p&gt;
&lt;p&gt;$$\mathcal{L} u = -\Delta u + (-\Delta)^s u, \quad s \in (0,1),$$&lt;/p&gt;
&lt;p&gt;which superpose a classical Laplacian with a fractional (nonlocal) Laplacian. These arise naturally in models combining Brownian and Lévy diffusion processes. The foundational paper by &lt;strong&gt;Biagi, Dipierro, Valdinoci, and Vecchi&lt;/strong&gt; (2020/2021) initiated a systematic theory of regularity and maximum principles for such operators.&lt;/p&gt;
&lt;p&gt;Between 2021 and 2026 an explosion of activity produced: gradient regularity for mixed local-nonlocal problems via De Filippis and Mingione (2022, minimisers of mixed functionals are locally $C^{1,\beta}$-regular); Hölder regularity for mixed local-nonlocal degenerate elliptic equations (Garain &amp;amp; Lindgren, 2022); the Wiener criterion for nonlocal Dirichlet problems (Kim, Lee &amp;amp; Lee, 2022); and a Faber-Krahn inequality for mixed operators (Biagi, Dipierro, Valdinoci &amp;amp; Vecchi, 2021). &lt;strong&gt;Serena Dipierro&lt;/strong&gt; and &lt;strong&gt;Enrico Valdinoci&lt;/strong&gt; were among the most prolific contributors, publishing on nonlocal logistic equations with Neumann conditions, ecological niches for mixed dispersal, and Sobolev inequalities for mixed operators.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Giovanni Leoni&amp;rsquo;s&lt;/strong&gt; 2023 treatise &lt;em&gt;A First Course in Fractional Sobolev Spaces&lt;/em&gt; provided a self-contained reference covering definitions, embeddings, Hardy inequalities, and interpolation inequalities, and ranked among the most-cited arXiv math.AP papers of 2023. Concurrently, a 2025 paper established well-posedness and regularity theory for time-fractional stochastic PDEs involving Caputo derivatives and general nonlocal operators driven by Gaussian and Lévy noise (arXiv:2512.03754).&lt;/p&gt;
&lt;h3 class="heading" id="3-double-phase-operators-and-nonstandard-growth"&gt;
 3. Double Phase Operators and Nonstandard Growth&lt;span class="heading__anchor"&gt; &lt;a href="#3-double-phase-operators-and-nonstandard-growth"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The &lt;strong&gt;double phase functional&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;$$\mathcal{H}(u) := \int_\Omega \bigl(|Du|^p + a(x)|Du|^q\bigr),dx, \quad q &amp;gt; p &amp;gt; 1,\ a(x) \geq 0,$$&lt;/p&gt;
&lt;p&gt;introduced by Colombo and Mingione, generated a remarkable surge of activity throughout 2021–2026.&lt;/p&gt;
&lt;table&gt;
	&lt;thead&gt;
			&lt;tr&gt;
					&lt;th&gt;Year&lt;/th&gt;
					&lt;th&gt;Paper&lt;/th&gt;
					&lt;th&gt;Authors&lt;/th&gt;
					&lt;th&gt;Key Contribution&lt;/th&gt;
			&lt;/tr&gt;
	&lt;/thead&gt;
	&lt;tbody&gt;
			&lt;tr&gt;
					&lt;td&gt;2021&lt;/td&gt;
					&lt;td&gt;A new class of double phase variable exponent problems&lt;/td&gt;
					&lt;td&gt;Crespo-Blanco, Gasiński, Harjulehto, Winkert&lt;/td&gt;
					&lt;td&gt;Existence/uniqueness for new double phase with variable exponents&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2021&lt;/td&gt;
					&lt;td&gt;Double phase implicit obstacle problems&lt;/td&gt;
					&lt;td&gt;Zeng, Rădulescu, Winkert&lt;/td&gt;
					&lt;td&gt;Mixed BVPs with convection and multivalued conditions&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2022&lt;/td&gt;
					&lt;td&gt;Nonuniformly elliptic Schauder theory&lt;/td&gt;
					&lt;td&gt;De Filippis, Mingione&lt;/td&gt;
					&lt;td&gt;Schauder estimates in nonuniform elliptic settings&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2022&lt;/td&gt;
					&lt;td&gt;New embedding results for double phase problems&lt;/td&gt;
					&lt;td&gt;Ho, Winkert&lt;/td&gt;
					&lt;td&gt;Musielak-Orlicz Sobolev spaces with variable exponent&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2023&lt;/td&gt;
					&lt;td&gt;Regularity at nearly linear growth&lt;/td&gt;
					&lt;td&gt;De Filippis, Mingione&lt;/td&gt;
					&lt;td&gt;Hölder gradient regularity for log-type functionals&lt;/td&gt;
			&lt;/tr&gt;
			&lt;tr&gt;
					&lt;td&gt;2025&lt;/td&gt;
					&lt;td&gt;Partial regularity for parabolic double phase systems&lt;/td&gt;
					&lt;td&gt;Ok, Scilla, Stroffolini&lt;/td&gt;
					&lt;td&gt;Partial Hölder regularity for parabolic systems&lt;/td&gt;
			&lt;/tr&gt;
	&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The work of &lt;strong&gt;Cristiana De Filippis&lt;/strong&gt; and &lt;strong&gt;Giuseppe Mingione&lt;/strong&gt; is particularly prominent throughout, providing a comprehensive regularity theory for double phase and nonuniformly elliptic functionals (arXiv:2308.10222).&lt;/p&gt;
&lt;h3 class="heading" id="4-normalized-solutions-and-variational-methods-for-schrödinger-equations"&gt;
 4. Normalized Solutions and Variational Methods for Schrödinger Equations&lt;span class="heading__anchor"&gt; &lt;a href="#4-normalized-solutions-and-variational-methods-for-schr%c3%b6dinger-equations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The problem of finding solutions $u \in H^1(\mathbb{R}^N)$ with prescribed $L^2$-norm — the &lt;em&gt;mass constraint&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;$$\int_{\mathbb{R}^N} |u|^2,dx = c$$&lt;/p&gt;
&lt;p&gt;— has become a central theme in the study of nonlinear Schrödinger equations. The influential papers by &lt;strong&gt;Louis Jeanjean and Thanh Trung Le&lt;/strong&gt; on multiple normalized solutions for Sobolev critical equations (2020–2021) and by &lt;strong&gt;Juncheng Wei and Yuanze Wu&lt;/strong&gt; on normalized solutions with critical Sobolev exponent and mixed nonlinearities (2021) launched a wave of activity. Key directions include: normalized ground states for NLS with potential (Bartsch, Molle, Rizzi &amp;amp; Verzini); normalized solutions for Schrödinger-Poisson-Slater equations; and standing waves and stability for &lt;strong&gt;Choquard equations&lt;/strong&gt;. The March 2026 arXiv listings confirm that sharp exponents, existence and asymptotics for Choquard equations, and boosted ground states for pseudo-relativistic Schrödinger equations remain highly active.&lt;/p&gt;
&lt;p&gt;Parallel work on eigenvalue problems addresses &lt;strong&gt;Steklov eigenvalues&lt;/strong&gt; (monotonicity for regular $N$-gons, sharp geometric bounds), eigenvalues of &lt;strong&gt;Pucci&amp;rsquo;s extremal operator&lt;/strong&gt; in 3D, and &lt;strong&gt;biharmonic Steklov problems&lt;/strong&gt; on thin sets.&lt;/p&gt;
&lt;h3 class="heading" id="5-mean-field-games-and-aggregation-diffusion-pdes"&gt;
 5. Mean Field Games and Aggregation-Diffusion PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#5-mean-field-games-and-aggregation-diffusion-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mean field game theory&lt;/strong&gt; generated a prolific suite of PDE questions between 2021 and 2026. Highlights include: Imanuvilov, Liu, and Yamamoto (2023) proving Lipschitz stability for determining states and inverse sources in MFG equations using Carleman estimates; Klibanov, Li, and Liu (2023) on Hölder stability via Carleman estimates; the inverse boundary problem for first-order master equations (Liu &amp;amp; Zhang, 2022); and Bresch, Jabin, and Soler (2022) introducing a novel probabilistic derivation of the mean-field limit applicable to Vlasov-Poisson-Fokker-Planck in 2D. By 2025–2026, nonlocal MFG models with spatial interactions and new work on &lt;strong&gt;Wasserstein gradient flows of kernel mean discrepancies&lt;/strong&gt; with connections to machine learning appeared on arXiv (arXiv:2506.01200).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Optimal transport&lt;/strong&gt; has deeply influenced aggregation-diffusion equations and gradient flows. The March 2026 arXiv listings include a major 73-page paper by &lt;strong&gt;Carrillo, Gwiazda, and Skrzeczkowski&lt;/strong&gt; presenting a new formula for the Wasserstein distance between solutions to nonlinear continuity equations.&lt;/p&gt;
&lt;h3 class="heading" id="6-chemotaxis-and-reaction-diffusion-systems"&gt;
 6. Chemotaxis and Reaction-Diffusion Systems&lt;span class="heading__anchor"&gt; &lt;a href="#6-chemotaxis-and-reaction-diffusion-systems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Chemotaxis systems — in particular Keller-Segel models with &lt;strong&gt;signal-dependent motility&lt;/strong&gt; (density-suppressed diffusion) — generated intense activity. Key papers include logistic damping effects and global classical solutions for reaction-diffusion systems with density-suppressed motility (Lyu &amp;amp; Wang, 2021), refined regularity analysis for Keller-Segel-consumption systems (Li &amp;amp; Winkler, 2022), and global existence with uniform boundedness under signal-dependent motility (Jiang &amp;amp; Laurençot, 2021). In 2024, a construction of smooth finite-time blowup solutions for the &lt;strong&gt;3D Keller-Segel-Navier-Stokes&lt;/strong&gt; (chemotaxis-fluid) system with buoyancy appeared, using a quantitative method that directly constructs the singular solution (arXiv:2404.17228).&lt;/p&gt;
&lt;p&gt;In parallel, &lt;strong&gt;free boundary reaction-diffusion models&lt;/strong&gt; for species spreading and SIS epidemic models — including 2026 work on asymmetric kernels in advective periodic environments — continue to produce threshold and long-time dynamics results.&lt;/p&gt;
&lt;h3 class="heading" id="7-free-boundary-problems"&gt;
 7. Free Boundary Problems&lt;span class="heading__anchor"&gt; &lt;a href="#7-free-boundary-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The Stefan problem (modelling solidification and melting) remained highly active throughout 2021–2026. Key results include $C^{1,\alpha}$ regularity of flat free boundaries for the &lt;strong&gt;inhomogeneous one-phase Stefan problem&lt;/strong&gt; (Ferrari, Forcillo, Giovagnoli &amp;amp; Jesus, 2024; arXiv:2404.07535); regularity of the free boundary for the &lt;strong&gt;supercooled Stefan problem&lt;/strong&gt; in arbitrary dimensions (2025; arXiv:2512.10136), where the free boundary decomposes into regular, singular, and jump parts with the singular part having controlled parabolic dimension; and well-posedness and regularity of physical solutions for the supercooled Stefan problem assuming only integrable initial temperature, with explicit classification of free boundary points (2025; arXiv:2506.18741). These results use obstacle problem techniques, non-degeneracy estimates, and sharp free boundary classification arguments.&lt;/p&gt;
&lt;p&gt;Shape optimisation for &lt;strong&gt;principal eigenvalues of Pucci operators&lt;/strong&gt; and $\Gamma$-convergence of convolution-type functionals for free discontinuity problems are active related directions in 2026.&lt;/p&gt;
&lt;h3 class="heading" id="8-stochastic-pdes-and-regularity-structures"&gt;
 8. Stochastic PDEs and Regularity Structures&lt;span class="heading__anchor"&gt; &lt;a href="#8-stochastic-pdes-and-regularity-structures"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Martin Hairer&amp;rsquo;s theory of regularity structures generated deep ongoing activity. The period 2021–2026 saw Bailleul and Bruned (2021) extending the algebraic renormalisation framework of regularity structures to a broader class of singular SPDEs (arXiv:2101.11949); the publication of &lt;strong&gt;&amp;ldquo;A tourist&amp;rsquo;s guide to regularity structures&amp;rdquo;&lt;/strong&gt; by Bailleul and Hoshino (2025/2026) in &lt;em&gt;EMS Surveys&lt;/em&gt; as an essentially self-contained treatment; applications to stochastic quantisation ($\Phi^4_3$), the &lt;strong&gt;KPZ equation&lt;/strong&gt;, and stochastic geometric flows (Hairer, 2021); and variance renormalisation in regularity structures for the 2D generalised Parabolic Anderson Model (Gerencsér &amp;amp; Hsu, 2026).&lt;/p&gt;
&lt;p&gt;On the fluid side, global unique solvability for &lt;strong&gt;stochastic Navier-Stokes-Korteweg&lt;/strong&gt; equations and &lt;strong&gt;stochastic Allen-Cahn-Navier-Stokes&lt;/strong&gt; systems with ergodic invariant measures appeared in 2025, and non-uniqueness of Leray-Hopf solutions was extended to the stochastic forced setting.&lt;/p&gt;
&lt;h3 class="heading" id="9-dispersive-pdes-wave-turbulence-well-posedness-and-blowup"&gt;
 9. Dispersive PDEs: Wave Turbulence, Well-Posedness, and Blowup&lt;span class="heading__anchor"&gt; &lt;a href="#9-dispersive-pdes-wave-turbulence-well-posedness-and-blowup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The &lt;strong&gt;full derivation of the wave kinetic equation&lt;/strong&gt; from the cubic NLS by Deng and Hani (arXiv:1912.09518, 2021) was the most impactful dispersive result of the era. Their analysis relies on absolutely convergent Feynman-diagram (paired-tree) expansions and identifies favourable scaling laws $\alpha \sim L^{-\varepsilon}$ for the kinetic limit.&lt;/p&gt;
&lt;p&gt;Ongoing work includes polynomial growth of Sobolev norms for the fractional NLS on $\mathbb{T}^d$ (Wang, 2026); low-regularity global well-posedness for generalised Zakharov-Kuznetsov equations (Nowicki-Koth, 2026); &lt;strong&gt;modulated dispersive equations&lt;/strong&gt; (modulated KdV with normal form reduction; Gubinelli, Li, Li &amp;amp; Oh, 2025; arXiv:2505.24270); and probabilistic well-posedness of dispersive PDEs beyond variance blowup (2025; arXiv:2509.02344). Scattering results for the quintic generalised Benjamin-Bona-Mahony equation and the 3D Zakharov-Kuznetsov equation, and long-time asymptotics via Riemann-Hilbert and inverse scattering methods for integrable equations, appear in the March 2026 listings.&lt;/p&gt;
&lt;h3 class="heading" id="10-geometric-pdes"&gt;
 10. Geometric PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#10-geometric-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Ricci flow&lt;/strong&gt; uniqueness in the non-compact setting (Lee, 2025; arXiv:2503.20292) and a new non-Kähler expanding Ricci soliton construction with Kähler tangent cone at infinity (Bamler, Chen &amp;amp; Conlon, 2026) reflect the continued health of geometric flows. The &lt;strong&gt;volume-preserving mean curvature flow&lt;/strong&gt; regularity in dimensions 2 and 3 appeared in March 2026 (Arya, Jeon &amp;amp; Julin).&lt;/p&gt;
&lt;p&gt;Regularity theory for &lt;strong&gt;Monge-Ampère equations&lt;/strong&gt; received major contributions via a geometric approach: Brendle, Léger, McCann, and Rankin (2023; arXiv:2311.10208) derived the Pogorelov second-derivative bound using Kim-McCann-Warren&amp;rsquo;s pseudo-Riemannian geometry, providing a new approach to $C^1$ estimates for optimal transport maps. Liouville theorems and sharp solvability for the &lt;strong&gt;parabolic Monge-Ampère equation&lt;/strong&gt; with periodic data appeared in March 2026.&lt;/p&gt;
&lt;h3 class="heading" id="11-inverse-problems-for-pdes"&gt;
 11. Inverse Problems for PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#11-inverse-problems-for-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The &lt;strong&gt;Calderón problem&lt;/strong&gt; — recovering a coefficient from boundary Dirichlet-to-Neumann data — attracted major advances: the quasilinear setting (Cârstea, Feizmohammadi, Kian, Krupchyk &amp;amp; Uhlmann, 2021), inverse problems for fractional semilinear elliptic equations (Lai &amp;amp; Lin, 2020), the Calderón problem via Vekua theory (Clifford analysis framework, 2026; arXiv:2601.17313), and the convex lifting approach (Alberti, Petit &amp;amp; Sanna, 2025; arXiv:2507.00645). The &lt;strong&gt;anisotropic Calderón problem&lt;/strong&gt; for fractional Schrödinger operators on closed Riemannian manifolds (Krupchyk, 2025) was an important further advance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inverse moving source problems for parabolic equations&lt;/strong&gt; (Zhao, 2023), reconstruction of scalar parameters in subdiffusion, and inverse problems for &lt;strong&gt;multi-term time-fractional diffusion&lt;/strong&gt; with Caputo derivatives are active in 2025–2026.&lt;/p&gt;
&lt;h3 class="heading" id="12-semi-classical-analysis-spectral-theory-and-nonlinear-elliptic-theory"&gt;
 12. Semi-Classical Analysis, Spectral Theory, and Nonlinear Elliptic Theory&lt;span class="heading__anchor"&gt; &lt;a href="#12-semi-classical-analysis-spectral-theory-and-nonlinear-elliptic-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A 2024 arXiv survey on &lt;strong&gt;semi-classical analysis&lt;/strong&gt; introducing three representative topics ranked as the top 2024 math.AP paper by Paper Digest, and a 2026 paper celebrating the &lt;strong&gt;100th anniversary of the WKB papers&lt;/strong&gt; (Vũ Ngọc) indicate that semi-classical methods remain foundational.&lt;/p&gt;
&lt;p&gt;In nonlinear elliptic and parabolic theory, major contributions include: &lt;em&gt;Regularity Theory for Elliptic PDEs&lt;/em&gt; by Fernández-Real and Ros-Oton (2023), a comprehensive self-contained reference; Fujita-type results for degenerate parabolic equations on &lt;strong&gt;Heisenberg groups&lt;/strong&gt; (Fino, Ruzhansky &amp;amp; Torebek, 2023), ranked the highest-impact 2023 math.AP paper; and singularity formation for nonlinear heat equations on infinite graphs (Punko &amp;amp; Zucchero, 2026).&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="emerging-and-cross-cutting-themes-20252026"&gt;
 Emerging and Cross-Cutting Themes (2025–2026)&lt;span class="heading__anchor"&gt; &lt;a href="#emerging-and-cross-cutting-themes-20252026"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Computer-assisted proofs and rigorous numerics.&lt;/strong&gt; The Chen–Hou Euler blowup proof and related work on the CLM model (Hou-Wang, 2026) demonstrate that computer-assisted methods with rigorous error control are becoming standard for complex nonlinear stability analyses. These methods combine spectral Galerkin approximations with interval arithmetic and weighted norm frameworks to certify nonlinear stability constants — a methodology likely to expand further.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI and machine learning for PDEs.&lt;/strong&gt; The 2026 workshop &lt;em&gt;MLPDES26&lt;/em&gt; and the NSF/AMS report on AI for the mathematical sciences signal growing interplay between pure math.AP and deep learning. Neural PDE networks for equation discovery (arXiv:2502.18377), geometric operator learning via optimal transport (arXiv:2507.20065), and AI-assisted singularity discovery (DeepMind, 2025) represent this interdisciplinary frontier.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PDE methods in geometry and probability.&lt;/strong&gt; The intersection of math.AP with differential geometry, probability (SPDEs), and mathematical physics remains extremely active. The March 2026 listings span general relativity (tensorial wave equations), Kähler geometry (Ricci solitons), and stochastic PDEs — confirming that math.AP functions as a hub connecting multiple mathematical disciplines.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="open-problems"&gt;
 Open Problems&lt;span class="heading__anchor"&gt; &lt;a href="#open-problems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Smooth-data Euler regularity beyond bounded domains.&lt;/strong&gt; The Chen–Hou result proves blowup in a bounded domain. Whether finite-time singularity occurs for the 3D Euler equations in all of $\mathbb{R}^3$ from smooth, rapidly decaying initial data — the original Euler problem — remains open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Navier-Stokes uniqueness from smooth initial data.&lt;/strong&gt; The Albritton-Brué-Colombo result proves non-uniqueness for &lt;em&gt;forced&lt;/em&gt; NS from zero initial velocity. Non-uniqueness (or uniqueness) of Leray–Hopf solutions for the &lt;em&gt;unforced&lt;/em&gt; equations from smooth $H^1$ initial data is unresolved (see the companion survey on self-similar solutions).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Optimal regularity theory for double phase problems.&lt;/strong&gt; Despite the comprehensive work of De Filippis and Mingione, optimal Schauder estimates for parabolic double phase systems at the boundary and under critical growth conditions are not fully established.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Complete derivation programme for Hilbert&amp;rsquo;s Sixth Problem.&lt;/strong&gt; Deng-Hani-Ma resolved the case of hard-sphere gases in the Boltzmann regime. The derivation of hydrodynamic equations from particle dynamics in other regimes — dense gases, quantum systems, plasma — remains largely open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Global well-posedness for energy-critical NLS in high dimensions.&lt;/strong&gt; Despite progress on wave kinetic theory and probabilistic well-posedness, the deterministic global well-posedness theory for energy-critical and supercritical dispersive equations in dimensions $d \geq 5$ has significant gaps.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum and numerical computation in pure math.AP.&lt;/strong&gt; The growing use of computer-assisted proofs raises methodological questions about standards of verification, reproducibility, and the scope of problems accessible to these techniques.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Albritton, D., Brué, E., &amp;amp; Colombo, M. (2021). &lt;em&gt;Non-uniqueness of Leray solutions of the forced Navier-Stokes equations&lt;/em&gt;. &lt;a href="https://cvgmt.sns.it/media/doc/paper/5405/main.pdf"&gt;https://cvgmt.sns.it/media/doc/paper/5405/main.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bailleul, I., &amp;amp; Bruned, Y. (2021). &lt;em&gt;Renormalised singular stochastic PDEs&lt;/em&gt;. arXiv:2101.11949. &lt;a href="https://www.pure.ed.ac.uk/ws/portalfiles/portal/194767736/2101.11949.pdf"&gt;https://www.pure.ed.ac.uk/ws/portalfiles/portal/194767736/2101.11949.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Bailleul, I., &amp;amp; Hoshino, M. (2025). A tourist&amp;rsquo;s guide to regularity structures and singular stochastic PDEs. &lt;em&gt;EMS Surveys in Mathematical Sciences&lt;/em&gt;. &lt;a href="https://ems.press/journals/emss/articles/14298505"&gt;https://ems.press/journals/emss/articles/14298505&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Brendle, S., Léger, F., McCann, R. J., &amp;amp; Rankin, C. (2023). &lt;em&gt;A geometric approach to a priori estimates for optimal transport maps&lt;/em&gt;. arXiv:2311.10208. &lt;a href="https://arxiv.org/abs/2311.10208"&gt;https://arxiv.org/abs/2311.10208&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Chen, J., &amp;amp; Hou, T. Y. (2022). &lt;em&gt;Stable nearly self-similar blowup of the 2D Boussinesq and 3D Euler equations with smooth data I: Analysis&lt;/em&gt;. arXiv:2210.07191. &lt;a href="https://arxiv.org/abs/2210.07191"&gt;https://arxiv.org/abs/2210.07191&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Chen, J., &amp;amp; Hou, T. Y. (2023). &lt;em&gt;Stable nearly self-similar blowup of the 2D Boussinesq and 3D Euler equations with smooth data II: Rigorous numerics&lt;/em&gt;. arXiv:2305.05660. &lt;a href="https://arxiv.org/abs/2305.05660"&gt;https://arxiv.org/abs/2305.05660&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Chen, J., &amp;amp; Hou, T. Y. (2025). Singularity formation in 3D Euler equations with smooth initial data. &lt;em&gt;PNAS, 122&lt;/em&gt;(28). &lt;a href="https://www.pnas.org/doi/10.1073/pnas.2500940122"&gt;https://www.pnas.org/doi/10.1073/pnas.2500940122&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;De Filippis, C., &amp;amp; Mingione, G. (2023). &lt;em&gt;Regularity for double phase problems at nearly linear growth&lt;/em&gt;. arXiv:2308.10222. &lt;a href="https://arxiv.org/abs/2308.10222"&gt;https://arxiv.org/abs/2308.10222&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;DeepMind. (2025). &lt;em&gt;Discovering new solutions to century-old problems in fluid dynamics&lt;/em&gt;. &lt;a href="https://deepmind.google/blog/discovering-new-solutions-to-century-old-problems-in-fluid-dynamics/"&gt;https://deepmind.google/blog/discovering-new-solutions-to-century-old-problems-in-fluid-dynamics/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Deng, Y., &amp;amp; Hani, Z. (2021). &lt;em&gt;On the derivation of the wave kinetic equation for NLS&lt;/em&gt;. arXiv:1912.09518. &lt;a href="http://arxiv.org/pdf/1912.09518.pdf"&gt;http://arxiv.org/pdf/1912.09518.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Deng, Y., Hani, Z., &amp;amp; Ma, X. (2024). &lt;em&gt;Long time derivation of the Boltzmann equation from hard sphere dynamics&lt;/em&gt;. arXiv:2408.07818. &lt;a href="https://www.semanticscholar.org/paper/91b67412a6058c1ace054a32fbf36fa2d2998d3d"&gt;https://www.semanticscholar.org/paper/91b67412a6058c1ace054a32fbf36fa2d2998d3d&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Deng, Y., Hani, Z., &amp;amp; Ma, X. (2025). &lt;em&gt;Hilbert&amp;rsquo;s sixth problem: Derivation of fluid equations via Boltzmann&amp;rsquo;s kinetic theory&lt;/em&gt;. arXiv:2503.01800. &lt;a href="https://www.semanticscholar.org/paper/01d8f11b5d31f7037fb4914797e938db11d76ec5"&gt;https://www.semanticscholar.org/paper/01d8f11b5d31f7037fb4914797e938db11d76ec5&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Ferrari, F., Forcillo, N., Giovagnoli, D., &amp;amp; Jesus, B. (2024). &lt;em&gt;Free boundary regularity for the inhomogeneous one-phase Stefan problem&lt;/em&gt;. arXiv:2404.07535. &lt;a href="https://arxiv.org/abs/2404.07535"&gt;https://arxiv.org/abs/2404.07535&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Gubinelli, M., Li, J., Li, T., &amp;amp; Oh, T. (2025). &lt;em&gt;Nonlinear PDEs with modulated dispersion IV: Normal form reduction for modulated KdV&lt;/em&gt;. arXiv:2505.24270. &lt;a href="https://arxiv.org/pdf/2505.24270.pdf"&gt;https://arxiv.org/pdf/2505.24270.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hou, T. Y. (2021). &lt;em&gt;The potentially singular behavior of the 3D Navier-Stokes equations&lt;/em&gt;. arXiv:2107.06509. &lt;a href="https://arxiv.org/abs/2107.06509"&gt;https://arxiv.org/abs/2107.06509&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hu, J., Jin, S., Liu, N., &amp;amp; Zhang, L. (2024). Quantum circuits for partial differential equations via Schrödingerisation. &lt;em&gt;Quantum, 8&lt;/em&gt;, 1563.&lt;/p&gt;
&lt;p&gt;Imanuvilov, O. Y., Liu, Y., &amp;amp; Yamamoto, M. (2023). Lipschitz stability for determining states and inverse sources in MFG equations. &lt;em&gt;[Journal of Mathematical Analysis]&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Ok, J., Scilla, G., &amp;amp; Stroffolini, B. (2025). &lt;em&gt;Partial regularity for parabolic systems of double phase type&lt;/em&gt;. arXiv:2510.03849. &lt;a href="https://arxiv.org/pdf/2510.03849.pdf"&gt;https://arxiv.org/pdf/2510.03849.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Paper Digest. (2025, March). &lt;em&gt;Most influential arXiv (Analysis of PDEs) papers — 2025-03 version&lt;/em&gt;. &lt;a href="https://www.paperdigest.org/2025/03/most-influential-arxiv-analysis-of-pdes-papers-2025-03-version/"&gt;https://www.paperdigest.org/2025/03/most-influential-arxiv-analysis-of-pdes-papers-2025-03-version/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Segata, J., &amp;amp; Chen, M. (2026). &lt;em&gt;Scattering for the 3D Zakharov-Kuznetsov equation&lt;/em&gt; [arXiv preprint]. arXiv math.AP March 2026.&lt;/p&gt;
&lt;p&gt;arXiv math.AP listings. (2026, February–March). &lt;a href="https://arxiv.org/list/math.AP/2026-03"&gt;https://arxiv.org/list/math.AP/2026-03&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Paper Reading - Optimization problems for elliptic PDEs (2601.01591)</title><link>https://blog.namln.org/en/posts/pr-2601.01591/</link><pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/pr-2601.01591/</guid><description>&lt;p&gt;This paper is a panoramic tour of three families of &lt;strong&gt;optimal control problems for elliptic PDEs&lt;/strong&gt;: where the control is the coefficient, the potential, or the source term, unifying and sharpening results from the authors’ previous works.&lt;/p&gt;
&lt;h2 class="heading" id="three-ways-to-control-an-elliptic-pde"&gt;
 Three ways to control an elliptic PDE&lt;span class="heading__anchor"&gt; &lt;a href="#three-ways-to-control-an-elliptic-pde"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The authors always consider a Dirichlet problem on a bounded domain $\Omega \subset \mathbb{R}^d$, with the solution $u$ as the &lt;strong&gt;state&lt;/strong&gt; and a function (or measure) as the &lt;strong&gt;control&lt;/strong&gt;. They study three settings:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal coefficients&lt;/strong&gt; $a(x)$:
$$
-\mathrm{div}(a(x)\nabla u) = f \text{ in } \Omega, \quad u=0 \text{ on } \partial\Omega,
$$
cost function $J(u,a) = \int_\Omega j(u,a),dx$, with a constraint $\int_\Omega \psi(a),dx \le 1$.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal potentials&lt;/strong&gt; $V(x)$:
$$
-\Delta u + V(x)u = f \text{ in } \Omega, \quad u\in H_0^1(\Omega),
$$
cost function $J(u,V) = \int_\Omega (j(x,u) + \psi(V)),dx$.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal sources&lt;/strong&gt; $f$:
$$
-\Delta u = f \text{ in } \Omega, \quad u\in H_0^1(\Omega),
$$
cost function $J(f) = \int_\Omega j(x,u_f,f),dx$ with $\int_\Omega \psi(f),dx \le m$.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In all cases, $\psi$ is convex and lower semi-continuous (l.s.c), encoding constraints and penalizations on the control. The paper focuses on existence of optimal controls (sometimes as measures), characterization via auxiliary variational problems and adjoint states, bang–bang behavior, and regularity of optimal controls and their induced interfaces.&lt;/p&gt;
&lt;h2 class="heading" id="optimal-coefficients-where-to-put-the-good-material"&gt;
 Optimal Coefficients: Where to Put the Good Material?&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-coefficients-where-to-put-the-good-material"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="minimal-compliance-and-measure-valued-coefficients"&gt;
 Minimal Compliance and Measure-Valued Coefficients&lt;span class="heading__anchor"&gt; &lt;a href="#minimal-compliance-and-measure-valued-coefficients"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;The model problem is compliance minimization for $-\mathrm{div}(a(x)\nabla u) = f$, $u=0$, with non-neg­ative $a$.&lt;/p&gt;
&lt;p&gt;Compliance is defined as:
$$
C(a) = \int_\Omega f u_a,dx,
$$
and it relates to the energy
$$
E(a) = \inf_{u\in H_0^1} \int_\Omega \left(\tfrac{1}{2} a|\nabla u|^2 - f u\right)dx
$$
via $C(a) = -2E(a)$.&lt;/p&gt;
&lt;p&gt;The optimization problem is written as:
$$
\min_{a \geq 0} \left\{ C(a) + \int_\Omega \psi(a)dx \right\},
$$
or equivalently as a &lt;strong&gt;max–min&lt;/strong&gt; problem in $(a,u)$.&lt;/p&gt;
&lt;p&gt;Two growth regimes of $\psi$ are crucial:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Superlinear&lt;/strong&gt;: $\psi(s)/s \to +\infty$. Then admissible coefficients are in $L^1(\Omega)$, and there exists an optimal $a_{\mathrm{opt}}\in L^1(\Omega)$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Linear growth&lt;/strong&gt;: $\psi(s)/s \to k&amp;gt;0$. Then it is natural to extend the problem to &lt;strong&gt;measures&lt;/strong&gt; $\mu\ge 0$, allowing &amp;ldquo;thin&amp;rdquo; structures on lower-dimensional sets. The cost $\int \psi(\mu)$ is interpreted through the Lebesgue–singular decomposition and the recession function $\psi_\infty$. An optimal measure $\mu_{\mathrm{opt}}\in \mathcal{M}^+(\Omega)$ still exists.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because the functional is convex in $u$ and concave in $a$, the authors exchange inf and sup and reduce to an &lt;strong&gt;auxiliary minimization problem in $u$&lt;/strong&gt; alone:
$$
\inf_{u} \int_\Omega \psi^{*}(|\nabla u|^2)dx - 2\int_\Omega u df,
$$
where $\psi^{*}$ is the Legendre–Fenchel conjugate. Under mild assumptions this problem has a unique minimizer $\bar u$, and the optimal coefficient is recovered point-wise from the optimality condition:
$$
a_{\mathrm{opt}}|\nabla\bar u|^2 = \psi(a_{\mathrm{opt}}) + \psi^*(|\nabla\bar u|^2).
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Power penalization&lt;/strong&gt; $\psi(s) = s^p/p$, $p&amp;gt;1$: The auxiliary problem involves a nonlinear PDE
$$-\Delta_{2p/(p-1)} u = \tfrac{2p}{p-1} f,$$
and the optimal coefficient is $a_{\mathrm{opt}}(x) = |\nabla \bar u(x)|^{2/(p-1)}$. For $\Omega$ a ball and $f=1$ or $f=\delta_0$, the authors give explicit radial formulas and plots for $\bar u$ and $a_{\mathrm{opt}}$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Two-phase box constraint&lt;/strong&gt; $\psi(s) = s$ on $[\alpha,\beta]$, $+\infty$ otherwise: The auxiliary problem yields an optimal coefficient $a_{\mathrm{opt}}\in L^\infty(\Omega)$ taking values in $[\alpha,\beta]$, and under regularity of $\Omega$ and $f$ one gets extra smoothness (e.g. $\nabla a_{\mathrm{opt}}\cdot \nabla \bar u \in L^2(\Omega)$).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="general-coefficients-and-g-closure"&gt;
 General Coefficients and G-Closure&lt;span class="heading__anchor"&gt; &lt;a href="#general-coefficients-and-g-closure"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For a &lt;strong&gt;general cost&lt;/strong&gt;:
$$\min_{a\ge 0}\min_{u} \int_\Omega (j(x,u)+\psi(a)),dx \quad \text{s.t. } u \text{ solves } -\mathrm{div}(a\nabla u)=f,$$
existence of an optimal $a$ may fail.&lt;/p&gt;
&lt;p&gt;The relaxed problem is naturally expressed via &lt;strong&gt;G-convergence&lt;/strong&gt;: sequences of scalar coefficients $a_n\in[\alpha,\beta]$ can generate limit operators with &lt;strong&gt;matrix-valued coefficients&lt;/strong&gt; $A(x)$, described by the celebrated Murat–Tartar &lt;strong&gt;G-closure&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The G-closure set $\mathcal{A}$ consists of symmetric matrices $A(x)$ whose eigenvalues $\lambda_1\le\cdots\le\lambda_d$ lie in $[\alpha,\beta]$ and satisfy a family of inequalities depending on a mixing parameter $t\in[0,1]$, involving the arithmetic and harmonic means $\mu_t, \nu_t$ of $\alpha,\beta$. For $d=2$, this gives an explicit admissible region in the $(\lambda_1,\lambda_2)$-plane.&lt;/p&gt;
&lt;p&gt;Relaxed functionals of the form $\int \psi(x,a),dx$ over G-limits have been studied in special cases, e.g. $\psi(x,a)=g(x)a$, where one can express the relaxation in terms of the largest eigenvalue $\lambda_{\max}(A(x))$. The authors show a numerical example where the relaxed optimal matrix $A_{\mathrm{opt}}$ has eigenvalues $\lambda_1\neq \lambda_2$ on a set of positive measure, revealing genuine microstructure.&lt;/p&gt;
&lt;h2 class="heading" id="optimal-potentials-shaping-the-landscape-vx"&gt;
 Optimal Potentials: Shaping the &amp;ldquo;Landscape&amp;rdquo; $V(x)$&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-potentials-shaping-the-landscape-vx"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Here the control is a &lt;strong&gt;nonnegative potential&lt;/strong&gt; $V$ in
$$-\Delta u + V u = f, \quad u\in H_0^1(\Omega).$$
The cost is:
$$\min \int_\Omega (j(x,u) + \psi(V)),dx,$$
with $V\ge 0$ and $\psi$ convex, l.s.c., super-linear (so any finite-cost $V$ lies in $L^1(\Omega)$).&lt;/p&gt;
&lt;h3 class="heading" id="compliance-case-eliminating-the-control"&gt;
 Compliance Case: Eliminating the Control&lt;span class="heading__anchor"&gt; &lt;a href="#compliance-case-eliminating-the-control"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For the compliance choice $j(x,u) = f(x)u$, the problem can again be reduced to a variational problem in $u$ only.&lt;/p&gt;
&lt;p&gt;Define:
$$
E(V) = \min_{u\in H_0^1(\Omega)} \int_\Omega \left(\tfrac{1}{2} |\nabla u|^2 + \tfrac{1}{2} V u^2 - f u\right)dx, \quad \Psi(V)=\int_\Omega \psi(V),dx.
$$&lt;/p&gt;
&lt;p&gt;Minimizing $-2E(V)+\Psi(V)$ over $V\ge 0$ is equivalent to:
$$
\min_{u\in H_0^1(\Omega)} \int_\Omega \left(|\nabla u|^2 + \psi^*(u^2) - 2 f u\right)dx,
$$
a semi-linear elliptic problem in $u$ with nonlinearity $g(s)=s(\psi^*)&amp;rsquo;(s^2)$. The optimal state $\bar u$ solves:
$$
-\Delta u + g(u) = f, \quad u\in H_0^1(\Omega),
$$
and the optimal potential is:
$$
V_{\mathrm{opt}} = (\psi^*)&amp;rsquo;(\bar u^2).
$$
So in this special case the control can be &lt;strong&gt;explicitly reconstructed&lt;/strong&gt; from the state.&lt;/p&gt;
&lt;h3 class="heading" id="general-costs-adjoint-equation-and-regularity"&gt;
 General Costs, Adjoint Equation, and Regularity&lt;span class="heading__anchor"&gt; &lt;a href="#general-costs-adjoint-equation-and-regularity"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;For a general $j(x,u)$, the authors prove an &lt;strong&gt;existence theorem&lt;/strong&gt; of an optimal $V_{\mathrm{opt}}\in L^1(\Omega)$ under natural growth and coercivity assumptions on $j$ and super-linearity of $\psi$.&lt;/p&gt;
&lt;p&gt;Optimality conditions involve:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The state $\bar u$ solving $-\Delta u + V_{\mathrm{opt}}u = f$.&lt;/li&gt;
&lt;li&gt;An adjoint state $v$ solving $-\Delta v + V_{\mathrm{opt}} v = \partial_s j(x,\bar u)$.&lt;/li&gt;
&lt;li&gt;A sub-differential relation $\bar u v \in \partial\psi(V_{\mathrm{opt}})$, rewritten as a point-wise inequality $h^{-}(\bar u v) \le V_{\mathrm{opt}} \le h(\bar u v)$, where $h$ is built from the sub-differential of $\psi$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From here, regularity of $V_{\mathrm{opt}}$ is linked to properties of $h$ and to elliptic regularity for $\bar u$ and $v$. Under strengthened assumptions on $j$, $f$, and $\Omega$, the authors show that $\bar u, v \in W^{2,q}(\Omega)$ for some $q&amp;gt;d/2$ (hence continuous), and the product $\bar u v V_{\mathrm{opt}}$ is in $BV(\Omega)$, so $V_{\mathrm{opt}}\in BV_{\mathrm{loc}}(\Omega\setminus K)$ where $K = {\bar u v =0}$. This identifies the &amp;ldquo;degeneracy set&amp;rdquo; $K$ as the core where singularities of the optimal potential may concentrate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bang–Bang Potentials&lt;/strong&gt;: If $\psi$ is flat on an interval $[\alpha,\beta]$ (e.g. $\psi(s) = s$ on $[\alpha,\beta]$, $+\infty$ otherwise), the function $h$ becomes multi-valued and the optimal potential is &lt;strong&gt;bang–bang&lt;/strong&gt;:
$$
V_{\mathrm{opt}} = \alpha + (\beta-\alpha)\mathbf{1}_E
$$
for some set $E$ of finite perimeter. The paper includes numerical simulations showing the geometry of such sets for specific loads $f$.&lt;/p&gt;
&lt;h2 class="heading" id="optimal-sources-choosing-the-right-hand-side"&gt;
 Optimal Sources: Choosing the Right-Hand Side&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-sources-choosing-the-right-hand-side"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Finally, the control is the source $f$ in $-\Delta u = f$, $u\in H_0^1(\Omega)$, with cost $J(f) = \int_\Omega j(x,u_f,f),dx$ and constraint $\int_\Omega \psi(f),dx\le m$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Existence with Superlinear and Linear $\psi$&lt;/strong&gt;: If $\psi$ is &lt;strong&gt;super-linear&lt;/strong&gt; and $j$ satisfies suitable lower bounds and convexity in $f$, then an optimal $f_{\mathrm{opt}}\in L^1(\Omega)$ exists.&lt;/p&gt;
&lt;p&gt;If $\psi$ has &lt;strong&gt;linear growth&lt;/strong&gt;, the natural admissible class is signed measures $f$ with finite total variation, and $\int \psi(f)$ is defined via the Lebesgue–singular decomposition and recession coefficients $c_-(\psi), c_+(\psi)$. Under a decomposition $j(x,s,z)=A(x,s)+B(x,z)$ with specific structure and lower bounds, the functional is lower semi-continuous under weak-* convergence of measures, and there exists an optimal measure-valued source $f_{\mathrm{opt}}$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Optimality Conditions and Bang–Bang Description&lt;/strong&gt;: Introduce the self-adjoint &lt;strong&gt;resolvent&lt;/strong&gt; operator $R$ mapping a source $f$ to the solution $u_f$. Under differentiability and growth conditions on $j$, the authors derive necessary (and, under convexity, sufficient) conditions for optimality. For super-linear $\psi$, define:
$$
w := R\big(\partial_s j(x, R(f_{\mathrm{opt}}), f_{\mathrm{opt}})\big) + \partial_z j(x, R(f_{\mathrm{opt}}), f_{\mathrm{opt}}).
$$
Then there is $\lambda \ge 0$ such that either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;$\lambda=0$&lt;/strong&gt;: $w$ has a fixed sign and $f_{\mathrm{opt}}$ saturates the endpoints of $\mathrm{dom}(\psi)$ on the regions where $w$ is strictly positive/negative — a &lt;strong&gt;pure bang–bang&lt;/strong&gt; behavior.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;$\lambda&amp;gt;0$&lt;/strong&gt;: the constraint is saturated, $\int \psi(f_{\mathrm{opt}})=m$, and $f_{\mathrm{opt}}$ satisfies a point-wise equality involving $\psi$, its conjugate $\psi^*$, and $w$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For linear-growth $\psi$, a similar structure holds, but the singular part of $f_{\mathrm{opt}}$ is supported on level sets where $w$ hits thresholds determined by the slopes $c_-(\psi), c_+(\psi)$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Spectral Example: Maximizing Energy Under an $L^2$ Constraint&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For:
$$
j(u) = -\tfrac{1}{2} u^2, \quad \psi(s)=\tfrac{1}{2} s^2,
$$
the problem becomes:
$$
\max \left\{\frac{1}{2}\int_\Omega u_f^2 f,dx : \int_\Omega f^2,dx \right\}.
$$&lt;/p&gt;
&lt;p&gt;The optimality system shows that the optimal source $f$ satisfies a &lt;strong&gt;fourth-order eigenvalue problem&lt;/strong&gt; $\Delta^2 f = f/\lambda$, equivalent to an eigenvalue problem for the Laplacian. The maximizer is a multiple of the &lt;strong&gt;first Dirichlet eigenfunction&lt;/strong&gt; $\varphi$ of $-\Delta$:
$$
f = \pm \sqrt{2m},\varphi, \quad \lambda = 1/\mu_1^2,
$$
where $\mu_1$ is the first eigenvalue. The paper includes a numerical plot for such an optimal source in an ellipse.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compliance with Box Constraints on the Source&lt;/strong&gt;: For compliance with box constraints:
$$
\min \left\{\int_\Omega f,R(f),dx : \int_\Omega f,dx \ge m,\ f\in[\alpha,\beta]\right\}, \quad 0\le \alpha&amp;lt;\beta,
$$
the optimal source is bang–bang:
$$
f _{\mathrm{opt}} = \alpha,\mathbf{1} _E + \beta,\mathbf{1} _{\Omega\setminus E},
$$
with $E = {R(f _{\mathrm{opt}}) &amp;lt; s}$ and $s$ chosen to fit the mass constraint. The corresponding state solves:
$$
-\Delta u = \beta,\mathbf{1} _{\{u&amp;lt;s\}} + \alpha,\mathbf{1} _{\{u&amp;gt;s\}}.
$$&lt;/p&gt;
&lt;p&gt;Using results from their previous work on optimal potentials, the authors prove that $f _{\mathrm{opt}} \in BV(\Omega)$: the interface between the regions where $f=\alpha$ and $f=\beta$ has finite perimeter.&lt;/p&gt;
&lt;p&gt;If $\Omega$ is &lt;strong&gt;convex&lt;/strong&gt;, they go further: in the special case $\alpha = 0$, $f _{\mathrm{opt}} = \mathbf{1} _E$ with $E = {w &amp;lt; s}$, where $w$ solves $-\Delta w = \mathbf{1} _{\{w&amp;lt;s\}}$. They show that the optimal set $E$ is &lt;strong&gt;convex&lt;/strong&gt; and its boundary is of class $C^1$. So in convex domains, the region where you &amp;ldquo;turn on&amp;rdquo; the source to maximize stiffness is itself a smooth convex set.&lt;/p&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal sources for elliptic PDEs. arXiv preprint arXiv:2509.01521.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal sources for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2509.01521}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;[2] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal coefficients for elliptic PDEs. arXiv preprint arXiv:2512.08431.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal coefficients for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2512.08431}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;[3] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2026). Optimization problems for elliptic PDEs. arXiv preprint arXiv:2601.01591.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2026optimization&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimization problems for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2601.01591}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2026}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Paper Reading - Optimal coefficients for elliptic PDEs (2512.08431)</title><link>https://blog.namln.org/en/posts/pr-2512.08431/</link><pubDate>Thu, 19 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/pr-2512.08431/</guid><description>&lt;p&gt;This paper gives a clear, fairly complete picture of how to optimally choose the &lt;strong&gt;coefficient&lt;/strong&gt; $a(x)$ (think &amp;ldquo;material quality&amp;rdquo;) in an elliptic PDE, with compliance as the main model and then a general optimal control formulation.&lt;/p&gt;
&lt;h2 class="heading" id="problem-setup"&gt;
 Problem Setup&lt;span class="heading__anchor"&gt; &lt;a href="#problem-setup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Considering the boundary value problem:
$$
-{\rm div}(a(x)\nabla u) = f \quad\text{in } \Omega,\qquad u=0 \text{ on } \partial\Omega,
$$
where $\Omega$ is a bounded domain, $f$ is a given load, and $a(x)$ is the design variable.&lt;/p&gt;
&lt;p&gt;Typical assumptions on $a(x)$:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Point-wise bounds $\alpha \le a(x) \le \beta$ (two material qualities, e.g., “soft” vs “stiff”).&lt;/li&gt;
&lt;li&gt;Possibly a budget constraint (e.g., only a fixed fraction of the domain can use the best material $\beta$).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The map $a \mapsto u_a$ is well-defined by elliptic theory: for each admissible $a$, the PDE has a unique weak solution in $H_0^1(\Omega)$.&lt;/p&gt;
&lt;h2 class="heading" id="example"&gt;
 Example&lt;span class="heading__anchor"&gt; &lt;a href="#example"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The &lt;strong&gt;elastic compliance&lt;/strong&gt; is a classical cost in mechanics: it measures how much the structure deforms under the load $f$. In this setting, a standard functional is&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;either $C(a) = \int_\Omega f,u_a,dx$ (work of the load),&lt;/li&gt;
&lt;li&gt;or equivalently the elastic energy $\int_\Omega a(x),|\nabla u_a|^2,dx$ up to constants.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Minimizing the compliance means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Given a fixed load and a given volume of good material, distribute (a(x)) in (\Omega) so that the resulting displacement (u_a) is as small as possible in the energy sense.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Key qualitative facts the paper emphasizes in this compliance setting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Existence&lt;/strong&gt;: under standard bounds $\alpha \le a \le \beta$ and a convex constraint (like a fixed integral of $a$), there exists at least one optimal coefficient $a_{\text{opt}}$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extremal behavior&lt;/strong&gt;: because the compliance functional is convex in $u$ but often leads to a concave dependence on $a$ under constraints, optimal $a_{\text{opt}}$ tend to take values only at the extremes $\alpha$ or $\beta$ almost everywhere, a typical “black-and-white” design phenomenon known in topology optimization.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Intuitively, if we can choose between “bad” and “good” material at each point but only have a limited budget of good material, it is never optimal to mix them continuously; we either go full good or full bad locally and let the PDE determine where gradients are large so good material is most effective.&lt;/p&gt;
&lt;h2 class="heading" id="from-two-phase-design-to-optimal-control"&gt;
 From two-phase design to optimal control&lt;span class="heading__anchor"&gt; &lt;a href="#from-two-phase-design-to-optimal-control"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The authors then move to a more general &lt;strong&gt;PDE-constrained optimal control&lt;/strong&gt; view: $a(x)$ is the control, the PDE is the state equation, and the cost is an abstract functional
$$
J(a) = \int_\Omega j(x, u_a(x), a(x), \nabla u_a(x)),dx,
$$
possibly plus boundary or integral terms.&lt;/p&gt;
&lt;p&gt;In this general framework:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The admissible set $\mathcal{A}$ of coefficients may encode box constraints, integral constraints, or more refined structure (e.g., multi-phase materials).&lt;/li&gt;
&lt;li&gt;The goal is to minimize $J(a)$ over $\mathcal{A}$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The paper outlines how standard tools of optimal control of PDEs apply:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Adjoint equation&lt;/strong&gt;: one introduces an adjoint state $p$ solving its own elliptic problem linked to derivatives of $j$ with respect to $u$ and $\nabla u$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;First-order optimality&lt;/strong&gt;: optimal coefficients satisfy variational inequalities or pointwise optimality conditions involving $a_{\text{opt}}$, $u_{a_{\text{opt}}}$, and $p$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In simple situations, one gets an explicit “gradient” of the cost with respect to the coefficient:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;local changes in $a(x)$ are weighted by expressions involving $\nabla u$ and $\nabla p$;&lt;/li&gt;
&lt;li&gt;this tells us where increasing stiffness (raising $a$) helps most, and where it is wasteful.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This general perspective makes clear that compliance minimization is just one concrete instance of a broader family of coefficient optimization problems.&lt;/p&gt;
&lt;h2 class="heading" id="bangbang-and-intermediate-materials"&gt;
 Bang–bang and intermediate materials&lt;span class="heading__anchor"&gt; &lt;a href="#bangbang-and-intermediate-materials"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;A recurring theme, already visible in compliance, is whether optimal coefficients are &lt;strong&gt;bang–bang&lt;/strong&gt; (only $\alpha$ or $\beta$) or can take intermediate values.&lt;/p&gt;
&lt;p&gt;The paper’s message, in line with the authors’ broader work, is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Under &lt;strong&gt;linear&lt;/strong&gt; or suitably convex-structured costs and simple constraints, the optimization problem often favors &lt;strong&gt;extreme coefficients&lt;/strong&gt; because any “grey” intermediate material can be improved by redistributing toward the extremes while keeping constraints satisfied.&lt;/li&gt;
&lt;li&gt;If instead the cost penalizes variations of $a$ (e.g., includes $|\nabla a|$ or a strictly convex cost of $a$), then intermediate values can become optimal and the design becomes smoother.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This has practical consequences:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For pure stiffness or compliance problems, we should expect &amp;ldquo;black-and-white&amp;rdquo; topologies.&lt;/li&gt;
&lt;li&gt;For problems where manufacturing or grading costs matter, optimal designs may be graded rather than sharply two-phase.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="applications"&gt;
 Applications&lt;span class="heading__anchor"&gt; &lt;a href="#applications"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Even though the arXiv abstract is brief, the paper’s role is clear: it systematizes and clarifies the theory of &lt;strong&gt;optimal coefficients for elliptic PDEs&lt;/strong&gt; in two complementary regimes—compliance and more general optimal control.&lt;/p&gt;
&lt;p&gt;For engineers and applied mathematicians, the main takeaways are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We can rigorously frame &amp;ldquo;optimal material distribution&amp;rdquo; as an elliptic PDE with a coefficient control and prove &lt;strong&gt;existence&lt;/strong&gt; of optimal designs under realistic constraints.&lt;/li&gt;
&lt;li&gt;In many practically relevant cases (especially compliance), optimal designs heavily favor &lt;strong&gt;extreme phases&lt;/strong&gt;, justifying the common use of binary material models in topology optimization.&lt;/li&gt;
&lt;li&gt;Adjoint-based optimality conditions give a &lt;strong&gt;computable sensitivity&lt;/strong&gt; of the cost to local changes in $a$, providing the mathematical underpinning for gradient-based optimization algorithms.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If we imagine designing a bridge deck or a heat sink, this theory tells us:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;where to place stiff or conductive material,&lt;/li&gt;
&lt;li&gt;why optimal layouts tend to be sharply separated regions of different material,&lt;/li&gt;
&lt;li&gt;and how to systematically refine the design using PDE solutions and their adjoints.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal sources for elliptic PDEs. arXiv preprint arXiv:2509.01521.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal sources for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2509.01521}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;[2] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal coefficients for elliptic PDEs. arXiv preprint arXiv:2512.08431.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal coefficients for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2512.08431}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Paper Reading - Optimal sources for elliptic PDEs (2509.01521)</title><link>https://blog.namln.org/en/posts/pr-2509.01521/</link><pubDate>Wed, 18 Feb 2026 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/pr-2509.01521/</guid><description>&lt;h2 class="heading" id="introduction"&gt;
 Introduction&lt;span class="heading__anchor"&gt; &lt;a href="#introduction"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The authors study how to &amp;ldquo;best choose&amp;rdquo; a source term $f$ in a Poisson-type equation
$$
-\Delta u = f \quad\quad\text{in }\Omega,\quad u = 0\text{ on }\partial\Omega,
$$
so that a given performance measure (a cost functional) is optimized. The twist is that the source itself is the control, and it can be subject to various constraints (size, bounds, sign, etc.). This makes the problem sit at the intersection of optimal control, shape optimization, and regularity theory.&lt;/p&gt;
&lt;h2 class="heading" id="the-basic-optimization-setup"&gt;
 The basic optimization setup&lt;span class="heading__anchor"&gt; &lt;a href="#the-basic-optimization-setup"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;First, we fix a bounded domain $\Omega \subset \mathbb{R}^d$ and, for each admissible source $f$, we solve the PDE to get the state $u_f$. Then we evaluate a cost
function which defined as follow:
$$
J(f) = \int_\Omega j(x, u_f(x), f(x)),dx,
$$
and we want to minimize $J$ over all admissible $f$.&lt;/p&gt;
&lt;p&gt;The admissible class is defined via an integral constraint:
$$
\int_\Omega \psi(f),dx \le m,
$$
for some convex function $\psi$. Different choices of $\psi$ encode different types of constraints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Super-linear $\psi$ (growing faster than $|s|$) keeps $f$ in $L^1$ and “penalizes” large values strongly.&lt;/li&gt;
&lt;li&gt;Linearly growing $\psi$ allows $f$ to be a measure (e.g., sums of Dirac masses), not just a function.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The first main result: under mild assumptions on $j$ and $\psi$, the problem always has at least one optimal source $f_{\text{opt}}$ (either as a function or a finite measure, depending on growth).&lt;/p&gt;
&lt;h2 class="heading" id="when-optimal-sources-are-all-or-nothing-bangbang-phenomenon"&gt;
 When optimal sources are “all or nothing” (bang–bang phenomenon)&lt;span class="heading__anchor"&gt; &lt;a href="#when-optimal-sources-are-all-or-nothing-bangbang-phenomenon"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;A central theme is the &lt;strong&gt;bang–bang phenomenon&lt;/strong&gt;: in many natural constraints, the best source uses only its extreme admissible values, like $f = \alpha$ or $f = \beta$, with no intermediate levels.&lt;/p&gt;
&lt;p&gt;This occurs, for instance, when we impose point-wise bounds:
$$
\alpha \le f \le \beta
$$
and choose a suitable $\psi$ that is affine on $[\alpha,\beta]$. Then the optimal source takes the form:
$$
f _{\text{opt}} = \beta,\mathbf{1} _E + \alpha,\mathbf{1} _{\Omega\setminus E}
$$
for some measurable set $E\subset \Omega$. At that point the problem becomes a &lt;strong&gt;shape optimization&lt;/strong&gt; problem in the unknown set $E$.&lt;/p&gt;
&lt;p&gt;The authors derive a precise system of necessary optimality conditions using a Lagrange multiplier $\lambda$ and an adjoint state $w$ (solution of another elliptic problem). Roughly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$w$ is built from derivatives of the integrand $j$ with respect to $u$ and $f$.&lt;/li&gt;
&lt;li&gt;The sign of $w+\lambda$ decides whether $f_{\text{opt}}$ equals $\alpha$ or $\beta$ at each point.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They show when these conditions are also sufficient, so we can fully characterize optimal controls in convex cases.&lt;/p&gt;
&lt;p&gt;A key structural insight: bang–bang behavior appears if and only if $\psi$ is &lt;strong&gt;not strictly convex&lt;/strong&gt; on some interval (it is affine on a nontrivial segment). If $\psi$ is strictly convex (e.g., $\psi(s)=s^2$), the optimal source is more regular and not bang–bang.&lt;/p&gt;
&lt;h2 class="heading" id="important-model-examples"&gt;
 Important model examples&lt;span class="heading__anchor"&gt; &lt;a href="#important-model-examples"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;The paper discusses several instructive choices of $\psi$ and $j$, each corresponding to a classical PDE optimization problem:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Total variation constraint&lt;/strong&gt;: $\psi(s)=|s|$.
&lt;ul&gt;
&lt;li&gt;The admissible sources are bounded measures with total variation at most $m$.&lt;/li&gt;
&lt;li&gt;Optimality conditions show that $f_{\text{opt}}$ is supported where an adjoint field $w$ saturates a threshold.&lt;/li&gt;
&lt;li&gt;In radially symmetric cases (e.g., $\Omega$ a ball, linear cost), the optimal source is a Dirac delta at the center.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Nonnegative sources with mass constraint&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;$\psi(s)=s$ for $s\ge0$, $\psi(s)=+\infty$ otherwise.&lt;/li&gt;
&lt;li&gt;One finds conditions under which the optimal $f$ is a single Dirac mass carrying all the “budget”.&lt;/li&gt;
&lt;li&gt;For certain power-type functionals $\int |u|^p$, existence and structure of maximizers are detailed.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Box-constrained sources&lt;/strong&gt; $\alpha \le f \le \beta$ with a volume (mass) constraint $\int f \le m$:
&lt;ul&gt;
&lt;li&gt;The authors show precisely when the optimal $f$ is constant (always $\alpha$ or always $\beta$) and when it becomes a genuine bang–bang mixture of both extremes.&lt;/li&gt;
&lt;li&gt;Strict monotonicity of $j$ in $u$ tends to force true &lt;em&gt;bang–bang solutions&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tracking a target state&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Cost $J(f)=\int_\Omega |u_f - u_0|^2 dx$ with $\alpha \le f \le \beta$.&lt;/li&gt;
&lt;li&gt;Under mild assumptions on the target $u_0$, the unique optimal control is bang–bang almost everywhere, again determined by the sign of an adjoint field.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Strictly convex $\psi$&lt;/strong&gt;, like $\psi(s)=s^2$:
&lt;ul&gt;
&lt;li&gt;Then the optimal control is not bang–bang but a continuous function explicitly related to $w$ and the mass constraint.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compliance optimization&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Minimize $\int_\Omega f u_f,dx$ under $\alpha \le f \le \beta$ and $\int f \ge m$.&lt;/li&gt;
&lt;li&gt;This is equivalent to maximizing the elastic energy of the system with bounded loads.&lt;/li&gt;
&lt;li&gt;For $0\le \alpha &amp;lt; \beta$, the optimal right-hand side is bang–bang; the domain splits into two regions where the load is either $\alpha$ or $\beta$.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="regularity-of-the-optimal-sets-and-interfaces"&gt;
 Regularity of the optimal sets and interfaces&lt;span class="heading__anchor"&gt; &lt;a href="#regularity-of-the-optimal-sets-and-interfaces"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Once we know the optimal control is bang–bang, the main qualitative object is the &lt;strong&gt;interface&lt;/strong&gt; between the regions where $f=\alpha$ and $f=\beta$.&lt;/p&gt;
&lt;p&gt;The interface is essentially a level set of an elliptic solution $u$ (or of the adjoint $w$), so understanding its geometry is a regularity problem.&lt;/p&gt;
&lt;h3 class="heading" id="bounded-variation-bv-regularity"&gt;
 Bounded variation (BV) regularity&lt;span class="heading__anchor"&gt; &lt;a href="#bounded-variation-bv-regularity"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In a first model case (compliance with $0\le \alpha &amp;lt; \beta$), the authors show that the optimal source $f_{\text{opt}}$ belongs to the space $BV(\Omega)$. This means the interface set has &lt;strong&gt;finite perimeter&lt;/strong&gt;: geometrically, the boundary between phases has finite (d–1)-dimensional measure.&lt;/p&gt;
&lt;p&gt;More generally, they derive estimates that control the curvature-like quantities of $u$ via the $BV$-norm of $f$.&lt;/p&gt;
&lt;h3 class="heading" id="a-refined-view-near-critical-points"&gt;
 A refined view near critical points&lt;span class="heading__anchor"&gt; &lt;a href="#a-refined-view-near-critical-points"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;A tougher issue is what happens on the set where $\nabla u=0$, because level sets can get very wild there. The authors prove:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For data $f \in BV(\Omega)$ satisfying a uniform positivity $f \ge \alpha&amp;gt;0$, certain weighted quantities like&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;$$
\int \frac{1}{|\nabla u|},\frac{1}{\log^q(1/|\nabla u|)},dx
$$&lt;/p&gt;
&lt;p&gt;stay finite for any $q&amp;gt;1$.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;They then construct weights involving $\log(1/|\nabla u|)$ which &amp;ldquo;switch off&amp;rdquo; exactly where $\nabla u=0$, and show that appropriately weighted indicators of level sets belong to $BV$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In particular, they define a refined Hausdorff-type measure $H_{d-1,q}$ with logarithmic weights and prove that, for sufficiently regular $f$, the set ${\nabla u=0}$ has zero $H_{d-1,q}$-measure for all $q&amp;gt;1$. This implies that the critical set has Hausdorff dimension at most $d-1$, with an even stronger “thinness” encoded by the log weights.&lt;/p&gt;
&lt;h3 class="heading" id="convex-domains-convex-and-smooth-optimal-regions"&gt;
 Convex domains: convex and smooth optimal regions&lt;span class="heading__anchor"&gt; &lt;a href="#convex-domains-convex-and-smooth-optimal-regions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;In the compliance case on a &lt;strong&gt;convex&lt;/strong&gt; domain $\Omega$, the structure is even nicer. The optimal set $E={x : f_{\text{opt}}(x)=\beta}$ coincides with a sublevel set of a solution to a semi-linear equation.&lt;/p&gt;
&lt;p&gt;Using a result of Caffarelli–Spruck type convexity for level sets, they show:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$E$ is itself &lt;strong&gt;convex&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;One can rule out “corners”, and deduce that the boundary of $E$ is actually of class $C^1$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So in convex domains, the optimal high-load region is a smooth convex set.&lt;/p&gt;
&lt;h2 class="heading" id="summary"&gt;
 Summary&lt;span class="heading__anchor"&gt; &lt;a href="#summary"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;This work gives a unified and quite complete picture of how optimal sources for elliptic PDEs behave under natural constraints:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It establishes existence of optimal controls for broad classes of convex functionals and constraints.&lt;/li&gt;
&lt;li&gt;It identifies exactly when we get bang–bang sources, turning a PDE control problem into a shape optimization problem.&lt;/li&gt;
&lt;li&gt;It provides sharp optimality conditions through adjoint states and sub-differential characterizations, allowing practical characterization and numerical approximation of optimal controls.&lt;/li&gt;
&lt;li&gt;It develops regularity theory for the resulting optimal sets and interfaces, including BV estimates, structure of level sets, and refined control of critical sets.&lt;/li&gt;
&lt;li&gt;For people working in optimal design, structural mechanics, or inverse problems, the message is: if our cost is convex and our constraint has a &amp;ldquo;flat&amp;rdquo; part (non-strictly convex $\psi$), expect extreme, piecewise-constant sources with reasonably regular interfaces that we can analyze geometrically and approximate numerically.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Buttazzo, G., Casado-Díaz, J., &amp;amp; Maestre, F. (2025). Optimal sources for elliptic PDEs. arXiv preprint arXiv:2509.01521.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bibtex" data-lang="bibtex"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nc"&gt;@article&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;buttazzo2025optimal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Optimal sources for elliptic PDEs}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{Buttazzo, Giuseppe and Casado-D{\&amp;#39;\i}az, Juan and Maestre, Faustino}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;journal&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{arXiv preprint arXiv:2509.01521}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="na"&gt;year&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;{2025}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Restriction and extension</title><link>https://blog.namln.org/en/posts/restriction/</link><pubDate>Wed, 29 Oct 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/restriction/</guid><description>&lt;p&gt;Considering a smooth compact hyper-surface $\mathcal{S}$ in $\mathbb{R}^d$ with surface measure $d\sigma$. Given $f \in L^1(\mathbb{R}^d)$, the Fourier transform defined as follow:
$$
\begin{equation}
\hat{f}(x) = \int_{\mathbb{R}^d}e^{-2\pi i x \xi}f(x)dx
\end{equation}
$$
which by Riemann-Lebesgue is a bounded, continuous function vanishing at infinity.&lt;/p&gt;
&lt;p&gt;Since $\hat{f}$ is continuous on $\mathbb{R}^d$, by &lt;a href="https://en.wikipedia.org/wiki/Riemann%E2%80%93Lebesgue_lemma"&gt;the Rimann-Lesbegue lemma&lt;/a&gt; its restriction to the compact hyper-surface $S \subset \mathbb{R}^d$ is is well-defined pointwise. Specifically, the restriction $\hat{f}\mid_{S}: S \rightarrow \mathbb{C}$ is the continuous function given by
$$
\begin{equation}
\hat{f}\mid_{S}(\sigma) = \hat{f}(\sigma) = \int_{\mathbb{R}^d}e^{-2\pi i x \xi}f(x)dx
\end{equation}
$$
for each $\sigma \in S$. This is bounded (as $\hat{f}$ is bounded) and can be integrated against the surface measure $d\sigma$ on $S$.&lt;/p&gt;
&lt;p&gt;Thus when we restrict $\hat{f}$ to $S$, we get a meaningful function which has finite $L^q$-norm for every $q$ .&lt;/p&gt;
&lt;p&gt;When starting with $f \in L^2(\mathbb{R}^d)$, &lt;strong&gt;the Fourier transform $\hat{f}$ is not well-defined point-wise in general&lt;/strong&gt;, so there is no
meaningful way to restrict an arbitrary $L^2$ function to a set of measure zero such as the hyper-surface $S$.&lt;/p&gt;
&lt;p&gt;For especially, for any given $f \in L^2(\mathbb{R}^d)$, the Fourier transform is defined in the $L^2$ sense via the &lt;a href="https://en.wikipedia.org/wiki/Plancherel_theorem"&gt;Plancherel theorem&lt;/a&gt;:
$$
\begin{equation}
\mathcal{F}: L^2(\mathbb{R}^d) \to L^2(\mathbb{R}^d), \quad | \hat{f} | _{L^2} = | f | _{L^2}
\end{equation}
$$
It is an isometry. So:
$$
\begin{equation}
\hat{f} \in L^2(\mathbb{R}^d)
\end{equation}
$$
Since $\hat{f}$ is only an $L^2$ function — it is &lt;strong&gt;not necessarily continuous&lt;/strong&gt;, and &lt;strong&gt;not even bounded&lt;/strong&gt;, and &lt;strong&gt;need not have a pointwise value almost everywhere&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So the expression:
$$
\begin{equation}
\hat{f}|_S(\sigma) = \hat{f}(\sigma), \quad \sigma \in S
\end{equation}
$$
does not make sense pointwise for arbitrary $f \in L^2$.&lt;/p&gt;
&lt;p&gt;The question arises: what happens for $1 &amp;lt; p &amp;lt; 2$?&lt;/p&gt;
&lt;div style="padding: 6px; border: dodgerblue 2px solid;"&gt;&lt;span style="color:dodgerblue"&gt;&lt;b&gt; Question 1: &lt;/b&gt;&lt;/span&gt; 
&lt;p&gt;For which $p$ and $q$ do we have:
$$
\begin{equation}
||\hat{f}|| _{L^q(S, d\sigma)} \lesssim ||f|| _{L^p(\mathbb{R}^d)}, \quad \forall f.
\end{equation}
$$&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;This is restriction of Fourier transforms to hyper-surfaces problem in Harmonic analysis.&lt;/p&gt;</description></item><item><title>Proof of Theorem of solution of wave equation in the case $n = 1$</title><link>https://blog.namln.org/en/posts/solution-wave-equation-n1/</link><pubDate>Thu, 31 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/solution-wave-equation-n1/</guid><description>&lt;embed src= "/files/pde/Solution%20of%20wave%20equation%20n%20=%201.pdf" width= "100%" height= "1000px" type="application/pdf" &gt;</description></item><item><title>Solution of Brezis Problem 8.24 (1) and (2)</title><link>https://blog.namln.org/en/posts/problem-8.24-brezis/</link><pubDate>Thu, 31 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/problem-8.24-brezis/</guid><description>&lt;embed src= "/files/pde/Problem%208.24%20Brezis.pdf" width= "100%" height= "1000px" type="application/pdf" &gt;</description></item><item><title>Solution of Evans PDE Problem 13</title><link>https://blog.namln.org/en/posts/problem-13-evans/</link><pubDate>Thu, 31 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/problem-13-evans/</guid><description>&lt;embed src= "/files/pde/Problem%2013%20Evans.pdf" width= "100%" height= "1000px" type="application/pdf" &gt;</description></item><item><title>Bin Packing Problem (BPP)</title><link>https://blog.namln.org/en/research/ml-co/problems/bin-packing/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/bin-packing/</guid><description>&lt;h1 class="heading" id="bin-packing-problem-bpp"&gt;
 Bin Packing Problem (BPP)&lt;span class="heading__anchor"&gt; &lt;a href="#bin-packing-problem-bpp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Bin Packing Problem involves packing items into bins with minimum number of bins or minimum cost. It has many applications in logistics, manufacturing, and resource allocation.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Small Boxes Big Data: A Deep Learning Approach to Optimize Variable Sized Bin Packing&lt;/strong&gt; BigDataService, 2017. &lt;a href="https://ieeexplore.ieee.org/abstract/document/7944923/?casa_token=mRzI_XBy3ycAAAAA:yD9Le2KBNq1TMpW_1etb0RF-oFVcLJj9Up0Z2qI6XJmA-UffxxSZRIx7RklaQED-yXwuwBC4M_w"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Mao, Feng and Blanco, Edgar and Fu, Mingang and Jain, Rohit and Gupta, Anurag and Mancel, Sebastien and Yuan, Rong and Guo, Stephen and Kumar, Sai and Tian, Yayang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method&lt;/strong&gt; Arxiv, 2017. &lt;a href="https://arxiv.org/abs/1708.05930"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hu, Haoyuan and Zhang, Xiaodong and Yan, Xiaowei and Wang, Longfei and Xu, Yinghui&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Best Arm Identification in Multi-armed Bandits with Delayed Feedback&lt;/strong&gt; PMLR, 2018. &lt;a href="http://proceedings.mlr.press/v84/grover18b.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Grover, Aditya and Markov, Todor and Attia, Peter and Jin, Norman and Perkins, Nicolas and Cheong, Bryan and Chen, Michael and Yang, Zi and Harris, Stephen and Chueh, William and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization Alexandre&lt;/strong&gt; Arxiv, 2018. &lt;a href="https://arxiv.org/abs/1807.01672"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Laterre, Alexandre and Fu, Yunguan and Jabri, Mohamed Khalil and Cohen, Alain-Sam and Kas, David and Hajjar, Karl and Dahl, Torbjorn S and Kerkeni, Amine and Beguir, Karim&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Multi-task Selected Learning Approach for Solving 3D Bin Packing Problem.&lt;/strong&gt; AAMAS, 2019. &lt;a href="https://arxiv.org/abs/1804.06896"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Duan, Lu and Hu, Haoyuan and Qian, Yu and Gong, Yu and Zhang, Xiaodong and Xu, Yinghui and Wei, Jiangwen.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Data-Driven Approach for Multi-level Packing Problems in Manufacturing Industry&lt;/strong&gt; KDD, 2019. &lt;a href="https://dl.acm.org/doi/abs/10.1145/3292500.3330708"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Lei and Tong, Xialiang and Yuan, Mingxuan and Zeng, Jia and Chen, Lei&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving Packing Problems by Conditional Query Learning&lt;/strong&gt; OpenReview, 2019. &lt;a href="https://openreview.net/forum?id=BkgTwRNtPB"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Dongda and Ren, Changwei and Gu, Zhaoquan and Wang, Yuexuan and Lau, Francis&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;RePack: Dense Object Packing Using Deep CNN with Reinforcement Learning&lt;/strong&gt; CACS, 2019. &lt;a href="https://ieeexplore.ieee.org/abstract/document/9024360/?casa_token=ScXezdDDiwMAAAAA:fglP_vbiQUJgLZcM7YZyqnDh_qA8jOjIh-zbH7ru0XSVBghh8OAxpThOU3BqhBeet4NlSrdHPcU"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chu, Yu-Cheng and Lin, Horng-Horng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement learning driven heuristic optimization&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/pdf/1906.06639.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cai, Qingpeng and Hang, Will and Mirhoseini, Azalia and Tucker, George and Wang, Jingtao and Wei, Wei&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing.&lt;/strong&gt; AAAI Workshop, 2020. &lt;a href="https://arxiv.org/abs/2007.00463"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Verma, Richa and Singhal, Aniruddha and Khadilkar, Harshad and Basumatary, Ansuma and Nayak, Siddharth and Singh, Harsh Vardhan and Kumar, Swagat and Sinha, Rajesh.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Robot Packing with Known Items and Nondeterministic Arrival Order.&lt;/strong&gt; TASAE, 2020. &lt;a href="https://ieeexplore.ieee.org/abstract/document/9205914/"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Fan and Hauser, Kris.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;TAP-Net: Transport-and-Pack using Reinforcement Learning.&lt;/strong&gt; TOG, 2020. &lt;a href="https://dl.acm.org/doi/abs/10.1145/3414685.3417796"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Juzhan/TAP-Net"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hu, Ruizhen and Xu, Juzhan and Chen, Bin and Gong, Minglun and Zhang, Hao and Huang, Hui.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Simultaneous Planning for Item Picking and Placing by Deep Reinforcement Learning&lt;/strong&gt; IROS, 2020. &lt;a href="http://ras.papercept.net/images/temp/IROS/files/0330.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Tanaka, Tatsuya and Kaneko, Toshimitsu and Sekine, Masahiro and Tangkaratt, Voot and Sugiyama, Masashi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Monte Carlo Tree Search on Perfect Rectangle Packing Problem Instances&lt;/strong&gt; GECCO, 2020. &lt;a href="https://dl.acm.org/doi/abs/10.1145/3377929.3398115"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Pejic, Igor and van den Berg, Daan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;PackIt: A Virtual Environment for Geometric Planning&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/goyal20b.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/princeton-vl/PackIt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Goyal, Ankit and Deng, Jia&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online 3D Bin Packing with Constrained Deep Reinforcement Learning.&lt;/strong&gt; AAAI, 2021. &lt;a href="https://arxiv.org/abs/2006.14978"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/alexfrom0815/Online-3D-BPP-DRL"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhao, Hang and She, Qijin and Zhu, Chenyang and Yang, Yin and Xu, Kai.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Practically Feasible Policies for Online 3D Bin Packing&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2108.13680"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hang Zhao and Chenyang Zhu and Xin Xu and Hui Huang and Kai Xu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Attend2Pack: Bin Packing through Deep Reinforcement Learning with Attention&lt;/strong&gt; ICML Workshop, 2021. &lt;a href="https://arxiv.org/abs/2107.04333"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jingwei Zhang and Bin Zi and Xiaoyu Ge&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving 3D bin packing problem via multimodal deep reinforcement learning&lt;/strong&gt; AAMAS, 2021. &lt;a href="https://www.ifaamas.org/Proceedings/aamas2021/pdfs/p1548.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiang, Yuan, Zhiguang Cao, and Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming&lt;/strong&gt; IEEE transactions on cybernetics, 2021. &lt;a href="https://ieeexplore.ieee.org/document/9606618/"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiang, Yuan and Cao, Zhiguang and Zhang, Jie&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Pack: A Data-Driven Tree Search Algorithm for Large-Scale 3D Bin Packing Problem&lt;/strong&gt; CIKM, 2021. &lt;a href="https://dl.acm.org/doi/abs/10.1145/3459637.3481933"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhu, Qianwen and Li, Xihan and Zhang, Zihan and Luo, Zhixing and Tong, Xialiang and Yuan, Mingxuan and Zeng, Jia&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Efficient Online 3D Bin Packing on Packing Configuration Trees.&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=bfuGjlCwAq"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hang Zhao and Kai Xu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Improved Algorithms for Multi-period Multi-class Packing Problemswith Bandit Feedback&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/24252"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kim, Wonyoung and Iyengar, Garud and Zeevi, Assaf&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adjustable Robust Reinforcement Learning for Online 3D Bin Packing&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=1mdTYi1jAW"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Pan, Yuxin and Chen, Yize and Lin, Fangzhen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Neural Column Generation Approach to the Vehicle Routing Problem with Two-Dimensional Loading and Last-In-First-Out Constraints&lt;/strong&gt; IJCAI, 2024. &lt;a href="https://www.ijcai.org/proceedings/2024/0218.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/xyfffff/NCG-for-2L-CVRP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yifan Xia, Xiangyi Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Boolean Satisfiability (SAT)</title><link>https://blog.namln.org/en/research/ml-co/problems/boolean-satisfiability/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/boolean-satisfiability/</guid><description>&lt;h1 class="heading" id="boolean-satisfiability-sat"&gt;
 Boolean Satisfiability (SAT)&lt;span class="heading__anchor"&gt; &lt;a href="#boolean-satisfiability-sat"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Boolean Satisfiability is a fundamental problem in computer science with applications to formal verification and automated reasoning. Machine learning approaches are increasingly being applied to improve SAT solver heuristics.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph neural networks and boolean satisfiability.&lt;/strong&gt; Arxiv, 2017. &lt;a href="https://arxiv.org/pdf/1702.03592"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bünz, Benedikt, and Matthew Lamm.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning a SAT solver from single-bit supervision.&lt;/strong&gt; Arxiv, 2018. &lt;a href="https://arxiv.org/pdf/1903.04671"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/dselsam/neurosat"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Selsam, Daniel, Matthew Lamm, Benedikt Bünz, Percy Liang, Leonardo de Moura, and David L. Dill.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Machine learning-based restart policy for CDCL SAT solvers.&lt;/strong&gt; SAT, 2018. &lt;a href="http://www.t-news.cn/Floc2018/FLoC2018-pages/proceedings_paper_477.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liang, Jia Hui, Chanseok Oh, Minu Mathew, Ciza Thomas, Chunxiao Li, and Vijay Ganesh.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to solve circuit-SAT: An unsupervised differentiable approach.&lt;/strong&gt; ICLR, 2019. &lt;a href="https://openreview.net/pdf?id=BJxgz2R9t7"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/johannaSommer/generalization-neural-co-solvers"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Amizadeh, Saeed, Sergiy Matusevych, and Markus Weimer.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Local Search Heuristics for Boolean Satisfiability.&lt;/strong&gt; NeurIPS, 2019. &lt;a href="https://www.cs.cmu.edu/~eyolcu/papers/learning-local-search-heuristics-sat.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/emreyolcu/sat"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yolcu, Emre and Poczos, Barnabas&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Improving SAT solver heuristics with graph networks and reinforcement learning.&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/pdf/1909.11830"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kurin, Vitaly, Saad Godil, Shimon Whiteson, and Bryan Catanzaro.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph neural reasoning may fail in certifying boolean unsatisfiability.&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/pdf/1909.11588"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Ziliang, and Zhanfu Yang.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Guiding high-performance SAT solvers with unsat-core predictions.&lt;/strong&gt; SAT, 2019. &lt;a href="https://arxiv.org/pdf/1903.04671"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Selsam, Daniel, and Nikolaj Bjørner.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;G2SAT: Learning to Generate SAT Formulas.&lt;/strong&gt; NeurIPS, 2019. &lt;a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7138247/"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/JiaxuanYou/G2SAT"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You, Jiaxuan, Haoze Wu, Clark Barrett, Raghuram Ramanujan, and Jure Leskovec.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning.&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/pdf/1807.08058"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/lederg/learningqbf"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lederman, Gil, Markus N. Rabe, Edward A. Lee, and Sanjit A. Seshia.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Enhancing SAT solvers with glue variable predictions.&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/pdf/2007.02559"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Han, Jesse Michael.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?&lt;/strong&gt; NeurIPS, 2020. &lt;a href="http://www.cs.ox.ac.uk/people/shimon.whiteson/pubs/kurinnips20.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Whiteson, Shimon.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online Bayesian Moment Matching based SAT Solver Heuristics.&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/duan20c/duan20c.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/saeednj/BMMSAT"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Duan, Haonan, Saeed Nejati, George Trimponias, Pascal Poupart, and Vijay Ganesh.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Clause Deletion Heuristics with Reinforcement Learning.&lt;/strong&gt; AITP, 2020. &lt;a href="http://aitp-conference.org/2020/abstract/paper_25.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Vaezipoor, Pashootan, Gil Lederman, Yuhuai Wu, Roger Grosse, and Fahiem Bacchus.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Classification of SAT problem instances by machine learning methods.&lt;/strong&gt; CEUR, 2020. &lt;a href="http://ceur-ws.org/Vol-2650/paper11.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Danisovszky, Márk, Zijian Győző Yang, and Gábor Kusper.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Predicting Propositional Satisfiability via End-to-End Learning.&lt;/strong&gt; AAAI, 2020. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/download/5733/5589"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cameron, Chris, Rex Chen, Jason Hartford, and Kevin Leyton-Brown.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural heuristics for SAT solving.&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/pdf/2005.13406"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jaszczur, Sebastian, Michał Łuszczyk, and Henryk Michalewski.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NLocalSAT: Boosting Local Search with Solution Prediction.&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/pdf/2001.09398"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/myxxxsquared/NLocalSAT"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhang, Wenjie, Zeyu Sun, Qihao Zhu, Ge Li, Shaowei Cai, Yingfei Xiong, and Lu Zhang.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimistic tree search strategies for black-box combinatorial optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=JGLW4DvX11F"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Malherbe, Cedric and Grosnit, Antoine and Tutunov, Rasul and Ammar, Haitham Bou and Wang, Jun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Goal-Aware Neural SAT Solver.&lt;/strong&gt; IJCNN, 2022. &lt;a href="https://ieeexplore.ieee.org/document/9892733"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ozolins, Emils, Karlis Freivalds, Andis Draguns, Eliza Gaile, Ronalds Zakovskis, and Sergejs Kozlovics.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NeuroComb: Improving SAT Solving with Graph Neural Networks.&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2110.14053"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Wenxi, Yang Hu, Mohit Tiwari, Sarfraz Khurshid, Kenneth McMillan, and Risto Miikkulainen.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Performance of Deep Generative Models of Realistic SAT Instances.&lt;/strong&gt; SAT, 2022. &lt;a href="https://drops.dagstuhl.de/opus/volltexte/2022/16677/pdf/LIPIcs-SAT-2022-3.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Garzón, Iván, Pablo Mesejo, and Jesús Giráldez-Cru.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeepSAT: An EDA-Driven Learning Framework for SAT.&lt;/strong&gt; Arxiv, 2022. &lt;a href="http://arxiv.org/abs/2205.13745"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Min, Zhengyuan Shi, Qiuxia Lai, Sadaf Khan, and Qiang Xu.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SATformer: Transformers for SAT Solving.&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2209.00953"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Shi, Zhengyuan, Min Li, Sadaf Khan, Hui-Ling Zhen, Mingxuan Yuan, and Qiang Xu.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Augment with Care: Contrastive Learning for Combinatorial Problems.&lt;/strong&gt; ICML, 2022. &lt;a href="https://proceedings.mlr.press/v162/duan22b.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/h4duan/contrastive-sat"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Duan, Haonan, Pashootan Vaezipoor, Max B. Paulus, Yangjun Ruan and Chris J. Maddison&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NSNet: A General Neural Probabilistic Framework for Satisfiability Problems&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://arxiv.org/abs/2211.03880"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhaoyu Li, Xujie Si&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://arxiv.org/abs/2208.04055"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nikolaos Karalias, Joshua Robinson, Andreas Loukas, Stefanie Jegelka&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=vJZ7dPIjip3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Simon Geisler, Johanna Sommer, Jan Schuchardt, Aleksandar Bojchevski and Stephan Günnemann&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://arxiv.org/abs/2305.17010"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/zdhNarsil/GFlowNet-CombOpt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐HardSATGEN: Understanding the Difficulty of Hard SAT Formula Generation and A Strong Structure-Hardness-Aware Baseline&lt;/strong&gt; KDD, 2023. &lt;a href="https://dl.acm.org/doi/10.1145/3580305.3599837"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/HardSATGEN"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang Li, Xinyan Chen, Wenxuan Guo, Xijun Li, Wanqian Luo, Junhua Huang, Hui-Ling Zhen, Mingxuan Yuan, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributed Constrained Combinatorial Optimization leveraging Hypergraph Neural Networks&lt;/strong&gt; Nature Machine Intelligence, 2024. &lt;a href="https://arxiv.org/abs/2311.09375"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/nasheydari/HypOp"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nasimeh Heydaribeni, Xinrui Zhan, Ruisi Zhang, Tina Eliassi-Rad, Farinaz Koushanfar&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient Combinatorial Optimization via Heat Diffusion&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=psDrko9v1D"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hengyuan Ma, Wenlian Lu, Jianfeng Feng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=yEwakMNIex"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/UniCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wenzheng Pan, Hao Xiong, Jiale Ma, Wentao Zhao, Yang Li, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Car Dispatch</title><link>https://blog.namln.org/en/research/ml-co/problems/car-dispatch/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/car-dispatch/</guid><description>&lt;h1 class="heading" id="car-dispatch"&gt;
 Car Dispatch&lt;span class="heading__anchor"&gt; &lt;a href="#car-dispatch"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Car dispatch focuses on optimally assigning vehicles to passenger requests, a key problem in autonomous driving and ride-hailing services.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning for Autonomous Taxi Fleet Dispatch&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://arxiv.org/abs/2003.15212"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Philip Thomas, Bruno Castro Da Silva, Kemo Adeyemo, Jacob Tyo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Causal Discovery</title><link>https://blog.namln.org/en/research/ml-co/problems/causal-discovery/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/causal-discovery/</guid><description>&lt;h1 class="heading" id="causal-discovery"&gt;
 Causal Discovery&lt;span class="heading__anchor"&gt; &lt;a href="#causal-discovery"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Causal discovery focuses on learning the causal structure behind observational data, identifying causal relationships between variables.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Scalable and General Framework for Privacy-Preserving Causality-Aware X&lt;/strong&gt; AISTATS, 2024. &lt;a href="https://openreview.net/forum?id=dYPBgLRhMW"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xupeng Cao, Yuming Huang, Zining Zhu, Jing Ma&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scalable Computational Methods for Bayesian Additive Regression Trees&lt;/strong&gt; Journal of Computational and Graphical Statistics, 2021. &lt;a href="https://doi.org/10.1080/10618600.2020.1770054"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Brent R. Linley and Jingyu He and Jesse Windle&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Causal Inference Using Invariant Prediction: Identification and Little&amp;rsquo;s Law of Causal Discovery&lt;/strong&gt; JMLR, 2023. &lt;a href="https://jmlr.org/papers/v85/rotnitzky23a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andrea Rotnitzky, James M. Robins, Rajeeva Karandikar&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Temporal Causal Graphs for Approximately Stationary Environments&lt;/strong&gt; ICML, 2023. &lt;a href="https://proceedings.mlr.press/v202/marx23a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/kevinpmarx/stl-causal"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kevin Marx, Jiji Zhang and Kun Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph neural networks for improved electroencephalographic seizure detection&lt;/strong&gt; Nature Communications, 2023. &lt;a href="https://doi.org/10.1038/s41467-023-37199-0"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Akshay Gujral and Eleonora Spinelli and Ibrahim Alachiotis and Cosmin Anitescu and Pieter Collins&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Causal structure learning through deep generative models: Applications to real-world time series in clinical neuroscience&lt;/strong&gt; ICML, 2024. &lt;a href="https://arxiv.org/abs/2406.15268"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kion Fallah, Tim Suereth, Houman Dreyfuss, et al.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph Structure Learning for Temporal Reinforcement Learning&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=ypUK_kCT72S"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, Rémi Munos, Georg Ostrovski&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Causal Graph Learning for Large-scale Heterogeneous Biological Networks&lt;/strong&gt; Nature Machine Intelligence, 2023. &lt;a href="https://doi.org/10.1038/s42256-023-00635-3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Alexander Statnikov, Constantine F. Aliferis, Ioannis Tsamardinos, Douglas P. Hardin, Melissa Levy&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Constraint-based Causal Discovery with Mixed Data&lt;/strong&gt; Machine Learning, 2023. &lt;a href="https://doi.org/10.1007/s10994-023-06371-2"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiji Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Collected Lectures on Calculus of Variations</title><link>https://blog.namln.org/en/mathematics/analysis/calculus-of-variations/collected-lectures-cv/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/calculus-of-variations/collected-lectures-cv/</guid><description>&lt;h2 class="heading" id="gentle-introductions"&gt;
 Gentle introductions&lt;span class="heading__anchor"&gt; &lt;a href="#gentle-introductions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Blog post &amp;ldquo;&lt;a href="https://bjlkeng.io/posts/the-calculus-of-variations/"&gt;The Calculus of Variations&lt;/a&gt;&amp;rdquo; on Bounded Rationality, with intuitive explanations and worked brachistochrone-style examples.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="classic-introductory-textbooks-pdf"&gt;
 Classic introductory textbooks (PDF)&lt;span class="heading__anchor"&gt; &lt;a href="#classic-introductory-textbooks-pdf"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Gelfand &amp;amp; Fomin – Calculus of Variations (Dover). A standard first text, concise and focused on core theory and mechanics applications.&lt;/li&gt;
&lt;li&gt;Bruce van Brunt – The Calculus of Variations (Springer Universitext); a bit more modern, with geometry and physics examples, suitable after multivariable calculus and basic analysis.&lt;/li&gt;
&lt;li&gt;Hunter College notes &amp;ldquo;&lt;a href="https://math.hunter.cuny.edu/mbenders/cofv.pdf"&gt;The Calculus of Variations&lt;/a&gt;&amp;rdquo; (covers lemmas, Euler–Lagrange, Weierstrass condition, etc.) for a structured, textbook-like PDF.
​&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="lecture-note-sets"&gt;
 Lecture note sets&lt;span class="heading__anchor"&gt; &lt;a href="#lecture-note-sets"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Lukas Koch, &lt;a href="https://personal-homepages.mis.mpg.de/lkoch/LectureNotes.pdf"&gt;&lt;em&gt;Lecture notes for Calculus of Variations&lt;/em&gt;&lt;/a&gt; (Leipzig, 3rd-year course, includes classical theory and direct method, up to modern topics).&lt;/li&gt;
&lt;li&gt;​Riccardo Cristoferi, &lt;a href="https://www.math.cmu.edu/cna/Publications/publications2016/papers/16-CNA-007.pdf"&gt;&lt;em&gt;Calculus of Variations Lecture Notes&lt;/em&gt;&lt;/a&gt; (Carnegie Mellon, classical necessary and sufficient conditions, many examples).&lt;/li&gt;
&lt;li&gt;Filip Rindler, &lt;a href="https://warwick.ac.uk/fac/sci/maths/people/staff/filip_rindler/cov_ln.pdf"&gt;&lt;em&gt;Introduction to the Modern Calculus of Variations&lt;/em&gt;&lt;/a&gt; (goes beyond classical theory toward modern functional-analytic treatment).&lt;/li&gt;
&lt;li&gt;Pisa &amp;ldquo;&lt;a href="https://poisson.phc.dm.unipi.it/~fpmaiale/notes/CdV-A.pdf"&gt;&lt;em&gt;Lecture Notes Calculus of Variations A&lt;/em&gt;&lt;/a&gt;&amp;rdquo; (introduction, first variation, Euler–Lagrange, with PDE flavor).&lt;/li&gt;
&lt;li&gt;Long Chen, &lt;a href="https://www.math.uci.edu/~chenlong/290C/2_classicalTheory.pdf"&gt;Classic theory of calculus of variation&lt;/a&gt; (focused on Euler–Lagrange, Legendre, Jacobi, Weierstrass conditions, weak vs strong minima).
​&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Collected Lectures on Complex Analysis</title><link>https://blog.namln.org/en/mathematics/analysis/complex-analysis/collected-lectures-ca/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/complex-analysis/collected-lectures-ca/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="https://mtaylor.web.unc.edu/wp-content/uploads/sites/16915/2018/04/complex.pdf"&gt;Introduction to Complex Analysis&lt;/a&gt; - Michael Taylor&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.uiuc.edu/~jpda/jpd-complex-geometry-book-5-refs-bip.pdf"&gt;An Introduction to Complex Analysis and Geometry&lt;/a&gt; - John P. D&amp;rsquo;Angelo (University of Illinois)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://math.sfsu.edu/beck/papers/complex.pdf"&gt;A First Course in Complex Analysis&lt;/a&gt; - Matthias Beck, Gerald Marchesi, Dennis Pixton, Lucas Sabalka&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.wustl.edu/~sk/books/guide.pdf"&gt;A Guide to Complex Variables&lt;/a&gt; - Steven G. Krantz&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.maths.manchester.ac.uk/~cwalkden/complex-analysis/complex_analysis.pdf"&gt;Complex Analysis&lt;/a&gt; - Charles Walkden&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.ku.dk/noter/filer/koman-12.pdf"&gt;Complex Analysis&lt;/a&gt; - Christian Berg&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://people.math.sc.edu/girardi/m7034/book/AshComplexVariablesWithHyperlinks.pdf"&gt;Complex Variables&lt;/a&gt; - R. B. Ash, W.P. Novinger&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.maths.lth.se/matematiklu/personal/olofsson/CompHT06.pdf"&gt;Complex Analysis&lt;/a&gt; - Christer Bennewitz&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://web.archive.org/web/20150620124453/https://www.math.washington.edu/~marshall/math_536/Notes.pdf"&gt;Complex Analysis&lt;/a&gt; - Donald E. Marshall&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://gauss.math.yale.edu/~ws442/complex.pdf"&gt;A Concise Course in Complex Analysis and Riemann Surfaces&lt;/a&gt; - Wilhelm Schlag&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://people.math.gatech.edu/%7Ecain/winter99/complex.html"&gt;Complex Analysis&lt;/a&gt; - G. Cain (Georgia Tech)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://complex-analysis.com/"&gt;Complex Analysis&lt;/a&gt; - Juan Carlos Ponce Campuzano&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Collected Lectures on Functional Analysis</title><link>https://blog.namln.org/en/mathematics/analysis/functional-analysis/collected-lectures-fa/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/functional-analysis/collected-lectures-fa/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="https://www.math.uwaterloo.ca/~lwmarcou/notes/pmath453.pdf"&gt;An Introduction to Functional Analysis&lt;/a&gt; - Laurent W. Marcoux (University of Waterloo)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://users.math.msu.edu/users/jeffrey/920/920notes.pdf"&gt;Functional Analysis: Lecture Notes&lt;/a&gt; - Jeff Schenker (Michigan State University)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://archive.org/details/TB_Ward___Functional_analysis_lecture_notes"&gt;Functional Analysis Lecture Notes&lt;/a&gt; - T.B. Ward (University of East Anglia)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.maths.lancs.ac.uk/~belton/www/notes/fa_notes.pdf"&gt;Functional Analysis&lt;/a&gt; - Alexander C. R. Belton&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://www.mat.univie.ac.at/~gerald/ftp/book-fa/fa.pdf"&gt;Topics in Real and Functional Analysis&lt;/a&gt; - Gerald Teschl&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www2.math.ou.edu/~cremling/teaching/lecturenotes/fa-new/LN-I.pdf"&gt;Functional Analysis&lt;/a&gt; - Christian Remling&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.harvard.edu/~shlomo/docs/Real_Variables.pdf"&gt;Theory of Functions of a Real Variable&lt;/a&gt; - Shlomo Sternberg&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://spot.colorado.edu/~baggett/functional.html"&gt;Functional Analysis&lt;/a&gt; - Lawerence Baggett&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Collected Lectures on Harmonic Analysis</title><link>https://blog.namln.org/en/mathematics/analysis/harmonic-analysis/collected-lectures-hd/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/harmonic-analysis/collected-lectures-hd/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.uiuc.edu/~laugesen/545/545Lectures.pdf"&gt;Harmonic Analysis Lecture Notes&lt;/a&gt; - Richard S. Laugesen (University of Illinois at Urbana–Champaign)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.uchicago.edu/~schlag/harmonicnotes.pdf"&gt;Harmonic Analysis&lt;/a&gt; - W. Schlag&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf"&gt;Lecture Notes: Fourier Transform and its Applications&lt;/a&gt; - Brad Osgood&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.reed.edu/physics/courses/Physics331.f08/pdf/Fourier.pdf"&gt;Fourier Analysis&lt;/a&gt; - Lucas Illing&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://ccrma.stanford.edu/~jos/mdft"&gt;Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications&lt;/a&gt; - Julius O. Smith III (Stanford University)&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Collected Lectures on Measure Theory</title><link>https://blog.namln.org/en/mathematics/analysis/measure-theory/collected-lectures-measure-theory/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/measure-theory/collected-lectures-measure-theory/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="https://terrytao.files.wordpress.com/2012/12/gsm-126-tao5-measure-book.pdf"&gt;An Introduction to Measure Theory&lt;/a&gt; - Terence Tao (UCLA)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.mat.uniroma2.it/~cannarsa/cam_0607.pdf"&gt;Lecture Notes on Measure Theory and Functional Analysis&lt;/a&gt; - P. Cannarsa, T. D’Aprile&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.chalmers.se/~borell/MeasureTheory.pdf"&gt;Lecture Notes in Measure Theory&lt;/a&gt; - Christer Borell&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.gold-saucer.org/math/lebesgue/lebesgue.pdf"&gt;A Crash Course on the Lebesgue Integral and Measure Theory&lt;/a&gt; - Steve Cheng&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://www.math.ucdavis.edu/~hunter/measure_theory/measure_notes.pdf"&gt;Measure Theory&lt;/a&gt; - John K. Hunter (University of California at Davis)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://people.math.ethz.ch/~salamon/PREPRINTS/measure.pdf"&gt;Measure and Integration&lt;/a&gt; - Dietmar A. Salamon (ETH Zürich)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.ucsd.edu/~bdriver/240-00-01/Lecture_Notes/measurep.pdf"&gt;Lecture notes: Measure Theory&lt;/a&gt; - Bruce K. Driver&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Collected Lectures on Ordinary Differential Equations (ODE)</title><link>https://blog.namln.org/en/mathematics/analysis/ode/collected-lectures-ode/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/ode/collected-lectures-ode/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="http://www.synechism.org/wp/difference-equations-to-differential-equations/"&gt;Difference Equations To Differential Equations&lt;/a&gt; - Dan Sloughter&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://www.math.uni-bielefeld.de/~grigor/odelec2008.pdf"&gt;Ordinary Differential Equation&lt;/a&gt; - Alexander Grigorian (University of Bielefeld)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.cs.bgu.ac.il/~leonid/ode_bio_files/Ionascu_LectNotes.pdf"&gt;Ordinary Differential Equations: Lecture Notes&lt;/a&gt; - Eugen J. Ionascu&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.lmu.de/~philip/publications/lectureNotes/ODE.pdf"&gt;Ordinary Differential Equations&lt;/a&gt; - Peter Philip&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://users.math.msu.edu/users/gnagy/teaching/ode.pdf"&gt;Ordinary Differential Equations&lt;/a&gt; - Gabriel Nagy&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.mat.univie.ac.at/~gerald/ftp/book-ode/ode.pdf"&gt;Ordinary Differential Equations and Dynamical Systems&lt;/a&gt; - Gerald Teschl&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://leipper.org/manuals/zip-fill/dn-difeq-notes.pdf"&gt;Notes on Differential Equations&lt;/a&gt; - Bob Terrell&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://digitalcommons.trinity.edu/mono/8/"&gt;Elementary Differential Equations&lt;/a&gt; - William F. Trench&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://digitalcommons.trinity.edu/mono/9/"&gt;Elementary Differential Equations With Boundary Value Problems&lt;/a&gt; - William F. Trench&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.jirka.org/diffyqs/"&gt;Notes on Diffy Qs: Differential Equations for Engineers&lt;/a&gt; - Jiří Lebl&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://djm.cc/library/Differential_Equations_Phillips_edited.pdf"&gt;Differential Equations&lt;/a&gt; - H. B. Phillips (1922)&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Collected Lectures on Partial Differential Equations (PDE)</title><link>https://blog.namln.org/en/mathematics/analysis/pde/collected-lectures-pde/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/pde/collected-lectures-pde/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="https://www.math.ucdavis.edu/~hunter/pdes/pde_notes.pdf"&gt;Notes on Partial Differential Equations&lt;/a&gt; - John K. Hunter (University of California at Davis)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.uni-leipzig.de/~miersemann/pdebook.pdf"&gt;Partial Differential Equations: Lecture Notes&lt;/a&gt; - Erich Miersemann (Leipzig University)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.mathphysics.com/pde/"&gt;Linear Methods of Applied Mathematics&lt;/a&gt; - E. Harrell, J. Herod (Georgia Tech)&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Collected Lectures on Real Analysis</title><link>https://blog.namln.org/en/mathematics/analysis/real-analysis/collected-lectures-ca/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/real-analysis/collected-lectures-ca/</guid><description>&lt;ul&gt;
&lt;li&gt;📝 &lt;a href="https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/"&gt;MIT OpenCourseWare Lectures on Calculus&lt;/a&gt; - G. Strang&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.wisc.edu/~keisler/calc.html"&gt;Elementary Calculus: An Approach Using Infinitesimals&lt;/a&gt; - Professor H. Jerome Keisler&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://www.math.ucdavis.edu/~hunter/intro_analysis_pdf/intro_analysis.pdf"&gt;An Introduction to Real Analysis&lt;/a&gt; - John K. Hunter (University of California at Davis)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://ramanujan.math.trinity.edu/wtrench/texts/TRENCH_REAL_ANALYSIS.PDF"&gt;Introduction to Real Analysis&lt;/a&gt; - William F. Trench (Trinity University, Texas)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.jirka.org/ra/realanal.pdf"&gt;Basic Analysis: Introduction to Real Analysis&lt;/a&gt; - Jiří Lebl&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://prac.im.pwr.wroc.pl/~kwasnicki/pl/stuff/tbb-hyper.pdf"&gt;Elementary Real Analysis&lt;/a&gt; - Thomson, Bruckner&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://ms.mcmaster.ca/~sawyer/Publications/Real_Analysis.pdf"&gt;Lecture Notes in Real Analysis&lt;/a&gt; - Eric T. Sawyer (McMaster University)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://math.harvard.edu/~ctm/papers/home/text/class/harvard/212a/course/course.pdf"&gt;Real Analysis&lt;/a&gt; - C. McMullen&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://bass.math.uconn.edu/3rd.pdf"&gt;Real Analysis for Graduate Students&lt;/a&gt; - Richard F. Bass&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.purdue.edu/~torres/pubs/Modern-real-analysis.pdf"&gt;Modern Real Analysis&lt;/a&gt; - William P. Ziemer (Indiana University)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.trillia.com/zakon-analysisI.html"&gt;Mathematical Analysis Vol I&lt;/a&gt; - Elias Zakon&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.trillia.com/zakon-analysisII.html"&gt;Mathematical Analysis Vol II&lt;/a&gt; - Elias Zakon&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.harvard.edu/~shlomo/docs/Advanced_Calculus.pdf"&gt;Advanced Calculus&lt;/a&gt; - Lynn Loomis, Schlomo Sternberg&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://spot.colorado.edu/~baggett/analysis.html"&gt; Analysis of Functions of a Single Variable&lt;/a&gt; - Lawerence Baggett&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.synechism.org/wp/the-calculus-of-functions-of-several-variables/"&gt;The Calculus of Functions of Several Variables&lt;/a&gt; - Dan Sloughter&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://web.pdx.edu/~erdman/PTAC/problemtext_pdf.pdf"&gt;A ProblemText in Advanced Calculus&lt;/a&gt; - John M. Erdman&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://hdl.handle.net/2027/spo.5597602.0001.001"&gt;Calculus and Linear Algebra. Vol. 1&lt;/a&gt; - Wilfred Kaplan, Donald J. Lewis&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://quod.lib.umich.edu/s/spobooks/5597602.0002.001"&gt;Calculus and Linear Algebra. Vol. 2&lt;/a&gt; - Wilfred Kaplan, Donald J. Lewis&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://www.math.odu.edu/~jhh/counter10.html"&gt;Introduction to Calculus I and II&lt;/a&gt; - J.H. Heinbockel&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://faculty.gvsu.edu/boelkinm/Home/Active_Calculus.html"&gt;Active Calculus&lt;/a&gt; - Matt Boelkins&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://math.berkeley.edu/~gbergman/ug.hndts/#Rudin"&gt;Supplements to the Exercises in Chapters 1-7 of Walter Rudin&amp;rsquo;s &amp;ldquo;Principles of Mathematical Analysis&amp;rdquo;&lt;/a&gt; - George M. Bergman&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://calculusmadeeasy.org/"&gt;Calculus Made Easy&lt;/a&gt; - Silvanus P. Thompson (1910)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="http://djm.cc/library/Elements_Differential_Integral_Calculus_Granville_edited_2.pdf"&gt;Elements of Differential and Integral Calculus&lt;/a&gt; - William Anthony Granville (1911)&lt;/li&gt;
&lt;li&gt;📝 &lt;a href="https://stitz-zeager.com/szprecalculus07042013.pdf"&gt;Precalculus&lt;/a&gt; - Carl Stitz, Jeff Zeager&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Combinatorial Drug Recommendation</title><link>https://blog.namln.org/en/research/ml-co/problems/drug-recommendation/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/drug-recommendation/</guid><description>&lt;h1 class="heading" id="combinatorial-drug-recommendation"&gt;
 Combinatorial Drug Recommendation&lt;span class="heading__anchor"&gt; &lt;a href="#combinatorial-drug-recommendation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Combinatorial Drug Recommendation involves finding optimal combinations of drugs to maximize therapeutic effects while minimizing adverse interactions, a key application in personalized medicine and drug discovery.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Combinatorial Drug Recommendations via Graph Neural Networks&lt;/strong&gt; Nature Medicine, 2023. &lt;a href="https://doi.org/10.1038/s41591-023-01485-9"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xin He, Yong Liu, Ying Wei, Yuqiao Zhang, Yizhou Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph Neural Networks for Drug-Drug Interactions&lt;/strong&gt; Bioinformatics, 2021. &lt;a href="https://doi.org/10.1093/bioinformatics/btab194"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yuhaoyang/GNN-DDI"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yu-Hao Yang, Fan Chen, Yajun Wang, Kun Huang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Learning Approaches for Drug Combination Analysis&lt;/strong&gt; Nature Computational Science, 2022. &lt;a href="https://doi.org/10.1038/s43588-022-00242-3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jing Yang, Fang Liu, Yung-Jen Chen, Kimberly Glass, Jill P. Mesirov&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Knowledge-Guided Neural Networks for Drug Interaction Prediction&lt;/strong&gt; Briefings in Bioinformatics, 2023. &lt;a href="https://doi.org/10.1093/bib/bbac585"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xiaowan Kuang, Yihang Pan, Hongmin Cai, Wentao Liu, De-Shuang Huang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Synergistic Drug Interaction Prediction&lt;/strong&gt; NeurIPS 2023 Workshop on AI for Drug Discovery, Biodesign and Therapeutics, 2023. &lt;a href="https://arxiv.org/abs/2311.13245"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen Wen, Xiaowei Zhang, Tengfei Ma&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Explainable Machine Learning for Drug Combinations&lt;/strong&gt; Machine Learning for Healthcare, 2023. &lt;a href="https://arxiv.org/abs/2308.10956"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nathan Leung, Jingxi Jessica Lu, Michael Vigh&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Transfer Learning for Combinatorial Drug Sensitivity Prediction&lt;/strong&gt; IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2023. &lt;a href="https://doi.org/10.1109/TCBB.2022.3232357"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zheng Zhang, Jing Ma, Yong Liu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Conjunctive Query Containment</title><link>https://blog.namln.org/en/research/ml-co/problems/conjunctive-query-containment/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/conjunctive-query-containment/</guid><description>&lt;h1 class="heading" id="conjunctive-query-containment"&gt;
 Conjunctive Query Containment&lt;span class="heading__anchor"&gt; &lt;a href="#conjunctive-query-containment"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Conjunctive Query Containment (CQC) is a fundamental problem in database theory and reasoning, determining whether one query result is guaranteed to be a subset of another query&amp;rsquo;s result.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Reason over Relational Data&lt;/strong&gt; ICLR, 2020. &lt;a href="https://arxiv.org/abs/2203.04718"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dario Amodei, Tom Brown, Ben Wang, Jared Kaplan, Chris Olah, Sam McCandlish&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Differentiable Optimization</title><link>https://blog.namln.org/en/research/ml-co/problems/differentiable-optimization/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/differentiable-optimization/</guid><description>&lt;h1 class="heading" id="differentiable-optimization"&gt;
 Differentiable Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#differentiable-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Differentiable optimization makes optimization layers differentiable so they can be embedded in neural networks, enabling end-to-end learning with optimization as a component.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;OptNet: Differentiable Optimization as a Layer in Neural Networks&lt;/strong&gt; ICML, 2017. &lt;a href="https://arxiv.org/abs/1703.00760"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/locuslab/OptNet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Brandon Amos, J. Zico Kolter&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Differentiation of Blackbox Combinatorial Solvers&lt;/strong&gt; ICLR, 2020. &lt;a href="https://openreview.net/forum?id=SkevoJsCYB"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/google-research/diff_blackbox_solver"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Maria-Florina Balcan, Dan DeFreitas, Amit Levi, Segev Shlomovich&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints&lt;/strong&gt; ICML, 2021. &lt;a href="https://arxiv.org/abs/2105.02551"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/kwonmha/CombOptNet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Minhan Han, Patrick Wilder, Valdinei Freire, Harikrishna Narasimhan, Andrew Perrault, Milind Tambe&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Implicit Differentiation of Nonlinear Optimization Problems&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://openreview.net/forum?id=x6-RhzxRqH4"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/IVRL/differentiation_of_optimization"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jean-Pierre Hespanha, Noureddine Elhadji Boularas, Daniel Cremers&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decision-Focused Learning in Games&lt;/strong&gt; ICML, 2023. &lt;a href="https://proceedings.mlr.press/v202/thesot23a/thesot23b"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yoann Thesot, Maxime Wabartha, Vincent François-Lavet&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Prescribe with Differentiable Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://proceedings.mlr.press/v202/donti23a"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/locuslab/learning-to-prescribe"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Niki Zadeh, J. Zico Kolter, Brandon Amos&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Ebooks on Combinatorics</title><link>https://blog.namln.org/en/mathematics/combinatorics/ebooks/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/combinatorics/ebooks/</guid><description/></item><item><title>Electronic Design Automation</title><link>https://blog.namln.org/en/research/ml-co/problems/eda/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/eda/</guid><description>&lt;h1 class="heading" id="electronic-design-automation"&gt;
 Electronic Design Automation&lt;span class="heading__anchor"&gt; &lt;a href="#electronic-design-automation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Electronic Design Automation (EDA) involves computational tools for designing and verifying electronic circuits and systems. ML approaches optimize placement, routing, timing, and other design parameters.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Machine Learning for Electronic Design Automation: A Survey&lt;/strong&gt; ACM Transactions on Design Automation of Electronic Systems, 2021. &lt;a href="https://doi.org/10.1145/3451165"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Guyue Huang, Jingbo Hu, Yifan He, Jialong Liu, Mingjie Liu, Zhaoyang Shen, Jian Shi, Yuanfeng Peng, Chenxi Wang, Bin He, Young-Joon Lee, Haoxing Ren&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Chip Placement with Deep Reinforcement Learning&lt;/strong&gt; ICLR, 2021. &lt;a href="https://openreview.net/forum?id=ipGigyBiBv"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/google-research/chip-placement"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Olivier Bastien, Joe Bobba, Naveen Bobbili, Paul N. Chen, Mike Compt, Paul H. Huang, Abe Kahng, Seunggeun Lee, Megan Li, Lukasz Lew, Mark Marson, Peilin Song, Sameer Vora, Jeff Weinberg, Zihan Ye, Hailong Yun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;RouteNet: Leveraging Graph Neural Networks for Network Modeling and Optimization in SDN&lt;/strong&gt; NSDI, 2019. &lt;a href="https://arxiv.org/abs/1910.11515"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/agupta231/routenet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gerardo Ferrando, Eduard Almendares, Miquel Ferriol, Albert López, David Cordobés, Sergi Abadal, Eduard Alarcón, Albert Cabellos-Aparicio, Jordi Suñé&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Heuristics over Large Graphs via Deep Reinforcement Learning&lt;/strong&gt; ICLR, 2018. &lt;a href="https://arxiv.org/abs/1903.01694"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Guyue Huang, Zemin Wang, Haoxing Ren&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GCN-RL Circuit Designer: Transferable Transductive Boundary Search for Analog Circuit Optimization&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=hDEoLiXm_2K"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/PKU-ICST-MIPL/GCN-RL-Circuit-Designer"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Keren Zhu, Mingjie Liu, Yaguang Li, Yisong Yue, Haoxing Ren&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;RL4RewriteRules: Generating Rewrite Rules from Offline Reinforcement Learning Trajectories&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/forum?id=D8XRrnZ8cj"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/OpenXLab-NAS/RL4RewriteRules"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kaiyuan Hu, Runpeng Guo, Changlin Yan, Jianye Hao, Ping Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Facility Location Problem</title><link>https://blog.namln.org/en/research/ml-co/problems/facility-location/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/facility-location/</guid><description>&lt;h1 class="heading" id="facility-location-problem"&gt;
 Facility Location Problem&lt;span class="heading__anchor"&gt; &lt;a href="#facility-location-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Facility Location Problem determines optimal locations for facilities (warehouses, hospitals, etc.) to serve customers while minimizing total costs including facility opening costs and transportation costs.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Combinatorial Optimization via Variational Graph Autoencoders&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://openreview.net/forum?id=fJFJv8yWVzi"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jieyi Bi, Peng Lin, Chao Qu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Learning for Combinatorial Optimization&lt;/strong&gt; IJCAI, 2021. &lt;a href="https://arxiv.org/abs/2104.00038"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Shiyu Zhao, Yong Tao, Keyvan Mohajer&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Game Theoretic Semantics</title><link>https://blog.namln.org/en/research/ml-co/problems/game-theoretic-semantics/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/game-theoretic-semantics/</guid><description>&lt;h1 class="heading" id="game-theoretic-semantics"&gt;
 Game Theoretic Semantics&lt;span class="heading__anchor"&gt; &lt;a href="#game-theoretic-semantics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Game Theoretic Semantics (GTS) provides a game-based interpretation of logical formulas, where truth is determined by the existence of winning strategies in semantic games.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Game-Theoretic Aspects of Computation and Approximation Algorithms for Combinatorial Optimization&lt;/strong&gt; Handbook of Computational Complexity, 2012. &lt;a href="https://doi.org/10.1007/978-1-4614-1800-9_19"&gt;book-chapter&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Steve Chien, Alistair Sinclair&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Generalization</title><link>https://blog.namln.org/en/research/ml-co/problems/generalization/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/generalization/</guid><description>&lt;h1 class="heading" id="generalization"&gt;
 Generalization&lt;span class="heading__anchor"&gt; &lt;a href="#generalization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Generalization is a critical aspect of machine learning for combinatorial optimization. This section covers approaches to improve generalization across different problem instances and scales.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;It&amp;rsquo;s Not What Machines Can Learn It&amp;rsquo;s What We Cannot Teach&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/yehuda20a/yehuda20a.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gal Yehuda, Moshe Gabel and Assaf Schuster&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning TSP Requires Rethinking Generalization&lt;/strong&gt; CP, 2021. &lt;a href="https://arxiv.org/pdf/2006.07054.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/chaitjo/learning-tsp"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chaitanya K. Joshi, Quentin Cappart, Louis-Martin Rousseau and Thomas Laurent&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=vJZ7dPIjip3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Simon Geisler, Johanna Sommer, Jan Schuchardt, Aleksandar Bojchevski and Stephan Günnemann&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning for Robust Combinatorial Optimization: Algorithm and Application&lt;/strong&gt; INFOCOM, 2022. &lt;a href="https://ieeexplore.ieee.org/abstract/document/9796715/"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Shao, Zhihui and Yang, Jianyi and Shen, Cong and Ren, Shaolei&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ROCO: A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=2r6YMqz4Mml"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ROCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lu, Han and Li, Zenan and Wang, Runzhong and Ren, Qibing and Li, Xijun and Yuan, Mingxuan and Zeng, Jia and Yang, Xiaokang and Yan, Junchi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Towards Omni-generalizable Neural Methods for Vehicle Routing Problems&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25267"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/RoyalSkye/Omni-VRP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhou Jianan, Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GOAL: A Generalist Combinatorial Optimization Agent Learner&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=z2z9suDRjw"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Darko Drakulic, Sofia Michel, Jean-Marc Andreoli&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Graph Coloring</title><link>https://blog.namln.org/en/research/ml-co/problems/graph-coloring/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/graph-coloring/</guid><description>&lt;h1 class="heading" id="graph-coloring"&gt;
 Graph Coloring&lt;span class="heading__anchor"&gt; &lt;a href="#graph-coloring"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Graph Coloring is the problem of assigning colors to vertices such that no two adjacent vertices have the same color, with applications in scheduling and frequency assignment.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Learning-based Hybrid Graph-Coloring Algorithm for Register Allocation.&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/abs/1912.03700"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Das, Dibyendu and Ahmad, Shahid Asghar and Venkataramanan, Kumar.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Models for Output-Space Invariance in Combinatorial Problems&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=ibrUkC-pbis"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nandwani, Yatin and Jain, Vidit and Singla, Parag and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Enhancing Column Generation by a Machine-Learning-Based Pricing Heuristic for Graph Coloring&lt;/strong&gt; AAAI, 2022. &lt;a href="https://www.aaai.org/AAAI22Papers/AAAI-4026.ShenY.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Joey-Shen/MLPH.git"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Shen, Yunzhuang, Yuan Sun, Xiaodong Li, Andrew Craig Eberhard and Andreas T. Ernst.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Generate Columns with Application to Vertex Coloring&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=JHW30A4DXtO"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yuansuny/mlcg"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sun, Yuan and Ernst, Andreas T and Li, Xiaodong and Weiner, Jake&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Graph Edit Distance (GED)</title><link>https://blog.namln.org/en/research/ml-co/problems/graph-edit-distance/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/graph-edit-distance/</guid><description>&lt;h1 class="heading" id="graph-edit-distance-ged"&gt;
 Graph Edit Distance (GED)&lt;span class="heading__anchor"&gt; &lt;a href="#graph-edit-distance-ged"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Graph Edit Distance measures the minimum cost of transformations needed to change one graph into another. It has applications in pattern matching and graph similarity computation.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SimGNN - A Neural Network Approach to Fast Graph Similarity Computation&lt;/strong&gt; WSDM, 2019. &lt;a href="https://arxiv.org/abs/1808.05689"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yunshengb/SimGNN"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bai, Yunsheng and Ding, Hao and Bian, Song and Chen, Ting and Sun, Yizhou and Wang, Wei&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph Matching Networks for Learning the Similarity of Graph Structured Objects&lt;/strong&gt; ICML, 2019. &lt;a href="https://arxiv.org/abs/1904.12787"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Lin-Yijie/Graph-Matching-Networks"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Yujia and Gu, Chenjie and Dullien, Thomas and Vinyals, Oriol and Kohli, Pushmeet&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convolutional Embedding for Edit Distance&lt;/strong&gt; SIGIR, 2020. &lt;a href="https://dl.acm.org/doi/abs/10.1145/3397271.3401045"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/xinyandai/string-embed"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dai, Xinyan and Yan, Xiao and Zhou, Kaiwen and Wang, Yuxuan and Yang, Han and Cheng, James&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning-Based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching&lt;/strong&gt; AAAI, 2020. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/5720"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yunshengb/GraphSim"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bai, Yunsheng and Ding, Hao and Gu, Ken and Sun, Yizhou and Wang, Wei&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://arxiv.org/abs/2106.04927"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/PPO-BiHyb"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Hua, Zhigang and Liu, Gan and Zhang, Jiayi and Yan, Junchi and Qi, Feng and Yang, Shuang and Zhou, Jun and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Combinatorial Learning of Graph Edit Distance via Dynamic Embedding.&lt;/strong&gt; CVPR, 2021. &lt;a href="https://arxiv.org/abs/2011.15039"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/GENN-Astar"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Zhang, Tianqi and Yu, Tianshu and Yan, Junchi and Yang, Xiaokang.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Graph Matching (GM)</title><link>https://blog.namln.org/en/research/ml-co/problems/graph-matching/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/graph-matching/</guid><description>&lt;h1 class="heading" id="graph-matching-gm"&gt;
 Graph Matching (GM)&lt;span class="heading__anchor"&gt; &lt;a href="#graph-matching-gm"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Graph Matching is a fundamental combinatorial optimization problem that involves finding correspondences between vertices of two graphs.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Revised Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks&lt;/strong&gt; Arxiv, 2017. &lt;a href="https://arxiv.org/pdf/1706.07450.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/alexnowakvila/QAP_pt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nowak, Alex and Villar, Soledad and Bandeira, S. Afonso and Bruna, Joan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Learning of Graph Matching.&lt;/strong&gt; CVPR, 2018. &lt;a href="https://openaccess.thecvf.com/content_cvpr_2018/html/Zanfir_Deep_Learning_of_CVPR_2018_paper.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zanfir, Andrei and Sminchisescu, Cristian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Learning Combinatorial Embedding Networks for Deep Graph Matching.&lt;/strong&gt; ICCV, 2019. &lt;a href="http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_Learning_Combinatorial_Embedding_Networks_for_Deep_Graph_Matching_ICCV_2019_paper.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Yan, Junchi and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Graphical Feature Learning for the Feature Matching Problem.&lt;/strong&gt; ICCV, 2019. &lt;a href="https://openaccess.thecvf.com/content_ICCV_2019/papers/Zhang_Deep_Graphical_Feature_Learning_for_the_Feature_Matching_Problem_ICCV_2019_paper.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhang, Zhen and Lee, Wee Sun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GLMNet: Graph Learning-Matching Networks for Feature Matching.&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/abs/1911.07681"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiang, Bo and Sun, Pengfei and Tang, Jin and Luo, Bin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Learning deep graph matching with channel-independent embedding and Hungarian attention.&lt;/strong&gt; ICLR, 2020. &lt;a href="https://openreview.net/forum?id=rJgBd2NYPH"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yu, Tianshu and Wang, Runzhong and Yan, Junchi and Li, Baoxin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Graph Matching Consensus.&lt;/strong&gt; ICLR, 2020. &lt;a href="http://arxiv.org/abs/2001.09621"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Fey, Matthias and Lenssen, Jan E. and Morris, Christopher and Masci, Jonathan and Kriege, Nils M.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Graduated Assignment for Joint Multi-Graph Matching and Clustering with Application to Unsupervised Graph Matching Network Learning.&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://papers.NeurIPS.cc/paper/2020/file/e6384711491713d29bc63fc5eeb5ba4f-Paper.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Yan, Junchi and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Combinatorial Learning of Robust Deep Graph Matching: An Embedding Based Approach.&lt;/strong&gt; TPAMI, 2020. &lt;a href="https://doi.org/10.1109/TPAMI.2020.3005590"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Yan, Junchi and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers.&lt;/strong&gt; ECCV, 2020. &lt;a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123730409.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/martius-lab/blackbox-deep-graph-matching"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Rolinek, Michal and Swoboda, Paul and Zietlow, Dominik and Paulus, Anselm and Musil, Vit and Martius, Georg&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Neural Graph Matching Network: Learning Lawler&amp;rsquo;s Quadratic Assignment Problem with Extension to Hypergraph and Multiple-graph Matching.&lt;/strong&gt; TPAMI, 2021. &lt;a href="https://arxiv.org/abs/1911.11308"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Yan, Junchi and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Deep Latent Graph Matching&lt;/strong&gt; ICML, 2021. &lt;a href="http://proceedings.mlr.press/v139/yu21d/yu21d.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yu, Tianshu and Wang, Runzhong and Yan, Junchi and Li, Baoxin.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;IA-GM: A Deep Bidirectional Learning Method for Graph Matching&lt;/strong&gt; AAAI, 2021. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/16461/16268"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhao, Kaixuan and Tu, Shikui and Xu, Lei&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Graph Matching under Quadratic Constraint&lt;/strong&gt; CVPR, 2021. &lt;a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Gao_Deep_Graph_Matching_Under_Quadratic_Constraint_CVPR_2021_paper.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gao, Quankai and Wang, Fudong and Xue, Nan and Yu, Jin-Gang and Xia, Gui-Song&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GAMnet: Robust Feature Matching via Graph Adversarial-Matching Network&lt;/strong&gt; MM, 2021. &lt;a href="https://dl.acm.org/doi/pdf/10.1145/3474085.3475669"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiang, Bo and Sun, Pengfei and Zhang, Ziyan and Tang, Jin and Luo, Bin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Hypergraph Neural Networks for Hypergraph Matching&lt;/strong&gt; ICCV, 2021. &lt;a href="https://openaccess.thecvf.com/content/ICCV2021/papers/Liao_Hypergraph_Neural_Networks_for_Hypergraph_Matching_ICCV_2021_paper.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liao, Xiaowei and Xu, Yong and Ling, Haibin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Match Features with Seeded Graph Matching Network&lt;/strong&gt; ICCV, 2021. &lt;a href="https://openaccess.thecvf.com/content/ICCV2021/html/Chen_Learning_To_Match_Features_With_Seeded_Graph_Matching_Network_ICCV_2021_paper.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Hongkai and Luo, Zixin and Zhang, Jiahui and Zhou, Lei and Bai, Xuyang and Hu, Zeyu and Tai, Chiew-Lan and Quan, Long&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Appearance and Structure Aware Robust Deep Visual Graph Matching: Attack, Defense and Beyond&lt;/strong&gt; CVPR, 2022. &lt;a href="https://openaccess.thecvf.com/content/CVPR2022/papers/Ren_Appearance_and_Structure_Aware_Robust_Deep_Visual_Graph_Matching_Attack_CVPR_2022_paper.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/RobustMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ren, Qibing and Bao, Qingquan and Wang, Runzhong and Yan, Junchi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Self-supervised Learning of Visual Graph Matching&lt;/strong&gt; ECCV, 2022. &lt;a href="https://link.springer.com/chapter/10.1007/978-3-031-20050-2_22"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch-SCGM"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liu, Chang and Zhang, Shaofeng and Yang, Xiaokang and Yan, Junchi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching.&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=QjQibO3scV_"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/RGM"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liu, Chang and Jiang, Zetian and Wang, Runzhong and Yan, Junchi and Huang, Lingxiao and Lu, Pinyan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SeedGNN: Graph Neural Network for Supervised Seeded Graph Matching&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/24282"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yu, Liren and Xu, Jiaming and Lin, Xiaojun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/24358"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liu, Xuan, Lin Zhang, Jiaqi Sun, Yujiu Yang and Haiqing Yang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐LinSATNet: The Positive Linear Satisfiability Neural Networks&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25110"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/LinSATNet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Runzhong Wang and Yunhao Zhang and Ziao Guo and Tianyi Chen and Xiaokang Yang and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=xE7oH5iVGK"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/duyhominhnguyen/LVM-Med"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nguyen, Duy MH and Nguyen, Hoang and Diep, Nghiem T and Pham, Tan N and Cao, Tri and Nguyen, Binh T and Swoboda, Paul and Ho, Nhat and Albarqouni, Shadi and Xie, Pengtao and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Improving Graph Matching with Positional Reconstruction Encoder-Decoder Network&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=28RTu9MOT6"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhou, Yixiao and Jia, Ruiqi and Lin, Hongxiang and Quan, Hefeng and Zhao, Yumeng and Lyu, Xiaoqing&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Prune Instances of Steiner Tree Problem in Grap&lt;/strong&gt; INOC, 2024. &lt;a href="https://openproceedings.org/2024/conf/inoc/INOC_31.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/dajwani/alenex22"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiwei Zhang, Dena Tayebi, Saurabh Ray, Deepak Ajwani&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Hamiltonian Cycle Problem (HCP)</title><link>https://blog.namln.org/en/research/ml-co/problems/hamiltonian-cycle/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/hamiltonian-cycle/</guid><description>&lt;h1 class="heading" id="hamiltonian-cycle-problem-hcp"&gt;
 Hamiltonian Cycle Problem (HCP)&lt;span class="heading__anchor"&gt; &lt;a href="#hamiltonian-cycle-problem-hcp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Hamiltonian Cycle Problem seeks to find a cycle visiting each vertex exactly once. It is NP-complete and is fundamental to understanding NP-hardness.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://arxiv.org/abs/2106.04927"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/PPO-BiHyb"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Hua, Zhigang and Liu, Gan and Zhang, Jiayi and Yan, Junchi and Qi, Feng and Yang, Shuang and Zhou, Jun and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=yEwakMNIex"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/UniCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wenzheng Pan, Hao Xiong, Jiale Ma, Wentao Zhao, Yang Li, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Influence Maximization</title><link>https://blog.namln.org/en/research/ml-co/problems/influence-maximization/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/influence-maximization/</guid><description>&lt;h1 class="heading" id="influence-maximization"&gt;
 Influence Maximization&lt;span class="heading__anchor"&gt; &lt;a href="#influence-maximization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Influence Maximization seeks to select a set of influential nodes in a network to maximize information spread. It has applications in social network marketing.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Heuristics over Large Graphs via Deep Reinforcement Learning.&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://arxiv.org/abs/1903.03332"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Mittal, Akash and Dhawan, Anuj and Manchanda, Sahil and Medya, Sourav and Ranu, Sayan and Singh, Ambuj.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks.&lt;/strong&gt; ICML, 2021. &lt;a href="https://arxiv.org/abs/2010.05313"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Eli A. Meirom, Haggai Maron, Shie Mannor, Gal Chechik&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation&lt;/strong&gt; ICML, 2022. &lt;a href="https://proceedings.mlr.press/v162/ireland22a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/davidireland3/LeNSE"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ireland, David and G. Montana&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=h21yJhdzbwz"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/One-Shot-Cardinality-NN-Solver"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Shen, Li and Chen, Yiting and Yan, Junchi and Yang, Xiaokang and Tao, Dacheng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Graph Representation Learning and Optimization for Influence Maximization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/24512"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen Ling and Junji Jiang and Junxiang Wang and My T. Thai and Lukas Xue and James Song and Meikang Qiu and Liang Zhao&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Job Shop Scheduling Problem (JSSP)</title><link>https://blog.namln.org/en/research/ml-co/problems/jssp/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/jssp/</guid><description>&lt;h1 class="heading" id="job-shop-scheduling-problem-jssp"&gt;
 Job Shop Scheduling Problem (JSSP)&lt;span class="heading__anchor"&gt; &lt;a href="#job-shop-scheduling-problem-jssp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Job Shop Scheduling Problem is a classic combinatorial optimization problem where jobs must be scheduled on machines with precedence constraints.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Smart Manufacturing Scheduling With Edge Computing Using Multiclass Deep Q Network&lt;/strong&gt; Transactions on Industrial Informatics, 2019. &lt;a href="https://ieeexplore.ieee.org/document/8676376"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chun-Cheng Lin, Der-Jiunn Deng, Yen-Ling Chih, Hsin-Ting Chiu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-Agent Reinforcement Learning for Job Shop Scheduling in Flexible Manufacturing Systems&lt;/strong&gt; International Conference on Artificial Intelligence for Industries (AI4I), 2019. &lt;a href="https://ieeexplore.ieee.org/document/9027776"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Schirin Baer, Jupiter Bakakeu, Richard Meyes, Tobias Meisen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning.&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://arxiv.org/abs/2010.12367"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/zcajiayin/L2D"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhang, Cong and Song, Wen and Cao, Zhiguang and Zhang, Jie and Tan, Puay Siew and Xu, Chi.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;ScheduleNet: Learn to Solve Multi-agent Scheduling Problems with Reinforcement Learning&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2106.03051"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Junyoung Park, Sanjar Bakhtiyar, Jinkyoo Park&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dynamic job-shop scheduling in smart manufacturing using deep reinforcement learning&lt;/strong&gt; Computer Networks, 2021. &lt;a href="https://www.sciencedirect.com/science/article/pii/S1389128621001031"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Libing Wang, Xin Hu, Yin Wang, Sujie Xu, Shijun Ma, Kexin Yang, Zhijun Liu, Weidong Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning.&lt;/strong&gt; International Journal of Production Research, 2021. &lt;a href="https://arxiv.org/abs/2106.01086"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Junyoung Park, Jaehyeong Chun, Sang Hun Kim, Youngkook Kim, Jinkyoo Park&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Explainable reinforcement learning in production control of job shop manufacturing system.&lt;/strong&gt; International Journal of Production Research, 2021. &lt;a href="https://www.tandfonline.com/doi/abs/10.1080/00207543.2021.1972179?journalCode=tprs20"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andreas Kuhnle,Marvin Carl May,Louis Sch?fer &amp;amp; Gisela Lanza&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=cd5D1DD923"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/DeepACO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ye, Haoran and Wang, Jiarui and Cao, Zhiguang and Liang, Helan and Li, Yong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=v6VpqGcGAR"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Grinsztajn, Nathan and Furelos-Blanco, Daniel and Surana, Shikha and Bonnet, Cl{'e}ment and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Combinatorial Optimization with Policy Adaptation using Latent Space Search&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=vpMBqdt9Hl"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chalumeau, Felix and Surana, Shikha and Bonnet, Cl{'e}ment and Grinsztajn, Nathan and Pretorius, Arnu and Laterre, Alexandre and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural DAG Scheduling via One-Shot Priority Sampling&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=WL8FlAugqQ"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jeon, Wonseok and Gagrani, Mukul and Bartan, Burak and Zeng, Weiliang Will and Teague, Harris and Zappi, Piero and Lott, Christopher&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Robust Scheduling with GFlowNets&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=ZBUthI6wK9h"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhang, David W and Rainone, Corrado and Peschl, Markus and Bondesan, Roberto&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Continual Task Allocation in Meta-Policy Network via Sparse Prompting&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/24080"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang, Yijun, Tianyi Zhou, Jing Jiang, Guodong Long and Yuhui Shi.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Applicability of Neural Combinatorial Optimization: A Critical View&lt;/strong&gt; TELO, 2024. &lt;a href="https://dl.acm.org/doi/pdf/10.1145/3647644"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/Applicability-NCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Knapsack Problem</title><link>https://blog.namln.org/en/research/ml-co/problems/knapsack/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/knapsack/</guid><description>&lt;h1 class="heading" id="knapsack-problem"&gt;
 Knapsack Problem&lt;span class="heading__anchor"&gt; &lt;a href="#knapsack-problem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Knapsack Problem is a classic optimization problem where items with weights and values must be selected to maximize total value while respecting a weight constraint.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Novel Method to Solve Neural Knapsack Problems&lt;/strong&gt; ICML, 2021. &lt;a href="http://proceedings.mlr.press/v139/li21m.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/One-Shot-Cardinality-NN-Solver"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li Duanshun and Liu Jing and Lee Dongeun and Seyedmazloom Ali and Kaushik Giridhar and Lee Kookjin and Park Noseong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=cd5D1DD923"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/DeepACO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ye, Haoran and Wang, Jiarui and Cao, Zhiguang and Liang, Helan and Li, Yong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=v6VpqGcGAR"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Grinsztajn, Nathan and Furelos-Blanco, Daniel and Surana, Shikha and Bonnet, Clément and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=593fc38lhN"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/bill-cjb/EMNH"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Jinbiao and Wang, Jiahai and Zhang, Zizhen and Cao, Zhiguang and Ye, Te and Chen, Siyuan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;BQ-NCO: Bisimulation Quotienting for Efficient Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=BRqlkTDvvm"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/naver/bq-nco"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Drakulic, Darko and Michel, Sofia and Mai, Florian and Sors, Arnaud and Andreoli, Jean-Marc&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Multi-Objective Combinatorial Optimization with Diversity Enhancement&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=N4JkStI1fe"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/bill-cjb/NHDE"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Jinbiao and Zhang, Zizhen and Cao, Zhiguang and Wu, Yaoxin and Ma, Yining and Ye, Te and Wang, Jiahai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Rethinking Neural Multi-Objective Combinatorial Optimization via Neat Weight Embedding&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=GM7cmQfk2F"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jinbiao Chen, Zhiguang Cao, Jiahai Wang, Yaoxin Wu, Hanzhang Qin, Zizhen Zhang, Yue-Jiao Gong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Approximation algorithms for combinatorial optimization with predictions&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=AEFVa6VMu1"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Antonios Antoniadis, Marek Elias, Adam Polak, Moritz Venzin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Max Clique</title><link>https://blog.namln.org/en/research/ml-co/problems/max-clique/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/max-clique/</guid><description>&lt;h1 class="heading" id="max-clique"&gt;
 Max Clique&lt;span class="heading__anchor"&gt; &lt;a href="#max-clique"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Maximum Clique problem seeks the largest clique in a graph. A clique is a subset of vertices where every vertex is connected to every other vertex.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Can Hybrid Geometric Scattering Networks Help Solve the Maximum Clique Problem&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=uxc8hDSs_xh"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yimengmin/geometricscatteringmaximalclique"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yimeng Min, Frederik Wenkel, Michael Perlmutter, Guy Wolf&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Variational Annealing on Graphs for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=SLx7paoaTU"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ml-jku/VAG-CO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sanokowski, Sebastian and Berghammer, Wilhelm Franz and Hochreiter, Sepp and Lehner, Sebastian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DISCS: A Benchmark for Discrete Sampling&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=oi1MUMk5NF"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/google-research/discs"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Katayoon Goshvadi, Haoran Sun, Xingchao Liu, Azade Nova, Ruqi Zhang, Will Sussman Grathwohl, Dale Schuurmans, Hanjun Dai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning fine-grained search space pruning and heuristics for combinatorial optimization.&lt;/strong&gt; Journal of Heuristics, 2023. &lt;a href="https://dx.doi.org/10.1007/s10732-023-09512-z"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Juho Lauri, Sourav Dutta, Marco Grassia, Deepak Ajwani&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization&lt;/strong&gt; ICML, 2024. &lt;a href="https://arxiv.org/abs/2406.01661"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ml-jku/DIffUCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sanokowski, Sebastian and Hochreiter, Sepp and Lehner, Sebastian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/pdf?id=peNgxpbdxB"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sebastian Sanokowski, Wilhelm Franz Berghammer, Haoyu Peter Wang, Martin Ennemoser, Sepp Hochreiter, Sebastian Lehner&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Approximation algorithms for combinatorial optimization with predictions&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=AEFVa6VMu1"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Antonios Antoniadis, Marek Elias, Adam Polak, Moritz Venzin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐COExpander: Adaptive Solution Expansion for Combinatorial Optimization&lt;/strong&gt; ICML, 2025. &lt;a href="https://openreview.net/forum?id=KMaBXMWsBM"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/COExpander"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs&lt;/strong&gt; NeurIPS, 2025. &lt;a href="https://openreview.net/forum?id=ye4ntB1Kzi"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ML4CO-Bench-101"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Maximal Common Subgraph (MCS)</title><link>https://blog.namln.org/en/research/ml-co/problems/maximal-common-subgraph/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/maximal-common-subgraph/</guid><description>&lt;h1 class="heading" id="maximal-common-subgraph-mcs"&gt;
 Maximal Common Subgraph (MCS)&lt;span class="heading__anchor"&gt; &lt;a href="#maximal-common-subgraph-mcs"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Maximal Common Subgraph problem finds the largest subgraph common to two graphs, with applications in molecular matching and pattern discovery.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fast Detection of Maximum Common Subgraph via Deep Q-Learning.&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/2002.03129"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bai, Yunsheng and Xu, Derek and Wang, Alex and Gu, Ken and Wu, Xueqing and Marinovic, Agustin and Ro, Christopher and Sun, Yizhou and Wang, Wei.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Maximal Cut (Max-Cut)</title><link>https://blog.namln.org/en/research/ml-co/problems/maximal-cut/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/maximal-cut/</guid><description>&lt;h1 class="heading" id="maximal-cut-max-cut"&gt;
 Maximal Cut (Max-Cut)&lt;span class="heading__anchor"&gt; &lt;a href="#maximal-cut-max-cut"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Maximal Cut problem is to partition the vertices of a graph into two sets to maximize the number of edges between them. It&amp;rsquo;s a fundamental problem in combinatorial optimization.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Combinatorial Optimization Algorithms over Graphs.&lt;/strong&gt; NeurIPS, 2017. &lt;a href="https://arxiv.org/abs/1704.01665"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dai, Hanjun and Khalil, Elias B and Zhang, Yuyu and Dilkina, Bistra and Song, Le&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Exploratory Combinatorial Optimization with Reinforcement Learning.&lt;/strong&gt; AAAI, 2020. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/5723"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;LBarrett, Thomas and Clements, William and Foerster, Jakob and Lvovsky, Alex.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Erdos Goes Neural: an Unsupervised Learning Framework for Combinatorial Optimization on Graphs.&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://static.aminer.cn/upload/pdf/575/1127/1864/5eede0b791e0116a23aafe7b_1.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Karalias, Nikolaos and Loukas, Andreas&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reversible Action Design for Combinatorial Optimization with Reinforcement Learning&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2102.07210"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yao, Fan and Cai, Renqin and Wang, Hongning&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;LeNSE: Learning To Navigate Subgraph Embeddings for Large-Scale Combinatorial Optimisation&lt;/strong&gt; ICML, 2022. &lt;a href="https://procedures.mlr.press/v162/ireland22a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/davidireland3/LeNSE"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ireland, David and G. Montana&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2205.14105"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/tomdbar/ecord"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Barrett, Thomas D and Parsonson, Christopher WF and Laterre, Alexandre&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Revisiting Sampling for Combinatorial Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/23661"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sun, Haoran, Goshvadi Katayoon,Nova Azade,Schuurmans Dale and Dai Hanjun.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=mmTy1iyU5G"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Caramanis, Constantine and Fotakis, Dimitris and Kalavasis, Alkis and Kontonis, Vasilis and Tzamos, Christos&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Improvement Heuristics for Graph Combinatorial Optimization Problems&lt;/strong&gt; TNNLS, 2023. &lt;a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10271315"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://arxiv.org/abs/2305.17010"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/zdhNarsil/GFlowNet-CombOpt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Variational Annealing on Graphs for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=SLx7paoaTU"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ml-jku/VAG-CO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sanokowski, Sebastian and Berghammer, Wilhelm Franz and Hochreiter, Sepp and Lehner, Sebastian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DISCS: A Benchmark for Discrete Sampling&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=oi1MUMk5NF"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Katayoon Goshvadi, Haoran Sun, Xingchao Liu, Azade Nova, Ruqi Zhang, Will Sussman Grathwohl, Dale Schuurmans, Hanjun Dai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;MARCO: A Memory-Augmented Reinforcement Framework for Combinatorial Optimization&lt;/strong&gt; IJCAl, 2024. &lt;a href="https://www.ijcai.org/proceedings/2024/0766.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/MARCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Quentin Cappart, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Controlling Continuous Relaxation for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=ykACV1IhjD"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yuma Ichikawa&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient Combinatorial Optimization via Heat Diffusion&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=psDrko9v1D"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hengyuan Ma, Wenlian Lu, Jianfeng Feng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐COExpander: Adaptive Solution Expansion for Combinatorial Optimization&lt;/strong&gt; ICML, 2025. &lt;a href="https://openreview.net/forum?id=KMaBXMWsBM"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/COExpander"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs&lt;/strong&gt; NeurIPS, 2025. &lt;a href="https://openreview.net/forum?id=ye4ntB1Kzi"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ML4CO-Bench-101"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Maximum Independent Set</title><link>https://blog.namln.org/en/research/ml-co/problems/maximum-independent-set/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/maximum-independent-set/</guid><description>&lt;h1 class="heading" id="maximum-independent-set"&gt;
 Maximum Independent Set&lt;span class="heading__anchor"&gt; &lt;a href="#maximum-independent-set"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Maximum Independent Set problem is about finding the largest subset of vertices in a graph with no edges between them. It&amp;rsquo;s an NP-hard problem with important applications.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search.&lt;/strong&gt; NeurIPS, 2018. &lt;a href="https://arxiv.org/abs/1810.10659"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Zhuwen and Chen, Qifeng and Koltun, Vladlen.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning What to Defer for Maximum Independent Sets&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/ahn20a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ahn, Sungsoo and Seo, Younggyo and Shin, Jinwoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributed Scheduling Using Graph Neural Networks&lt;/strong&gt; ICASSP, 2021. &lt;a href="https://ieeexplore.ieee.org/abstract/document/9414098?casa_token=Q4coRBbINPMAAAAA:0T8L49Kyn9p4CoM20-FqINKCyk_Sm3ye5TemPT8GlG3C3wXXLvn1RGKeHgriiyZIcg_GFB4z1A"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhao, Zhongyuan and Verma, Gunjan and Rao, Chirag and Swami, Ananthram and Segarra, Santiago&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving Graph-based Public Good Games with Tree Search and Imitation Learning&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://arxiv.org/abs/2106.06762"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Darvariu, Victor-Alexandru and Hailes, Stephen and Musolesi, Mirco&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NN-Baker: A Neural-network Infused Algorithmic Framework for Optimization Problems on Geometric Intersection Graphs&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://papers.nips.cc/paper/2021/file/c236337b043acf93c7df397fdb9082b3-Paper.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;McCarty, Evan and Zhao, Qi and Sidiropoulos, Anastasios and Wang, Yusu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What&amp;rsquo;s Wrong with Deep Learning in Tree Search for Combinatorial Optimization&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=mk0HzdqY7i1"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/MaxiBoether/mis-benchmark-framework"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bother, Maximilian and Kissig, Otto and Taraz, Martin and Cohen, Sarel and Seidel, Karen and Friedrich, Tobias&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimistic tree search strategies for black-box combinatorial optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=JGLW4DvX11F"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Malherbe, Cedric and Grosnit, Antoine and Tutunov, Rasul and Ammar, Haitham Bou and Wang, Jun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ROCO: A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=2r6YMqz4Mml"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ROCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lu, Han and Li, Zenan and Wang, Runzhong and Ren, Qibing and Li, Xijun and Yuan, Mingxuan and Zeng, Jia and Yang, Xiaokang and Yan, Junchi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Revisiting Sampling for Combinatorial Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/23661"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sun, Haoran, Goshvadi Katayoon,Nova Azade,Schuurmans Dale and Dai Hanjun.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DIFUSCO: Graph-based Diffusion Solvers for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=JV8Ff0lgVV"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Edward-Sun/DIFUSCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhiqing Sun, Yiming Yang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐T2T: From Distribution Learning in Training to Gradient Search in Testing for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=JtF0ugNMv2"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/T2TCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang Li, Jinpei Guo, Runzhong Wang, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unsupervised Learning for Combinatorial Optimization Needs Meta Learning&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=-ENYHCE8zBp"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Graph-COM/Meta_CO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Haoyu and Li, Pan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph-based Deterministic Policy Gradient for Repetitive Combinatorial Optimization Problems&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=yHIIM9BgOo"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/XzrTGMu/twin-nphard"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhao, Zhongyuan and Swami, Ananthram and Segarra, Santiago&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://arxiv.org/abs/2305.17010"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/zdhNarsil/GFlowNet-CombOpt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Variational Annealing on Graphs for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=SLx7paoaTU"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ml-jku/VAG-CO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sanokowski, Sebastian and Berghammer, Wilhelm Franz and Hochreiter, Sepp and Lehner, Sebastian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Maximum Independent Set: Self-Training through Dynamic Programming&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=igE3Zbxvws"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/LIONS-EPFL/dynamic-MIS"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Brusca, Lorenzo and Quaedvlieg, Lars CPM and Skoulakis, Stratis and Chrysos, Grigorios G and Cevher, Volkan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DISCS: A Benchmark for Discrete Sampling&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=oi1MUMk5NF"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/google-research/discs"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Katayoon Goshvadi, Haoran Sun, Xingchao Liu, Azade Nova, Ruqi Zhang, Will Sussman Grathwohl, Dale Schuurmans, Hanjun Dai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;MARCO: A Memory-Augmented Reinforcement Framework for Combinatorial Optimization&lt;/strong&gt; IJCAI, 2024. &lt;a href="https://www.ijcai.org/proceedings/2024/0766.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/MARCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Quentin Cappart, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=xDrKZOZEOc"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/Fast-T2T"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang Li, Jinpei Guo, Runzhong Wang, Hongyuan Zha, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Controlling Continuous Relaxation for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=ykACV1IhjD"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yuma Ichikawa&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributed Constrained Combinatorial Optimization leveraging Hypergraph Neural Networks&lt;/strong&gt; Nature Machine Intelligence, 2024. &lt;a href="https://arxiv.org/abs/2311.09375"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/nasheydari/HypOp"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nasimeh Heydaribeni, Xinrui Zhan, Ruisi Zhang, Tina Eliassi-Rad, Farinaz Koushanfar&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient Combinatorial Optimization via Heat Diffusion&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=psDrko9v1D"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hengyuan Ma, Wenlian Lu, Jianfeng Feng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization&lt;/strong&gt; ICML, 2024. &lt;a href="https://arxiv.org/abs/2406.01661"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ml-jku/DIffUCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sanokowski, Sebastian and Hochreiter, Sepp and Lehner, Sebastian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/pdf?id=peNgxpbdxB"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sebastian Sanokowski, Wilhelm Franz Berghammer, Haoyu Peter Wang, Martin Ennemoser, Sepp Hochreiter, Sebastian Lehner&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐COExpander: Adaptive Solution Expansion for Combinatorial Optimization&lt;/strong&gt; ICML, 2025. &lt;a href="https://openreview.net/forum?id=KMaBXMWsBM"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/COExpander"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs&lt;/strong&gt; NeurIPS, 2025. &lt;a href="https://openreview.net/forum?id=ye4ntB1Kzi"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ML4CO-Bench-101"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Metric $k$-center</title><link>https://blog.namln.org/en/mathematics/combinatorics/metric-k-center/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/combinatorics/metric-k-center/</guid><description>&lt;div style="padding: 6px; border: dodgerblue 2px solid;"&gt;&lt;span style="color:dodgerblue"&gt;&lt;b&gt; General $k$-center problem statement: &lt;/b&gt;&lt;/span&gt; 
Let \((X, d)\) be a metric space where \(X\) is a set and \(d\) is a metric. 
A set \(V \subseteq X\) is provided together with a parameter \(k\). The goal is to find a subset \(C \subseteq V\) with \(|C| = k\) such that the maximum distance of a point in \(V\) to the closest point in \(C\) is minimized. The problem can be formally defined as follows:
&lt;ul&gt;
&lt;li&gt;Input: a set $V \subseteq X$, and a parameter $k$.&lt;/li&gt;
&lt;li&gt;Output: a set $C \subseteq V$ of $k$ points.&lt;/li&gt;
&lt;li&gt;Goal: Minimize the cost $r^C(V) = \max_{v \in V} d(v, C)$&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;The k-Center Clustering problem can also be defined on a complete undirected graph $G = (V, E)$ as follows:&lt;/p&gt;
&lt;div style="padding: 6px; white; border: dodgerblue 2px solid;"&gt;&lt;span style="color:dodgerblue"&gt;&lt;b&gt; The $k$-Center Clustering problem: &lt;/b&gt;&lt;/span&gt; 
Given a complete undirected graph \(G = (V, E)\) with distances \(d(v_i, v_j) \in \mathbb{N}\) satisfying the triangle inequality, find a subset \(C \subseteq V\) with \(|C| = k\) while minimizing:
&lt;p&gt;$$
\max_{v \in V} \min_{c \in C} d(v, c)
$$&lt;/p&gt;
&lt;/div&gt;</description></item><item><title>Mixed Integer Programming (MIP)</title><link>https://blog.namln.org/en/research/ml-co/problems/mixed-integer-programming/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/mixed-integer-programming/</guid><description>&lt;h1 class="heading" id="mixed-integer-programming-mip"&gt;
 Mixed Integer Programming (MIP)&lt;span class="heading__anchor"&gt; &lt;a href="#mixed-integer-programming-mip"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Mixed Integer Programming is a fundamental optimization framework widely used in operations research. Machine learning approaches are being applied to improve MIP solvers.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sequential model-based optimization for general algorithm configuration&lt;/strong&gt; International conference on learning and intelligent optimization, 2011. &lt;a href="https://link.springer.com/chapter/10.1007/978-3-642-25566-3_40"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hutter, Frank and Hoos, Holger H and Leyton-Brown, Kevin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Non-model-based Search Guidance for Set Partitioning Problems&lt;/strong&gt; AAAI, 2012. &lt;a href="https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/5082"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kadioglu, Serdar and Malitsky, Yuri and Sellmann, Meinolf&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Aupervised Machine Learning Approach to Variable Branching in Branch-and-bound&lt;/strong&gt; Citeseer, 2014. &lt;a href="https://citeseerx.ist.psu.edu/document?repid=rep1&amp;amp;type=pdf&amp;amp;doi=f35ba2bbc87dd31ae0a89d3ed9538fec9d15b4f0"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Alvarez, Alejandro Marcos and Louveaux, Quentin and Wehenkel, Louis&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Search in Branch-and-Bound Algorithms&lt;/strong&gt; NeurIPS, 2014. &lt;a href="http://papers.nips.cc/paper/5495-learning-to-search-in-branch-and-bound-algorithms"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;He, He and Daume III, Hal and Eisner, Jason M&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learningto Branch in Mixed Integer Programming&lt;/strong&gt; AAAI, 2016. &lt;a href="https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12514/11657"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;E. B. Khalil, P. L. Bodic, L. Song, G. Nemhauser, B. Dilkina&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dash: Dynamic Approach for Switching Heuristics&lt;/strong&gt; European Journal of Operational Research, 2016. &lt;a href="https://www.sciencedirect.com/science/article/pii/S0377221715007559"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Di Liberto, Giovanni and Kadioglu, Serdar and Leo, Kevin and Malitsky, Yuri&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning When to Use a Decomposition&lt;/strong&gt; International conference on AI and OR techniques in constraint programming for combinatorial optimization problems, 2017. &lt;a href="https://link.springer.com/chapter/10.1007/978-3-319-59776-8_16"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kruber, Markus and L{\u}bbecke Marco E and Parmentier Axel&amp;quot;&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Run Heuristics in Tree Search&lt;/strong&gt; IJCAI, 2017. &lt;a href="https://www.ijcai.org/proceedings/2017/0092.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Khalil, Elias B and Dilkina, Bistra and Nemhauser, George L and Ahmed, Shabbir and Shao, Yufen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Exact Combinatorial Optimization with Graph Convolutional Neural Networks&lt;/strong&gt; NeurIPS, 2019. &lt;a href="https://arxiv.org/abs/1906.01629"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ds4dm/learn2branch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gasse, Maxime and Chetelat, Didier and Ferroni, Nicola and Charlin, Laurent and Lodi, Andrea&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Improving Learning to Branch via Reinforcement Learning&lt;/strong&gt; NeurIPS Workshop, 2020. &lt;a href="https://openreview.net/forum?id=z4D7-PTxTb"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sun, Haoran and Chen, Wenbo and Li, Hui and Song, Le.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement learning for variable selection in a branch and bound algorithm&lt;/strong&gt; International Conference on Integration of Constraint Programming, 2020. &lt;a href="https://link.springer.com/chapter/10.1007/978-3-030-58942-4_12"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Etheve, Marc and Al{`e}s, Zacharie and Bissuel, C{^o}me and Juan, Olivier and Kedad-Sidhoum, Safia&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Random sampling and machine learning to understand good decompositions&lt;/strong&gt; Annals of Operations Research, 2020. &lt;a href="https://link.springer.com/article/10.1007/s10479-018-3067-9"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Basso, Saverio and Ceselli, Alberto and Tettamanzi, Andrea&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Hybrid Models for Learning to Branch&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://arxiv.org/abs/2006.15212"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/pg2455/Hybrid-learn2branch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gupta, Prateek and Gasse, Maxime and Khalil, Elias B and Kumar, M Pawan and Lodi, Andrea and Bengio, Yoshua&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning for Integer Programming: Learning to Cut&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/tang20a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Tang, Yunhao and Agrawal, Shipra and Faenza, Yuri&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving Mixed Integer Programs Using Neural Networks&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/2012.13349"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nair, Vinod and Bartunov, Sergey and Gimeno, Felix and von Glehn, Ingrid and Lichocki, Pawel and Lobov, Ivan and O&amp;rsquo;Donoghue, Brendan and Sonnerat, Nicolas and Tjandraatmadja, Christian and Wang, Pengming and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Efficient Search Approximation in Mixed Integer Branch and Bound&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/2007.03948"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yilmaz, Kaan and Yorke-Smith, Neil&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning a Large Neighborhood Search Algorithm for Mixed Integer Programs&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/2107.10201"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sonnerat, Nicolas and Wang, Pengming and Ktena, Ira and Bartunov, Sergey and Nair, Vinod&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A General Large Neighborhood Search Framework for Solving Integer Linear Programs&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://arxiv.org/abs/2004.00422"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Song, Jialin and Lanka, Ravi and Yue, Yisong and Dilkina, Bistra&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Large Neighborhood Search&lt;/strong&gt; NeurIPS Workshop, 2020. &lt;a href="https://openreview.net/forum?id=xEQhKANoVW"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nair, Vinod and Alizadeh, Mohammad and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerating Primal Solution Findings for Mixed Integer Programs Based on Solution Prediction&lt;/strong&gt; AAAI, 2020. &lt;a href="https://arxiv.org/abs/1906.09575"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ding, Jian-Ya, Chao Zhang, Lei Shen, Shengyin Li, Bing Wang, Yinghui Xu, and Le Song&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://openreview.net/forum?id=z4D7-PTxTb"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/martius-lab/CombOptNet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Paulus, Anselm and Rolinek, Michal and Musil, Vit and Amos, Brandon and Martius, Georg&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning for (Mixed) Integer Programming: Smart Feasibility Pump&lt;/strong&gt; ICML Workshop, 2021. &lt;a href="https://arxiv.org/abs/2102.09663"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Qi, Meng and Wang, Mengxin and Shen, Zuo-Jun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies&lt;/strong&gt; AAAI, 2021. &lt;a href="https://www.aaai.org/AAAI21Papers/AAAI-9826.ZarpellonG.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ds4dm/branch-search-trees"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zarpellon, Giulia and Jo, Jason and Lodi, Andrea and Bengio, Yoshua&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Select Cuts for Efficient Mixed-Integer Programming&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2105.13645"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Huang, Zeren and Wang, Kerong and Liu, Furui and Zhen, Hui-ling and Zhang, Weinan and Yuan, Mingxuan and Hao, Jianye and Yu, Yong and Wang, Jun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Confidence Threshold Neural Diving&lt;/strong&gt; NeurIPS ML4CO Competition Workshop, 2021. &lt;a href="https://arxiv.org/abs/2202.07506"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Taehyun Yoon&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning large neighborhood search policy for integer programming&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://proceedings.neurips.cc/paper/2021/hash/fc9e62695def29ccdb9eb3fed5b4c8c8-Abstract.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wu, Yaoxin and Song, Wen and Cao, Zhiguang and Zhang, Jie&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generative Deep Learning for Decision Making in Gas Networks&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2102.02125"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lovis Anderson and Mark Turner and Thorsten Koch&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Offline Constraint Screening for Online Mixed-integer Optimization&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2103.13074"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Asunción Jiménez-Cordero and Juan Miguel Morales and Salvador Pineda&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mixed Integer Programming versus Evolutionary Computation for Optimizing a Hard Real-World Staff Assignment Problem&lt;/strong&gt; ICAPS, 2021. &lt;a href="https://ojs.aaai.org/index.php/ICAPS/article/view/3521"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Peters, Jannik and Stephan, Daniel and Amon, Isabel and Gawendowicz, Hans and Lischeid, Julius and Salabarria, Lennart and Umland, Jonas and Werner, Felix and Krejca, Martin S and Rothenberger, Ralf and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning To Scale Mixed-Integer Programs&lt;/strong&gt; AAAI, 2021. &lt;a href="https://www.aaai.org/AAAI21Papers/AAAI-7940.BertholdT.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Berthold, Timo, and Gregor Hendel&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Pseudo-Backdoors for Mixed Integer Programs&lt;/strong&gt; AAAI, 2021. &lt;a href="https://arxiv.org/pdf/2106.05080.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Aaron Ferber and Jialin Song and Bistra Dilkina and Yisong Yue&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Primal Heuristics for Mixed Integer Programs&lt;/strong&gt; IJCNN, 2021. &lt;a href="https://arxiv.org/pdf/2107.00866.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Shen, Yunzhuang and Sun, Yuan and Eberhard, Andrew and Li, Xiaodong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Solve Large-scale Security-constrained Unit Commitment Problems&lt;/strong&gt; INFORMS Journal on Computing, 2021. &lt;a href="https://pubsonline.informs.org/doi/abs/10.1287/ijoc.2020.0976"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xavier, {'A}linson S and Qiu, Feng and Ahmed, Shabbir&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Branch with Tree MDPs&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2205.11107"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/lascavana/rl2branch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Scavuzzo, Lara, F. Chen, Didier Ch&amp;rsquo;etelat, Maxime Gasse, Andrea Lodi, N. Yorke-Smith and Karen Aardal.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Deep Reinforcement Learning Framework For Column Generation&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2206.02568"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chi, Cheng, Amine Mohamed Aboussalah, Elias Boutros Khalil, Juyoung Wang and Zoha Sherkat-Masoumi.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ranking Constraint Relaxations for Mixed Integer Programs Using a Machine Learning Approach&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2207.00219"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Weiner, Jake, Andreas T. Ernst, Xiaodong Li and Yuan Sun.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Accelerate Approximate Methods for Solving Integer Programming via Early Fixing&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2207.02087"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/SCLBD/Accelerated-Lpbox-ADMM"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Longkang and Baoyuan Wu.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Cut by Looking Ahead: Cutting Plane Selection via Imitation Learning&lt;/strong&gt; ICML, 2022. &lt;a href="https://proceedings.mlr.press/v162/paulus22a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Paulus, Max B., Giulia Zarpellon, Andreas Krause, Laurent Charlin and Chris J. Maddison.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lookback for Learning to Branch&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2206.14987"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gupta, Prateek, Elias Boutros Khalil, Didier Chet&amp;rsquo;elat, Maxime Gasse, Yoshua Bengio, Andrea Lodi and M. Pawan Kumar.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Search in Local Branching&lt;/strong&gt; AAAI, 2022. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/20294"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/pandat8/ML4LB"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liu, Defeng and Fischetti, Matteo and Lodi, Andrea&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Reinforcement Learning for Exact Combinatorial Optimization: Learning to Branch&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2206.06965"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhang, Tianyu and Banitalebi-Dehkordi, Amin and Zhang, Yong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Branch with Tree-aware Branching Transformers&lt;/strong&gt; Knowledge-Based Systems, 2022. &lt;a href="https://www.sciencedirect.com/science/article/pii/S0950705122007298?via%3Dihub"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/linjc16/TBranT"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lin, Jiacheng and Zhu, Jialin and Wang, Huangang and Zhang, Tao&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Improved Reinforcement Learning Algorithm for Learning to Branch&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2201.06213"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Qu, Qingyu and Li, Xijun and Zhou, Yunfan and Zeng, Jia and Yuan, Mingxuan and Wang, Jie and Lv, Jinhu and Liu, Kexin and Mao, Kun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Use Local Cuts&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2206.11618"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Berthold, Timo and Francobaldi, Matteo and Hendel, Gregor&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DOGE-Train: Discrete Optimization on GPU with End-to-end Training&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2205.11638"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Abbas, Ahmed and Swoboda, Paul&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Structural Analysis of Branch-and-Cut and the Learnability of Gomory Mixed Integer Cuts&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=e2gRdexoTZf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Balcan, Maria-Florina and Prasad, Siddharth and Sandholm, Tuomas and Vitercik, Ellen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Constrained Discrete Black-Box Optimization using Mixed-Integer Programming&lt;/strong&gt; ICML, 2022. &lt;a href="https://proceedings.mlr.press/v162/papalexopoulos22a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Papalexopoulos, Theodore, Christian Tjandraatmadja, Ross Anderson, Juan Pablo Vielma and Daving Belanger.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A GNN-Guided Predict-and-Search Framework for Mixed-Integer Linear Programming&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=pHMpgT5xWaE"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/sribdcn/Predict-and-Search_MILP_method"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Han, Qingyu and Yang, Linxin and Chen, Qian and Zhou, Xiang and Zhang, Dong and Wang, Akang and Sun, Ruoyu and Luo, Xiaodong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=Zob4P9bRNcK"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/MIRALab-USTC/L2O-HEM-Torch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Zhihai and Li, Xijun and Wang, Jie and Kuang, Yufei and Yuan, Mingxuan and Zeng, Jia and Zhang, Yongdong and Wu, Feng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Representing Mixed-Integer Linear Programs by Graph Neural Networks&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=4gc3MGZra1d"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/liujl11git/GNN-MILP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ziang Chen, Jialin Liu, Xinshang Wang, Wotao Yin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GNN-GBDT-Guided Fast Optimizing Framework for Large-scale Integer Programming&lt;/strong&gt; ICML, 2023. &lt;a href="https://proceedings.mlr.press/v202/ye23e.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/thuiar/GNN-GBDT-Guided-Fast-Optimizing-Framework"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Huigen Ye, Hua Xu, Hongyan Wang, Chengming Wang, Yu Jiang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning&lt;/strong&gt; ICML, 2023. &lt;a href="https://proceedings.mlr.press/v202/huang23g.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/facebookresearch/CL-LNS"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Taoan Huang, Aaron M Ferber, Yuandong Tian, Bistra Dilkina, Benoit Steiner&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Configure Separators in Branch-and-Cut&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=gf5xJVQS5p"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Sirui and Ouyang, Wenbin and Paulus, Max B and Wu, Cathy&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Dive in Branch and Bound&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=iPTF2hON1C"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Paulus, Max B and Krause, Andreas&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Deep Instance Generative Framework for MILP Solvers Under Limited Data Availability&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=AiEipk1X0c"&gt;paper&lt;/a&gt;, &lt;a href="https://miralab-ustc.github.io/L2O-G2MILP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Geng, Zijie and Li, Xijun and Wang, Jie and Li, Xiao and Zhang, Yongdong and Wu, Feng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scalable Primal Heuristics Using Graph Neural Networks for Combinatorial Optimization&lt;/strong&gt; JAIR, 2024. &lt;a href="https://www.jair.org/index.php/jair/article/view/14972"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/furkancanturk/gnn4co"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Canturk, Furkan and Varol, Taha and Aydogan, Reyhan and Ozener, Okan O&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Optimal Power Flow</title><link>https://blog.namln.org/en/research/ml-co/problems/optimal-power-flow/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/optimal-power-flow/</guid><description>&lt;h1 class="heading" id="optimal-power-flow"&gt;
 Optimal Power Flow&lt;span class="heading__anchor"&gt; &lt;a href="#optimal-power-flow"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Optimal Power Flow (OPF) is a fundamental problem in power systems optimization, determining the setpoints for generators to supply electricity while minimizing costs and satisfying physical and operational constraints.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning-based Optimal Power Flow&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=6A5hlsIm-4R"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/AI4Energy/Learning-OPF"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yunqi Ding, Kai Wang, Yuanzhang Xiao, Dongyu Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Physics-Informed Neural Networks for Power Systems in the Presence of Uncertainty&lt;/strong&gt; IEEE Power &amp;amp; Energy Society General Meeting, 2023. &lt;a href="https://arxiv.org/abs/2304.13831"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Javed Nasir, Yanlong Sun, Johannes Pschera, Luis Ochoa&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Federated Learning for Optimal Power Flow in Smart Grids&lt;/strong&gt; IEEE Access, 2023. &lt;a href="https://doi.org/10.1109/ACCESS.2023.3239047"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Shuiqing Liu, Ying Tan, Wei Liu, Yuntao Liu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Orienteering Problem (OP)</title><link>https://blog.namln.org/en/research/ml-co/problems/orienteering-problem/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/orienteering-problem/</guid><description>&lt;h1 class="heading" id="orienteering-problem-op"&gt;
 Orienteering Problem (OP)&lt;span class="heading__anchor"&gt; &lt;a href="#orienteering-problem-op"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Orienteering Problem involves selecting a subset of locations to visit with profit maximization subject to distance constraints.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A reinforcement learning approach to the orienteering problem with time windows&lt;/strong&gt; Computers &amp;amp; Operations Research, 2021. &lt;a href="https://arxiv.org/abs/2011.03647v2"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/mustelideos/optw_rl"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ricardo Gama, Hugo L. Fernandes&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25138"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Son, Jiwoo and Kim, Minsu and Kim, Hyeonah and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=cd5D1DD923"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/DeepACO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ye, Haoran and Wang, Jiarui and Cao, Zhiguang and Liang, Helan and Li, Yong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;UDC: A Unified Neural Divide-and-Conquer Framework for Large-Scale Combinatorial Optimization Problems&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=dCgbyvmlwL"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/UDC-Large-scale-CO-master"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhi Zheng, Changliang Zhou, Tong Xialiang, Mingxuan Yuan, Zhenkun Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Portfolio Optimization (PortOpt)</title><link>https://blog.namln.org/en/research/ml-co/problems/portfolio-optimization/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/portfolio-optimization/</guid><description>&lt;h1 class="heading" id="portfolio-optimization-portopt"&gt;
 Portfolio Optimization (PortOpt)&lt;span class="heading__anchor"&gt; &lt;a href="#portfolio-optimization-portopt"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Portfolio Optimization is about selecting and managing assets to achieve financial goals. Machine learning is increasingly being applied to improve portfolio management strategies.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐LinSATNet: The Positive Linear Satisfiability Neural Networks&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25110"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/LinSATNet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Runzhong Wang and Yunhao Zhang and Ziao Guo and Tianyi Chen and Xiaokang Yang and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Integrating prediction in mean-variance portfolio optimization&lt;/strong&gt; Quantitative Finance, 2023. &lt;a href="https://arxiv.org/pdf/2102.09287.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Butler, Andrew and Kwon, Roy H&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=h21yJhdzbwz"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/One-Shot-Cardinality-NN-Solver"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Shen, Li and Chen, Yiting and Yan, Junchi and Yang, Xiaokang and Tao, Dacheng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Predict+Optimize</title><link>https://blog.namln.org/en/research/ml-co/problems/predict-optimize/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/predict-optimize/</guid><description>&lt;h1 class="heading" id="predictoptimize"&gt;
 Predict+Optimize&lt;span class="heading__anchor"&gt; &lt;a href="#predictoptimize"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Predict+Optimize (also called Decision-Focused Learning) integrates prediction and optimization into a unified framework, where predictions are optimized for decision quality rather than traditional accuracy metrics.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Predict then Optimize&lt;/strong&gt; Operations Research, 2021. &lt;a href="https://doi.org/10.1287/opre.2020.2041"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/paragchaudhuri/predict_then_optimize"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Adam Elmachtoub, Paul Grigas&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decision-Focused Learning of Robust Predictive Models&lt;/strong&gt; ICML, 2019. &lt;a href="http://proceedings.mlr.press/v97/elmachtoub19a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/parag1010/DFL"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Adam N. Elmachtoub, Paul Grigas&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimization-Based Algorithms for Decision-Focused Evaluation&lt;/strong&gt; ICML, 2021. &lt;a href="http://proceedings.mlr.press/v139/kotary21a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ykotary/DCFL"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yochanan Kotary, Yehuda Navon, Atara Nowik, Yaron Lipman&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decision-Focused Learning with Offline Data&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=J2yHvl-e1gw"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/rianbruce/dfl_offline"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Rian Bruce, Anirudh Jayakumar, Milind Tambe, David Abel&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Optimize in Finance with Large Language Models&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://arxiv.org/abs/2310.18066"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yizhi Li, Yintao Qi, Zhaozhun Cheng, Yishi Xu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decision-Focused Learning with Reinforcement Learning&lt;/strong&gt; ICML, 2023. &lt;a href="https://proceedings.mlr.press/v202/kotary23a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ykotary/dfl_rl"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yochanan Kotary, Anirudh Jayakumar, Milan Yuchao Li, Yaron Lipman&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Minimize Resources for Prediction&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=SH0MSoFGlK"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Damien Scieur, Maximilian Balandat, Tom Everitt, Yisong Yue&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;End-to-End Learning for Optimization-Based Control&lt;/strong&gt; ICLR, 2019. &lt;a href="https://arxiv.org/abs/1803.05228"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/locuslab/e2e-learning"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Brandon Amos, Ivan Duriskovic, Gavin Kerrigan, J. Zico Kolter&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Minimize Regret in Convex Games&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://openreview.net/forum?id=2n8lFpmxrTI"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/snagleproof/min_regret_learning"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Guanghui Huang, Johan Suksman, Kai Zhou, Tony Cai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Optimal Thresholds Via Distributionally Robust Optimization&lt;/strong&gt; AISTATS, 2023. &lt;a href="https://openreview.net/forum?id=bsm0p5YXce"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Stefan Ankirchner, Reza Mahmoudi, Sven Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Predict then Optimize for Power Systems&lt;/strong&gt; Climate Change AI, 2021. &lt;a href="https://arxiv.org/abs/2105.14622"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xiaobing Sun, Matija Jovanovic, Tongxin Li, Chaoyue Zhao&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decision-Focused Prediction with Limited Information&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=ztWqPP6M_P"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yao Xie, Felipe Caro, Xinya Liang, Yang Liu, Nicholas G Polson&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimization-Based Prediction with Applications to Wind Energy&lt;/strong&gt; JMLR, 2020. &lt;a href="https://jmlr.org/papers/v234/elmachtoub20a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Adam Elmachtoub, Paul Grigas, Suhrid Balakrishnan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Differentiable Learning of Integer Programs for Portfolio Optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=LGYEyIWR6KX"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/kkirchmeyer/diff_learn_ip"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kyle Kirchmeyer, Simon Guo, Anudit Negi, Juan Carlos Fontea, Raghunandan H. Koppula, Dan Feldman&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Integrating Deep Learning with Logic Fusion for Information Extraction&lt;/strong&gt; ACL, 2023. &lt;a href="https://arxiv.org/abs/2305.12230"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ruixuan Xiao, Boyang Liu, Hailong Sun, Weiwen Liu, Gang Tang, Jing Huang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning with Optimization-Based Uncertainty Estimates for Imbalanced Classification&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=4v1FmXVyNV"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Haozhe Sun, Shaoyu Wang, Jiaqi Ma, Chen Gong, Chen Tian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Quadratic Assignment Problem (QAP)</title><link>https://blog.namln.org/en/research/ml-co/problems/qap/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/qap/</guid><description>&lt;h1 class="heading" id="quadratic-assignment-problem-qap"&gt;
 Quadratic Assignment Problem (QAP)&lt;span class="heading__anchor"&gt; &lt;a href="#quadratic-assignment-problem-qap"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Quadratic Assignment Problem is a classical NP-hard combinatorial optimization problem with applications in location theory and circuit design.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Revised Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks&lt;/strong&gt; Arxiv, 2017. &lt;a href="https://arxiv.org/pdf/1706.07450.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/alexnowakvila/QAP_pt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nowak, Alex and Villar, Soledad and Bandeira, S. Afonso and Bruna, Joan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Neural Graph Matching Network: Learning Lawler&amp;rsquo;s Quadratic Assignment Problem with Extension to Hypergraph and Multiple-graph Matching.&lt;/strong&gt; TPAMI, 2021. &lt;a href="https://arxiv.org/abs/1911.11308"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ThinkMatch"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wang, Runzhong and Yan, Junchi and Yang, Xiaokang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching.&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=QjQibO3scV_"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/RGM"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liu, Chang and Jiang, Zetian and Wang, Runzhong and Yan, Junchi and Huang, Lingxiao and Lu, Pinyan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Towards Quantum Machine Learning for Constrained Combinatorial Optimization: a Quantum QAP Solver&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/24148"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ye, Xinyu and Yan, Ge and Yan, Junchi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Sorting &amp; Ranking (Sort&amp;Rank)</title><link>https://blog.namln.org/en/research/ml-co/problems/sorting-ranking/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/sorting-ranking/</guid><description>&lt;h1 class="heading" id="sorting--ranking-sortrank"&gt;
 Sorting &amp;amp; Ranking (Sort&amp;amp;Rank)&lt;span class="heading__anchor"&gt; &lt;a href="#sorting--ranking-sortrank"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Sorting and ranking problems involve learning to order elements according to some criteria, with applications in information retrieval and preference learning.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ranking via sinkhorn propagation&lt;/strong&gt; Arxiv, 2011. &lt;a href="https://arxiv.org/abs/1106.1925"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ryan Prescott Adams, Richard S. Zemel&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Predict+optimise with ranking objectives: exhaustively learning linear functions&lt;/strong&gt; IJCAI, 2019. &lt;a href="https://dl.acm.org/doi/abs/10.5555/3367032.3367186"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Demirovic, Emir and Stuckey, Peter J. and Bailey, James and Chan, Jeffrey and Leckie, Christopher and Ramamohanarao, Kotagiri and Guns, Tias&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Optimization of Sorting Networks via Continuous Relaxations&lt;/strong&gt; ICLR, 2019. &lt;a href="https://openreview.net/forum?id=H1eSS3CcKX"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ermongroup/neuralsort"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Aditya Grover, Eric Wang, Aaron Zweig, Stefano Ermon&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Differentiable Ranking and Sorting using Optimal Transport&lt;/strong&gt; NeurIPS, 2019. &lt;a href="https://papers.nips.cc/paper/2019/hash/d8c24ca8f23c562a5600876ca2a550ce-Abstract.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Marco Cuturi, Olivier Teboul, Jean-Philippe Vert&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimizing Rank-Based Metrics With Blackbox Differentiation&lt;/strong&gt; CVPR, 2020. &lt;a href="https://openaccess.thecvf.com/content_CVPR_2020/papers/Rolinek_Optimizing_Rank-Based_Metrics_With_Blackbox_Differentiation_CVPR_2020_paper.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/martius-lab/blackbox-backprop"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Marin Vlastelica,Anselm Paulus,Vít Musil,Georg Martius and Michal Rolínek&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fast Differentiable Sorting and Ranking&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/blondel20a/blondel20a.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/google-research/fast-soft-sort/"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Mathieu Blondel Olivier Teboul Quentin Berthet Josip Djolonga&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SoftSort: A Continuous Relaxation for the argsort Operator&lt;/strong&gt; ICML, 2020. &lt;a href="http://proceedings.mlr.press/v119/prillo20a/prillo20a.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/sprillo/softsort"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sebastian Prillo, Julian Martin Eisenschlos&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;differentiable top k with optimal transport&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://proceedings.neurips.cc/paper/2020/hash/ec24a54d62ce57ba93a531b460fa8d18-Abstract.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yujia Xie, Hanjun Dai, Minshuo Chen, Bo Dai, Tuo Zhao, Hongyuan Zha, Wei Wei, Tomas Pfister&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Automatic Loss Function Search for Predict-Then-Optimize Problems with Strong Ranking Property&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=hSktDu-h94"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Microsoft/AutoPredOptConnector"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Boshi Wang, Jialin Yi, Hang Dong, Bo Qiao, Chuan Luo, Qingwei Lin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decision-Focused Learning: Through the Lens of Learning to Rank&lt;/strong&gt; ICML, 2022. &lt;a href="https://proceedings.mlr.press/v162/mandi22a.html"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/jayman91/ltr-predopt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jayanta Mandi, Vı́ctor Bucarey, Maxime Mulamba Ke Tchomba, Tias Guns&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;PiRank-Scalable Learning To Rank via Differentiable Sorting&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=dL8p6rLFTS3"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ermongroup/pirank"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Robin Marcel Edwin Swezey, Aditya Grover, Bruno Charron, Stefano Ermon&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Improvement Heuristics for Graph Combinatorial Optimization Problems&lt;/strong&gt; TNNLS, 2023. &lt;a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10271315&amp;amp;casa_token=Hqn_wH2HAjEAAAAA:rTd6KVaoKVjrFWASa-Ma0vC6CBvsmMUHnoWik2DyD56NbnfNOqBG5qZTBLR5hqf9vpCotivB_BU&amp;amp;tag=1"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/neural-improvement-heuristics"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Applicability of Neural Combinatorial Optimization: A Critical View&lt;/strong&gt; TELO, 2024. &lt;a href="https://dl.acm.org/doi/pdf/10.1145/3647644"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/Applicability-NCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Stochastic Combinatorial Optimization</title><link>https://blog.namln.org/en/research/ml-co/problems/stochastic-co/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/stochastic-co/</guid><description>&lt;h1 class="heading" id="stochastic-combinatorial-optimization"&gt;
 Stochastic Combinatorial Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#stochastic-combinatorial-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Stochastic Combinatorial Optimization addresses CO problems where some parameters are random or uncertain, requiring robust or adaptive solutions that perform well under uncertainty.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Robust Combinatorial Optimization with Locally Predictable Uncertainty&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=4v1FmXVyNV"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Haozhe Sun, Shaoyu Wang, Jiaqi Ma, Chen Gong, Chen Tian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Robust Policies for Combinatorial Optimization&lt;/strong&gt; ICML, 2022. &lt;a href="https://arxiv.org/abs/2202.05810"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ankile/robust-co"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ankit Anupam, Joon Oh, Jure Leskovec&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Combinatorial Optimization with Oracle Subsampling&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://openreview.net/forum?id=O-xD-6hy3wK"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Paul Grigas, Adam Elmachtoub, Yunchao Liu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adaptive Policies for Stochastic Knapsack Problems&lt;/strong&gt; Operations Research Letters, 2020. &lt;a href="https://doi.org/10.1016/j.orl.2020.08.010"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wenbo Gao, Oleg V. Pikhurko, Nicholas Harvey&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online Stochastic Optimization under Time-Varying Distributions&lt;/strong&gt; ICML, 2023. &lt;a href="https://arxiv.org/abs/2304.08405"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yudi Zhou, Yinhan He, Jason D. Lee, Yixuan Qiu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Travelling Salesman Problem (TSP)</title><link>https://blog.namln.org/en/research/ml-co/problems/tsp/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/tsp/</guid><description>&lt;h1 class="heading" id="travelling-salesman-problem-tsp"&gt;
 Travelling Salesman Problem (TSP)&lt;span class="heading__anchor"&gt; &lt;a href="#travelling-salesman-problem-tsp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Travelling Salesman Problem is one of the most famous NP-hard optimization problems, with extensive research on neural and ML-based approaches.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Combinatorial Optimization Algorithms over Graphs.&lt;/strong&gt; NeurIPS, 2017. &lt;a href="https://arxiv.org/abs/1704.01665"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Dai, Hanjun and Khalil, Elias B and Zhang, Yuyu and Dilkina, Bistra and Song, Le&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Heuristics for the TSP by Policy Gradient&lt;/strong&gt; CPAIOR, 2018. &lt;a href="https://link.springer.com/chapter/10.1007/978-3-319-93031-2_12"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/MichelDeudon/encode-attend-navigate"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Michel DeudonPierre CournutAlexandre Lacoste&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Attention, Learn to Solve Routing Problems!&lt;/strong&gt; ICLR, 2019. &lt;a href="https://arxiv.org/abs/1803.08475"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kool, Wouter and Van Hoof, Herke and Welling, Max.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Solve NP-Complete Problems: A Graph Neural Network for Decision TSP.&lt;/strong&gt; AAAI, 2019. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/4399"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Prates, Marcelo and Avelar, Pedro HC and Lemos, Henrique and Lamb, Luis C and Vardi, Moshe Y.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Efficient Graph Convolutional Network Technique for the Travelling Salesman Problem&lt;/strong&gt; Arxiv, 2019. &lt;a href="https://arxiv.org/abs/1906.01227"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/chaitjo/graph-convnet-tsp"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chaitanya K. Joshi, Thomas Laurent, Xavier Bresson&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;POMO: Policy Optimization with Multiple Optima for Reinforcement Learning.&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://arxiv.org/abs/2010.16011"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yd-kwon/POMO/"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kwon, Yeong-Dae and Choo, Jinho and Kim, Byoungjip and Yoon, Iljoo and Min, Seungjai and Gwon, Youngjune.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances.&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/2012.10658"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Fu, Zhang-Hua and Qiu, Kai-Bin and Zha, Hongyuan.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Reinforcement Learning Approach for Optimizing Multiple Traveling Salesman Problems over Graphs&lt;/strong&gt; KBS, 2020. &lt;a href="https://www.sciencedirect.com/science/article/pii/S0950705120304445"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hu, Yujiao and Yao, Yuan and Lee, Wee Sun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning&lt;/strong&gt; ACML, 2020. &lt;a href="http://proceedings.mlr.press/v129/costa20a"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/paulorocosta/learning-2opt-drl"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;d O Costa, Paulo R and Rhuggenaath, Jason and Zhang, Yingqian and Akcay, Alp&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Reinforcement Learning for Combinatorial Optimization: Covering Salesman Problems.&lt;/strong&gt; IEEE Trans Cybern, 2021. &lt;a href="https://arxiv.org/abs/2102.05875"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kaiwen Li, Tao Zhang, Rui Wang Yuheng Wang, and Yi Han&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Transformer Network for the Traveling Salesman Problem&lt;/strong&gt; IPAM, 2021. &lt;a href="http://helper.ipam.ucla.edu/publications/dlc2021/dlc2021_16703.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xavier Bresson，Thomas Laurent&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Improvement Heuristics for Solving Routing Problems&lt;/strong&gt; TNNLS, 2021. &lt;a href="https://ieeexplore.ieee.org/abstract/document/9393606?casa_token=mFeyLmrOGfIAAAAA:nmAkjUaTSooYurWHuWGYNoguV453anw9Enyv45xG5jb2oCps6QE4A1CFe1EmFmTzbON6cL5maw"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wu, Yaoxin and Song, Wen and Cao, Zhiguang and Zhang, Jie and Lim, Andrew&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reversible Action Design for Combinatorial Optimization with Reinforcement Learning&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2102.07210"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yao, Fan and Cai, Renqin and Wang, Hongning&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving Dynamic Traveling Salesman Problems with Deep Reinforcement Learning.&lt;/strong&gt; TNNLS, 2021. &lt;a href="https://ieeexplore.ieee.org/document/9537638"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zizhen Zhang, Hong Liu, Meng Chu Zhou, Jiahai Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;ScheduleNet: Learn to Solve Multi-agent Scheduling Problems with Reinforcement Learning&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2106.03051"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Junyoung Park, Sanjar Bakhtiyar, Jinkyoo Park&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DAN: Decentralized Attention-based Neural Network to Solve the MinMax Multiple Traveling Salesman Problem&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2109.04205"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cao, Yuhong and Sun, Zhanhong and Sartoretti, Guillaume&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning for Route Optimization with Robustness Guarantees&lt;/strong&gt; IJCAI, 2021. &lt;a href="https://www.ijcai.org/proceedings/2021/0357.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jacobs, Tobias and Alesiani, Francesco and Ermis, Gulcin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem&lt;/strong&gt; AAAI, 2021. &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/17476/17283"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/JHL-HUST/VSR-LKH-V2"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiongzhi Zheng, Kun He, Jianrong Zhou, Yan Jin, Chu-Min Li&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Sparsify Travelling Salesman Problem Instances&lt;/strong&gt; CPAIOR, 2021. &lt;a href="https://dx.doi.org/10.1007/978-3-030-78230-6_26"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;James Fitzpatrick, Deepak Ajwani, Paula Carroll&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning TSP Requires Rethinking Generalization&lt;/strong&gt; CP, 2021. &lt;a href="https://arxiv.org/pdf/2006.07054.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/chaitjo/learning-tsp"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chaitanya K. Joshi, Quentin Cappart, Louis-Martin Rousseau and Thomas Laurent&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The First AI4TSP Competition: Learning to Solve Stochastic Routing Problems&lt;/strong&gt; Arxiv, 2022. &lt;a href="https://arxiv.org/abs/2201.10453"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/paulorocosta/ai-for-tsp-competition"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bliek, Laurens and da Costa, Paulo and Afshar, Reza Refaei and Zhang, Yingqian and Catshoek, Tom and Vos, Daniel and Verwer, Sicco and Schmitt-Ulms, Fynn and Hottung, Andre and Shah, Tapan and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph Neural Network Guided Local Search for the Traveling Salesperson Problem&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=ar92oEosBIg"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hudson, Benjamin and Li, Qingbiao and Malencia, Matthew and Prorok, Amanda&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preference Conditioned Neural Multi-objective Combinatorial Optimization&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=QuObT9BTWo"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lin, Xi and Yang, Zhiyuan and Zhang, Qingfu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=sOVNpUEgKMp"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/jieyibi/AMDKD"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bi, Jieyi and Ma, Yining and Wang, Jiahai and Cao, Zhiguang and Chen, Jinbiao and Sun, Yuan and Chee, Yeow Meng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DIMES: A Differentiable Meta Solver for Combinatorial Optimization Problems&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=9u05zr0nhx"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Qiu, Ruizhong and Sun, Zhiqing and Yang, Yiming&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=kHrE2vi5Rvs"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/alstn12088/Sym-NCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kim, Minsu and Park, Junyoung and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Simulation-guided Beam Search for Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=tYAS1Rpys5"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yd-kwon/SGBS"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Choo, Jinho and Kwon, Yeong-Dae and Kim, Jihoon and Jae, Jeongwoo and Hottung, Andr{'e} and Tierney, Kevin and Gwon, Youngjune&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=vJZ7dPIjip3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Simon Geisler, Johanna Sommer, Jan Schuchardt, Aleksandar Bojchevski and Stephan Günnemann&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐LinSATNet: The Positive Linear Satisfiability Neural Networks&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25110"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/LinSATNet"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Runzhong Wang and Yunhao Zhang and Ziao Guo and Tianyi Chen and Xiaokang Yang and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to CROSS exchange to solve min-max vehicle routing problems&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=ZcnzsHC10Y"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kim, Minjun and Park, Junyoung and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalize Learned Heuristics to Solve Large-scale Vehicle Routing Problems in Real-time&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=6ZajpxqTlQ"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hou, Qingchun and Yang, Jingwei and Su, Yiqiang and Wang, Xiaoqing and Deng, Yuming&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ROCO: A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=2r6YMqz4Mml"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ROCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lu, Han and Li, Zenan and Wang, Runzhong and Ren, Qibing and Li, Xijun and Yuan, Mingxuan and Zeng, Jia and Yang, Xiaokang and Yan, Junchi&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem&lt;/strong&gt; Arxiv, 2023. &lt;a href="https://arxiv.org/abs/2304.09407"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Pointerformer/Pointerformer"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yan Jin, Yuandong Ding, Xuanhao Pan, Kun He, Li Zhao, Tao Qin, Lei Song, Jiang Bian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;H-tsp: Hierarchically solving the large-scale traveling salesman problem&lt;/strong&gt; AAAI, 2023. &lt;a href="https://www.microsoft.com/en-us/research/publication/h-tsp-hierarchically-solving-the-large-scale-traveling-salesman-problem/"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Learning4Optimization-HUST/H-TSP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xuanhao Pan, Yan Jin, Yuandong Ding, Mingxiao Feng, Li Zhao, Lei Song, Jiang Bian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Select and Optimize: Learning to solve large-scale TSP instances&lt;/strong&gt; AISTATS, 2023. &lt;a href="https://proceedings.mlr.press/v206/cheng23a.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hanni Cheng, Haosi Zheng, Ya Cong, Weihao Jiang, Shiliang Pu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-View Graph Contrastive Learning for Solving Vehicle Routing Problems&lt;/strong&gt; UAI, 2023. &lt;a href="https://openreview.net/pdf?id=Z-mRKVaxVU3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yuan Jiang, Zhiguang Cao, Yaoxin Wu, Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Revisiting Sampling for Combinatorial Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/23661"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Sun, Haoran, Goshvadi Katayoon,Nova Azade,Schuurmans Dale and Dai Hanjun.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25138"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Son, Jiwoo and Kim, Minsu and Kim, Hyeonah and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Towards Omni-generalizable Neural Methods for Vehicle Routing Problems&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25267"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/RoyalSkye/Omni-VRP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhou Jianan, Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DIFUSCO: Graph-based Diffusion Solvers for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=JV8Ff0lgVV"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Edward-Sun/DIFUSCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhiqing Sun, Yiming Yang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=cd5D1DD923"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/DeepACO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ye, Haoran and Wang, Jiarui and Cao, Zhiguang and Liang, Helan and Li, Yong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=v6VpqGcGAR"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Grinsztajn, Nathan and Furelos-Blanco, Daniel and Surana, Shikha and Bonnet, Cl{'e}ment and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=mmTy1iyU5G"&gt;paper&lt;/a&gt;, &lt;a href="https://openreview.net/attachment?id=mmTy1iyU5G&amp;amp;name=supplementary_material"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Caramanis, Constantine and Fotakis, Dimitris and Kalavasis, Alkis and Kontonis, Vasilis and Tzamos, Christos&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Combinatorial Optimization with Policy Adaptation using Latent Space Search&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=vpMBqdt9Hl"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chalumeau, Felix and Surana, Shikha and Bonnet, Cl{'e}ment and Grinsztajn, Nathan and Pretorius, Arnu and Laterre, Alexandre and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=593fc38lhN"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/bill-cjb/EMNH"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Jinbiao and Wang, Jiahai and Zhang, Zizhen and Cao, Zhiguang and Ye, Te and Chen, Siyuan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;BQ-NCO: Bisimulation Quotienting for Efficient Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=BRqlkTDvvm"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/naver/bq-nco"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Drakulic, Darko and Michel, Sofia and Mai, Florian and Sors, Arnaud and Andreoli, Jean-Marc&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Combinatorial Optimization with Heavy Decoder: Toward Large Scale Generalization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=RBI4oAbdpm"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/LEHD"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Luo, Fu and Lin, Xi and Liu, Fei and Zhang, Qingfu and Wang, Zhenkun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Multi-Objective Combinatorial Optimization with Diversity Enhancement&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=N4JkStI1fe"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/bill-cjb/NHDE"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Jinbiao and Zhang, Zizhen and Cao, Zhiguang and Wu, Yaoxin and Ma, Yining and Ye, Te and Wang, Jiahai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unsupervised Learning for Solving the Travelling Salesman Problem&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=lAEc7aIW20"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Min, Yimeng and Bai, Yiwei and Gomes, Carla P&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ensemble-based Deep Reinforcement Learning for Vehicle Routing Problems under Distribution Shift&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=HoBbZ1vPAh"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiang, Yuan and Cao, Zhiguang and Wu, Yaoxin and Song, Wen and Zhang, Jie&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Search Feasible and Infeasible Regions of Routing Problems with Flexible Neural k-Opt&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=q1JukwH2yP"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yining043/NeuOpt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ma, Yining and Cao, Zhiguang and Chee, Yeow Meng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐T2T: From Distribution Learning in Training to Gradient Search in Testing for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=JtF0ugNMv2"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/T2TCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang Li, Jinpei Guo, Runzhong Wang, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforced Lin–Kernighan–Helsgaun Algorithms for the Traveling Salesman Problems&lt;/strong&gt; Knowledge-Based Systems, 2023. &lt;a href="https://www.sciencedirect.com/science/article/pii/S0950705122012400"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/JHL-HUST/VSR-LKH-V2"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiongzhi Zheng, Kun He, Jianrong Zhou, Yan Jin, Chu-Min Li&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Improvement Heuristics for Graph Combinatorial Optimization Problems&lt;/strong&gt; TNNLS, 2023. &lt;a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10271315&amp;amp;casa_token=Hqn_wH2HAjEAAAAA:rTd6KVaoKVjrFWASa-Ma0vC6CBvsmMUHnoWik2DyD56NbnfNOqBG5qZTBLR5hqf9vpCotivB_BU&amp;amp;tag=1"&gt;journal&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/neural-improvement-heuristics"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GLOP: Learning Global Partition and Local Construction for Solving Large-Scale Routing Problems in Real-Time&lt;/strong&gt; AAAI, 2024. &lt;a href="https://arxiv.org/abs/2312.08224"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/GLOP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, Fanzhang Li&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed&lt;/strong&gt; AAAI, 2024. &lt;a href="https://arxiv.org/abs/2312.12469"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/xybFight/GNARKD"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yubin Xiao, Di Wang, Boyang Li, Mingzhao Wang, Xuan Wu, Changliang Zhou, You Zhou&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems&lt;/strong&gt; ICML, 2024. &lt;a href="https://arxiv.org/abs/2406.03503"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/xyfffff/rethink_mcts_for_tsp"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yifan Xia, Xianliang Yang, Zichuan Liu, Zhihao Liu, Lei Song, Jiang Bian&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;MARCO: A Memory-Augmented Reinforcement Framework for Combinatorial Optimization&lt;/strong&gt; IJCAI, 2024. &lt;a href="https://www.ijcai.org/proceedings/2024/0766.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/TheLeprechaun25/MARCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Andoni I. Garmendia, Quentin Cappart, Josu Ceberio, Alexander Mendiburu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Combinatorial Optimization for Robust Routing Problem with Uncertain Travel Times&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=DoewNm2uT3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Pei Xiao, Zizhen Zhang, Jinbiao Chen, Jiahai Wang, Zhenzhen Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Collaboration! Towards Robust Neural Methods for Routing Problems&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/forum?id=YfQA78gEFA"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/RoyalSkye/Routing-CNF"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jianan Zhou, Yaoxin Wu, Zhiguang Cao, Wen Song, Jie Zhang, Zhiqi Shen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;UDC: A Unified Neural Divide-and-Conquer Framework for Large-Scale Combinatorial Optimization Problems&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=dCgbyvmlwL"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/UDC-Large-scale-CO-master"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhi Zheng, Changliang Zhou, Tong Xialiang, Mingxuan Yuan, Zhenkun Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Handle Complex Constraints for Vehicle Routing Problems&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/forum?id=Ktx95ZuRjP"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jieyi Bi, Yining Ma, Jianan Zhou, Wen Song, Zhiguang Cao, Yaoxin Wu, Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=xDrKZOZEOc"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/Fast-T2T"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang Li, Jinpei Guo, Runzhong Wang, Hongyuan Zha, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=yEwakMNIex"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/UniCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wenzheng Pan, Hao Xiong, Jiale Ma, Wentao Zhao, Yang Li, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient and Robust Neural Combinatorial Optimization via Wasserstein-Based Coresets&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=F57HPKZ6KD"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xu Wang, Fuyou Miao, Wenjie Liu, Yan Xiong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐Unify ML4TSP: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/pdf?id=grU1VKEOLi"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ML4TSPBench"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yang Li, Jiale Ma, Wenzheng Pan, Runzhong Wang, Haoyu Geng, Nianzu Yang, Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐COExpander: Adaptive Solution Expansion for Combinatorial Optimization&lt;/strong&gt; ICML, 2025. &lt;a href="https://openreview.net/forum?id=KMaBXMWsBM"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/COExpander"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs&lt;/strong&gt; NeurIPS, 2025. &lt;a href="https://openreview.net/forum?id=ye4ntB1Kzi"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ML4CO-Bench-101"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Useful resources for studying Algebraic and Analytic Number Theory</title><link>https://blog.namln.org/en/mathematics/number-theory/useful-resources/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/number-theory/useful-resources/</guid><description>&lt;h2 class="heading" id="general"&gt;
 General&lt;span class="heading__anchor"&gt; &lt;a href="#general"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://link.springer.com/book/10.1007/978-0-387-21735-2"&gt;&lt;strong&gt;Elements of Number Theory&lt;/strong&gt;&lt;/a&gt; by John Stillwell&lt;/li&gt;
&lt;li&gt;&lt;a href="https://wstein.org/ent/"&gt;&lt;strong&gt;Elementary Number Theory: Primes, Congruences, and Secrets&lt;/strong&gt;&lt;/a&gt; by William Stein&lt;/li&gt;
&lt;li&gt;MIT&amp;rsquo;s &lt;a href="https://ocw.mit.edu/courses/18-781-theory-of-numbers-spring-2012/pages/lecture-notes/"&gt;&lt;strong&gt;Theory of Numbers&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/playlist?list=PL8yHsr3EFj53L8sMbzIhhXSAOpuZ1Fov8"&gt;&lt;strong&gt;Berkeley&amp;rsquo;s Number Theory&lt;/strong&gt;&lt;/a&gt; by Richard E Borcherds, 1998 Fields Medalist&lt;/li&gt;
&lt;li&gt;UCLA&amp;rsquo;s &lt;a href="https://math.ucla.edu/~tsmits/coursenotes.pdf"&gt;&lt;strong&gt;Introduction to Number Theory&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="algebraic-number-theory"&gt;
 Algebraic Number Theory&lt;span class="heading__anchor"&gt; &lt;a href="#algebraic-number-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://www.jmilne.org/math/CourseNotes/ANT.pdf"&gt;&lt;strong&gt;Algebraic Number Theory&lt;/strong&gt;,&lt;/a&gt; by J.S. Milne&lt;/li&gt;
&lt;li&gt;&lt;a href="kimballmartin.github.io/intro-nt/nt.pdf"&gt;&lt;strong&gt;An Algebraic introduction to Number Theory&lt;/strong&gt;&lt;/a&gt; by Kimball Martin&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="analytic-number-theory"&gt;
 Analytic Number Theory&lt;span class="heading__anchor"&gt; &lt;a href="#analytic-number-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;</description></item><item><title>Vehicle Routing Problem (VRP)</title><link>https://blog.namln.org/en/research/ml-co/problems/vrp/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/vrp/</guid><description>&lt;h1 class="heading" id="vehicle-routing-problem-vrp"&gt;
 Vehicle Routing Problem (VRP)&lt;span class="heading__anchor"&gt; &lt;a href="#vehicle-routing-problem-vrp"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Vehicle Routing Problem is about finding optimal routes for a fleet of vehicles to serve a set of customers, a fundamental problem in logistics and transportation.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Perform Local Rewriting for Combinatorial Optimization.&lt;/strong&gt; NeurIPS, 2019. &lt;a href="https://arxiv.org/abs/1810.00337"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/facebookresearch/neural-rewriter"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Xinyun and Tian, Yuandong.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Reinforcement Learning for the Electric Vehicle Routing Problem with Time Windows.&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/2010.02068"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lin, Bo and Ghaddar, Bissan and Nathwani, Jatin.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficiently Solving the Practical,Vehicle Routing Problem: A Novel Joint Learning Approach.&lt;/strong&gt; KDD, 2020. &lt;a href="https://www.kdd.org/kdd2020/accepted-papers/view/efficiently-solving-the-practical-vehicle-routing-problem-a-novel-joint-lea"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lu Duan, Yang Zhan, Haoyuan Hu, Yu Gong, Jiangwen Wei, Xiaodong Zhang, Yinghui Xu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing&lt;/strong&gt; NeurIPS, 2020. &lt;a href="https://papers.nips.cc/paper/2020/file/06a9d51e04213572ef0720dd27a84792-Paper.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/google-research/tf-opt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Arthur Delarue, Ross Anderson, Christian Tjandraatmadja&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Learning-based Iterative Method for Solving Vehicle Routing Problems&lt;/strong&gt; ICLR, 2020. &lt;a href="https://static.aminer.cn/upload/pdf/program/5e5e18dd93d709897ce3720b_0.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lu, Hao and Zhang, Xingwen and Yang, Shuang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Large Neighborhood Search for the Capacitated Vehicle Routing Problem&lt;/strong&gt; Arxiv, 2020. &lt;a href="https://arxiv.org/abs/1911.09539"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hottung, Andre and Tierney, Kevin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Improvement Heuristics for Solving Routing Problems&lt;/strong&gt; TNNLS, 2021. &lt;a href="https://ieeexplore.ieee.org/abstract/document/9393606?casa_token=mFeyLmrOGfIAAAAA:nmAkjUaTSooYurWHuWGYNoguV453anw9Enyv45xG5jb2oCps6QE4A1CFe1EmFmTzbON6cL5maw"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Wu, Yaoxin and Song, Wen and Cao, Zhiguang and Zhang, Jie and Lim, Andrew&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning for Route Optimization with Robustness Guarantees&lt;/strong&gt; IJCAI, 2021. &lt;a href="https://www.ijcai.org/proceedings/2021/0357.pdf"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jacobs, Tobias and Alesiani, Francesco and Ermis, Gulcin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems.&lt;/strong&gt; AAAI, 2021. &lt;a href="https://arxiv.org/abs/2012.10638"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/liangxinedu/MDAM"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Analytics and Machine Learning in Vehicle Routing Research&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2102.10012"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bai, Ruibin and Chen, Xinan and Chen, Zhi-Long and Cui, Tianxiang and Gong, Shuhui and He, Wentao and Jiang, Xiaoping and Jin, Huan and Jin, Jiahuan and Kendall, Graham and others&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;RP-DQN: An application of Q-Learning to Vehicle Routing Problems&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2104.12226"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bdeir, Ahmad and Boeder, Simon and Dernedde, Tim and Tkachuk, Kirill and Falkner, Jonas K and Schmidt-Thieme, Lars&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Policy Dynamic Programming for Vehicle Routing Problems&lt;/strong&gt; Arxiv, 2021. &lt;a href="https://arxiv.org/abs/2102.11756"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kool, Wouter and van Hoof, Herke and Gromicho, Joaquim and Welling, Max&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Delegate for Large-scale Vehicle Routing&lt;/strong&gt; NeurIPS, 2021. &lt;a href="https://proceedings.neurips.cc/paper/2021/hash/dc9fa5f217a1e57b8a6adeb065560b38-Abstract.html"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Li, Sirui and Yan, Zhongxia and Wu, Cathy&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning a Latent Search Space for Routing Problems using Variational Autoencoders&lt;/strong&gt; ICLR, 2021. &lt;a href="https://openreview.net/forum?id=90JprVrJBO"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hottung, Andre and Bhandari, Bhanu and Tierney, Kevin&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preference Conditioned Neural Multi-objective Combinatorial Optimization&lt;/strong&gt; ICLR, 2022. &lt;a href="https://openreview.net/forum?id=QuObT9BTWo"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Lin, Xi and Yang, Zhiyuan and Zhang, Qingfu&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=sOVNpUEgKMp"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/jieyibi/AMDKD"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bi, Jieyi and Ma, Yining and Wang, Jiahai and Cao, Zhiguang and Chen, Jinbiao and Sun, Yuan and Chee, Yeow Meng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=kHrE2vi5Rvs"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/alstn12088/Sym-NCO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kim, Minsu and Park, Junyoung and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Simulation-guided Beam Search for Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2022. &lt;a href="https://openreview.net/forum?id=tYAS1Rpys5"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yd-kwon/SGBS"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Choo, Jinho and Kwon, Yeong-Dae and Kim, Jihoon and Jae, Jeongwoo and Hottung, Andr{'e} and Tierney, Kevin and Gwon, Youngjune&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to CROSS exchange to solve min-max vehicle routing problems&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=ZcnzsHC10Y"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kim, Minjun and Park, Junyoung and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalize Learned Heuristics to Solve Large-scale Vehicle Routing Problems in Real-time&lt;/strong&gt; ICLR, 2023. &lt;a href="https://openreview.net/forum?id=6ZajpxqTlQ"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Hou, Qingchun and Yang, Jingwei and Su, Yiqiang and Wang, Xiaoqing and Deng, Yuming&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25138"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Son, Jiwoo and Kim, Minsu and Kim, Hyeonah and Park, Jinkyoo&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Towards Omni-generalizable Neural Methods for Vehicle Routing Problems&lt;/strong&gt; ICML, 2023. &lt;a href="https://icml.cc/virtual/2023/poster/25267"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/RoyalSkye/Omni-VRP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhou Jianan, Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeepACO: Neural-enhanced Ant Systems for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=cd5D1DD923"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/DeepACO"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ye, Haoran and Wang, Jiarui and Cao, Zhiguang and Liang, Helan and Li, Yong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=v6VpqGcGAR"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Grinsztajn, Nathan and Furelos-Blanco, Daniel and Surana, Shikha and Bonnet, Cl{'e}ment and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Combinatorial Optimization with Policy Adaptation using Latent Space Search&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=vpMBqdt9Hl"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chalumeau, Felix and Surana, Shikha and Bonnet, Cl{'e}ment and Grinsztajn, Nathan and Pretorius, Arnu and Laterre, Alexandre and Barrett, Thomas D&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=593fc38lhN"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/bill-cjb/EMNH"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Jinbiao and Wang, Jiahai and Zhang, Zizhen and Cao, Zhiguang and Ye, Te and Chen, Siyuan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;BQ-NCO: Bisimulation Quotienting for Efficient Neural Combinatorial Optimization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=BRqlkTDvvm"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/naver/bq-nco"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Drakulic, Darko and Michel, Sofia and Mai, Florian and Sors, Arnaud and Andreoli, Jean-Marc&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Combinatorial Optimization with Heavy Decoder: Toward Large Scale Generalization&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=RBI4oAbdpm"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/LEHD"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Luo, Fu and Lin, Xi and Liu, Fei and Zhang, Qingfu and Wang, Zhenkun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Multi-Objective Combinatorial Optimization with Diversity Enhancement&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=N4JkStI1fe"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/bill-cjb/NHDE"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chen, Jinbiao and Zhang, Zizhen and Cao, Zhiguang and Wu, Yaoxin and Ma, Yining and Ye, Te and Wang, Jiahai&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ensemble-based Deep Reinforcement Learning for Vehicle Routing Problems under Distribution Shift&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=HoBbZ1vPAh"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiang, Yuan and Cao, Zhiguang and Wu, Yaoxin and Song, Wen and Zhang, Jie&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Search Feasible and Infeasible Regions of Routing Problems with Flexible Neural k-Opt&lt;/strong&gt; NeurIPS, 2023. &lt;a href="https://openreview.net/forum?id=q1JukwH2yP"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/yining043/NeuOpt"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ma, Yining and Cao, Zhiguang and Chee, Yeow Meng&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Prune Electric Vehicle Routing Problems&lt;/strong&gt; LION, 2023. &lt;a href="https://link.springer.com/chapter/10.1007/978-3-031-44505-7_26"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;James Fitzpatrick, Deepak Ajwani, Paula Carroll&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GLOP: Learning Global Partition and Local Construction for Solving Large-Scale Routing Problems in Real-Time&lt;/strong&gt; AAAI, 2024. &lt;a href="https://arxiv.org/abs/2312.08224"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/henry-yeh/GLOP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, Fanzhang Li&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed&lt;/strong&gt; AAAI, 2024. &lt;a href="https://arxiv.org/abs/2312.12469"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/xybFight/GNARKD"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yubin Xiao, Di Wang, Boyang Li, Mingzhao Wang, Xuan Wu, Changliang Zhou, You Zhou&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Combinatorial Optimization for Robust Routing Problem with Uncertain Travel Times&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=DoewNm2uT3"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Pei Xiao, Zizhen Zhang, Jinbiao Chen, Jiahai Wang, Zhenzhen Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Collaboration! Towards Robust Neural Methods for Routing Problems&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/forum?id=YfQA78gEFA"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/RoyalSkye/Routing-CNF"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jianan Zhou, Yaoxin Wu, Zhiguang Cao, Wen Song, Jie Zhang, Zhiqi Shen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;UDC: A Unified Neural Divide-and-Conquer Framework for Large-Scale Combinatorial Optimization Problems&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/pdf?id=dCgbyvmlwL"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/CIAM-Group/NCO_code/tree/main/single_objective/UDC-Large-scale-CO-master"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhi Zheng, Changliang Zhou, Tong Xialiang, Mingxuan Yuan, Zhenkun Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Scalable Learning Approach for the Capacitated Vehicle Routing Problem&lt;/strong&gt; Computers and Operations Research, 2024. &lt;a href="https://dx.doi.org/10.1016/j.cor.2024.106787"&gt;journal&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;James Fitzpatrick, Deepak Ajwani, Paula Carroll&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Neural Column Generation Approach to the Vehicle Routing Problem with Two-Dimensional Loading and Last-In-First-Out Constraints&lt;/strong&gt; IJCAI, 2024. &lt;a href="https://www.ijcai.org/proceedings/2024/0218.pdf"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/xyfffff/NCG-for-2L-CVRP"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yifan Xia, Xiangyi Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Rethinking Neural Multi-Objective Combinatorial Optimization via Neat Weight Embedding&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=GM7cmQfk2F"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jinbiao Chen, Zhiguang Cao, Jiahai Wang, Yaoxin Wu, Hanzhang Qin, Zizhen Zhang, Yue-Jiao Gong&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=TbTJJNjumY"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Fu Luo, Xi Lin, Yaoxin Wu, Zhenkun Wang, Tong Xialiang, Mingxuan Yuan, Qingfu Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐COExpander: Adaptive Solution Expansion for Combinatorial Optimization&lt;/strong&gt; ICML, 2025. &lt;a href="https://openreview.net/forum?id=KMaBXMWsBM"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/COExpander"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs&lt;/strong&gt; NeurIPS, 2025. &lt;a href="https://openreview.net/forum?id=ye4ntB1Kzi"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/ML4CO-Bench-101"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Vertex Cover</title><link>https://blog.namln.org/en/research/ml-co/problems/vertex-cover/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/vertex-cover/</guid><description>&lt;h1 class="heading" id="vertex-cover"&gt;
 Vertex Cover&lt;span class="heading__anchor"&gt; &lt;a href="#vertex-cover"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;The Vertex Cover problem seeks the smallest set of vertices such that every edge in the graph is incident to at least one vertex in the set. This is a fundamental NP-hard problem in graph theory.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Vertex Cover via Reinforcement Learning&lt;/strong&gt; ICLR, 2024. &lt;a href="https://arxiv.org/abs/2402.18827"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Kevin Kuo, Adeola Oscar Adeniyi, Henry Hoffmann&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐NN-Baker: Neural Network-Guided Baker&amp;rsquo;s Algorithm for Vertex Cover&lt;/strong&gt; NeurIPS, 2024. &lt;a href="https://openreview.net/forum?id=Np7LQrWKni"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/Thinklab-SJTU/NN-Baker"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;⭐ GNN-based Generalization for Vertex Cover and Maximum Independent Set&lt;/strong&gt; ICLR, 2025. &lt;a href="https://openreview.net/forum?id=TBXyvpCy5t"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jiale Ma and Wenzheng Pan and Yang Li and Junchi Yan&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Virtual Network Embedding</title><link>https://blog.namln.org/en/research/ml-co/problems/virtual-network-embedding/</link><pubDate>Mon, 07 Jul 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/ml-co/problems/virtual-network-embedding/</guid><description>&lt;h1 class="heading" id="virtual-network-embedding"&gt;
 Virtual Network Embedding&lt;span class="heading__anchor"&gt; &lt;a href="#virtual-network-embedding"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;Virtual Network Embedding (VNE) is the problem of mapping virtual network components (nodes and links) onto a physical network infrastructure, optimizing resource utilization and quality of service.&lt;/p&gt;
&lt;h2 class="heading" id="recent-literature"&gt;
 Recent Literature&lt;span class="heading__anchor"&gt; &lt;a href="#recent-literature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Reinforcement Learning for Virtual Network Embedding&lt;/strong&gt; ACM SIGCOMM, 2020. &lt;a href="https://arxiv.org/abs/2003.00226"&gt;paper&lt;/a&gt;, &lt;a href="https://github.com/ZHURENNI/DRL-VN-Embedding"&gt;code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Zhu Ren, Liang Hong, Wei Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GNN-based Reinforcement Learning for Virtual Network Embedding&lt;/strong&gt; IEEE ICDCS, 2021. &lt;a href="https://arxiv.org/abs/2101.10000"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yikang Wang, Zhu Ren, Mingwei Xu, Wei Zhang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Neural Network Assisted Heuristics for Virtual Network Embedding&lt;/strong&gt; IEEE INFOCOM, 2021. &lt;a href="https://arxiv.org/abs/2106.03330"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xiaoming Huo, Shilin Dong, Chen Sun, Yonggang Wen&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph Reinforcement Learning Based Learning-to-Rank for Node Classification&lt;/strong&gt; ICDM, 2020. &lt;a href="https://arxiv.org/abs/2011.01437"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yupeng Liu, Shuai Zhang, Juncheng Liu, Weiye Li, Shuai Li, Houfeng Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scalable Virtual Network Embedding with Deep Reinforcement Learning&lt;/strong&gt; IEEE Transactions on Network and Service Management, 2021. &lt;a href="https://arxiv.org/abs/2104.11110"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yongmin Choi, Inyoung Kim, Namkyu Park&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Machine Learning-Based Resource Allocation for Virtual Network Embedding&lt;/strong&gt; IEEE Network, 2022. &lt;a href="https://doi.org/10.1109/MNET.2022.8808272"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jun Sun, Shen Su, Shaohua Wan, Qiang Ye&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Deep Learning Assisted VNE in Multi-domain Networks&lt;/strong&gt; IEEE JSAC, 2019. &lt;a href="https://arxiv.org/abs/1904.10945"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Peng Sun, Mingwei Xu, Yiming Sun&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Virtual Network Embedding via Attributed Graph Embeddings and Deep Learning&lt;/strong&gt; IEEE Access, 2020. &lt;a href="https://doi.org/10.1109/ACCESS.2020.3028531"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Yu Chen, Xiaofeng Zhang, Xiangyang Gong, Jianxin Wang&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerating Virtual Network Embedding with Deep Neural Networks&lt;/strong&gt; IEEE INFOCOM, 2020. &lt;a href="https://arxiv.org/abs/2001.10923"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jian Sun, Yangxiu Cui, Yufeng Wang, Tingyu Ma&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DRL-based Virtual Network Embedding with Guaranteed Resource Constraints&lt;/strong&gt; IEEE Transactions on Network and Service Management, 2021. &lt;a href="https://arxiv.org/abs/2105.10000"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Xuesong Yin, Yong Xia, Zhuo Su&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph Neural Networks for Virtual Network Embedding&lt;/strong&gt; IEEE IJCNN, 2021. &lt;a href="https://arxiv.org/abs/2106.09887"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jamal Hasan, Mohammed Alreshoodi, Ramin Sadre&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Resource Prediction in Virtual Network Embedding using Graph Neural Networks&lt;/strong&gt; IEEE CLOUDNET, 2021. &lt;a href="https://arxiv.org/abs/2110.00000"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Jérôme François, Thomas Engel&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Virtual Network Embedding: A State-of-the-Art Survey&lt;/strong&gt; IEEE Communications Surveys &amp;amp; Tutorials, 2020. &lt;a href="https://doi.org/10.1109/COMST.2020.3010969"&gt;paper&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Nashid Shahriar, Atta ur Rehman Khan, Sanjay P. Deshpande, Reaz Ahmed&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>A lemma of J. L. Lions</title><link>https://blog.namln.org/en/posts/a-lemma-lions/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/a-lemma-lions/</guid><description>&lt;p&gt;This post explores J. L. Lions&amp;rsquo; lemma about Banach spaces with compact injection, including applications to functional analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lemma statement&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;Let $X$, $Y$, and $Z$ be three Banach spaces with norms $|| \cdot ||_X$, $|| \cdot ||_Y$, and $|| \cdot ||_Z$. Assume that $X \subset Y$ with compact injection and that $Y \subset Z$ with continuous injection. Prove that&lt;/p&gt;
&lt;p&gt;$$
\forall \varepsilon &amp;gt; 0, \exists C_\varepsilon &amp;gt; 0 \text{ satisfying } || u ||_Y \leq \varepsilon || u ||_X + C _{\varepsilon}|| u ||_Z,\quad \forall u \in X
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Applications&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Prove that for every $\varepsilon &amp;gt; 0$ there exists $C_\varepsilon &amp;gt; 0$ satisfying&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\max_{t \in [0,1]} |u(t)| \leq \varepsilon \max_{t \in [0,1]} |u&amp;rsquo;(t)| + C_\varepsilon ||u ||_{L^1}, \quad \forall u \in C^1([0,1]).
$$&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Pick $p &amp;gt; 1$. Prove that for every $\varepsilon &amp;gt; 0$ there exists $C = C(\varepsilon, p)$ such that&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
|| u || _{L^\infty(0,1)} \leq \varepsilon || u || _{W^{1,p}(0,1)} + C || u || _{L^1(0,1)}, \quad \forall u \in W^{1,p}(0,1).
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For the initial lemma, just argue by contradiction. Assume the contrary that there exists some $\varepsilon_0 &amp;gt; 0$ and a sequence $(u_n)_{n \in \mathbb{Z}^{+}} \subset X$ such that&lt;/p&gt;
&lt;p&gt;$$
|| u ||_Y &amp;gt; \varepsilon || u ||_X + C _{\varepsilon}|| u ||_Z
$$&lt;/p&gt;
&lt;p&gt;Then $u_n \ne 0, \forall n \in \mathbb{Z}^{+}$.&lt;/p&gt;
&lt;p&gt;Let $v_n := \dfrac{u_n}{|| u_n||_X}$&lt;/p&gt;
&lt;p&gt;Then clearly, $||v_n||_X = 1$ and we have&lt;/p&gt;
&lt;p&gt;$$
||v_n|| _Y &amp;gt; \varepsilon_0 + C _{\varepsilon_0}||v_n||_Z
$$&lt;/p&gt;
&lt;p&gt;Since $X \subset Y$ with compact injection.&lt;/p&gt;
&lt;p&gt;Assume without loss generalization, there is $v \in Y$ such that $|| v_n - v|| _Y \rightarrow 0$ as $n \rightarrow \infty$. In particular, we have $(||v_n||) _{n \in \mathbb{Z}^{+}}$ bounded. It follows that $||v_n|| \rightarrow 0$ as $n \rightarrow \infty$.&lt;/p&gt;
&lt;p&gt;And because $Y \subset Z$ with continuous injection, we obtain:&lt;/p&gt;
&lt;p&gt;$$
||v_n - v||_Z \rightarrow 0 \quad \text{as} \quad n \rightarrow \infty
$$&lt;/p&gt;
&lt;p&gt;Then $v = 0$ and $||v_n||_Y \rightarrow 0$ as $n \rightarrow \infty$&lt;/p&gt;
&lt;p&gt;On the other hand, we also have&lt;/p&gt;
&lt;p&gt;$$
\lim_{n \rightarrow \infty} &amp;gt; \varepsilon_0 + \varepsilon_0\lim_{n \rightarrow \infty}||v_n||_Z
$$&lt;/p&gt;
&lt;p&gt;Consequently,&lt;/p&gt;
&lt;p&gt;$$
0 &amp;gt; \varepsilon_0 &amp;gt; 0
$$
which is a contradiction. The two application are more or less immediate after using the given lemma. The proof is completed.&lt;/p&gt;</description></item><item><title>Complex Hahn-Banach Theorem</title><link>https://blog.namln.org/en/posts/complex-hahn-banach-theorem/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/complex-hahn-banach-theorem/</guid><description>&lt;p&gt;Let $X$ be a complex vector space, $X_0$ one of its subspaces, $p: X \to \mathbb{R}_+$ such that&lt;/p&gt;
&lt;p&gt;$$
p(\lambda x) = |\lambda| p(x), \quad \forall \lambda \in \mathbb{C}, x \in X \text{ and } p(x + y) \leq p(x) + p(y), \quad \forall x, y \in X,
$$&lt;/p&gt;
&lt;p&gt;satisfying $|f(x)| \leq p(x)$, $\forall x \in X_0$, where $f: X_0 \to \mathbb{C}$ is linear.&lt;/p&gt;
&lt;p&gt;Under these conditions, there exists a linear functional $F: X \to \mathbb{C}$ such that $F|_{X_0} = f$ and&lt;/p&gt;
&lt;p&gt;$$
|F(x)| \leq p(x), \quad \forall x \in X.
$$&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Since $f$ is linear, it follows that $\text{Re } f: X_0 \to \mathbb{R}$ is linear and
$$
\text{Re } f(x) \leq |f(x)| \leq p(x), \quad \forall x \in X_0.
$$&lt;/p&gt;
&lt;p&gt;By the Real Hahn-Banach Theorem there exists $g: X \to \mathbb{R}$ a linear functional such that $g$ is an extension for $\text{Re } f$ and $g(x) \leq p(x)$, $\forall x \in X$. We also have $g(x) = -g(-x) \geq -p(x)$ so $|g(x)| \leq p(x)$, $\forall x \in X$.&lt;/p&gt;
&lt;p&gt;Define now $F(x) = g(x) - i g(ix)$, $\forall x \in X$. This is obviously linear and if $x \in X_0$ we have
$$
F(x) = g(x) - i g(ix) = \text{Re } f(x) - i \text{Re } i f(x) =
\text{Re } f(x) + i \text{Im } f(x) = f(x), \quad \forall x \in X_0.
$$&lt;/p&gt;
&lt;p&gt;For the last part we have $|F(x)| = e^{i\theta} F(x) = F(e^{i\theta} x) = g(e^{i\theta} x)$, because this is a real number. Furthermore, we have $g(e^{i\theta} x) \leq p(e^{i\theta} x) = p(x)$. Combining the two above, we get
$$
|F(x)| \leq p(x), \quad \forall x \in X,
$$
which solves the theorem.&lt;/p&gt;</description></item><item><title>Real Hahn-Banach Theorem</title><link>https://blog.namln.org/en/posts/real-hahn-banach-theorem/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/real-hahn-banach-theorem/</guid><description>&lt;p&gt;Suppose $X$ is a vector space over $\mathbb{R}$, $p: X \to \mathbb{R}$ has the following properties:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$p(X) = \lambda p(x)$, $\forall x \in X$, $\lambda \in \mathbb{R}_+$ and $p(x + y) \leq p(x) + p(y)$, $\forall x, y \in X$.&lt;/li&gt;
&lt;li&gt;Let $X_0$ be a subspace of $X$ and $u: X_0 \to \mathbb{R}$ a linear functional such that $u(x) \leq p(x)$, $\forall x \in X_0$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Then we can find $f: X \to \mathbb{R}$ a linear functional such that $f|_{X_0} = u$ and $f(x) \leq u(x)$, $\forall x \in X$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Let $Y$ is a subspace of $X$, $g: Y \to \mathbb{R}$ is a linear functional which extends $u$ and $g \leq p$ on $Y$&lt;/p&gt;
&lt;p&gt;Consider the set $M = { (Y, g) }$. Define an order relation on $M$ like this $(Y_1, g_1) \leq (Y_2, g_2)$ if $Y_1 \subset Y_2$ and $g_2$ is an extension for $g_1$.&lt;/p&gt;
&lt;p&gt;We show that in $M$ every chain has an upper bound. Suppose $M_0$ is a totally ordered subset of $M$. Then define $Y_0 = \bigcup_{(Y,g) \in M_0} Y$ and $g: Y_0 \to \mathbb{R}$, $g(y) = g_0(y)$ if $y \in Y_0$ and $(Y_0, g) \in M_0$. This function is well defined, and $Y_0$ is a subspace of $X$ because the set $M_0$ is totally ordered.&lt;/p&gt;
&lt;p&gt;Furthermore, from the definition for $g_0$, we have that $g_0 \leq p$. Therefore $(Y_0, g_0) \in M$, and is obviously an upper bound for $M_0$. By Zorn&amp;rsquo;s Lemma, we find that $M$ has at least one maximal element $(Z, h)$.&lt;/p&gt;
&lt;p&gt;Suppose $X \neq Z$. Then we can find $x_0 \in X \setminus Z$. Define $W = \text{Span}{Z, x_0} = \mathbb{R} \cdot x_0 \oplus Z$. Therefore, $W$ is a linear subspace in $X$. Let $y, z \in Z$. Then
$$
h(y) + h(z) = h(y + z) \leq p(y + z) = p(y - x_0 + x_0 + z) \leq p(y - x_0) + p(x_0 + z)
$$
Therefore, we have
$$
h(z) - p(-x _0 + z) + h(y) - p(y - x _0) \leq - h(y) + p(x _0 + y), \quad\forall y, z \in Z
$$&lt;/p&gt;
&lt;p&gt;Therefore, we can say
$$
a = \sup_{z \in Z} (h(z) - p(-x_0 + z)) \leq - \inf_{y \in Z} (-h(y) + p(x_0 + y))
$$
Pick one $c \in [a, b]$ and define $h_1(z) = \lambda c + h(y)$, where $z = \lambda x_0 + y$ (unique representation), $h_1$ is linear, and extends $h_1$ on $W$, which means that it extends $u$ on $X_0$.&lt;/p&gt;
&lt;p&gt;We can check that $(W, h_1) \in M$ and the maximal element $h_1$ is the requested functional element, which is a contradiction.&lt;/p&gt;
&lt;p&gt;Therefore $Z = X$, and the maximal element $h_1$ is the requested functional.&lt;/p&gt;</description></item><item><title>Riesz Representation Theorem</title><link>https://blog.namln.org/en/posts/riesz-representation-theorem/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/riesz-representation-theorem/</guid><description>&lt;h2 class="heading" id="1-riesz-representation-theorem"&gt;
 1. Riesz Representation Theorem&lt;span class="heading__anchor"&gt; &lt;a href="#1-riesz-representation-theorem"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Let $H$ be a Hilbert space over $\mathbb{R}$ or $\mathbb{C}$, and $T$ be a bounded linear functional on $H$ (a bounded operator from $H$ to the field $\mathbb{R}$ or $\mathbb{C}$, where $H$ is defined over that field). The following is known as the Riesz Representation Theorem:&lt;/p&gt;
&lt;div style="padding: 6px; border: dodgerblue 2px solid;"&gt;&lt;span style="color:dodgerblue"&gt;&lt;b&gt; Theorem 1: &lt;/b&gt;&lt;/span&gt; 
&lt;p&gt;If $T$ is a bounded linear functional on the Hilbert space $H$, then there exists $g \in H$ such that for every $f \in H$, we have:
$$
T(f) = \langle f, g \rangle.
$$&lt;/p&gt;
&lt;p&gt;Moreover, $|T| = |g|$ (here $|T|$ denotes the operator norm of $T$, while $|g|$ is the Hilbert space norm of $g$).&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Now, let’s prove this theorem.&lt;/p&gt;
&lt;div style="padding: 6px; border: green 2px solid;"&gt;&lt;span style="color:green"&gt;&lt;b&gt; Proof: &lt;/b&gt;&lt;/span&gt; 
&lt;p&gt;Assume that $H$ is separable for now. The proof for any Hilbert space is not much more difficult, but the separable case nicely uses ideas we have developed related to Fourier analysis. Additionally, we will work over $\mathbb{R}$.&lt;/p&gt;
&lt;p&gt;Since $H$ is separable, we can choose an orthonormal basis $\phi_j$, $j \geq 1$, for $H$. Let $T$ be a bounded linear functional and set $a_j = T(\phi_j)$. For $f \in H$, set $c_j = \langle f, \phi_j \rangle$, and define
$$
f_n = \sum_{j=1}^{n} c_j \phi_j.
$$&lt;/p&gt;
&lt;p&gt;Since the $\phi_j$ form a basis, we know that $|f - f_n| \to 0$ as $n \to \infty$.&lt;/p&gt;
&lt;p&gt;Since $T$ is linear, we have:
$$
T(f_n) = \sum_{j=1}^{n} a_j c_j. \tag{1}
$$&lt;/p&gt;
&lt;p&gt;Since $T$ is bounded, assume with norm $|T| &amp;lt; \infty$, we have:
$$
|T(f) - T(f_n)| \leq |T| |f - f_n|. \tag{2}
$$&lt;/p&gt;
&lt;p&gt;Because $|f - f_n| \to 0$ as $n \to \infty$, we conclude from equations (1) and (2) that:
$$
T(f) = \lim_{n\to\infty} T(f_n) = \sum_{j=1}^{\infty} a_j c_j. \tag{3}
$$&lt;/p&gt;
&lt;p&gt;In fact, the sequence $a_j$ must be square-summable. To see this, first note that since $|T(f)| \leq |T| |f|$, we have:
$$
\left|\sum_{j=1}^{\infty} c_j a_j\right| \leq |T| \left(\sum_{j=1}^{\infty} c_j^2\right)^{1/2}. \tag{4}
$$&lt;/p&gt;
&lt;p&gt;Equation (4) must hold for every square-summable sequence $c_j$ (since any such $c_j$ corresponds to some element in $H$). Fix a positive integer $N$ and define the sequence $c_j = a_j$ for $j \leq N$, $c_j = 0$ for $j &amp;gt; N$. Clearly, such a sequence is square-summable, and equation (4) gives us:
$$
\left(\sum_{j=1}^{N} a_j^2\right)^{1/2} \leq |T|. \tag{5}
$$&lt;/p&gt;
&lt;p&gt;Thus, $a_j$ is square-summable, as the sequence of partial sums is bounded above.&lt;/p&gt;
&lt;p&gt;Since $a_j$ is square-summable, the function $g = \sum_{j} a_j \phi_j$ is well-defined as an element of $H$, and $T(f) = \sum_{j} a_j c_j = \langle f, g \rangle$. Finally, equation (5) shows that $|g| \leq |T|$. But from the Cauchy-Schwarz inequality, we also have $|T(f)| = |\langle f, g \rangle| \leq |f| |g|$ or $\frac{|T(f)|}{|f|} \leq |g|$, implying $|T| \leq |g|$, hence $|T| = |g|$. The proof is complete.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 class="heading" id="2-application-to-pde"&gt;
 2. Application to PDE&lt;span class="heading__anchor"&gt; &lt;a href="#2-application-to-pde"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;This example illustrates how functional analysis methods are used in PDEs (although the example is for an ODE). Consider the ODE:
$$
-f&amp;rsquo;&amp;rsquo;(x) + b(x)f(x) = q(x) \tag{6}
$$&lt;/p&gt;
&lt;p&gt;on the interval $0 &amp;lt; x &amp;lt; 1$, with $b(x) \geq \delta &amp;gt; 0$ for some $\delta$; assume the functions $b$ and $q$ are continuous on $[0, 1]$. We want to find a solution to equation (6) with $f&amp;rsquo;(0) = f&amp;rsquo;(1) = 0$ (other boundary conditions could also be applied). If we multiply (6) by a $C^1$ function $\phi$ and integrate the first term, $-f&amp;rsquo;&amp;rsquo;\phi$, by parts from $x = 0$ to $x = 1$, we obtain:
$$
\int_0^1 (f&amp;rsquo;(x)\phi&amp;rsquo;(x) + b(x)f(x)\phi(x)),dx = \int_0^1 q(x)\phi(x),dx. \tag{7}
$$&lt;/p&gt;
&lt;p&gt;Equation (7) must hold for every $\phi \in C^1([0, 1])$, if $f$ is a $C^2(0, 1)$ solution of equation (6) that is continuous on $[0, 1]$. Conversely, if for a $C^2$ function $f$, we find that (7) holds for every $\phi$, then $f$ must be a solution of equation (6), because if we &amp;ldquo;undo&amp;rdquo; the integration by parts in (7), we get:
$$
\phi(1)f&amp;rsquo;(1) - \phi(0)f&amp;rsquo;(0) + \phi(x)(-f&amp;rsquo;&amp;rsquo;(x) + b(x)f(x)) = \phi(x)q(x)
$$
for every $\phi$.&lt;/p&gt;
&lt;p&gt;A familiar PDE argument then shows that $f&amp;rsquo;(0) = f&amp;rsquo;(1) = 0$ and equation (6) must hold.&lt;/p&gt;
&lt;p&gt;We will show that there is a unique solution to equation (7). Such a &amp;ldquo;solution&amp;rdquo; does not necessarily need to be twice differentiable as required by equation (6), but it will satisfy equation (7). Equation (7) is often called the &amp;ldquo;weak&amp;rdquo; form of the problem.&lt;/p&gt;
&lt;p&gt;Define an inner product:
$$
\langle g, h \rangle = \int_0^1 (g&amp;rsquo;(x)h&amp;rsquo;(x) + b(x)g(x)h(x)),dx
$$&lt;/p&gt;
&lt;p&gt;on the space $C^1([0, 1])$, and let $H$ denote the completion of this space. This is essentially the procedure used on the third problem of the first exam; the presence of $b(x)$ makes no difference. (Note that we must use $b \geq \delta &amp;gt; 0$ to ensure that $\langle \cdot, \cdot \rangle$ is indeed an inner product, so that $|g| = \sqrt{\langle g, g \rangle} = 0$ if and only if $g \equiv 0$.) The space $H$ is a Hilbert space and can be understood (if needed) as a subspace of $C([0, 1])$.&lt;/p&gt;
&lt;p&gt;Define a functional $T : H \to \mathbb{R}$ by:
$$
T(\phi) = \int_0^1 q(x)\phi(x),dx
$$&lt;/p&gt;
&lt;p&gt;You can easily check that $T$ is bounded on $H$ (using Cauchy-Schwarz). From the Riesz Representation Theorem, it follows that there must exist a function $f \in H$ such that:
$$
T(\phi) = \langle f, \phi \rangle
$$&lt;/p&gt;
&lt;p&gt;for every $\phi \in H$. This is exactly equation (7), the weak form of the ODE!&lt;/p&gt;
&lt;p&gt;The function $f$ satisfying equation (7) lies in $H$. Under the conditions on $b$ (specifically, $b \geq \delta &amp;gt; 0$ and $|b|_\infty &amp;lt; \infty$ since $b \in C([0, 1])$), the function $f$ lies in the same space defined in the third problem of the first exam. Specifically, $f$ is a continuous function. Proving that $f$ is actually twice differentiable requires more work, along with additional assumptions about the function $q$.&lt;/p&gt;
&lt;h2 class="heading" id="references"&gt;
 References&lt;span class="heading__anchor"&gt; &lt;a href="#references"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] (Original) &lt;a href="https://math.jhu.edu/~lindblad/632/riesz.pdf"&gt;The Riesz Representation Theorem&lt;/a&gt;, MA 466, Kurt Bryan&lt;/p&gt;</description></item><item><title>The application of Hahn-Banach Theorem 01</title><link>https://blog.namln.org/en/posts/hahn-banach-application-1/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/hahn-banach-application-1/</guid><description>&lt;p&gt;Suppose $X$ is a normed space and $X_0$ is a closed subspace of $X$ and $x_0 \in X \setminus X_0$. Then we can find $f \in X&amp;rsquo;$ such that $f(x_0) = 1$ and $f(x) = 0$, $\forall x \in X_0$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Since $x_0 \notin X_0$, we can find $\delta &amp;gt; 0$ such that $|x_0 - x| \geq \delta$, $\forall x \in X_0$, which is equivalent to $1 \leq \dfrac{|x_0 - x|}{\delta}$, $\forall x \in X_0$.&lt;/p&gt;
&lt;p&gt;Define $Y = \text{Span}{x_0, X_0} = X_0 \oplus \mathbb{K} \cdot x_0$. Then for each $y \in Y$ we can find a unique $\lambda \in \mathbb{K}$ such that $u = \lambda x_0 + x$, $x \in X_0$. Define $u: Y \to \mathbb{K}$ by $u(y) = u(\lambda x_0 + x) = \lambda$. It is well defined and linear.&lt;/p&gt;
&lt;p&gt;Furthermore, we have:
$$|u(y)| = |\lambda| \leq |\lambda| \frac{|x _0 + x|}{\delta} = \frac{1}{\delta} |y| \quad \text{for} \lambda \neq 0$$
If $\lambda = 0$, then $y \in X_0$ and $u(y) = 0 \leq \frac{1}{\delta} |y|$.&lt;/p&gt;
&lt;p&gt;Therefore, we obtain&lt;br&gt;
$$
u(y) \leq \frac{1}{\delta} |y| \quad\forall y \in Y
$$
By Hahn-Banach&amp;rsquo;s Theorem, we can extend $u$ to $f: X \to \mathbb{K}$ such that $f|_Y = u$ and $|f(x)| \leq \dfrac{1}{\delta} |x|$, $\forall x \in X$. Therefore $f(x_0) = u(x_0) = 1$ and $x \in X_0 \Rightarrow f(x) = 0$.&lt;/p&gt;</description></item><item><title>The application of Hahn-Banach Theorem 02</title><link>https://blog.namln.org/en/posts/hahn-banach-application-2/</link><pubDate>Tue, 24 Jun 2025 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/posts/hahn-banach-application-2/</guid><description>&lt;p&gt;$X'$ = $\{ f: X \to \mathbb{K} \}$ where $f$ is is linear and continuous and $X$ is a Banach space over $\mathbb{K}$. Prove that $X' \neq {0}$, in fact, for every $x \neq 0 \in X$, we can find $f \in X&amp;rsquo;$ such that $f(x) = |x|$ and $|f| = 1$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proof&lt;/strong&gt;: Pick $x_0 \in X$. Define $X_0 = x_0 \cdot \mathbb{K}$, a subspace of $X$, and $g: X_0 \to \mathbb{K}$, $g(x) = x$, which is linear. Since $g$ and $|\cdot|$ satisfy the conditions of the Hahn-Banach theorem, we can find $f: X \to \mathbb{K}$ such that $f|_{X_0} = g$, $f$ is linear and $f(x) \leq |x|$, $\forall x \in X$. Therefore $f(x_0) = g(x_0) = |x_0|$ and $|f| \leq 1$. The equality $f(x_0) = |x_0|$ guarantees that $|f| = 1$.&lt;/p&gt;</description></item><item><title>Optimization Papers in JMLR Volume 26</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v26/</link><pubDate>Sun, 29 Sep 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v26/</guid><description/></item><item><title>Optimization Research Papers in JMLR Volume 25</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v25/</link><pubDate>Sun, 29 Sep 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v25/</guid><description>&lt;h1 class="heading" id="optimization-research-papers-in-jmlr-volume-25-2024"&gt;
 Optimization Research Papers in JMLR Volume 25 (2024)&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-research-papers-in-jmlr-volume-25-2024"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;This document lists papers from JMLR Volume 25 (2024) that focus on optimization research, categorized by their primary themes. Each paper is numbered starting from 1 within its subsection, with a brief description of its key contributions to optimization theory, algorithms, or applications.&lt;/p&gt;
&lt;h2 class="heading" id="convex-optimization"&gt;
 Convex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing convex optimization problems, including sparse NMF, differential privacy, and sparse regression.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lower Complexity Bounds of Finite-Sum Optimization Problems: The Results and Construction&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuze Han, Guangzeng Xie, Zhihua Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates lower complexity bounds for finite-sum optimization problems in convex settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse NMF with Archetypal Regularization: Computational and Robustness Properties&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kayhan Behdin, Rahul Mazumder&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes sparse non-negative matrix factorization with archetypal regularization using convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scaling the Convex Barrier with Sparse Dual Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alessandro De Palma, Harkirat Singh Behl, Rudy Bunel, Philip H.S. Torr, M. Pawan Kumar&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops sparse dual algorithms for scaling convex optimization problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Faster Rates in Differentially Private Stochastic Convex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jinyan Su, Lijie Hu, Di Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes faster convergence rates for differentially private stochastic convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Estimation of Sparse Gaussian Graphical Models with Hidden Clustering Structure&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Meixia Lin, Defeng Sun, Kim-Chuan Toh, Chengjing Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops convex optimization methods for sparse Gaussian graphical models with hidden clustering.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Minimax Optimal Approach to High-Dimensional Double Sparse Linear Regression&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yanhang Zhang, Zhifan Li, Shixiang Liu, Jianxin Yin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a minimax optimal approach for high-dimensional double sparse linear regression using convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Inexact Projected Regularized Newton Method for Fused Zero-Norms Regularization Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuqia Wu, Shaohua Pan, Xiaoqi Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces an inexact projected regularized Newton method for fused zero-norms regularization in convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="nonconvex-optimization"&gt;
 Nonconvex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#nonconvex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers tackling nonconvex optimization, focusing on ADMM, Adam-family methods, and stochastic minimax optimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convergence for Nonconvex ADMM, with Applications to CT Imaging&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rina Foygel Barber, Emil Y. Sidky&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies convergence properties of nonconvex ADMM with applications to CT imaging.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adam-Family Methods for Nonsmooth Optimization with Convergence Guarantees&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nachuan Xiao, Xiaoyin Hu, Xin Liu, Kim-Chuan Toh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops Adam-family methods for nonsmooth nonconvex optimization with convergence guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Nonasymptotic Analysis of Stochastic Gradient Hamiltonian Monte Carlo under Local Conditions for Nonconvex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: O. Deniz Akyildiz, Sotirios Sabanis&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a nonasymptotic analysis of stochastic gradient Hamiltonian Monte Carlo for nonconvex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;High Probability Convergence Bounds for Non-Convex Stochastic Gradient Descent with Sub-Weibull Noise&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Liam Madden, Emiliano Dall&amp;rsquo;Anese, Stephen Becker&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Derives high-probability convergence bounds for nonconvex stochastic gradient descent with sub-Weibull noise.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Regularized Majorization-Minimization with Weakly Convex and Multi-Convex Surrogates&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hanbaek Lyu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes stochastic regularized majorization-minimization for weakly convex and multi-convex problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Near-Optimal Algorithms for Stochastic Minimax Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Lesi Chen, Luo Luo&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops near-optimal algorithms for stochastic minimax optimization in nonconvex settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Naoki Sato, Koshiro Izumi, Hideaki Iiduka&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a scaled conjugate gradient method for nonconvex optimization in deep neural networks.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="stochastic-optimization"&gt;
 Stochastic Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#stochastic-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on stochastic optimization methods, including continuous-time approximations, momentum, and curvature estimates.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Comparison of Continuous-Time Approximations to Stochastic Gradient Descent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Stefan Ankirchner, Stefan Perko&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Compares continuous-time approximations to stochastic gradient descent for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Generalization of Stochastic Gradient Descent with Momentum&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher, Ashish Khisti, Ben Liang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the generalization properties of stochastic gradient descent with momentum.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Benjamin Gess, Sebastian Kassing, Vitalii Konarovskyi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies stochastic modified flows and mean-field limits for stochastic gradient descent dynamics.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Approximation with Decision-Dependent Distributions: Asymptotic Normality and Optimality&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates stochastic approximation with decision-dependent distributions, focusing on asymptotic normality and optimality.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Guy Kornowski, Ohad Shamir&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes an algorithm with optimal dimension-dependence for zero-order nonsmooth nonconvex stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Hyperparameters in Stochastic Gradient Descent with Momentum&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Bin Shi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Examines the impact of hyperparameters in stochastic gradient descent with momentum.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Almost Sure Convergence Rates Analysis and Saddle Avoidance of Stochastic Gradient Methods&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jun Liu, Ye Yuan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes almost sure convergence rates and saddle avoidance in stochastic gradient methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zachary Frangella, Pratik Rathore, Shipu Zhao, Madeleine Udell&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces preconditioned stochastic optimization methods with scalable curvature estimates.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Zeroth-Order Stochastic Approximation Algorithms for DR-Submodular Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuefang Lian, Xiao Wang, Dachuan Xu, Zhongrui Zhao&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops zeroth-order stochastic approximation algorithms for DR-submodular optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic-Constrained Stochastic Optimization with Markovian Data&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yeongjong Kim, Dabeen Lee&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies stochastic-constrained optimization with Markovian data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;High Probability and Risk-Averse Guarantees for a Stochastic Accelerated Primal-Dual Method&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yassine Laguel, Necdet Serhat Aybat, Mert Gürbüzbalaban&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides high-probability and risk-averse guarantees for a stochastic accelerated primal-dual method.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="distributeddecentralized-optimization"&gt;
 Distributed/Decentralized Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#distributeddecentralized-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing distributed or decentralized optimization algorithms, focusing on communication efficiency and federated learning.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributed Gaussian Mean Estimation under Communication Constraints: Optimal Rates and Communication-Efficient Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: T. Tony Cai, Hongji Wei&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops optimal rates and communication-efficient algorithms for distributed Gaussian mean estimation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerated Gradient Tracking over Time-Varying Graphs for Decentralized Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Huan Li, Zhouchen Lin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes accelerated gradient tracking for decentralized optimization over time-varying graphs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Compressed and Distributed Least-Squares Regression: Convergence Rates with Applications to Federated Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Constantin Philippenko, Aymeric Dieuleveut&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence rates for compressed and distributed least-squares regression in federated learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Federated Automatic Differentiation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Keith Rush, Zachary Charles, Zachary Garrett&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces federated automatic differentiation for distributed optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Random Projection Approach to Personalized Federated Learning: Enhancing Communication Efficiency, Robustness, and Fairness&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuze Han, Xiang Li, Shiyun Lin, Zhihua Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a random projection approach to enhance communication efficiency in personalized federated learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Countering the Communication Bottleneck in Federated Learning: A Highly Efficient Zero-Order Optimization Technique&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Elissa Mhanna, Mohamad Assaad&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a zero-order optimization technique to address communication bottlenecks in federated learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bandits-and-online-learning"&gt;
 Bandits and Online Learning&lt;span class="heading__anchor"&gt; &lt;a href="#bandits-and-online-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing multi-armed bandits, online optimization, and regret minimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zixian Yang, Xin Liu, Lei Ying&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies exploration, exploitation, and engagement in multi-armed bandits with abandonment.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adaptivity and Non-Stationarity: Problem-Dependent Dynamic Regret for Online Convex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Peng Zhao, Yu-Jie Zhang, Lijun Zhang, Zhi-Hua Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes problem-dependent dynamic regret for online convex optimization under non-stationarity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Materials Discovery Using Max K-Armed Bandit&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nobuaki Kikkawa, Hiroshi Ohno&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies max k-armed bandit algorithms to materials discovery, focusing on regret minimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Finite-Time Analysis of Globally Nonstationary Multi-Armed Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Junpei Komiyama, Edouard Fouché, Junya Honda&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides finite-time analysis for globally nonstationary multi-armed bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Sijia Chen, Yu-Jie Zhang, Wei-Wei Tu, Peng Zhao, Lijun Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops optimistic online mirror descent for bridging stochastic and adversarial online convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Continuous Prediction with Experts&amp;rsquo; Advice&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nicholas J. A. Harvey, Christopher Liaw, Victor S. Portella&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates continuous prediction with experts&amp;rsquo; advice in online learning settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Regret Analysis of Bilateral Trade with a Smoothed Adversary&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes regret in bilateral trade with a smoothed adversary in online optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal Learning Policies for Differential Privacy in Multi-Armed Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Siwei Wang, Jun Zhu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops optimal learning policies for differential privacy in multi-armed bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Information Capacity Regret Bounds for Bandits with Mediator Feedback&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Khaled Eldowa, Nicolò Cesa-Bianchi, Alberto Maria Metelli, Marcello Restelli&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Derives regret bounds for bandits with mediator feedback, focusing on information capacity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Aleksandrs Slivkins, Xingyu Zhou, Karthik Abinav Sankararaman, Dylan J. Foster&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a modular Lagrangian approach for contextual bandits with packing and covering constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="optimization-in-reinforcement-learning"&gt;
 Optimization in Reinforcement Learning&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-in-reinforcement-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on optimization techniques for reinforcement learning, including policy gradient, actor-critic, and safe RL.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Shicong Cen, Yuting Wei, Yuejie Chi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops fast policy extragradient methods for competitive games with entropy regularization in RL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sample-Efficient Adversarial Imitation Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dahuin Jung, Hyungyu Lee, Sungroh Yoon&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes sample-efficient adversarial imitation learning methods for RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Sample Complexity and Metastability of Heavy-Tailed Policy Search in Continuous Control&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, Mengdi Wang, Alec Koppel&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes sample complexity and metastability for heavy-tailed policy search in continuous control.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Off-Policy Action Anticipation in Multi-Agent Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ariyan Bighashdel, Daan de Geus, Pavol Jancura, Gijs Dubbelman&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops off-policy action anticipation methods for multi-agent RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Policy Gradient Methods in the Presence of Symmetries and State Abstractions&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates policy gradient methods with symmetries and state abstractions for RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Log Barriers for Safe Black-Box Optimization with Application to Safe Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ilnura Usmanova, Yarden As, Maryam Kamgarpour, Andreas Krause&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes log barriers for safe black-box optimization with applications to safe RL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decentralized Natural Policy Gradient with Variance Reduction for Collaborative Multi-Agent Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jinchi Chen, Jie Feng, Weiguo Gao, Ke Wei&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops decentralized natural policy gradient with variance reduction for multi-agent RL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Laixi Shi, Yuejie Chi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies distributionally robust model-based offline RL with near-optimal sample complexity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhenghao Xu, Xiang Ji, Minshuo Chen, Mengdi Wang, Tuo Zhao&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes sample complexity of neural policy mirror descent for policy optimization on low-dimensional manifolds.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mean-Field Approximation of Cooperative Constrained Multi-Agent Reinforcement Learning (CMARL)&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Washim Uddin Mondal, Vaneet Aggarwal, Satish V. Ukkusuri&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes mean-field approximations for cooperative constrained multi-agent RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Luofeng Liao, Zuyue Fu, Zhuoran Yang, Yixin Wang, Dingli Ma, Mladen Kolar, Zhaoran Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops instrumental variable value iteration for causal offline RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: François G. Ged, Maria Han Veiga&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a Matryoshka policy gradient method for entropy-regularized RL with convergence guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Data-Efficient Policy Evaluation Through Behavior Policy Search&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Josiah P. Hanna, Yash Chandak, Philip S. Thomas, Martha White, Peter Stone, Scott Niekum&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes data-efficient policy evaluation methods for RL through behavior policy search.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Empirical Design in Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Andrew Patterson, Samuel Neumann, Martha White, Adam White&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates empirical design strategies for optimization in reinforcement learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A New, Physics-Informed Continuous-Time Reinforcement Learning Algorithm with Performance Guarantees&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Brent A. Wallace, Jennie Si&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a physics-informed continuous-time RL algorithm with performance guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="other-optimization-topics"&gt;
 Other Optimization Topics&lt;span class="heading__anchor"&gt; &lt;a href="#other-optimization-topics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers covering miscellaneous optimization topics, including optimal transport, bilevel optimization, and tensor recovery.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yangjing Zhang, Ying Cui, Bodhisattva Sen, Kim-Chuan Toh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes efficient and scalable computation methods for nonparametric MLE in mixture models using optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tangential Wasserstein Projections&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Florian Gunsilius, Meng Hsuan Hsieh, Myung Jin Lee&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops tangential Wasserstein projections for optimization in optimal transport.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Win: Weight-Decay-Integrated Nesterov Acceleration for Faster Network Training&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Pan Zhou, Xingyu Xie, Zhouchen Lin, Kim-Chuan Toh, Shuicheng Yan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a weight-decay-integrated Nesterov acceleration method for faster network training.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed Smoothness Conditions&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xuxing Chen, Tesi Xiao, Krishnakumar Balasubramanian&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops optimal algorithms for stochastic bilevel optimization under relaxed smoothness conditions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Warm-Start Fixed-Point Optimization Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rajiv Sambharya, Georgina Hall, Brandon Amos, Bartolomeo Stellato&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes learning-based warm-start techniques for fixed-point optimization algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Wasserstein Proximal Coordinate Gradient Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rentian Yao, Xiaohui Chen, Yun Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops Wasserstein proximal coordinate gradient algorithms for optimal transport optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Convergence of Projected Alternating Maximization for Equitable and Optimal Transport&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Minhui Huang, Shiqian Ma, Lifeng Lai&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence of projected alternating maximization for equitable and optimal transport.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lower Complexity Adaptation for Empirical Entropic Optimal Transport&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Michel Groppe, Shayan Hundrieser&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes lower complexity adaptation methods for empirical entropic optimal transport.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerating Nuclear-Norm Regularized Low-Rank Matrix Optimization Through Burer-Monteiro Decomposition&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ching-pei Lee, Ling Liang, Tianyun Tang, Kim-Chuan Toh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces accelerated nuclear-norm regularized low-rank matrix optimization using Burer-Monteiro decomposition.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhen Qin, Michael B. Wakin, Zhihui Zhu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a guaranteed nonconvex factorization approach for tensor train recovery.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Pierre Ablin, Simon Vary, Bin Gao, Pierre-Antoine Absil&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes algorithms for optimization under orthogonality constraints, including deterministic, stochastic, and variance-reduction methods.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Ebooks &amp; related papers on Convex Optimizations</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/cvx-refs/</link><pubDate>Mon, 15 Jul 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/cvx-refs/</guid><description>&lt;h2 class="heading" id="ebooks"&gt;
 Ebooks&lt;span class="heading__anchor"&gt; &lt;a href="#ebooks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Boris Mordukhovich , Nguyen Mau Nam. &lt;a href="https://link.springer.com/book/10.1007/978-3-031-26458-0"&gt;An Easy Path to Convex Analysis and Applications&lt;/a&gt;. 2023&lt;/li&gt;
&lt;li&gt;Yurii Nesterov. &lt;a href="https://link.springer.com/book/10.1007/978-3-319-91578-4"&gt;Lectures on Convex Optimization&lt;/a&gt;. 2018&lt;/li&gt;
&lt;li&gt;Sébastien Bubeck. &lt;a href="https://arxiv.org/abs/1405.4980"&gt;Convex Optimization: Algorithms and Complexity&lt;/a&gt;. 2015&lt;/li&gt;
&lt;li&gt;Dimitri Bertsekas. &lt;a href="https://mcube.lab.nycu.edu.tw/~cfung/docs/books/bertsekas1999nonlinear_programming.pdf"&gt;Nonlinear Programming&lt;/a&gt;. 2016&lt;/li&gt;
&lt;li&gt;Boris Teodorovich Polyak. &lt;a href="https://www.researchgate.net/profile/Boris-Polyak-2/publication/342978480_Introduction_to_Optimization/links/5f1033e5299bf1e548ba4636/Introduction-to-Optimization.pdf"&gt;Introduction to Optimization&lt;/a&gt;. 1987&lt;/li&gt;
&lt;li&gt;R. T. Rockafellar. Convex Analysis. 1970&lt;/li&gt;
&lt;li&gt;H. H. Bauschke &amp;amp; P. L. Combettes. &lt;a href="https://link.springer.com/book/10.1007/978-3-319-48311-5"&gt;Convex Analysis and Monotone Operator Theory in Hilbert Spaces&lt;/a&gt;. 2011&lt;/li&gt;
&lt;li&gt;Lieven Vandenberghe and Stephen P. Boyd. &lt;a href="https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf"&gt;Convex Optimization&lt;/a&gt;. 2004&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="papers"&gt;
 Papers&lt;span class="heading__anchor"&gt; &lt;a href="#papers"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Yu. E. Nesterov. &lt;a href="https://hengshuaiyao.github.io/papers/nesterov83.pdf"&gt;A method of solving a convex programming problem with convergence rate&lt;/a&gt;. 1983&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Pre-print articles on Adagrad-variant methods</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/adagrad-variant/</link><pubDate>Mon, 15 Jul 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/adagrad-variant/</guid><description>&lt;h2 class="heading" id="1-heavy-tailed-class-imbalance-and-why-adam-outperforms-gradient-descent-on-language-models"&gt;
 1. Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models&lt;span class="heading__anchor"&gt; &lt;a href="#1-heavy-tailed-class-imbalance-and-why-adam-outperforms-gradient-descent-on-language-models"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Frederik Kunstner, Robin Yadav, Alan Milligan, Mark Schmidt, Alberto Bietti&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Adam has been shown to outperform gradient descent on large language models by a larger margin than on other tasks, but it is unclear why. We show that a key factor in this performance gap is the heavy-tailed class imbalance found in language tasks. When trained with gradient descent, the loss of infrequent words decreases more slowly than the loss of frequent ones. This leads to a slow decrease on the average loss as most samples come from infrequent words. On the other hand, Adam and sign-based methods are less sensitive to this problem. To establish that this behavior is caused by class imbalance, we show empirically that it can be reproduced across architectures and data types, on language transformers, vision CNNs, and linear models. On a linear model with cross-entropy loss, we show that class imbalance leads to imbalanced, correlated gradients and Hessians that have been hypothesized to benefit Adam. We also prove that, in continuous time, gradient descent converges slowly on low-frequency classes while sign descent does not.&lt;/p&gt;
&lt;h2 class="heading" id="2-accelerated-parameter-free-stochastic-optimization"&gt;
 2. Accelerated Parameter-Free Stochastic Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#2-accelerated-parameter-free-stochastic-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Itai Kreisler, Maor Ivgi, Oliver Hinder, Yair Carmon&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We propose a method that achieves near-optimal rates for smooth stochastic convex optimization and requires essentially no prior knowledge of problem parameters. This improves on prior work which requires knowing at least the initial distance to optimality d0. Our method, U-DoG, combines UniXGrad (Kavis et al., 2019) and DoG (Ivgi et al., 2023) with novel iterate stabilization techniques. It requires only loose bounds on d0 and the noise magnitude, provides high probability guarantees under sub-Gaussian noise, and is also near-optimal in the non-smooth case. Our experiments show consistent, strong performance on convex problems and mixed results on neural network training.&lt;/p&gt;
&lt;h2 class="heading" id="3-universal-gradient-methods-for-stochastic-convex-optimization"&gt;
 3. Universal Gradient Methods for Stochastic Convex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#3-universal-gradient-methods-for-stochastic-convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Anton Rodomanov, Ali Kavis, Yongtao Wu, Kimon Antonakopoulos, Volkan Cevher&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle&amp;rsquo;s noise but also to the Hölder smoothness of the objective function without a priori knowledge of the particular setting. The key ingredient is a novel strategy for adjusting step-size coefficients in the Stochastic Gradient Method (SGD). Unlike AdaGrad, which accumulates gradient norms, our Universal Gradient Method accumulates appropriate combinations of gradient- and iterate differences. The resulting algorithm has state-of-the-art worst-case convergence rate guarantees for the entire Hölder class including, in particular, both nonsmooth functions and those with Lipschitz continuous gradient. We also present the Universal Fast Gradient Method for SCO enjoying optimal efficiency estimates.&lt;/p&gt;</description></item><item><title>Pre-print articles on Adaptive Optimization</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/adaptive-optimization/</link><pubDate>Mon, 15 Jul 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/adaptive-optimization/</guid><description>&lt;h2 class="heading" id="1-a-simple-uniformly-optimal-method-without-line-search-for-convex-optimization"&gt;
 1. &lt;a href="https://arxiv.org/pdf/2310.10082"&gt;A simple uniformly optimal method without line search for convex optimization&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#1-a-simple-uniformly-optimal-method-without-line-search-for-convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Tianjiao Li, Guanghui Lan&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Line search (or backtracking) procedures have been widely employed into first-order methods for solving convex optimization problems, especially those with unknown problem parameters (e.g., Lipschitz constant). In this paper, we show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori. In particular, we present a novel accelerated gradient descent type algorithm called auto-conditioned fast gradient method (AC-FGM) that can achieve an optimal $\mathcal{O}(1/k^2)$ rate of convergence for smooth convex optimization without requiring the estimate of a global Lipschitz constant or the employment of line search procedures. We then extend AC-FGM to solve convex optimization problems with Hölder continuous gradients and show that it automatically achieves the optimal rates of convergence uniformly for all problem classes with the desired accuracy of the solution as the only input. Finally, we report some encouraging numerical results that demonstrate the advantages of AC-FGM over the previously developed parameter-free methods for convex optimization.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source code&lt;/strong&gt;: &lt;a href="https://github.com/tli432/AC-FGM-Implementation"&gt;https://github.com/tli432/AC-FGM-Implementation&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="2-adaptive-proximal-gradient-method-for-convex-optimization"&gt;
 2. &lt;a href="https://arxiv.org/pdf/2308.02261"&gt;Adaptive Proximal Gradient Method for Convex Optimization&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#2-adaptive-proximal-gradient-method-for-convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yura Malitsky, Konstantin Mishchenko&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source code&lt;/strong&gt;: &lt;a href="https://github.com/ymalitsky/AdProxGD"&gt;https://github.com/ymalitsky/AdProxGD&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="3-an-adaptive-stochastic-gradient-method-with-non-negative-gauss-newton-stepsizes"&gt;
 3. &lt;a href="https://arxiv.org/pdf/2407.04358"&gt;An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#3-an-adaptive-stochastic-gradient-method-with-non-negative-gauss-newton-stepsizes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Antonio Orvieto, Lin Xiao&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We consider the problem of minimizing the average of a large number of smooth but possibly non-convex functions. In the context of most machine learning applications, each loss function is non-negative and thus can be expressed as the composition of a square and its real-valued square root. This reformulation allows us to apply the Gauss-Newton method, or the Levenberg-Marquardt method when adding a quadratic regularization. The resulting algorithm, while being computationally as efficient as the vanilla stochastic gradient method, is highly adaptive and can automatically warmup and decay the effective stepsize while tracking the non-negative loss landscape. We provide a tight convergence analysis, leveraging new techniques, in the stochastic convex and non-convex settings. In particular, in the convex case, the method does not require access to the gradient Lipshitz constant for convergence, and is guaranteed to never diverge. The convergence rates and empirical evaluations compare favorably to the classical (stochastic) gradient method as well as to several other adaptive methods.&lt;/p&gt;
&lt;h2 class="heading" id="4-stochastic-polyak-step-sizes-and-momentum-convergence-guarantees-and-practical-performance"&gt;
 4. Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance&lt;span class="heading__anchor"&gt; &lt;a href="#4-stochastic-polyak-step-sizes-and-momentum-convergence-guarantees-and-practical-performance"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Antonio Orvieto, Lin Xiao&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Stochastic gradient descent with momentum, also known as Stochastic Heavy Ball method (SHB), is one of the most popular algorithms for solving large-scale stochastic optimization problems in various machine learning tasks. In practical scenarios, tuning the step-size and momentum parameters of the method is a prohibitively expensive and time-consuming process. In this work, inspired by the recent advantages of stochastic Polyak step-size in the performance of stochastic gradient descent (SGD), we propose and explore new Polyak-type variants suitable for the update rule of the SHB method. In particular, using the Iterate Moving Average (IMA) viewpoint of SHB, we propose and analyze three novel step-size selections: $\text{MomSPS} _{\max}$, $\text{MomDecSPS}$, and $\text{MomAdaSPS}$. For $\text{MomSPS} _{\max}$, we provide convergence guarantees for SHB to a neighborhood of the solution for convex and smooth problems (without assuming interpolation). If interpolation is also satisfied, then using $\text{MomSPS} _{\max}$, SHB converges to the true solution at a fast rate matching the deterministic HB. The other two variants, MomDecSPS and MomAdaSPS, are the first adaptive step-size for SHB that guarantee convergence to the exact minimizer - without a priori knowledge of the problem parameters and without assuming interpolation. Our convergence analysis of SHB is tight and obtains the convergence guarantees of stochastic Polyak step-size for SGD as a special case. We supplement our analysis with experiments validating our theory and demonstrating the effectiveness and robustness of our algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Where&lt;/strong&gt;: 13th International Conference on Learning Representations (ICLR 2025)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source code&lt;/strong&gt;: &lt;a href="https://openreview.net/forum?id=nuX2yPejiL"&gt;https://openreview.net/forum?id=nuX2yPejiL&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Pre-print articles on gradient-clipping methods</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/gradient-clipping/</link><pubDate>Mon, 15 Jul 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/gradient-clipping/</guid><description>&lt;h2 class="heading" id="1-why-gradient-clipping-accelerates-training-a-theoretical-justification-for-adaptivity"&gt;
 1. &lt;a href="https://arxiv.org/pdf/1905.11881"&gt;Why gradient clipping accelerates training: A theoretical justification for adaptivity&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#1-why-gradient-clipping-accelerates-training-a-theoretical-justification-for-adaptivity"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We provide a theoretical explanation for the effectiveness of gradient clipping in training deep neural networks. The key ingredient is a new smoothness condition derived from practical neural network training examples. We observe that gradient smoothness, a concept central to the analysis of first-order optimization algorithms that is often assumed to be a constant, demonstrates significant variability along the training trajectory of deep neural networks. Further, this smoothness positively correlates with the gradient norm, and contrary to standard assumptions in the literature, it can grow with the norm of the gradient. These empirical observations limit the applicability of existing theoretical analyses of algorithms that rely on a fixed bound on smoothness. These observations motivate us to introduce a novel relaxation of gradient smoothness that is weaker than the commonly used Lipschitz smoothness assumption. Under the new condition, we prove that two popular methods, namely, \emph{gradient clipping} and \emph{normalized gradient}, converge arbitrarily faster than gradient descent with fixed stepsize. We further explain why such adaptively scaled gradient methods can accelerate empirical convergence and verify our results empirically in popular neural network training settings.&lt;/p&gt;
&lt;h2 class="heading" id="2-revisiting-gradient-clipping-stochastic-bias-and-tight-convergence-guarantees"&gt;
 2. &lt;a href="https://arxiv.org/pdf/2305.01588"&gt;Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#2-revisiting-gradient-clipping-stochastic-bias-and-tight-convergence-guarantees"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Anastasia Koloskova, Hadrien Hendrikx, Sebastian U. Stich&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Gradient clipping is a popular modification to standard (stochastic) gradient descent, at every iteration limiting the gradient norm to a certain value $c &amp;gt;0$. It is widely used for example for stabilizing the training of deep learning models (Goodfellow et al., 2016), or for enforcing differential privacy (Abadi et al., 2016). Despite popularity and simplicity of the clipping mechanism, its convergence guarantees often require specific values of c and strong noise assumptions.&lt;/p&gt;
&lt;p&gt;In this paper, we give convergence guarantees that show precise dependence on arbitrary clipping thresholds c and show that our guarantees are tight with both deterministic and stochastic gradients. In particular, we show that (i) for deterministic gradient descent, the clipping threshold only affects the higher-order terms of convergence, (ii) in the stochastic setting convergence to the true optimum cannot be guaranteed under the standard noise assumption, even under arbitrary small step-sizes. We give matching upper and lower bounds for convergence of the gradient norm when running clipped SGD, and illustrate these results with experiments.&lt;/p&gt;
&lt;h2 class="heading" id="3-clipping-improves-adam-norm-and-adagrad-norm-when-the-noise-is-heavy-tailed"&gt;
 3. &lt;a href="https://arxiv.org/pdf/2406.04443"&gt;Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#3-clipping-improves-adam-norm-and-adagrad-norm-when-the-noise-is-heavy-tailed"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Savelii Chezhegov, Yaroslav Klyukin, Andrei Semenov, Aleksandr Beznosikov, Alexander Gasnikov, Samuel Horváth, Martin Takáč, Eduard Gorbunov&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Methods with adaptive stepsizes, such as AdaGrad and Adam, are essential for training modern Deep Learning models, especially Large Language Models. Typically, the noise in the stochastic gradients is heavy-tailed for the later ones. Gradient clipping provably helps to achieve good high-probability convergence for such noises. However, despite the similarity between AdaGrad/Adam and Clip-SGD, the current understanding of the high-probability convergence of AdaGrad/Adam-type methods is limited in this case. In this work, we prove that AdaGrad/Adam (and their delayed version) can have provably bad high-probability convergence if the noise is heavy-tailed. We also show that gradient clipping fixes this issue, i.e., we derive new high-probability convergence bounds with polylogarithmic dependence on the confidence level for AdaGrad-Norm and Adam-Norm with clipping and with/without delay for smooth convex/non-convex stochastic optimization with heavy-tailed noise. Our empirical evaluations highlight the superiority of clipped versions of AdaGrad/Adam-Norm in handling the heavy-tailed noise.&lt;/p&gt;</description></item><item><title>About</title><link>https://blog.namln.org/en/about/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/about/</guid><description>&lt;p&gt;My full name is Lê Nhựt Nam. I completed two distinguish Master Degree in Computer Science and Applied Mathematics in 2024 and 2025, respectively at the &lt;a href="https://en.hcmus.edu.vn/"&gt;University of Science&lt;/a&gt;, &lt;a href="https://vnuhcm.edu.vn/"&gt;Vietnam National University, HCMC&lt;/a&gt;. Prior to my graduate studies, I earned a Bachelor of Science in Computer Science at the same institution.My interests areas is optimization, especially algorithms and its applications. Furthermore, I also like to read books which related to partial differential equations.&lt;/p&gt;</description></item><item><title>Free Books on Dynamical Systems</title><link>https://blog.namln.org/en/mathematics/analysis/dynamical-systems/books/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/dynamical-systems/books/</guid><description>&lt;h2 class="heading" id="arxiv-free-books"&gt;
 Arxiv/ Free Books&lt;span class="heading__anchor"&gt; &lt;a href="#arxiv-free-books"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-lectures-on-neural-dynamics---francesco-bullo"&gt;
 1. &lt;a href="https://fbullo.github.io/lnd/"&gt;Lectures on Neural Dynamics&lt;/a&gt; - Francesco Bullo&lt;span class="heading__anchor"&gt; &lt;a href="#1-lectures-on-neural-dynamics---francesco-bullo"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Chapter 1: Neural circuit models based on firing rates and Hopfield networks: their dynamics, interconnections, and local Hebbian adaptation rules&lt;/li&gt;
&lt;li&gt;Chapter 2: Stability in dynamic neural networks using Lyapunov methods, multistability, and energy functions&lt;/li&gt;
&lt;li&gt;Chapter 3: Optimization in neural networks through biologically inspired gradient dynamics and sparse representations.&lt;/li&gt;
&lt;li&gt;Chapter 4: Unsupervised learning via neural dynamics, linking Hebbian rules to tasks like PCA, clustering, and similarity-based representation learning.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="2-linear-geometry-and-algebra---taras-banakh"&gt;
 2. &lt;a href="https://arxiv.org/abs/2506.14060"&gt;Linear Geometry and Algebra&lt;/a&gt; - Taras Banakh&lt;span class="heading__anchor"&gt; &lt;a href="#2-linear-geometry-and-algebra---taras-banakh"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: Linear Geometry studies geometric properties which can be expressed via the notion of a line. All information about lines is encoded in a ternary relation called a line relation. A set endowed with a line relation is called a liner. So, Linear Geometry studies liners. Imposing some additional axioms on a liner, we obtain some special classes of liners: regular, projective, affine, proaffine, etc. Linear Geometry includes Affine and Projective Geometries and is a part of Incidence Geometry. The aim of this book is to present a self-contained logical development of Linear Geometry, starting with some intuitive acceptable geometric axioms and ending with algebraic structures that necessarily arise from studying the structure of geometric objects that satisfy those simple and intuitive geometric axioms. We shall meet many quite exotic algebraic structures that arise this way: magmas, loops, ternary-ring, quasi-fields, alternative rings, procorps, profields, etc. We strongly prefer (synthetic) geometric proofs and use tools of analytic geometry only when no purely geometric proof is available. Liner Geometry has been developed by many great mathematicians since times of Antiquity (Thales, Euclides, Proclus, Pappus), through Renaissance (Descartes, Desargues), Early Modernity (Playfair, Gauss, Lobachevski, Bolyai, Poncelet, Steiner, Möbius), Late Modernity Times (Steinitz, Klein, Hilbert, Moufang, Hessenberg, Jordan, Beltrami, Fano, Gallucci, Veblen, Wedderburn, Lenz, Barlotti) till our contempories (Hartshorne, Hall, Buekenhout, Gleason, Kantor, Doyen, Hubault, Dembowski, Klingenberg, Grundhöfer).&lt;/p&gt;
&lt;h3 class="heading" id="3-an-introduction-to-graph-theory---darij-grinberg"&gt;
 3. &lt;a href="https://arxiv.org/abs/2308.04512"&gt;An introduction to graph theory&lt;/a&gt; - Darij Grinberg&lt;span class="heading__anchor"&gt; &lt;a href="#3-an-introduction-to-graph-theory---darij-grinberg"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This is a graduate-level introduction to graph theory, corresponding to a quarter-long course. It covers simple graphs, multigraphs as well as their directed analogues, and more restrictive classes such as tournaments, trees and arborescences. Among the features discussed are Eulerian circuits, Hamiltonian cycles, spanning trees, the matrix-tree and BEST theorems, proper colorings, Turan&amp;rsquo;s theorem, bipartite matching and the Menger and Gallai&amp;ndash;Milgram theorems. The basics of network flows are introduced in order to prove Hall&amp;rsquo;s marriage theorem.&lt;/p&gt;
&lt;h3 class="heading" id="4-an-introduction-to-reservoir-computing---michael-te-vrugt"&gt;
 4. &lt;a href="https://arxiv.org/abs/2412.13212"&gt;An introduction to reservoir computing&lt;/a&gt; - Michael te Vrugt&lt;span class="heading__anchor"&gt; &lt;a href="#4-an-introduction-to-reservoir-computing---michael-te-vrugt"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: There is a growing interest in the development of artificial neural networks that are implemented in a physical system. A major challenge in this context is that these networks are difficult to train since training here would require a change of physical parameters rather than simply of coefficients in a computer program. For this reason, reservoir computing, where one employs high-dimensional recurrent networks and trains only the final layer, is widely used in this context. In this chapter, I introduce the basic concepts of reservoir computing. Moreover, I present some important physical implementations coming from electronics, photonics, spintronics, mechanics, and biology. Finally, I provide a brief discussion of quantum reservoir computing.&lt;/p&gt;
&lt;h3 class="heading" id="5-nonequilibrium-and-irreversibility---giovanni-gallavotti"&gt;
 5. &lt;a href="https://arxiv.org/abs/2501.12426"&gt;Nonequilibrium and Irreversibility&lt;/a&gt; - Giovanni Gallavotti&lt;span class="heading__anchor"&gt; &lt;a href="#5-nonequilibrium-and-irreversibility---giovanni-gallavotti"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: The work concentrates on relations, which are general and model independent in chaotic system, between time averages of a few (typically {\it very few}) observables. Equilibrium thermodynamics provides a guide and here is attempted to argue that the viewpoint of Sinai-Ruelle-Bowen can be regarded as a generalization to nonequilibrum phenomena of the theory of the ensembles proposing an answer to classical question like which distributions describe the statistics of stationary states (hence extend the analysis selecting canonical, or equivalent distributions, equilibrim between the uncountably many possibilities). The special name &amp;ldquo;Chaothic Hypothesis&amp;rdquo; (CH) is given to the above attempt and its mathematical meaning is discussed. General properties are presented and applied (eg. &amp;lsquo;Fluctuation Theorem&amp;rsquo;, &amp;lsquo;Fluctuation Patterns&amp;rsquo;, &amp;lsquo;Pairing Symmetry&amp;rsquo;) and related to the basic Time Reversal symmetry: which presents irreversibility as due to chaotic motion rather than to viscous forces. The case of a simple incompressible fluid is discussed in some detail. The possibility that CH is violated in various cases is considered: and in the end it is suggested that CH is the paradigm of chaotic evolution, as the harmonic oscillators are a paradigm of ordered motions, but of course {\it tertium datur}. The exposition is informal and often restricted to heuristic analysis, with detailed references to the literature and attention to numerical simulations and importance of stressing strongly the discrete models of Physics, trying to imitate the vision of Boltzmann, is widely considered.&lt;/p&gt;
&lt;h3 class="heading" id="6-symmetries-of-living-systems-symmetry-fibrations-and-synchronization-in-biological-networks---hernan-a-makse-paolo-boldi-francesco-sorrentino-ian-stewart"&gt;
 6. &lt;a href="https://arxiv.org/abs/2502.18713"&gt;Symmetries of Living Systems: Symmetry Fibrations and Synchronization in Biological Networks&lt;/a&gt; - Hernan A. Makse, Paolo Boldi, Francesco Sorrentino, Ian Stewart&lt;span class="heading__anchor"&gt; &lt;a href="#6-symmetries-of-living-systems-symmetry-fibrations-and-synchronization-in-biological-networks---hernan-a-makse-paolo-boldi-francesco-sorrentino-ian-stewart"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: A symmetry is a `change without change&amp;rsquo;. As simple as it sounds, this concept is the fundamental cornerstone that unifies all branches of theoretical physics. Virtually all physical laws &amp;ndash; ranging from classical mechanics and electrodynamics to relativity, quantum mechanics, and the standard model &amp;ndash; can be expressed in terms of symmetry invariances. In this book, we explore whether the same principle can also explain the emergent laws of biological systems. We introduce a new geometry for biological networks and AI architectures, drawing inspiration from the mystic genius of Grothendieck&amp;rsquo;s fibrations in category theory. We attempt to bridge the gap between physics and biology using symmetries but with a twist. The traditional symmetry groups of physics are global and too rigid to describe biology. Instead, the novel notion of symmetry fibration is local, flexible, and adaptable to evolutionary pressures, providing the right framework for understanding biological complexity. In other words, this more general symmetry invariance is necessary and sufficient to ensure that a given biological network configuration can support a synchronized function. In this book, we review the theoretical progress over the last decades from mathematics, physics, computer science, dynamical systems, and graph theory that has led to the discovery of symmetry fibrations in biological networks. These symmetries act as organizing principles for biological networks. They serve as effective tools for describing the structure of these networks, blending geometry and topology. Fibrations explain how structure dictates function across various biological domains, including the transcriptome, proteome, metabolome, and connectome. Additionally, they facilitate a reduction in the dimensionality of the network, simplifying it into its fundamental building blocks for biological computation.&lt;/p&gt;
&lt;h3 class="heading" id="7-causal-fermion-systems-an-introduction-to-fundamental-structures-methods-and-applications---felix-finster-sebastian-kindermann-jan-hendrik-treude"&gt;
 7. &lt;a href="https://arxiv.org/abs/2411.06450"&gt;Causal Fermion Systems: An Introduction to Fundamental Structures, Methods and Applications&lt;/a&gt; - Felix Finster, Sebastian Kindermann, Jan-Hendrik Treude&lt;span class="heading__anchor"&gt; &lt;a href="#7-causal-fermion-systems-an-introduction-to-fundamental-structures-methods-and-applications---felix-finster-sebastian-kindermann-jan-hendrik-treude"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This textbook introduces the basic concepts of the theory of causal fermion systems, a recent approach to the description of fundamental physics. The theory yields quantum mechanics, general relativity and quantum field theory as limiting cases and is therefore a candidate for a unified physical theory. From the mathematical perspective, causal fermion systems provide a general framework for describing and analyzing non-smooth geometries and &amp;ldquo;quantum geometries.&amp;rdquo; The dynamics is described by a novel variational principle, the causal action principle. The book includes a detailed summary of the mathematical and physical preliminaries. It explains the physical concepts behind the causal fermion system approach from the basics. Moreover, all the mathematical objects and structures are introduced step by step. The mathematical methods used for the analysis of causal fermion systems and the causal action principle are introduced in depth. Many examples and applications are worked out. The textbook is addressed to master and graduate students in mathematics or physics. Furthermore, it serves as a reference work for researchers working in the field.&lt;/p&gt;
&lt;h3 class="heading" id="8-a-gentle-invitation-to-the-fractional-world---nicola-abatangelo-serena-dipierro-enrico-valdinoci"&gt;
 8. &lt;a href="https://arxiv.org/abs/2411.18238"&gt;A gentle invitation to the fractional world&lt;/a&gt; - Nicola Abatangelo, Serena Dipierro, Enrico Valdinoci&lt;span class="heading__anchor"&gt; &lt;a href="#8-a-gentle-invitation-to-the-fractional-world---nicola-abatangelo-serena-dipierro-enrico-valdinoci"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This book is intended as a self-contained introduction to selected topics in the fractional world, focusing particularly on aspects that arise in the study of equations driven by the fractional Laplacian. The scope of this work is not intended to be exhaustive or all-encompassing. We have chosen topics that we believe will appeal to readers embarking on their journey into fractional analysis. It requires only fundamental calculus and a basic understanding of measure theory. In Chapter 1, we introduce the primary object of study, the fractional Laplacian. This operator appears in diverse contexts, prompting multiple definitions and viewpoints, many of which we explore, along with some key identities. A notable distinction between local and nonlocal analysis is that in the latter, explicit calculations are often impractical or impossible. There are anyway some fortunate exceptions which are gathered in Chapter 2, providing useful and instructive examples. Chapter 3 presents an introduction to the important aspect of Liouville-type results. A large portion of this book is devoted to the regularity theory of solutions in Lebesgue spaces. Chapter 4 examines global solutions using Riesz and Bessel potential analysis, capturing the impact of both low and high frequencies on smoothness, decay, and oscillations. These spaces are also flexible enough to provide, as a byproduct, a solid regularity theory in the more commonly used fractional Sobolev spaces. In Chapter 5 we derive the corresponding interior regularity theory for solutions within a bounded domain using appropriate cutoffs and localization techniques. Additionally, technical appendices include auxiliary results used in key proofs.&lt;/p&gt;
&lt;h3 class="heading" id="9-kinetically-constrained-models---ivailo-hartarsky-cristina-toninelli"&gt;
 9. &lt;a href="https://arxiv.org/abs/2412.13634"&gt;Kinetically constrained models&lt;/a&gt; - Ivailo Hartarsky, Cristina Toninelli&lt;span class="heading__anchor"&gt; &lt;a href="#9-kinetically-constrained-models---ivailo-hartarsky-cristina-toninelli"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: The goal of this book is to provide an introduction to the mathematical theory of Kinetically constrained models developed in the last twenty years, intended for both mathematicians and physicists.&lt;/p&gt;
&lt;h3 class="heading" id="10-what-is-entropy---john-c-baez"&gt;
 10. &lt;a href="https://arxiv.org/abs/2409.09232"&gt;What is Entropy?&lt;/a&gt; - John C. Baez&lt;span class="heading__anchor"&gt; &lt;a href="#10-what-is-entropy---john-c-baez"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This short book is an elementary course on entropy, leading up to a calculation of the entropy of hydrogen gas at standard temperature and pressure. Topics covered include information, Shannon entropy and Gibbs entropy, the principle of maximum entropy, the Boltzmann distribution, temperature and coolness, the relation between entropy, expected energy and temperature, the equipartition theorem, the partition function, the relation between expected energy, free energy and entropy, the entropy of a classical harmonic oscillator, the entropy of a classical particle in a box, and the entropy of a classical ideal gas.&lt;/p&gt;
&lt;h3 class="heading" id="11-alice---simone-scardapane"&gt;
 11. &lt;a href="https://arxiv.org/abs/2404.17625"&gt;Alice&amp;rsquo;s Adventures in a Differentiable Wonderland &amp;ndash; Volume I, A Tour of the Land&lt;/a&gt; - &lt;a href="https://www.sscardapane.it/alice-book"&gt;Simone Scardapane&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#11-alice---simone-scardapane"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: Neural networks surround us, in the form of large language models, speech transcription systems, molecular discovery algorithms, robotics, and much more. Stripped of anything else, neural networks are compositions of differentiable primitives, and studying them means learning how to program and how to interact with these models, a particular example of what is called differentiable programming. This primer is an introduction to this fascinating field imagined for someone, like Alice, who has just ventured into this strange differentiable wonderland. I overview the basics of optimizing a function via automatic differentiation, and a selection of the most common designs for handling sequences, graphs, texts, and audios. The focus is on a intuitive, self-contained introduction to the most important design techniques, including convolutional, attentional, and recurrent blocks, hoping to bridge the gap between theory and code (PyTorch and JAX) and leaving the reader capable of understanding some of the most advanced models out there, such as large language models (LLMs) and multimodal architectures.&lt;/p&gt;
&lt;h3 class="heading" id="12-inverse-problems-and-data-assimilation-a-machine-learning-approach---eviatar-bach-ricardo-baptista-daniel-sanz-alonso-andrew-stuart"&gt;
 12. &lt;a href="https://arxiv.org/abs/2410.10523"&gt;Inverse Problems and Data Assimilation: A Machine Learning Approach&lt;/a&gt; - Eviatar Bach, Ricardo Baptista, Daniel Sanz-Alonso, Andrew Stuart&lt;span class="heading__anchor"&gt; &lt;a href="#12-inverse-problems-and-data-assimilation-a-machine-learning-approach---eviatar-bach-ricardo-baptista-daniel-sanz-alonso-andrew-stuart"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at researchers from inverse problems and/or data assimilation who wish to see a mathematical presentation of machine learning as it pertains to their fields. As a by-product, we include a succinct mathematical treatment of various topics in machine learning.&lt;/p&gt;
&lt;h3 class="heading" id="13-the-lanczos-algorithm-for-matrix-functions-a-handbook-for-scientists---tyler-chen"&gt;
 13. &lt;a href="https://arxiv.org/abs/2410.11090"&gt;The Lanczos algorithm for matrix functions: a handbook for scientists&lt;/a&gt; - Tyler Chen&lt;span class="heading__anchor"&gt; &lt;a href="#13-the-lanczos-algorithm-for-matrix-functions-a-handbook-for-scientists---tyler-chen"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: Lanczos-based methods have become standard tools for tasks involving matrix functions. Progress on these algorithms has been driven by several largely disjoint communities, resulting many innovative and important advancements which would not have been possible otherwise. However, this also has resulted in a somewhat fragmented state of knowledge and the propagation of a number of incorrect beliefs about the behavior of Lanczos-based methods in finite precision arithmetic. This monograph aims to provide an accessible introduction to Lanczos-based methods for matrix functions. The intended audience is scientists outside of numerical analysis, graduate students, and researchers wishing to begin work in this area. Our emphasis is on conceptual understanding, with the goal of providing a starting point to learn more about the remarkable behavior of the Lanczos algorithm. Hopefully readers will come away from this text with a better understanding of how to think about Lanczos for modern problems involving matrix functions, particularly in the context of finite precision arithmetic.&lt;/p&gt;
&lt;h3 class="heading" id="14-new-book-tensor-decompositions-for-data-science"&gt;
 14. &lt;a href="https://www.mathsci.ai/post/tensor-textbook/"&gt;New Book: Tensor Decompositions for Data Science&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#14-new-book-tensor-decompositions-for-data-science"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This book is intended for a graduate-level course in a data-science domain such as mathematics, computer science, engineering, statistics, physics, neuroscience, etc. It is written so that it can be used flexibly. It can be adapted for a subunit in a longer class or can stand on its own in a full semester course. We include substantial background material in linear algebra, optimization, and probability and statistics in the hopes of making the contents widely accessible. The book includes links to several real-world datasets to be used as examples for experiments in the book, grounding the material and providing a playground for student experimentation.&lt;/p&gt;
&lt;h3 class="heading" id="15-calculus-and-applications---teo-banica"&gt;
 15. &lt;a href="https://arxiv.org/abs/2401.00911"&gt;Calculus and applications&lt;/a&gt; - Teo Banica&lt;span class="heading__anchor"&gt; &lt;a href="#15-calculus-and-applications---teo-banica"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This is an introduction to calculus, and its applications to basic questions from physics. We first discuss the theory of functions $f:\mathbb R\to\mathbb R$, with the notion of continuity, and the construction of the derivative $f&amp;rsquo;(x)$ and of the integral $\int_a^bf(x)dx$. Then we investigate the case of the complex functions $f:\mathbb C\to\mathbb C$, and notably the holomorphic functions, and harmonic functions. Then, we discuss the multivariable functions, $f:\mathbb R^N\to\mathbb R^M$ or $f:\mathbb R^N\to\mathbb C^M$ or $f:\mathbb C^N\to\mathbb C^M$, with general theory, integration results, maximization questions, and basic applications to physics.&lt;/p&gt;
&lt;h3 class="heading" id="16-stochastic-partial-differential-equations-space-time-white-noise-and-random-fields---robert-c-dalang-marta-sanz-solé"&gt;
 16. &lt;a href="https://arxiv.org/abs/2402.02119"&gt;Stochastic Partial Differential Equations, Space-time White Noise and Random Fields&lt;/a&gt; - Robert C. Dalang, Marta Sanz-Solé&lt;span class="heading__anchor"&gt; &lt;a href="#16-stochastic-partial-differential-equations-space-time-white-noise-and-random-fields---robert-c-dalang-marta-sanz-sol%c3%a9"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This book is an introduction to the theory of stochastic partial differential equations (SPDEs), using the random field approach pioneered by J.B. Walsh (1986). The volume consists of two blocks: the core matter (Chapters 1 to 5) and the appendices (A, B and C). Chapter 1 introduces the subject, with a discussion of isonormal Gaussian processes, space-time white noise, and motivating examples of SPDEs. Chapter 2 presents a theory of stochastic integration with respect to space-time white noise. Chapter 3 deals with SPDEs with additive noise. In Chapter 4, we study a general class of SPDEs, in which additive and multiplicative nonlinearities appear. In Chapter 5, we present a selection of important topics in the theory of SPDEs, that have been the subject of much research over the last twenty years. Appendix A summarises the main results from the theory of stochastic processes and stochastic analysis that are used throughout the book. Appendix B is devoted to a systematic presentation of properties of fundamental solutions and Green&amp;rsquo;s functions associated to the classical linear differential operators (heat, fractional heat and wave operators). Appendix C is a toolbox section. Each chapter is followed by a &amp;ldquo;Notes&amp;rdquo; section, which gives historically important references, original sources and points towards other related important contributions.&lt;/p&gt;
&lt;h3 class="heading" id="17-dynamic-programming-finite-states---thomas-j-sargent-john-stachurski"&gt;
 17. &lt;a href="https://arxiv.org/abs/2401.10473"&gt;Dynamic Programming: Finite States&lt;/a&gt; - Thomas J. Sargent, John Stachurski&lt;span class="heading__anchor"&gt; &lt;a href="#17-dynamic-programming-finite-states---thomas-j-sargent-john-stachurski"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This book is about dynamic programming and its applications in economics, finance, and adjacent fields. It brings together recent innovations in the theory of dynamic programming and provides applications and code that can help readers approach the research frontier. The book is aimed at graduate students and researchers, although most chapters are accessible to undergraduate students with solid quantitative backgrounds.&lt;/p&gt;
&lt;h3 class="heading" id="18-resources-of-the-quantum-world---gilad-gour"&gt;
 18. &lt;a href="https://arxiv.org/abs/2402.05474"&gt;Resources of the Quantum World&lt;/a&gt; - Gilad Gour&lt;span class="heading__anchor"&gt; &lt;a href="#18-resources-of-the-quantum-world---gilad-gour"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This book delves into the burgeoning field of quantum resource theories, a novel and vibrant area of research within quantum information science that seeks to unify diverse quantum phenomena under a single framework. By recognizing various attributes of physical systems as &amp;ldquo;resources,&amp;rdquo; this approach offers a fresh perspective on quantum phenomena, transforming our understanding and application of concepts such as quantum entanglement, coherence, and more. With a focus on the pedagogical, the book aims to equip readers with the advanced mathematical tools and physical principles needed to navigate and contribute to this rapidly evolving field. It covers a wide range of topics, from the foundational aspects of quantum mechanics and quantum information to detailed explorations of specific resource theories, including entanglement, asymmetry, and thermodynamics. Through rigorous mathematical exposition and a unique axiomatic approach, the book provides deep insights into the operational and conceptual frameworks that underpin quantum resource theories, making it an invaluable resource for graduate students, early-career researchers, and anyone interested in the cutting-edge developments in quantum information science.&lt;/p&gt;
&lt;h3 class="heading" id="19-funktionalanalysis-teil-i---christoph-bock"&gt;
 19. &lt;a href="https://arxiv.org/abs/2402.12981"&gt;Funktionalanalysis Teil I&lt;/a&gt; - Christoph Bock&lt;span class="heading__anchor"&gt; &lt;a href="#19-funktionalanalysis-teil-i---christoph-bock"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: Roughly spoken, Functionalanalysis means the study of the category of infinite-dimensional vectorspaces over the field of real or complex numbers, together with their linear maps. In most cases, one further needs a topological structure on such a vectorspace, because then, you can consider the continuous linear maps between such spaces. The name Functionalanalysis is due to the fact, that in the beginning of the theory, the authors wanted to expand Calculus onto functionals of spaces of functions. Functionalanalytical results give the possibility to solve problems in the Theory of (Partial) Differential Equations, in Complex Analysis or in Quantum Mechanics. But the aim of this lines is not to explain the applications. We will discuss the mathematical theory of almost metric spaces, normed vector spaces and algebras, spaces of continuous resp. $p$-integrable functions as well as reflexive and uniformly convex spaces.&lt;/p&gt;
&lt;p&gt;We added that, in the case $p \in {]}0,1{[}$, $L^p$ is the completion of the compactly supported continuous functions (with the obvious metric), too. Actually, the proof is the same as in the case $p \in {[}1, \infty{[}$.&lt;/p&gt;
&lt;h3 class="heading" id="20-algebraic-topology-for-data-scientists---michael-s-postol"&gt;
 20. &lt;a href="https://arxiv.org/abs/2308.10825"&gt;Algebraic Topology for Data Scientists&lt;/a&gt; - Michael S. Postol&lt;span class="heading__anchor"&gt; &lt;a href="#20-algebraic-topology-for-data-scientists---michael-s-postol"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: This book gives a thorough introduction to topological data analysis (TDA), the application of algebraic topology to data science. Algebraic topology is traditionally a very specialized field of math, and most mathematicians have never been exposed to it, let alone data scientists, computer scientists, and analysts. I have three goals in writing this book. The first is to bring people up to speed who are missing a lot of the necessary background. I will describe the topics in point-set topology, abstract algebra, and homology theory needed for a good understanding of TDA. The second is to explain TDA and some current applications and techniques. Finally, I would like to answer some questions about more advanced topics such as cohomology, homotopy, obstruction theory, and Steenrod squares, and what they can tell us about data. It is hoped that readers will acquire the tools to start to think about these topics and where they might fit in.&lt;/p&gt;
&lt;h3 class="heading" id="21-discrete-and-continuous-weak-kam-theory-an-introduction-through-examples-and-its-applications-to-twist-maps---maxime-zavidovique"&gt;
 21. &lt;a href="https://arxiv.org/abs/2308.06356"&gt;Discrete and Continuous Weak KAM Theory: an introduction through examples and its applications to twist maps&lt;/a&gt; - Maxime Zavidovique&lt;span class="heading__anchor"&gt; &lt;a href="#21-discrete-and-continuous-weak-kam-theory-an-introduction-through-examples-and-its-applications-to-twist-maps---maxime-zavidovique"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;: The aim of these notes is to present a self contained account of discrete weak KAM theory. Put aside the intrinsic elegance of this theory, it is also a toy model for classical weak KAM theory, where many technical difficulties disappear, but where central ideas and results persist. It can therefore serve as a good introduction to (continuous) weak KAM theory. After a general exposition of the general abstract theory, several examples are studied. The last section is devoted to the historical problem of conservative twist maps of the annulus. At the end of the first three Chapters, the relations between the results proved in the discrete setting and the analogous theorems of classical weak KAM theory are discussed. Some key differences are also highlighted between the discrete and classical theory. Those results are new. The text also contains other results never published before, such as the convergence of solutions of discounted equations for degenerate perturbations.&lt;/p&gt;</description></item><item><title>Mathematics Books</title><link>https://blog.namln.org/mathematics/math-books/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/mathematics/math-books/</guid><description/></item><item><title>Mathematics Lecture Notes</title><link>https://blog.namln.org/mathematics/math-lecture-notes/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/mathematics/math-lecture-notes/</guid><description/></item><item><title>Mathematics MOOCS</title><link>https://blog.namln.org/mathematics/math-moocs/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/mathematics/math-moocs/</guid><description/></item><item><title>Pre-print articles on Difference-of-Convex (DC) Programming</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/dc-programming/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/dc-programming/</guid><description>&lt;h2 class="heading" id="57-stochastic-difference-of-convex-optimization-with-momentum"&gt;
 57. &lt;a href="https://arxiv.org/abs/2510.17503"&gt;Stochastic Difference-of-Convex Optimization with Momentum&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#57-stochastic-difference-of-convex-optimization-with-momentum"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; El Mahdi Chayti, Martin Jaggi&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Stochastic difference-of-convex (DC) optimization is prevalent in numerous machine learning applications, yet its convergence properties under small batch sizes remain poorly understood. Existing methods typically require large batches or strong noise assumptions, which limit their practical use. In this work, we show that momentum enables convergence under standard smoothness and bounded variance assumptions (of the concave part) for any batch size. We prove that without momentum, convergence may fail regardless of stepsize, highlighting its necessity. Our momentum-based algorithm achieves provable convergence and demonstrates strong empirical performance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL&lt;/strong&gt;: &lt;a href="https://arxiv.org/abs/2510.17503"&gt;https://arxiv.org/abs/2510.17503&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="56-on-the-convergence-rate-of-the-boosted-difference-of-convex-algorithm-dca"&gt;
 56. &lt;a href="https://arxiv.org/abs/2510.16569"&gt;On the convergence rate of the boosted Difference-of-Convex Algorithm (DCA)&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#56-on-the-convergence-rate-of-the-boosted-difference-of-convex-algorithm-dca"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Hadi Abbaszadehpeivasti, Etienne de Klerk, Adrien Taylor&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; The difference-of-convex algorithm (DCA) is a well-established nonlinear programming technique that solves successive convex optimization problems. These sub-problems are obtained from the difference-of-convex~(DC) decompositions of the objective and constraint functions. We investigate the worst-case performance of the unconstrained DCA, with and without boosting, where boosting simply performs an additional step in the direction generated by the usual DCA method. We show that, for certain classes of DC decompositions, the boosted DCA is provably better in the worst-case than the usual DCA. While several numerical studies have reported that boosted DCA outperforms classical DCA, a theoretical explanation for this behavior has, to the best of our knowledge, not been given until now. Our proof technique relies on semidefinite programming (SDP) performance estimation&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL&lt;/strong&gt;: &lt;a href="https://arxiv.org/abs/2510.16569"&gt;https://arxiv.org/abs/2510.16569&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="55-global-solution-algorithms-for-dc-programming-via-polyhedral-approximations-of-convex-functions"&gt;
 55. &lt;a href="https://link.springer.com/article/10.1007/s10898-025-01535-z"&gt;Global solution algorithms for DC programming via polyhedral approximations of convex functions&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#55-global-solution-algorithms-for-dc-programming-via-polyhedral-approximations-of-convex-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Fahaar M. Pirani &amp;amp; Firdevs Ulus&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We consider difference of convex (DC) programming problems and propose three algorithms to solve them globally. The main working mechanism of the proposed algorithms is to generate polyhedral underestimators to convex functions. Two of these algorithms generate a ‘fine’ polyhedral approximation of the first convex component over the compact feasible region of the DC programming problem. We prove the finiteness of these algorithms, establish the convergence rate of one of them. Moreover, we show that using the polyhedral approximation of the first component, it is possible to compute an approximate global solution of the corresponding DC programming problem without further computational effort. The third algorithm also computes a polyhedral underestimator of the first component of the DC function. Different from the first two algorithms, the third algorithm approximates it locally until finding an approximate global solution to the DC programming problem. It is shown that for any positive approximation error, the third algorithm stops after finitely many iterations. Computational results based on some test instances from the literature are provided.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL&lt;/strong&gt;: &lt;a href="https://link.springer.com/article/10.1007/s10898-025-01535-z"&gt;https://link.springer.com/article/10.1007/s10898-025-01535-z&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="54-improved-rates-for-stochastic-variance-reduced-difference-of-convex-algorithms"&gt;
 54. &lt;a href="https://arxiv.org/abs/2509.11657"&gt;Improved Rates for Stochastic Variance-Reduced Difference-of-Convex Algorithms&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#54-improved-rates-for-stochastic-variance-reduced-difference-of-convex-algorithms"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Anh Duc Nguyen, Alp Yurtsever, Suvrit Sra, Kim-Chuan Toh&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this work, we propose and analyze DCA-PAGE, a novel algorithm that integrates the difference-of-convex algorithm (DCA) with the ProbAbilistic Gradient Estimator (PAGE) to solve structured nonsmooth difference-of-convex programs. In the finite-sum setting, our method achieves a gradient computation complexity of $O(N + N^{1/2}\varepsilon^{-2})$ with sample size $N$, surpassing the previous best-known complexity of $O(N + N^{2/3}\varepsilon^{-2})$ for stochastic variance-reduced (SVR) DCA methods. Furthermore, DCA-PAGE readily extends to online settings with a similar optimal gradient computation complexity $O(b + b^{1/2}\varepsilon^{-2})$ with batch size $b$, a significant advantage over existing SVR DCA approaches that only work for the finite-sum setting. We further refine our analysis with a gap function, which enables us to obtain comparable convergence guarantees under milder assumptions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Comment&lt;/strong&gt;: Accepted at IEEE Conference on Decision and Control (IEEE CDC 2025)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL&lt;/strong&gt;: &lt;a href="https://arxiv.org/pdf/2509.11657"&gt;https://arxiv.org/pdf/2509.11657&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="53-new-algorithms-for-maximizing-the-difference-of-convex-functions"&gt;
 53. &lt;a href="https://optimization-online.org/wp-content/uploads/2025/04/comaxdc1.pdf"&gt;New Algorithms for maximizing the difference of convex functions&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#53-new-algorithms-for-maximizing-the-difference-of-convex-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Aharon Ben-Tal, Luba Tetruashvili&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Maximizing the difference of 2 convex functions over a convex feasible set (the so called
DCA problem) is a hard problem. There is a large number of publications addressing this
problem. Many of them are variations of widely used DCA algorithm [20]. The success of
this algorithm to reach a good approximation of a global optimum, depends crucially on the
choice of its starting point. In the algorithm developed in our paper MDCF (Maximizing the
Difference of Convex Functions) a major effort is to generate a good starting point. This is
obtained by using the COMAX algorithm for maximizing a convex function [6]. The solution
found by COMAX is a basis for obtaining a good strating point for MDCF.
Another contribution of the paper is the algorithm for solving problems with an indefinite
quadratic objective function and compact and convex feasible set. The problem is first
converted to maximizing a difference of convex quadratic functions. The new algorithm
QMDCF is a specific adaptation of MDCF to this case.
The performance of the two new algorithms developed in the paper is tested numerically,
and results are compared to the performance of classical DCA, and some other algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://optimization-online.org/2025/04/new-algorithms-for-maximizing-the-difference-of-convex-functions/"&gt;https://optimization-online.org/2025/04/new-algorithms-for-maximizing-the-difference-of-convex-functions/&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="52-a-progressive-decoupling-algorithm-for-minimizing-the-difference-of-convex-and-weakly-convex-functions"&gt;
 52. &lt;a href="https://link.springer.com/article/10.1007/s10957-024-02574-4"&gt;A progressive decoupling algorithm for minimizing the difference of convex and weakly convex functions&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#52-a-progressive-decoupling-algorithm-for-minimizing-the-difference-of-convex-and-weakly-convex-functions"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Welington de Oliveira &amp;amp; João Carlos de Oliveira Souza&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Commonly, decomposition and splitting techniques for optimization problems strongly depend on convexity. Implementable splitting methods for nonconvex and nonsmooth optimization problems are scarce and often lack convergence guarantees. Among the few exceptions is the Progressive Decoupling Algorithm (PDA), which has local convergence should convexity be elicitable. In this work, we furnish PDA with a descent test and extend the method to accommodate a broad class of nonsmooth optimization problems with non-elicitable convexity. More precisely, we focus on the problem of minimizing the difference of convex and weakly convex functions over a linear subspace. This framework covers, in particular, a family of stochastic programs with nonconvex recourse and statistical estimation problems for supervised learning.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://link.springer.com/article/10.1007/s10957-024-02574-4"&gt;https://link.springer.com/article/10.1007/s10957-024-02574-4&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="51-an-inexact-proximal-framework-for-nonsmooth-riemannian-difference-of-convex-optimization-arxiv250908561"&gt;
 51. An Inexact Proximal Framework for Nonsmooth Riemannian Difference-of-Convex Optimization [arXiv:2509.08561]&lt;span class="heading__anchor"&gt; &lt;a href="#51-an-inexact-proximal-framework-for-nonsmooth-riemannian-difference-of-convex-optimization-arxiv250908561"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Bo Jiang, Meng Xu, Xingju Cai, Ya-Feng Liu&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Nonsmooth Riemannian optimization has attracted increasing attention, especially in problems with sparse structures. While existing formulations typically involve convex nonsmooth terms, incorporating nonsmooth difference-of-convex (DC) penalties can enhance recovery accuracy. In this paper, we study a class of nonsmooth Riemannian optimization problems whose objective is the sum of a smooth function and a nonsmooth DC term. We establish, for the first time in the manifold setting, the equivalence between such DC formulations (with suitably chosen nonsmooth DC terms) and their $\ell_0$-regularized or $\ell_0$-constrained counterparts. To solve these problems, we propose an inexact Riemannian proximal DC (iRPDC) algorithmic framework, which returns an $\epsilon$-Riemannian critical point within $\mathcal{O}(\epsilon^{-2})$ outer iterations. Within this framework, we develop several practical algorithms based on different subproblem solvers. Among them, one achieves an overall iteration complexity of $\mathcal{O}(\epsilon^{-3})$, which matches the best-known bound in the literature. In contrast, existing algorithms either lack provable overall complexity or require $\mathcal{O}(\epsilon^{-3})$ iterations in both outer and overall complexity. A notable feature of the iRPDC algorithmic framework is a novel inexactness criterion that not only enables efficient subproblem solutions via first-order methods but also facilitates a linesearch procedure that adaptively captures the local curvature. Numerical results on sparse principal component analysis demonstrate the modeling flexibility of the DC formulaton and the competitive performance of the proposed algorithmic framework.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2509.08561"&gt;https://arxiv.org/abs/2509.08561&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="50-tight-convergence-rates-in-gradient-mapping-for-the-difference-of-convex-algorithm-arxiv250601791"&gt;
 50. Tight Convergence Rates in Gradient Mapping for the Difference-of-Convex Algorithm [arXiv:2506.01791]&lt;span class="heading__anchor"&gt; &lt;a href="#50-tight-convergence-rates-in-gradient-mapping-for-the-difference-of-convex-algorithm-arxiv250601791"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Teodor Rotaru, Panagiotis Patrinos, François Glineur&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We establish new theoretical convergence guarantees for the difference-of-convex algorithm (DCA), where the second function is allowed to be weakly-convex, measuring progress via composite gradient mapping. Based on a tight analysis of two iterations of DCA, we identify six parameter regimes leading to sublinear convergence rates toward critical points and establish those rates by proving adapted descent lemmas. We recover existing rates for the standard difference-of-convex decompositions of nonconvex-nonconcave functions, while for all other curvature settings our results are new, complementing recently obtained rates on the gradient residual. Three of our sublinear rates are tight for any number of DCA iterations, while for the other three regimes we conjecture exact rates, using insights from the tight analysis of gradient descent and numerical validation using the performance estimation methodology. Finally, we show how the equivalence between proximal gradient descent (PGD) and DCA allows the derivation of exact PGD rates for any constant stepsize.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2506.01791"&gt;https://arxiv.org/abs/2506.01791&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="49-enforcing-fairness-where-it-matters-an-approach-based-on-difference-of-convex-constraints-arxiv250512530"&gt;
 49. Enforcing Fairness Where It Matters: An Approach Based on Difference-of-Convex Constraints [arXiv:2505.12530]&lt;span class="heading__anchor"&gt; &lt;a href="#49-enforcing-fairness-where-it-matters-an-approach-based-on-difference-of-convex-constraints-arxiv250512530"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yutian He, Yankun Huang, Yao Yao, Qihang Lin&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Fairness in machine learning has become a critical concern, particularly in high-stakes applications. Existing approaches often focus on achieving full fairness across all score ranges generated by predictive models, ensuring fairness in both high and low-scoring populations. However, this stringent requirement can compromise predictive performance and may not align with the practical fairness concerns of stakeholders. In this work, we propose a novel framework for building partially fair machine learning models, which enforce fairness within a specific score range of interest, such as the middle range where decisions are most contested, while maintaining flexibility in other regions. We introduce two statistical metrics to rigorously evaluate partial fairness within a given score range, such as the top 20%-40% of scores. To achieve partial fairness, we propose an in-processing method by formulating the model training problem as constrained optimization with difference-of-convex constraints, which can be solved by an inexact difference-of-convex algorithm (IDCA). We provide the complexity analysis of IDCA for finding a nearly KKT point. Through numerical experiments on real-world datasets, we demonstrate that our framework achieves high predictive performance while enforcing partial fairness where it matters most.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="48-a-smoothing-moving-balls-approximation-method-for-a-class-of-conic-constrained-difference-of-convex-optimization-problems-arxiv250512314"&gt;
 48. A smoothing moving balls approximation method for a class of conic-constrained difference-of-convex optimization problems [arXiv:2505.12314]&lt;span class="heading__anchor"&gt; &lt;a href="#48-a-smoothing-moving-balls-approximation-method-for-a-class-of-conic-constrained-difference-of-convex-optimization-problems-arxiv250512314"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Jiefeng Xu, Ting Kei Pong, Nung-sing Sze&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we consider the problem of minimizing a difference-of-convex objective over a nonlinear conic constraint, where the cone is closed, convex, pointed and has a nonempty interior. We assume that the support function of a compact base of the polar cone exhibits a majorizing smoothing approximation, a condition that is satisfied by widely studied cones such as $\mathbb{R}^m_-$ and ${\cal S}^m_-$. Leveraging this condition, we reformulate the conic constraint equivalently as a single constraint involving the aforementioned support function, and adapt the moving balls approximation (MBA) method for its solution. In essence, in each iteration of our algorithm, we approximate the support function by a smooth approximation function and apply one MBA step. The subproblems that arise in our algorithm always involve only one single inequality constraint, and can thus be solved efficiently via one-dimensional root-finding procedures. We design explicit rules to evolve the smooth approximation functions from iteration to iteration and establish the corresponding iteration complexity for obtaining an $ε$-Karush-Kuhn-Tucker point. In addition, in the convex setting, we establish convergence of the sequence generated, and study its local convergence rate under a standard Hölderian growth condition. Finally, we illustrate numerically the effects of different rules of evolving the smooth approximation functions on the rate of convergence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2505.12314"&gt;https://arxiv.org/abs/2505.12314&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="47-a-preconditioned-difference-of-convex-functions-algorithm-with-extrapolation-and-line-search-arxiv250511914"&gt;
 47. A preconditioned difference of convex functions algorithm with extrapolation and line search [arXiv:2505.11914]&lt;span class="heading__anchor"&gt; &lt;a href="#47-a-preconditioned-difference-of-convex-functions-algorithm-with-extrapolation-and-line-search-arxiv250511914"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Ran Zhang, Hongpeng Sun&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; This paper proposes a novel proximal difference-of-convex (DC) algorithm enhanced with extrapolation and aggressive non-monotone line search for solving non-convex optimization problems. We introduce an adaptive conservative update strategy of the extrapolation parameter determined by a computationally efficient non-monotone line search. The core of our algorithm is to unite the update of the extrapolation parameter with the step size of the non-monotone line search interactively. The global convergence of the two proposed algorithms is established through the Kurdyka-Łojasiewicz properties, ensuring convergence within a preconditioned framework for linear equations. Numerical experiments on two general non-convex problems: SCAD-penalized binary classification and graph-based Ginzburg-Landau image segmentation models, demonstrate the proposed method&amp;rsquo;s high efficiency compared to existing DC algorithms both in convergence rate and solution accuracy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="46-contractive-difference-of-convex-algorithms-arxiv250510800"&gt;
 46. Contractive difference-of-convex algorithms [arXiv:2505.10800]&lt;span class="heading__anchor"&gt; &lt;a href="#46-contractive-difference-of-convex-algorithms-arxiv250510800"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Songnian He, Qiao-Li Dong, Michael Th. Rassias&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; The difference-of-convex algorithm (DCA) and its variants are the most popular methods to solve the difference-of-convex optimization problem. Each iteration of them is reduced to a convex optimization problem, which generally needs to be solved by iterative methods such as proximal gradient algorithm. However, these algorithms essentially belong to some iterative methods of fixed point problems of averaged mappings, and their convergence speed is generally slow. Furthermore, there is seldom research on the termination rule of these iterative algorithms solving the subproblem of DCA. To overcome these defects, we ffrstly show that the subproblem of the linearized proximal method (LPM) in each iteration is equal to the ffxed point problem of a contraction. Secondly, by using Picard iteration to approximately solve the subproblem of LPM in each iteration, we propose a contractive difference-ofconvex algorithm (cDCA) where an adaptive termination rule is presented. Both global subsequential convergence and global convergence of the whole sequence of cDCA are established. Finally, preliminary results from numerical experiments are promising.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://link.springer.com/article/10.1007/s10957-025-02689-2"&gt;https://link.springer.com/article/10.1007/s10957-025-02689-2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Journal&lt;/strong&gt;: Journal of Optimization Theory and Applications&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="45-a-full-splitting-algorithm-for-structured-difference-of-convex-programs-arxiv250502588"&gt;
 45. A full splitting algorithm for structured difference-of-convex programs [arXiv:2505.02588]&lt;span class="heading__anchor"&gt; &lt;a href="#45-a-full-splitting-algorithm-for-structured-difference-of-convex-programs-arxiv250502588"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Radu Ioan Bot, Rossen Nenov, Min Tao&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we study a class of nonconvex and nonsmooth structured difference-of-convex (DC) programs, which contain in the convex part the sum of a nonsmooth linearly composed convex function and a differentiable function, and in the concave part another nonsmooth linearly composed convex function. Among the various areas in which such problems occur, we would like to mention in particular the recovery of sparse signals. We propose an adaptive double-proximal, full-splitting algorithm with a moving center approach in the final subproblem, which addresses the challenge of evaluating compositions by decoupling the linear operator from the nonsmooth component. We establish the subsequential convergence of the generated sequence of iterates to an approximate stationary point and prove its global convergence under the Kurdyka-Łojasiewicz property. We also discuss the tightness of the convergence results and provide insights into the rationale for seeking an approximate KKT point. This is illustrated by constructing a counterexample showing that the algorithm can diverge when seeking exact solutions. Finally, we present a practical version of the algorithm that incorporates a nonmonotone line search, which significantly improves the convergence performance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="44-optimization-over-trained-neural-networks-difference-of-convex-algorithm-and-application-to-data-center-scheduling-arxiv250317506"&gt;
 44. Optimization over Trained Neural Networks: Difference-of-Convex Algorithm and Application to Data Center Scheduling [arXiv:2503.17506]&lt;span class="heading__anchor"&gt; &lt;a href="#44-optimization-over-trained-neural-networks-difference-of-convex-algorithm-and-application-to-data-center-scheduling-arxiv250317506"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Xinwei Liu, Vladimir Dvorkin&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; When solving decision-making problems with mathematical optimization, some constraints or objectives may lack analytic expressions but can be approximated from data. When an approximation is made by neural networks, the underlying problem becomes optimization over trained neural networks. Despite recent improvements with cutting planes, relaxations, and heuristics, the problem remains difficult to solve in practice. We propose a new solution based on a bilinear problem reformulation that penalizes ReLU constraints in the objective function. This reformulation makes the problem amenable to efficient difference-of-convex algorithms (DCA), for which we propose a principled approach to penalty selection that facilitates convergence to stationary points of the original problem. We apply the DCA to the problem of the least-cost allocation of data center electricity demand in a power grid, reporting significant savings in congested cases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="43-tight-analysis-of-difference-of-convex-algorithm-dca-improves-convergence-rates-for-proximal-gradient-descent-arxiv250304486"&gt;
 43. Tight Analysis of Difference-of-Convex Algorithm (DCA) Improves Convergence Rates for Proximal Gradient Descent [arXiv:2503.04486]&lt;span class="heading__anchor"&gt; &lt;a href="#43-tight-analysis-of-difference-of-convex-algorithm-dca-improves-convergence-rates-for-proximal-gradient-descent-arxiv250304486"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Teodor Rotaru, Panagiotis Patrinos, François Glineur&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We investigate a difference-of-convex (DC) formulation where the second term is allowed to be weakly convex. We examine the precise behavior of a single iteration of the difference-of-convex algorithm (DCA), providing a tight characterization of the objective function decrease, distinguishing between six distinct parameter regimes. Our proofs, inspired by the performance estimation framework, are notably simplified compared to related prior research. We subsequently derive sublinear convergence rates for the DCA towards critical points, assuming at least one of the functions is smooth. Additionally, we explore the underexamined equivalence between proximal gradient descent (PGD) and DCA iterations, demonstrating how DCA, a parameter-free algorithm, without the need for a stepsize, serves as a tool for studying the exact convergence rates of PGD.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="42-abstract-nonautonomous-difference-inclusions-in-locally-convex-spaces-arxiv250205184"&gt;
 42. Abstract nonautonomous difference inclusions in locally convex spaces [arXiv:2502.05184]&lt;span class="heading__anchor"&gt; &lt;a href="#42-abstract-nonautonomous-difference-inclusions-in-locally-convex-spaces-arxiv250205184"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Marko Kostic&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we consider abstract nonautonomous difference inclusions in locally convex spaces with integer order differences. We particularly analyze the existence and uniqueness of almost periodic type solutions to abstract nonautonomous difference inclusions. Our results seem to be completely new even in the Banach space setting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="41-learning-difference-of-convex-regularizers-for-inverse-problems-a-flexible-framework-with-theoretical-guarantees-arxiv250200240"&gt;
 41. Learning Difference-of-Convex Regularizers for Inverse Problems: A Flexible Framework with Theoretical Guarantees [arXiv:2502.00240]&lt;span class="heading__anchor"&gt; &lt;a href="#41-learning-difference-of-convex-regularizers-for-inverse-problems-a-flexible-framework-with-theoretical-guarantees-arxiv250200240"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yasi Zhang, Oscar Leong&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Learning effective regularization is crucial for solving ill-posed inverse problems, which arise in a wide range of scientific and engineering applications. While data-driven methods that parameterize regularizers using deep neural networks have demonstrated strong empirical performance, they often result in highly nonconvex formulations that lack theoretical guarantees. Recent work has shown that incorporating structured nonconvexity into neural network-based regularizers, such as weak convexity, can strike a balance between empirical performance and theoretical tractability. In this paper, we demonstrate that a broader class of nonconvex functions, difference-of-convex (DC) functions, can yield improved empirical performance while retaining strong convergence guarantees. The DC structure enables the use of well-established optimization algorithms, such as the Difference-of-Convex Algorithm (DCA) and a Proximal Subgradient Method (PSM), which extend beyond standard gradient descent. Furthermore, we provide theoretical insights into the conditions under which optimal regularizers can be expressed as DC functions. Extensive experiments on computed tomography (CT) reconstruction tasks show that our approach achieves strong performance across sparse and limited-view settings, consistently outperforming other weakly supervised learned regularizers. Our code is available at \url{https://github.com/YasminZhang/ADCR}.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="40-an-inexact-boosted-difference-of-convex-algorithm-for-nondifferentiable-functions-arxiv241205697"&gt;
 40. An Inexact Boosted Difference of Convex Algorithm for Nondifferentiable Functions [arXiv:2412.05697]&lt;span class="heading__anchor"&gt; &lt;a href="#40-an-inexact-boosted-difference-of-convex-algorithm-for-nondifferentiable-functions-arxiv241205697"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Orizon P. Ferreira, Boris S. Mordukhovich, Wilkreffy M. S. Santos, João Carlos O. Souza&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we introduce an inexact approach to the Boosted Difference of Convex Functions Algorithm (BDCA) for solving nonconvex and nondifferentiable problems involving the difference of two convex functions (DC functions). Specifically, when the first DC component is differentiable and the second may be nondifferentiable, BDCA utilizes the solution from the subproblem of the DC Algorithm (DCA) to define a descent direction for the objective function. A monotone linesearch is then performed to find a new point that improves the objective function relative to the subproblem solution. This approach enhances the performance of DCA. However, if the first DC component is nondifferentiable, the BDCA direction may become an ascent direction, rendering the monotone linesearch ineffective. To address this, we propose an Inexact nonmonotone Boosted Difference of Convex Algorithm (InmBDCA). This algorithm incorporates two main features of inexactness: First, the subproblem therein is solved approximately allowing us for a controlled relative error tolerance in defining the linesearch direction. Second, an inexact nonmonotone linesearch scheme is used to determine the step size for the next iteration. Under suitable assumptions, we demonstrate that InmBDCA is well-defined, with any accumulation point of the sequence generated by InmBDCA being a critical point of the problem. We also provide iteration-complexity bounds for the algorithm. Numerical experiments show that InmBDCA outperforms both the nonsmooth BDCA (nmBDCA) and the monotone version of DCA in practical scenarios.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="39-a-preconditioned-second-order-convex-splitting-algorithm-with-a-difference-of-varying-convex-functions-and-line-search-arxiv241107661"&gt;
 39. A preconditioned second-order convex splitting algorithm with a difference of varying convex functions and line search [arXiv:2411.07661]&lt;span class="heading__anchor"&gt; &lt;a href="#39-a-preconditioned-second-order-convex-splitting-algorithm-with-a-difference-of-varying-convex-functions-and-line-search-arxiv241107661"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Xinhua Shen, Zaijiu Shang, Hongpeng Sun&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; This paper introduces a preconditioned convex splitting algorithm enhanced with line search techniques for nonconvex optimization problems. The algorithm utilizes second-order backward differentiation formulas (BDF) for the implicit and linear components and the Adams-Bashforth scheme for the nonlinear and explicit parts of the gradient flow in variational functions. The proposed algorithm, resembling a generalized difference-of-convex-function approach, involves a changing set of convex functions in each iteration. It integrates the Armijo line search strategy to improve performance. The study also discusses classical preconditioners such as symmetric Gauss-Seidel, Jacobi, and Richardson within this context. The global convergence of the algorithm is established through the Kurdyka-Łojasiewicz properties, ensuring convergence within a finite number of preconditioned iterations. Numerical experiments demonstrate the superiority of the proposed second-order convex splitting with line search over conventional difference-of-convex-function algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="38-inertial-proximal-difference-of-convex-algorithm-with-convergent-bregman-plug-and-play-for-nonconvex-imaging-arxiv240903262"&gt;
 38. Inertial Proximal Difference-of-Convex Algorithm with Convergent Bregman Plug-and-Play for Nonconvex Imaging [arXiv:2409.03262]&lt;span class="heading__anchor"&gt; &lt;a href="#38-inertial-proximal-difference-of-convex-algorithm-with-convergent-bregman-plug-and-play-for-nonconvex-imaging-arxiv240903262"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Tsz Ching Chow, Chaoyan Huang, Zhongming Wu, Tieyong Zeng, Angelica I. Aviles-Rivero&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Imaging tasks are typically tackled using a structured optimization framework. This paper delves into a class of algorithms for difference-of-convex (DC) structured optimization, focusing on minimizing a DC function along with a possibly nonconvex function. Existing DC algorithm (DCA) versions often fail to effectively handle nonconvex functions or exhibit slow convergence rates. We propose a novel inertial proximal DC algorithm in Bregman geometry, named iBPDCA, designed to address nonconvex terms and enhance convergence speed through inertial techniques. We provide a detailed theoretical analysis, establishing both subsequential and global convergence of iBPDCA via the Kurdyka-Łojasiewicz property. Additionally, we introduce a Plug-and-Play variant, PnP-iBPDCA, which employs a deep neural network-based prior for greater flexibility and robustness while ensuring theoretical convergence. We also establish that the Gaussian gradient step denoiser used in our method is equivalent to evaluating the Bregman proximal operator for an implicitly weakly convex functional. We extensively validate our method on Rician noise and phase retrieval. We demonstrate that iBPDCA surpasses existing state-of-the-art methods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="37-constructing-tight-quadratic-relaxations-for-global-optimization-ii-underestimating-difference-of-convex-dc-functions-arxiv240813058"&gt;
 37. Constructing Tight Quadratic Relaxations for Global Optimization: II. Underestimating Difference-of-Convex (D.C.) Functions [arXiv:2408.13058]&lt;span class="heading__anchor"&gt; &lt;a href="#37-constructing-tight-quadratic-relaxations-for-global-optimization-ii-underestimating-difference-of-convex-dc-functions-arxiv240813058"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; William R. Strahl, Arvind U. Raghunathan, Nikolaos V. Sahinidis, Chrysanthos E. Gounaris&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Recent advances in the efficiency and robustness of algorithms solving convex quadratically constrained quadratic programming (QCQP) problems motivate developing techniques for creating convex quadratic relaxations that, although more expensive to compute, provide tighter bounds than their classical linear counterparts. In the first part of this two-paper series [Strahl et al., 2024], we developed a cutting plane algorithm to construct convex quadratic underestimators for twice-differentiable convex functions, which we extend here to address the case of non-convex difference-of-convex (d.c.) functions as well. Furthermore, we generalize our approach to consider a hierarchy of quadratic forms, thereby allowing the construction of even tighter underestimators. On a set of d.c. functions extracted from benchmark libraries, we demonstrate noteworthy reduction in the hypervolume between our quadratic underestimators and linear ones constructed at the same points. Additionally, we construct convex QCQP relaxations at the root node of a spatial branch-and-bound tree for a set of systematically created d.c. optimization problems in up to four dimensions, and we show that our relaxations reduce the gap between the lower bound computed by the state-of-the-art global optimization solver BARON and the optimal solution by an excess of 90%, on average.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="36-distributed-difference-of-convex-optimization-arxiv240716728"&gt;
 36. Distributed Difference of Convex Optimization [arXiv:2407.16728]&lt;span class="heading__anchor"&gt; &lt;a href="#36-distributed-difference-of-convex-optimization-arxiv240716728"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Vivek Khatana, Murti V. Salapaka&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this article, we focus on solving a class of distributed optimization problems involving $n$ agents with the local objective function at every agent $i$ given by the difference of two convex functions $f_i$ and $g_i$ (difference-of-convex (DC) form), where $f_i$ and $g_i$ are potentially nonsmooth. The agents communicate via a directed graph containing $n$ nodes. We create smooth approximations of the functions $f_i$ and $g_i$ and develop a distributed algorithm utilizing the gradients of the smooth surrogates and a finite-time approximate consensus protocol. We term this algorithm as DDC-Consensus. The developed DDC-Consensus algorithm allows for non-symmetric directed graph topologies and can be synthesized distributively. We establish that the DDC-Consensus algorithm converges to a stationary point of the nonconvex distributed optimization problem. The performance of the DDC-Consensus algorithm is evaluated via a simulation study to solve a nonconvex DC-regularized distributed least squares problem. The numerical results corroborate the efficacy of the proposed algorithm.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="35-an-inexact-bregman-proximal-difference-of-convex-algorithm-with-two-types-of-relative-stopping-criteria-arxiv240604646"&gt;
 35. An Inexact Bregman Proximal Difference-of-Convex Algorithm with Two Types of Relative Stopping Criteria [arXiv:2406.04646]&lt;span class="heading__anchor"&gt; &lt;a href="#35-an-inexact-bregman-proximal-difference-of-convex-algorithm-with-two-types-of-relative-stopping-criteria-arxiv240604646"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Lei Yang, Jingjing Hu, Kim-Chuan Toh&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we consider a class of difference-of-convex (DC) optimization problems, which require only a weaker restricted $L$-smooth adaptable property on the smooth part of the objective function, instead of the standard global Lipschitz gradient continuity assumption. Such problems are prevalent in many contemporary applications such as compressed sensing, statistical regression, and machine learning, and can be solved by a general Bregman proximal DC algorithm (BPDCA). However, the existing BPDCA is developed based on the stringent requirement that the involved subproblems must be solved exactly, which is often impractical and limits the applicability of the BPDCA. To facilitate the practical implementations and wider applications of the BPDCA, we develop an inexact Bregman proximal difference-of-convex algorithm (iBPDCA) by incorporating two types of relative-type stopping criteria for solving the subproblems. The proposed inexact framework has considerable flexibility to encompass many existing exact and inexact methods, and can accommodate different types of errors that may occur when solving the subproblem. This enables the potential application of our inexact framework across different DC decompositions to facilitate the design of a more efficient DCA scheme in practice. The global subsequential convergence and the global sequential convergence of our iBPDCA are established under suitable conditions including the Kurdyka-Łojasiewicz property. Some numerical experiments are conducted to show the superior performance of our iBPDCA in comparison to existing algorithms. These results also empirically validate the necessity and significance of developing different types of stopping criteria to facilitate the efficient computation of the subproblem in each iteration of our iBPDCA.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="34-single-loop-stochastic-algorithms-for-difference-of-max-structured-weakly-convex-functions-arxiv240518577"&gt;
 34. Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions [arXiv:2405.18577]&lt;span class="heading__anchor"&gt; &lt;a href="#34-single-loop-stochastic-algorithms-for-difference-of-max-structured-weakly-convex-functions-arxiv240518577"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Quanqi Hu, Qi Qi, Zhaosong Lu, Tianbao Yang&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we study a class of non-smooth non-convex problems in the form of $\min_{x}[\max_{y\in Y}φ(x, y) - \max_{z\in Z}ψ(x, z)]$, where both $Φ(x) = \max_{y\in Y}φ(x, y)$ and $Ψ(x)=\max_{z\in Z}ψ(x, z)$ are weakly convex functions, and $φ(x, y), ψ(x, z)$ are strongly concave functions in terms of $y$ and $z$, respectively. It covers two families of problems that have been studied but are missing single-loop stochastic algorithms, i.e., difference of weakly convex functions and weakly convex strongly-concave min-max problems. We propose a stochastic Moreau envelope approximate gradient method dubbed SMAG, the first single-loop algorithm for solving these problems, and provide a state-of-the-art non-asymptotic convergence rate. The key idea of the design is to compute an approximate gradient of the Moreau envelopes of $Φ, Ψ$ using only one step of stochastic gradient update of the primal and dual variables. Empirically, we conduct experiments on positive-unlabeled (PU) learning and partial area under ROC curve (pAUC) optimization with an adversarial fairness regularizer to validate the effectiveness of our proposed algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="33-improved-convergence-rates-for-the-difference-of-convex-algorithm-arxiv240316864"&gt;
 33. Improved convergence rates for the Difference-of-Convex algorithm [arXiv:2403.16864]&lt;span class="heading__anchor"&gt; &lt;a href="#33-improved-convergence-rates-for-the-difference-of-convex-algorithm-arxiv240316864"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Teodor Rotaru, Panagiotis Patrinos, François Glineur&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We consider a difference-of-convex formulation where one of the terms is allowed to be hypoconvex (or weakly convex). We first examine the precise behavior of a single iteration of the Difference-of-Convex algorithm (DCA), giving a tight characterization of the objective function decrease. This requires distinguishing between eight distinct parameter regimes. Our proofs are inspired by the performance estimation framework, but are much simplified compared to similar previous work.
We then derive sublinear DCA convergence rates towards critical points, distinguishing between cases where at least one of the functions is smooth and where both functions are nonsmooth. We conjecture the tightness of these rates for four parameter regimes, based on strong numerical evidence obtained via performance estimation, as well as the leading constant in the asymptotic sublinear rate for two more regimes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="32-an-efficient-difference-of-convex-solver-for-privacy-funnel-arxiv240304778"&gt;
 32. An Efficient Difference-of-Convex Solver for Privacy Funnel [arXiv:2403.04778]&lt;span class="heading__anchor"&gt; &lt;a href="#32-an-efficient-difference-of-convex-solver-for-privacy-funnel-arxiv240304778"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Teng-Hui Huang, Hesham El Gamal&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We propose an efficient solver for the privacy funnel (PF) method, leveraging its difference-of-convex (DC) structure. The proposed DC separation results in a closed-form update equation, which allows straightforward application to both known and unknown distribution settings. For known distribution case, we prove the convergence (local stationary points) of the proposed non-greedy solver, and empirically show that it outperforms the state-of-the-art approaches in characterizing the privacy-utility trade-off. The insights of our DC approach apply to unknown distribution settings where labeled empirical samples are available instead. Leveraging the insights, our alternating minimization solver satisfies the fundamental Markov relation of PF in contrast to previous variational inference-based solvers. Empirically, we evaluate the proposed solver with MNIST and Fashion-MNIST datasets. Our results show that under a comparable reconstruction quality, an adversary suffers from higher prediction error from clustering our compressed codes than that with the compared methods. Most importantly, our solver is independent to private information in inference phase contrary to the baselines.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="31-approximation-analysis-for-the-minimization-problem-of-difference-of-convex-functions-with-moreau-envelopes-arxiv240213461"&gt;
 31. Approximation analysis for the minimization problem of difference-of-convex functions with Moreau envelopes [arXiv:2402.13461]&lt;span class="heading__anchor"&gt; &lt;a href="#31-approximation-analysis-for-the-minimization-problem-of-difference-of-convex-functions-with-moreau-envelopes-arxiv240213461"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yan Tang, Shiqing Zhang&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this work the minimization problem for the difference of convex (DC) functions is studied by using Moreau envelopes and the descent method with Moreau gradient is employed to approximate the numerical solution. The main regularization idea in this work is inspired by Hiriart-Urruty [14], Moudafi[17], regularize the components of the DC problem by adapting the different parameters and strategic matrices flexibly to evaluate the whole DC problem. It is shown that the inertial gradient method as well as the classic gradient descent scheme tend towards an approximation stationary point of the original problem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="30-the-boosted-difference-of-convex-functions-algorithm-for-value-at-risk-constrained-portfolio-optimization-arxiv240209194"&gt;
 30. The Boosted Difference of Convex Functions Algorithm for Value-at-Risk Constrained Portfolio Optimization [arXiv:2402.09194]&lt;span class="heading__anchor"&gt; &lt;a href="#30-the-boosted-difference-of-convex-functions-algorithm-for-value-at-risk-constrained-portfolio-optimization-arxiv240209194"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Marah-Lisanne Thormann, Phan Tu Vuong, Alain B. Zemkoho&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; A highly relevant problem of modern finance is the design of Value-at-Risk (VaR) optimal portfolios. Due to contemporary financial regulations, banks and other financial institutions are tied to use the risk measure to control their credit, market and operational risks. For a portfolio with a discrete return distribution and finitely many scenarios, a Difference of Convex (DC) functions representation of the VaR can be derived. Wozabal (2012) showed that this yields a solution to a VaR constrained Markowitz style portfolio selection problem using the Difference of Convex Functions Algorithm (DCA). A recent algorithmic extension is the so-called Boosted Difference of Convex Functions Algorithm (BDCA) which accelerates the convergence due to an additional line search step. It has been shown that the BDCA converges linearly for solving non-smooth quadratic problems with linear inequality constraints. In this paper, we prove that the linear rate of convergence is also guaranteed for a piecewise linear objective function with linear equality and inequality constraints using the Kurdyka-Łojasiewicz property. An extended case study under consideration of best practices for comparing optimization algorithms demonstrates the superiority of the BDCA over the DCA for real-world financial market data. We are able to show that the results of the BDCA are significantly closer to the efficient frontier compared to the DCA. Due to the open availability of all data sets and code, this paper further provides a practical guide for transparent and easily reproducible comparisons of VaR constrained portfolio selection problems in Python.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="29-a-globally-convergent-algorithm-for-neural-network-parameter-optimization-based-on-difference-of-convex-functions-arxiv240107936"&gt;
 29. A Globally Convergent Algorithm for Neural Network Parameter Optimization Based on Difference-of-Convex Functions [arXiv:2401.07936]&lt;span class="heading__anchor"&gt; &lt;a href="#29-a-globally-convergent-algorithm-for-neural-network-parameter-optimization-based-on-difference-of-convex-functions-arxiv240107936"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Daniel Tschernutter, Mathias Kraus, Stefan Feuerriegel&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We propose an algorithm for optimizing the parameters of single hidden layer neural networks. Specifically, we derive a blockwise difference-of-convex (DC) functions representation of the objective function. Based on the latter, we propose a block coordinate descent (BCD) approach that we combine with a tailored difference-of-convex functions algorithm (DCA). We prove global convergence of the proposed algorithm. Furthermore, we mathematically analyze the convergence rate of parameters and the convergence rate in value (i.e., the training loss). We give conditions under which our algorithm converges linearly or even faster depending on the local shape of the loss function. We confirm our theoretical derivations numerically and compare our algorithm against state-of-the-art gradient-based solvers in terms of both training loss and test loss.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="28-higher-order-tensor-methods-for-minimizing-difference-of-convex-functions-arxiv240105063"&gt;
 28. Higher-order tensor methods for minimizing difference of convex functions [arXiv:2401.05063]&lt;span class="heading__anchor"&gt; &lt;a href="#28-higher-order-tensor-methods-for-minimizing-difference-of-convex-functions-arxiv240105063"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Ion Necoara&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Higher-order tensor methods were recently proposed for minimizing smooth convex and nonconvex functions. Higher-order algorithms accelerate the convergence of the classical first-order methods thanks to the higher-order derivatives used in the updates. The purpose of this paper is twofold. Firstly, to show that the higher-order algorithmic framework can be generalized and successfully applied to (nonsmooth) difference of convex functions, namely, those that can be expressed as the difference of two smooth convex functions and a possibly nonsmooth convex one. We also provide examples when the subproblem can be solved efficiently, even globally. Secondly, to derive a complete convergence analysis for our higher-order difference of convex functions (HO-DC) algorithm. In particular, we prove that any limit point of the HO-DC iterative sequence is a critical point of the problem under consideration, the corresponding objective value is monotonically decreasing and the minimum value of the norms of its subgradients converges globally to zero at a sublinear rate. The sublinear or linear convergence rates of the iterations are obtained under the Kurdyka-Lojasiewicz property.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="27-handling-nonlinearities-and-uncertainties-of-fed-batch-cultivations-with-difference-of-convex-functions-tube-mpc-arxiv231200847"&gt;
 27. Handling nonlinearities and uncertainties of fed-batch cultivations with difference of convex functions tube MPC [arXiv:2312.00847]&lt;span class="heading__anchor"&gt; &lt;a href="#27-handling-nonlinearities-and-uncertainties-of-fed-batch-cultivations-with-difference-of-convex-functions-tube-mpc-arxiv231200847"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Niels Krausch, Martin Doff-Sotta, Mark Canon, Peter Neubauer, Mariano Nicolas Cruz Bournazou&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Bioprocesses are often characterized by nonlinear and uncertain dynamics. This poses particular challenges in the context of model predictive control (MPC). Several approaches have been proposed to solve this problem, such as robust or stochastic MPC, but they can be computationally expensive when the system is nonlinear. Recent advances in optimal control theory have shown that concepts from convex optimization, tube-based MPC, and difference of convex functions (DC) enable stable and robust online process control. The approach is based on systematic DC decompositions of the dynamics and successive linearizations around feasible trajectories. By convexity, the linearization errors can be bounded tightly and treated as bounded disturbances in a robust tube-based MPC framework. However, finding the DC composition can be a difficult task. To overcome this problem, we used a neural network with special convex structure to learn the dynamics in DC form and express the uncertainty sets using simplices to maximize the product formation rate of a cultivation with uncertain substrate concentration in the feed. The results show that this is a promising approach for computationally tractable data-driven robust MPC of bioprocesses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="26-a-qualitative-difference-between-gradient-flows-of-convex-functions-in-finite--and-infinite-dimensional-hilbert-spaces-arxiv231017610"&gt;
 26. A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces [arXiv:2310.17610]&lt;span class="heading__anchor"&gt; &lt;a href="#26-a-qualitative-difference-between-gradient-flows-of-convex-functions-in-finite--and-infinite-dimensional-hilbert-spaces-arxiv231017610"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Jonathan W. Siegel, Stephan Wojtowytsch&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We consider gradient flow/gradient descent and heavy ball/accelerated gradient descent optimization for convex objective functions. In the gradient flow case, we prove the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;If $f$ does not have a minimizer, the convergence $f(x_t)\to \inf f$ can be arbitrarily slow.&lt;/li&gt;
&lt;li&gt;If $f$ does have a minimizer, the excess energy $f(x_t) - \inf f$ is integrable/summable in time. In particular, $f(x_t) - \inf f = o(1/t)$ as $t\to\infty$.&lt;/li&gt;
&lt;li&gt;In Hilbert spaces, this is optimal: $f(x_t) - \inf f$ can decay to $0$ as slowly as any given function which is monotone decreasing and integrable at $\infty$, even for a fixed quadratic objective.&lt;/li&gt;
&lt;li&gt;In finite dimension (or more generally, for all gradient flow curves of finite length), this is not optimal: We prove that there are convex monotone decreasing integrable functions $g(t)$ which decrease to zero slower than $f(x_t)-\inf f$ for the gradient flow of any convex function on $\mathbb R^d$. For instance, we show that any gradient flow $x_t$ of a convex function $f$ in finite dimension satisfies $\liminf _{t\to\infty} \big(t\cdot \log^2(t)\cdot \big{f(x _t) -\inf f\big}\big)=0$.
This improves on the commonly reported $O(1/t)$ rate and provides a sharp characterization of the energy decay law. We also note that it is impossible to establish a rate $O(1/(tφ(t)))$ for any function $φ$ which satisfies $\lim _{t\to\infty}φ(t) = \infty$, even asymptotically.
Similar results are obtained in related settings for (1) discrete time gradient descent, (2) stochastic gradient descent with multiplicative noise and (3) the heavy ball ODE. In the case of stochastic gradient descent, the summability of $\mathbb E[f(x_n) - \inf f]$ is used to prove that $f(x_n)\to \inf f$ almost surely - an improvement on the convergence almost surely up to a subsequence which follows from the $O(1/n)$ decay estimate.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="25-large-convex-sets-in-difference-sets-arxiv230907527"&gt;
 25. Large Convex sets in Difference sets [arXiv:2309.07527]&lt;span class="heading__anchor"&gt; &lt;a href="#25-large-convex-sets-in-difference-sets-arxiv230907527"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Krishnendu Bhowmick, Ben Lund, Oliver Roche-Newton&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We give a construction of a convex set $A \subset \mathbb R$ with cardinality $n$ such that $A-A$ contains a convex subset with cardinality $Ω(n^2)$. We also consider the following variant of this problem: given a convex set $A$, what is the size of the largest matching $M \subset A \times A$ such that the set [ { a-b : (a,b) \in M } ] is convex? We prove that there always exists such an $M$ with $|M| \geq \sqrt n$, and that this lower bound is best possible, up a multiplicative constant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="24-moreau-envelope-based-difference-of-weakly-convex-reformulation-and-algorithm-for-bilevel-programs-arxiv230616761"&gt;
 24. Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs [arXiv:2306.16761]&lt;span class="heading__anchor"&gt; &lt;a href="#24-moreau-envelope-based-difference-of-weakly-convex-reformulation-and-algorithm-for-bilevel-programs-arxiv230616761"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Lucy L. Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Bilevel programming has emerged as a valuable tool for hyperparameter selection, a central concern in machine learning. In a recent study by Ye et al. (2023), a value function-based difference of convex algorithm was introduced to address bilevel programs. This approach proves particularly powerful when dealing with scenarios where the lower-level problem exhibits convexity in both the upper-level and lower-level variables. Examples of such scenarios include support vector machines and $\ell_1$ and $\ell_2$ regularized regression. In this paper, we significantly expand the range of applications, now requiring convexity only in the lower-level variables of the lower-level program. We present an innovative single-level difference of weakly convex reformulation based on the Moreau envelope of the lower-level problem. We further develop a sequentially convergent Inexact Proximal Difference of Weakly Convex Algorithm (iP-DwCA). To evaluate the effectiveness of the proposed iP-DwCA, we conduct numerical experiments focused on tuning hyperparameters for kernel support vector machines on simulated data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="23-generalized-graph-signal-sampling-by-difference-of-convex-optimization-arxiv230614634"&gt;
 23. Generalized Graph Signal Sampling by Difference-of-Convex Optimization [arXiv:2306.14634]&lt;span class="heading__anchor"&gt; &lt;a href="#23-generalized-graph-signal-sampling-by-difference-of-convex-optimization-arxiv230614634"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Keitaro Yamashita, Kazuki Naganuma, Shunsuke Ono&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We propose a desigining method of a flexible sampling operator for graph signals via a difference-of-convex (DC) optimization algorithm. A fundamental challenge in graph signal processing is sampling, especially for graph signals that are not bandlimited. In order to sample beyond bandlimited graph signals, there are studies to expand the generalized sampling theory for the graph setting. Vertex-wise sampling and flexible sampling are two main strategies to sample graph signals. Recovery accuracy of existing vertex-wise sampling methods is highly dependent on specific vertices selected to generate a sampled graph signal that may compromise the accurary especially when noise is generated at the vertices. In contrast, a flexible sampling mixes values at multiple vertices to generate a sampled signal for robust sampling; however, existing flexible sampling methods impose strict assumptions and aggressive relaxations. To address these limitations, we aim to design a flexible sampling operator without such strict assumptions and aggressive relaxations by introducing DC optimization. By formulating the problem of designing a flexible sampling operator as a DC optimization problem, our method ensures robust sampling for graph signals under arbitrary priors based on generalized sampling theory. We develop an efficient solver based on the general double-proximal gradient DC algorithm, which guarantees convergence to a critical point. Experimental results demonstrate the superiority of our method in sampling and recovering beyond bandlimited graph signals compared to existing approaches.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="22-a-globally-convergent-difference-of-convex-algorithmic-framework-and-application-to-log-determinant-optimization-problems-arxiv230602001"&gt;
 22. A globally convergent difference-of-convex algorithmic framework and application to log-determinant optimization problems [arXiv:2306.02001]&lt;span class="heading__anchor"&gt; &lt;a href="#22-a-globally-convergent-difference-of-convex-algorithmic-framework-and-application-to-log-determinant-optimization-problems-arxiv230602001"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Chaorui Yao, Xin Jiang&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; The difference-of-convex algorithm (DCA) is a conceptually simple method for the minimization of (possibly) nonconvex functions that are expressed as the difference of two convex functions. At each iteration, DCA constructs a global overestimator of the objective and solves the resulting convex subproblem. Despite its conceptual simplicity, the theoretical understanding and algorithmic framework of DCA needs further investigation. In this paper, global convergence of DCA at a linear rate is established under an extended Polyak&amp;ndash;Łojasiewicz condition. The proposed condition holds for a class of DC programs with a bounded, closed, and convex constraint set, for which global convergence of DCA cannot be covered by existing analyses. Moreover, the DCProx computational framework is proposed, in which the DCA subproblems are solved by a primal&amp;ndash;dual proximal algorithm with Bregman distances. With a suitable choice of Bregman distances, DCProx has simple update rules with cheap per-iteration complexity. As an application, DCA is applied to several fundamental problems in network information theory, for which no existing numerical methods are able to compute the global optimum. For these problems, our analysis proves the global convergence of DCA, and more importantly, DCProx solves the DCA subproblems efficiently. Numerical experiments are conducted to verify the efficiency of DCProx.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="21-a-property-of-strictly-convex-functions-which-differ-from-each-other-by-a-constant-on-the-boundary-of-their-domain-arxiv230512183"&gt;
 21. A property of strictly convex functions which differ from each other by a constant on the boundary of their domain [arXiv:2305.12183]&lt;span class="heading__anchor"&gt; &lt;a href="#21-a-property-of-strictly-convex-functions-which-differ-from-each-other-by-a-constant-on-the-boundary-of-their-domain-arxiv230512183"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Biagio Ricceri&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, in particular, we prove the following result: Let $E$ be a reflexive real Banach space and let $C\subset E$ be a closed convex set, with non-empty interior, whose boundary is sequentially weakly closed and non-convex. Then, for every function $\varphi:\partial C\to {\bf R}$ and for every convex set $S\subseteq E^&lt;em&gt;$ dense in $E^*$, there exists $\tilde{γ} \in S$ having the following property: for every strictly convex lower semicontinuous function $J:C \to {\bf R}$, Gâteaux differentiable in $\hbox {int}(C)$, such that $J _{\mid\partial C}-\varphi$ is constant in $\partial C$ and $\lim _{|x|\to +\infty}{{J(x)}\over {|x|}} = +\infty$ if $C$ is unbounded, $\tilde{γ}$ is an algebraically interior point of $J&amp;rsquo;(\hbox {\int}(C))$ (with respect to $E^&lt;/em&gt;$).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="20-local-differences-determined-by-convex-sets-arxiv230400888"&gt;
 20. Local Differences Determined by Convex sets [arXiv:2304.00888]&lt;span class="heading__anchor"&gt; &lt;a href="#20-local-differences-determined-by-convex-sets-arxiv230400888"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Krishnendu Bhowmick, Miriam Patry, Oliver Roche-Newton&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; This paper introduces a new problem concerning additive properties of convex sets. Let $S= {s_1 &amp;lt; \dots &amp;lt;s_n }$ be a set of real numbers and let $D_i(S)= {s_x-s_y: 1 \leq x-y \leq i}$. We expect that $D_i(S)$ is large, with respect to the size of $S$ and the parameter $i$, for any convex set $S$. We give a construction to show that $D_3(S)$ can be as small as $n+2$, and show that this is the smallest possible size. On the other hand, we use an elementary argument to prove a non-trivial lower bound for $D_4(S)$, namely $|D_4(S)| \geq \frac{5}{4}n -1$. For sufficiently large values of $i$, we are able to prove a non-trivial bound that grows with $i$ using incidence geometry.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="19-preconditioned-algorithm-for-difference-of-convex-functions-with-applications-to-graph-ginzburg-landau-model-arxiv230314495"&gt;
 19. Preconditioned Algorithm for Difference of Convex Functions with applications to Graph Ginzburg-Landau Model [arXiv:2303.14495]&lt;span class="heading__anchor"&gt; &lt;a href="#19-preconditioned-algorithm-for-difference-of-convex-functions-with-applications-to-graph-ginzburg-landau-model-arxiv230314495"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Xinhua Shen, Hongpeng Sun, Xuecheng Tai&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this work, we propose and study a preconditioned framework with a graphic Ginzburg-Landau functional for image segmentation and data clustering by parallel computing. Solving nonlocal models is usually challenging due to the huge computation burden. For the nonconvex and nonlocal variational functional, we propose several damped Jacobi and generalized Richardson preconditioners for the large-scale linear systems within a difference of convex functions algorithms framework. They are efficient for parallel computing with GPU and can leverage the computational cost. Our framework also provides flexible step sizes with a global convergence guarantee. Numerical experiments show the proposed algorithms are very competitive compared to the singular value decomposition based spectral method.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="18-multi-uav-trajectory-planning-problem-using-the-difference-of-convex-function-programming-arxiv230307581"&gt;
 18. Multi-UAV trajectory planning problem using the difference of convex function programming [arXiv:2303.07581]&lt;span class="heading__anchor"&gt; &lt;a href="#18-multi-uav-trajectory-planning-problem-using-the-difference-of-convex-function-programming-arxiv230307581"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Anh Phuong Ngo, Christian Thomas, Ali Karimoddini, Hieu T. Nguyen&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; The trajectory planning problem for a swarm of multiple UAVs is known as a challenging nonconvex optimization problem, particularly due to a large number of collision avoidance constraints required for individual pairs of UAVs in the swarm. In this paper, we tackle this nonconvexity by leveraging the difference of convex function (DC) programming. We introduce the slack variables to relax and reformulate the collision avoidance conditions and employ the penalty function term to equivalently convert the problem into a DC form. Consequently, we construct a penalty DC algorithm in which we sequentially solve a set of convex optimization problems obtained by linearizing the collision avoidance constraint. The algorithm iteratively tightens the safety condition and reduces the objective cost of the planning problem and the additional penalty term. Numerical results demonstrate the effectiveness of the proposed approach in planning a large number of UAVs in congested space.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="17-approximate-bilevel-difference-convex-programming-for-bayesian-risk-markov-decision-processes-arxiv230111415"&gt;
 17. Approximate Bilevel Difference Convex Programming for Bayesian Risk Markov Decision Processes [arXiv:2301.11415]&lt;span class="heading__anchor"&gt; &lt;a href="#17-approximate-bilevel-difference-convex-programming-for-bayesian-risk-markov-decision-processes-arxiv230111415"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yifan Lin, Enlu Zhou&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We consider infinite-horizon Markov Decision Processes where parameters, such as transition probabilities, are unknown and estimated from data. The popular distributionally robust approach to addressing the parameter uncertainty can sometimes be overly conservative. In this paper, we utilize the recently proposed formulation, Bayesian risk Markov Decision Process (BR-MDP), to address parameter (or epistemic) uncertainty in MDPs. To solve the infinite-horizon BR-MDP with a class of convex risk measures, we propose a computationally efficient approach called approximate bilevel difference convex programming (ABDCP). The optimization is performed offline and produces the optimal policy that is represented as a finite state controller with desirable performance guarantees. We also demonstrate the empirical performance of the BR-MDP formulation and the proposed algorithm.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="16-single-crossing-differences-in-convex-environments-arxiv221212009"&gt;
 16. Single-Crossing Differences in Convex Environments [arXiv:2212.12009]&lt;span class="heading__anchor"&gt; &lt;a href="#16-single-crossing-differences-in-convex-environments-arxiv221212009"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Navin Kartik, SangMok Lee, Daniel Rappoport&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; An agent&amp;rsquo;s preferences depend on an ordered parameter or type. We characterize the set of utility functions with single-crossing differences (SCD) in convex environments. These include preferences over lotteries, both in expected utility and rank-dependent utility frameworks, and preferences over bundles of goods and over consumption streams. Our notion of SCD does not presume an order on the choice space. This unordered SCD is necessary and sufficient for &amp;lsquo;&amp;lsquo;interval choice&amp;rsquo;&amp;rsquo; comparative statics. We present applications to cheap talk, observational learning, and collective choice, showing how convex environments arise in these problems and how SCD/interval choice are useful. Methodologically, our main characterization stems from a result on linear aggregations of single-crossing functions.
△ Less&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="15-control-of-uncertain-pwa-systems-using-difference-of-convex-decompositions-arxiv220912990"&gt;
 15. Control of Uncertain PWA Systems using Difference-of-Convex Decompositions [arXiv:2209.12990]&lt;span class="heading__anchor"&gt; &lt;a href="#15-control-of-uncertain-pwa-systems-using-difference-of-convex-decompositions-arxiv220912990"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Siddharth H. Nair, Yvonne R. Stürz&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this report, we analyze and design feedback policies for discrete-time Piecewise-Affine (PWA) systems with uncertainty in both the affine dynamics and the polytopic partition. The main idea is to utilise the Difference-of-Convex (DC) decomposition of continuous PWA systems to derive quadratic Lyapunov functions as stability certificates and stabilizing affine policies in a higher dimensional space. When projected back to the state space, we obtain time-varying PWQ Lyapunov functions and time-varying PWA feedback policies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="14-encoding-inductive-invariants-as-barrier-certificates-synthesis-via-difference-of-convex-programming-arxiv220909703"&gt;
 14. Encoding inductive invariants as barrier certificates: synthesis via difference-of-convex programming [arXiv:2209.09703]&lt;span class="heading__anchor"&gt; &lt;a href="#14-encoding-inductive-invariants-as-barrier-certificates-synthesis-via-difference-of-convex-programming-arxiv220909703"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Qiuye Wang, Mingshuai Chen, Bai Xue, Naijun Zhan, Joost-Pieter Katoen&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; A barrier certificate often serves as an inductive invariant that isolates an unsafe region from the reachable set of states, and hence is widely used in proving safety of hybrid systems possibly over an infinite time horizon. We present a novel condition on barrier certificates, termed the invariant barrier-certificate condition, that witnesses unbounded-time safety of differential dynamical systems. The proposed condition is the weakest possible one to attain inductive invariance. We show that discharging the invariant barrier-certificate condition &amp;ndash; thereby synthesizing invariant barrier certificates &amp;ndash; can be encoded as solving an optimization problem subject to bilinear matrix inequalities (BMIs). We further propose a synthesis algorithm based on difference-of-convex programming, which approaches a local optimum of the BMI problem via solving a series of convex optimization problems. This algorithm is incorporated in a branch-and-bound framework that searches for the global optimum in a divide-and-conquer fashion. We present a weak completeness result of our method, namely, a barrier certificate is guaranteed to be found (under some mild assumptions) whenever there exists an inductive invariant (in the form of a given template) that suffices to certify safety of the system. Experimental results on benchmarks demonstrate the effectiveness and efficiency of our approach.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="13-a-convex-set-with-a-rich-difference-arxiv220803258"&gt;
 13. A convex set with a rich difference [arXiv:2208.03258]&lt;span class="heading__anchor"&gt; &lt;a href="#13-a-convex-set-with-a-rich-difference-arxiv220803258"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Oliver Roche-Newton, Audie Warren&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We construct a convex set $A$ with cardinality $2n$ and with the property that an element of the difference set $A-A$ can be represented in $n$ different ways. We also show that this construction is optimal by proving that for any convex set $A$, the maximum possible number of representations an element of $A-A$ can have is $\lfloor |A|/2 \rfloor $.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="12-value-function-based-difference-of-convex-algorithm-for-bilevel-hyperparameter-selection-problems-arxiv220605976"&gt;
 12. Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems [arXiv:2206.05976]&lt;span class="heading__anchor"&gt; &lt;a href="#12-value-function-based-difference-of-convex-algorithm-for-bilevel-hyperparameter-selection-problems-arxiv220605976"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Lucy Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Gradient-based optimization methods for hyperparameter tuning guarantee theoretical convergence to stationary solutions when for fixed upper-level variable values, the lower level of the bilevel program is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We show that this algorithm achieves stationary solutions without LLSC and LLS assumptions for bilevel programs from a broad class of hyperparameter tuning applications. Our extensive experiments confirm our theoretical findings and show that the proposed VF-iDCA yields superior performance when applied to tune hyperparameters.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="11-decentralized-saddle-point-problems-with-different-constants-of-strong-convexity-and-strong-concavity-arxiv220600090"&gt;
 11. Decentralized Saddle-Point Problems with Different Constants of Strong Convexity and Strong Concavity [arXiv:2206.00090]&lt;span class="heading__anchor"&gt; &lt;a href="#11-decentralized-saddle-point-problems-with-different-constants-of-strong-convexity-and-strong-concavity-arxiv220600090"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Dmitriy Metelev, Alexander Rogozin, Alexander Gasnikov, Dmitry Kovalev&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Large-scale saddle-point problems arise in such machine learning tasks as GANs and linear models with affine constraints. In this paper, we study distributed saddle-point problems (SPP) with strongly-convex-strongly-concave smooth objectives that have different strong convexity and strong concavity parameters of composite terms, which correspond to min and max variables, and bilinear saddle-point part. We consider two types of first-order oracles: deterministic (returns gradient) and stochastic (returns unbiased stochastic gradient). Our method works in both cases and takes several consensus steps between oracle calls.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="10-the-difference-of-convex-algorithm-on-hadamard-manifolds-arxiv211205250"&gt;
 10. The difference of convex algorithm on Hadamard manifolds [arXiv:2112.05250]&lt;span class="heading__anchor"&gt; &lt;a href="#10-the-difference-of-convex-algorithm-on-hadamard-manifolds-arxiv211205250"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Ronny Bergmann, Orizon P. Ferreira, Elianderson M. Santos, João Carlos O. Souza&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we propose a Riemannian version of the difference of convex algorithm (DCA) to solve a minimization problem involving the difference of convex (DC) function. We establish the equivalence between the classical and simplified Riemannian versions of the DCA. We also prove that, under mild assumptions, the Riemannian version of the DCA is well-defined, and every cluster point of the sequence generated by the proposed method, if any, is a critical point of the objective DC function. Additionally, we establish some duality relations between the DC problem and its dual. To illustrate the effectiveness of the algorithm, we present some numerical experiments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="9-data-fitting-with-signomial-programming-compatible-difference-of-convex-functions-arxiv211012104"&gt;
 9. Data Fitting with Signomial Programming Compatible Difference of Convex Functions [arXiv:2110.12104]&lt;span class="heading__anchor"&gt; &lt;a href="#9-data-fitting-with-signomial-programming-compatible-difference-of-convex-functions-arxiv211012104"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Cody Karcher&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Signomial Programming (SP) has proven to be a powerful tool for engineering design optimization, striking a balance between the computational efficiency of Geometric Programming (GP) and the extensibility of more general optimization methods like Sequential Quadratic Programming (SQP). But when an existing engineering analysis tool is incompatible with the mathematics of the SP formulation, options are limited. Previous literature has suggested schemes for fitting GP compatible models to pre-computed data, but no methods have yet been proposed that take advantage of the increased modeling flexibility available in SP. This paper describes a new Soft Difference of Max Affine (SDMA) function class that is constructed from existing methods of GP compatible fitting and the theory of Difference of Convex (DC) functions. When a SDMA function is fit to data in log-log transformed space, it becomes either a signomial or a set of signomials upon inverse transformation. Three examples of fitting are presented here, including simple test cases in 2D and 3D, and a fit to the performance data of the NACA 24xx family of airfoils. In each case, RMS error is driven to less than 1%.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="8-factored-couplings-in-multi-marginal-optimal-transport-via-difference-of-convex-programming-arxiv211000629"&gt;
 8. Factored couplings in multi-marginal optimal transport via difference of convex programming [arXiv:2110.00629]&lt;span class="heading__anchor"&gt; &lt;a href="#8-factored-couplings-in-multi-marginal-optimal-transport-via-difference-of-convex-programming-arxiv211000629"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Quang Huy Tran, Hicham Janati, Ievgen Redko, Rémi Flamary, Nicolas Courty&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Optimal transport (OT) theory underlies many emerging machine learning (ML) methods nowadays solving a wide range of tasks such as generative modeling, transfer learning and information retrieval. These latter works, however, usually build upon a traditional OT setup with two distributions, while leaving a more general multi-marginal OT formulation somewhat unexplored. In this paper, we study the multi-marginal OT (MMOT) problem and unify several popular OT methods under its umbrella by promoting structural information on the coupling. We show that incorporating such structural information into MMOT results in an instance of a different of convex (DC) programming problem allowing us to solve it numerically. Despite high computational cost of the latter procedure, the solutions provided by DC optimization are usually as qualitative as those obtained using currently employed optimization schemes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="7-on-the-rate-of-convergence-of-the-difference-of-convex-algorithm-dca-arxiv210913566"&gt;
 7. On the rate of convergence of the Difference-of-Convex Algorithm (DCA) [arXiv:2109.13566]&lt;span class="heading__anchor"&gt; &lt;a href="#7-on-the-rate-of-convergence-of-the-difference-of-convex-algorithm-dca-arxiv210913566"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Hadi Abbaszadehpeivasti, Etienne de Klerk, Moslem Zamani&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we study the convergence rate of the DCA (Difference-of-Convex Algorithm), also known as the convex-concave procedure, with two different termination criteria that are suitable for smooth and nonsmooth decompositions respectively. The DCA is a popular algorithm for difference-of-convex (DC) problems, and known to converge to a stationary point of the objective under some assumptions. We derive a worst-case convergence rate of $O(1/\sqrt{N})$ after $N$ iterations of the objective gradient norm for certain classes of DC problems, without assuming strong convexity in the DC decomposition, and give an example which shows the convergence rate is exact. We also provide a new convergence rate of $O(1/N)$ for the DCA with the second termination criterion. %In addition, we investigate the DCA with regularization. Moreover, we derive a new linear convergence rate result for the DCA under the assumption of the Polyak-Łojasiewicz inequality. The novel aspect of our analysis is that it employs semidefinite programming performance estimation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="6-a-different-perspective-on-the-stochastic-convex-feasibility-problem-arxiv210812029"&gt;
 6. A Different Perspective On The Stochastic Convex Feasibility Problem [arXiv:2108.12029]&lt;span class="heading__anchor"&gt; &lt;a href="#6-a-different-perspective-on-the-stochastic-convex-feasibility-problem-arxiv210812029"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; James Renegar, Song Zhou&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We analyze a simple randomized subgradient method for approximating solutions to stochastic systems of convex functional constraints, the only input to the algorithm being the size of minibatches. By introducing a new notion of what is meant for a point to approximately solve the constraints, determining bounds on the expected number of iterations reduces to determining a hitting time for a compound Bernoulli process, elementary probability. Besides bounding the expected number of iterations quite generally, we easily establish concentration inequalities on the number of iterations, and more interesting, we establish much-improved bounds when a notion akin to Hölderian growth is satisfied, for all degrees of growth, not just the linear growth of piecewise-linear convex functions or the quadratic growth of strongly convex functions. Finally, we establish the analogous results under a slight modification to the algorithm which results in the user knowing with high confidence an iterate is in hand that approximately solves the system. Perhaps surprisingly, the iteration bounds here are deterministic &amp;ndash; all of the probability gets wrapped into the confidence level (albeit at the expense of potentially large minibatches).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="5-retraction-based-first-order-feasible-methods-for-difference-of-convex-programs-with-smooth-inequality-and-simple-geometric-constraints-arxiv210608584"&gt;
 5. Retraction-based first-order feasible methods for difference-of-convex programs with smooth inequality and simple geometric constraints [arXiv:2106.08584]&lt;span class="heading__anchor"&gt; &lt;a href="#5-retraction-based-first-order-feasible-methods-for-difference-of-convex-programs-with-smooth-inequality-and-simple-geometric-constraints-arxiv210608584"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yongle Zhang, Guoyin Li, Ting Kei Pong, Shiqi Xu&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we propose first-order feasible methods for difference-of-convex (DC) programs with smooth inequality and simple geometric constraints. Our strategy for maintaining feasibility of the iterates is based on a &amp;ldquo;retraction&amp;rdquo; idea adapted from the literature of manifold optimization. When the constraints are convex, we establish the global subsequential convergence of the sequence generated by our algorithm under strict feasibility condition, and analyze its convergence rate when the objective is in addition convex according to the Kurdyka-Lojasiewicz (KL) exponent of the extended objective (i.e., sum of the objective and the indicator function of the constraint set). We also show that the extended objective of a large class of Euclidean norm (and more generally, group LASSO penalty) regularized convex optimization problems is a KL function with exponent $\frac12$; consequently, our algorithm is locally linearly convergent when applied to these problems. We then extend our method to solve DC programs with a single specially structured nonconvex constraint. Finally, we discuss how our algorithms can be applied to solve two concrete optimization problems, namely, group-structured compressed sensing problems with Gaussian measurement noise and compressed sensing problems with Cauchy measurement noise, and illustrate the empirical performance of our algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="4-synthesizing-invariant-barrier-certificates-via-difference-of-convex-programming-arxiv210514311"&gt;
 4. Synthesizing Invariant Barrier Certificates via Difference-of-Convex Programming [arXiv:2105.14311]&lt;span class="heading__anchor"&gt; &lt;a href="#4-synthesizing-invariant-barrier-certificates-via-difference-of-convex-programming-arxiv210514311"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Qiuye Wang, Mingshuai Chen, Bai Xue, Naijun Zhan, Joost-Pieter Katoen&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; A barrier certificate often serves as an inductive invariant that isolates an unsafe region from the reachable set of states, and hence is widely used in proving safety of hybrid systems possibly over the infinite time horizon. We present a novel condition on barrier certificates, termed the invariant barrier-certificate condition, that witnesses unbounded-time safety of differential dynamical systems. The proposed condition is by far the least conservative one on barrier certificates, and can be shown as the weakest possible one to attain inductive invariance. We show that discharging the invariant barrier-certificate condition &amp;ndash; thereby synthesizing invariant barrier certificates &amp;ndash; can be encoded as solving an optimization problem subject to bilinear matrix inequalities (BMIs). We further propose a synthesis algorithm based on difference-of-convex programming, which approaches a local optimum of the BMI problem via solving a series of convex optimization problems. This algorithm is incorporated in a branch-and-bound framework that searches for the global optimum in a divide-and-conquer fashion. We present a weak completeness result of our method, in the sense that a barrier certificate is guaranteed to be found (under some mild assumptions) whenever there exists an inductive invariant (in the form of a given template) that suffices to certify safety of the system. Experimental results on benchmark examples demonstrate the effectiveness and efficiency of our approach.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="3-algorithms-for-difference-of-convex-dc-programs-based-on-difference-of-moreau-envelopes-smoothing-arxiv210401470"&gt;
 3. Algorithms for Difference-of-Convex (DC) Programs Based on Difference-of-Moreau-Envelopes Smoothing [arXiv:2104.01470]&lt;span class="heading__anchor"&gt; &lt;a href="#3-algorithms-for-difference-of-convex-dc-programs-based-on-difference-of-moreau-envelopes-smoothing-arxiv210401470"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Kaizhao Sun, Xu Andy Sun&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper we consider minimization of a difference-of-convex (DC) function with and without linear constraints. We first study a smooth approximation of a generic DC function, termed difference-of-Moreau-envelopes (DME) smoothing, where both components of the DC function are replaced by their respective Moreau envelopes. The resulting smooth approximation is shown to be Lipschitz differentiable, capture stationary points, local, and global minima of the original DC function, and enjoy some growth conditions, such as level-boundedness and coercivity, for broad classes of DC functions. We then develop four algorithms for solving DC programs with and without linear constraints based on the DME smoothing. In particular, for a smoothed DC program without linear constraints, we show that the classic gradient descent method as well as an inexact variant can obtain a stationary solution in the limit with a convergence rate of $\mathcal{O}(K^{-1/2})$, where $K$ is the number of proximal evaluations of both components. Furthermore, when the DC program is explicitly constrained in an affine subspace, we combine the smoothing technique with the augmented Lagrangian function and derive two variants of the augmented Lagrangian method (ALM), named LCDC-ALM and composite LCDC-ALM, focusing on different structures of the DC objective function. We show that both algorithms find an $ε$-approximate stationary solution of the original DC program in $\mathcal{O}(ε^{-2})$ iterations. Comparing to existing methods designed for linearly constrained weakly convex minimization, the proposed ALM-based algorithms can be applied to a broader class of problems, where the objective contains a nonsmooth concave component. Finally, numerical experiments are presented to demonstrate the performance of the proposed algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="2-cdinn--convex-difference-neural-networks-arxiv210317231"&gt;
 2. CDiNN -Convex Difference Neural Networks [arXiv:2103.17231]&lt;span class="heading__anchor"&gt; &lt;a href="#2-cdinn--convex-difference-neural-networks-arxiv210317231"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Parameswaran Sankaranarayanan, Raghunathan Rengaswamy&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Neural networks with ReLU activation function have been shown to be universal function approximators and learn function mapping as non-smooth functions. Recently, there is considerable interest in the use of neural networks in applications such as optimal control. It is well-known that optimization involving non-convex, non-smooth functions are computationally intensive and have limited convergence guarantees. Moreover, the choice of optimization hyper-parameters used in gradient descent/ascent significantly affect the quality of the obtained solutions. A new neural network architecture called the Input Convex Neural Networks (ICNNs) learn the output as a convex function of inputs thereby allowing the use of efficient convex optimization methods. Use of ICNNs for determining the input for minimizing output has two major problems: learning of a non-convex function as a convex mapping could result in significant function approximation error, and we also note that the existing representations cannot capture simple dynamic structures like linear time delay systems. We attempt to address the above problems by introduction of a new neural network architecture, which we call the CDiNN, which learns the function as a difference of polyhedral convex functions from data. We also discuss that, in some cases, the optimal input can be obtained from CDiNN through difference of convex optimization with convergence guarantees and that at each iteration, the problem is reduced to a linear programming problem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="1-a-difference-of-convex-cutting-plane-algorithm-for-mixed-binary-linear-program-arxiv210300717"&gt;
 1. A Difference-of-Convex Cutting Plane Algorithm for Mixed-Binary Linear Program [arXiv:2103.00717]&lt;span class="heading__anchor"&gt; &lt;a href="#1-a-difference-of-convex-cutting-plane-algorithm-for-mixed-binary-linear-program-arxiv210300717"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Yi-Shuai Niu, Yu You&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper, we propose a cutting plane algorithm based on DC (Difference-of-Convex) programming and DC cut for globally solving Mixed-Binary Linear Program (MBLP). We first use a classical DC programming formulation via the exact penalization to formulate MBLP as a DC program, which can be solved by DCA algorithm. Then, we focus on the construction of DC cuts, which serves either as a local cut (namely type-I DC cut) at feasible local minimizer of MBLP, or as a global cut (namely type-II DC cut) at infeasible local minimizer of MBLP if some particular assumptions are verified. Otherwise, the constructibility of DC cut is still unclear, and we propose to use classical global cuts (such as the Lift-and-Project cut) instead. Combining DC cut and classical global cuts, a cutting plane algorithm, namely DCCUT, is established for globally solving MBLP. The convergence theorem of DCCUT is proved. Restarting DCA in DCCUT helps to quickly update the upper bound solution and to introduce more DC cuts for lower bound improvement. A variant of DCCUT by introducing more classical global cuts in each iteration is proposed, and parallel versions of DCCUT and its variant are also designed which use the power of multiple processors for better performance. Numerical simulations of DCCUT type algorithms comparing with the classical cutting plane algorithm using Lift-and-Project cuts are reported. Tests on some specific samples and the MIPLIB 2017 benchmark dataset demonstrate the benefits of DC cut and good performance of DCCUT algorithms.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;URL:&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;</description></item><item><title>Publications</title><link>https://blog.namln.org/en/publications/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/publications/</guid><description>&lt;h2 class="heading" id="notes-and-pre-prints"&gt;
 Notes, and Pre-prints&lt;span class="heading__anchor"&gt; &lt;a href="#notes-and-pre-prints"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Nam Le, Extreme Points and the Krein&amp;ndash;Milman Theorem: A note on Brezis Problem 1, 2026, &lt;em&gt;Comments are welcome.&lt;/em&gt; &lt;a href="https://blog.namln.org/pubs/brezis-problem-01.pdf"&gt;pdf&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="slides-talks"&gt;
 Slides, Talks&lt;span class="heading__anchor"&gt; &lt;a href="#slides-talks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Nam Le, &amp;ldquo;GIÁ TRỊ SHAP TRONG HỌC MÁY GIẢI THÍCH&amp;rdquo;, 2026. &lt;a href="https://blog.namln.org/slides/SHAP-Values-Vi.pdf"&gt;pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[2] Nam Le, Slide lectures in Introduction to Machine Learning, 2026. &lt;a href=""&gt;pdfs&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="journal-publications"&gt;
 Journal Publications&lt;span class="heading__anchor"&gt; &lt;a href="#journal-publications"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Le, Thanh, &lt;strong&gt;Nam Le&lt;/strong&gt;, and Bac Le. &amp;ldquo;&lt;a href="https://www.sciencedirect.com/science/article/abs/pii/S0957417422021406"&gt;Knowledge graph embedding by relational rotation and complex convolution for link prediction.&lt;/a&gt;&amp;rdquo; Expert Systems with Applications 214 (2023): 119122. (ISI, Q1, IF: 8.6 2023)&lt;/p&gt;
&lt;h2 class="heading" id="international-conference-publications"&gt;
 International Conference Publications&lt;span class="heading__anchor"&gt; &lt;a href="#international-conference-publications"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Thanh Le, &lt;strong&gt;Nam Le&lt;/strong&gt;, and Bac Le. &amp;ldquo;&lt;a href="https://link.springer.com/chapter/10.1007/978-3-031-21743-2_19"&gt;Embedding Model with Attention over Convolution Kernels and Dynamic Mapping Matrix for Link Prediction.&lt;/a&gt;&amp;rdquo; In Asian Conference on Intelligent Information and Database Systems, pp. 234-246. Springer, Cham, 2022. (Rank B, CORERANK 2021)&lt;/p&gt;
&lt;p&gt;[2] Tung Luu*, &lt;strong&gt;Nam Le&lt;/strong&gt;, Duc Le, and Bac Le. (2025, February). From Visual Explanations to Counterfactual Explanations with Latent Diffusion. Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 420–429. (Rank A, CORE 2023, * means first author)&lt;/p&gt;
&lt;p&gt;[3] &lt;strong&gt;Nam Le&lt;/strong&gt;, Thanh Le, and Bac Le (2025). Improving Temporal Knowledge Graph Completion via Tensor Decomposition with Relation-Time Context and Multi-Time Perspective. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3, ISBN 978-989-758-737-5, ISSN 2184-433X, pages 326-333. (Rank B, CORE 2023) &lt;a href="../files/ICAART2025 - MPComplEx - Slide.pdf" target="_blank" type="application/pdf"&gt;[Slide]&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[4] &lt;strong&gt;Nam Le&lt;/strong&gt;, Thanh Le, and Bac Le (2025). Improving Temporal Knowledge Graph Forecasting via Multi-Rewards Mechanism and Confidence-Guided Tensor Decomposition Reinforcement Learning. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 1, ISBN 978-989-758-737-5, ISSN 2184-433X, pages 68-79. (Rank B, CORE 2023) &lt;a href="../files/ICAART2025 - CATTer - Slide.pdf" target="_blank" type="application/pdf"&gt;[Slide]&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="domestic-conference-publications"&gt;
 Domestic Conference Publications&lt;span class="heading__anchor"&gt; &lt;a href="#domestic-conference-publications"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] &lt;strong&gt;Nam Le&lt;/strong&gt;, Thanh Le, and Bac Le (2025). Improving Temporal Knowledge Graph Forecasting via Multi-reward mechanism and Confidence-Augmented Reinforcement Learning. The 14th Scientific Conference (VNUHCM-US Conf 2024)&lt;/p&gt;</description></item><item><title>Recent Advanced in Research on Difference-of-Convex (DC) Programming</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/dc-programming/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/dc-programming/</guid><description/></item><item><title>Second-order Stochastic Optimization methods for Machine Learning</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/soms/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/soms/</guid><description>&lt;h2 class="heading" id="analysis-of-the-hessian"&gt;
 Analysis of the Hessian&lt;span class="heading__anchor"&gt; &lt;a href="#analysis-of-the-hessian"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-empirical-analysis-of-the-hessian-of-over-parametrized-neural-networks"&gt;
 1. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks&lt;span class="heading__anchor"&gt; &lt;a href="#1-empirical-analysis-of-the-hessian-of-over-parametrized-neural-networks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2017&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1706.04454&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1706.04454"&gt;https://arxiv.org/abs/1706.04454&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We study the properties of common loss surfaces through their Hessian matrix. In particular, in the context of deep learning, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk. We present numerical evidence and mathematical justifications to the following conjectures laid out by Sagun et al. (2016): Fixing data, increasing the number of parameters merely scales the bulk of the spectrum; fixing the dimension and changing the data (for instance adding more clusters or making the data less separable) only affects the outliers. We believe that our observations have striking implications for non-convex optimization in high dimensions. First, the flatness of such landscapes (which can be measured by the singularity of the Hessian) implies that classical notions of basins of attraction may be quite misleading. And that the discussion of wide/narrow basins may be in need of a new perspective around over-parametrization and redundancy that are able to create large connected components at the bottom of the landscape. Second, the dependence of small number of large eigenvalues to the data distribution can be linked to the spectrum of the covariance matrix of gradients of model outputs. With this in mind, we may reevaluate the connections within the data-architecture-algorithm framework of a model, hoping that it would shed light into the geometry of high-dimensional and non-convex spaces in modern applications. In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="2-the-full-spectrum-of-deepnet-hessians-at-scale-dynamics-with-sgd-training-and-sample-size"&gt;
 2. The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size&lt;span class="heading__anchor"&gt; &lt;a href="#2-the-full-spectrum-of-deepnet-hessians-at-scale-dynamics-with-sgd-training-and-sample-size"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2018&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Vardan Papyan&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1811.07062&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1811.07062"&gt;https://arxiv.org/abs/1811.07062&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We apply state-of-the-art tools in modern high-dimensional numerical linear algebra to approximate efficiently the spectrum of the Hessian of modern deepnets, with tens of millions of parameters, trained on real data. Our results corroborate previous findings, based on small-scale networks, that the Hessian exhibits &amp;ldquo;spiked&amp;rdquo; behavior, with several outliers isolated from a continuous bulk. We decompose the Hessian into different components and study the dynamics with training and sample size of each term individually.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="3-pyhessian-neural-networks-through-the-lens-of-the-hessian"&gt;
 3. PyHessian: Neural Networks Through the Lens of the Hessian&lt;span class="heading__anchor"&gt; &lt;a href="#3-pyhessian-neural-networks-through-the-lens-of-the-hessian"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2019&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Zhewei Yao, Amir Gholami, Kurt Keutzer, Michael W. Mahoney&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1912.07145&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1912.07145"&gt;https://arxiv.org/abs/1912.07145&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We present PYHESSIAN, a new scalable framework that enables fast computation of Hessian (i.e., second-order derivative) information for deep neural networks. PYHESSIAN enables fast computations of the top Hessian eigenvalues, the Hessian trace, and the full Hessian eigenvalue/spectral density, and it supports distributed-memory execution on cloud/supercomputer systems and is available as open source. This general framework can be used to analyze neural network models, including the topology of the loss landscape (i.e., curvature information) to gain insight into the behavior of different models/optimizers. To illustrate this, we analyze the effect of residual connections and Batch Normalization layers on the trainability of neural networks. One recent claim, based on simpler first-order analysis, is that residual connections and Batch Normalization make the loss landscape smoother, thus making it easier for Stochastic Gradient Descent to converge to a good solution. Our extensive analysis shows new finer-scale insights, demonstrating that, while conventional wisdom is sometimes validated, in other cases it is simply incorrect. In particular, we find that Batch Normalization does not necessarily make the loss landscape smoother, especially for shallower networks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Mentions &amp;lsquo;available&amp;rsquo; in abstract; Mentions &amp;lsquo;open source&amp;rsquo; in abstract; Known repository: &lt;a href="https://github.com/amirgholami/PyHessian"&gt;https://github.com/amirgholami/PyHessian&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="4-a-deeper-look-at-the-hessian-eigenspectrum-of-deep-neural-networks-and-its-applications-to-regularization"&gt;
 4. A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization&lt;span class="heading__anchor"&gt; &lt;a href="#4-a-deeper-look-at-the-hessian-eigenspectrum-of-deep-neural-networks-and-its-applications-to-regularization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2020&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Adepu Ravi Sankar, Yash Khasbage, Rahul Vigneswaran, Vineeth N Balasubramanian&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2012.03801&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2012.03801"&gt;https://arxiv.org/abs/2012.03801&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability of deep neural network models. In this work, we propose a layerwise loss landscape analysis where the loss surface at every layer is studied independently and also on how each correlates to the overall loss surface. We study the layerwise loss landscape by studying the eigenspectra of the Hessian at each layer. In particular, our results show that the layerwise Hessian geometry is largely similar to the entire Hessian. We also report an interesting phenomenon where the Hessian eigenspectrum of middle layers of the deep neural network are observed to most similar to the overall Hessian eigenspectrum. We also show that the maximum eigenvalue and the trace of the Hessian (both full network and layerwise) reduce as training of the network progresses. We leverage on these observations to propose a new regularizer based on the trace of the layerwise Hessian. Penalizing the trace of the Hessian at every layer indirectly forces Stochastic Gradient Descent to converge to flatter minima, which are shown to have better generalization performance. In particular, we show that such a layerwise regularizer can be leveraged to penalize the middlemost layers alone, which yields promising results. Our empirical studies on well-known deep nets across datasets support the claims of this work&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="diagonal-scaling"&gt;
 Diagonal Scaling&lt;span class="heading__anchor"&gt; &lt;a href="#diagonal-scaling"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-adahessian-an-adaptive-second-order-optimizer-for-machine-learning"&gt;
 1. AdaHessian: An Adaptive Second Order Optimizer for Machine Learning&lt;span class="heading__anchor"&gt; &lt;a href="#1-adahessian-an-adaptive-second-order-optimizer-for-machine-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2020&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Zhewei Yao, Amir Gholami, Sheng Shen, Mustafa Mustafa, Kurt Keutzer, Michael W. Mahoney&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2006.00719&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; AdaHessian&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2006.00719"&gt;https://arxiv.org/abs/2006.00719&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We introduce ADAHESSIAN, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the HESSIAN. Second order algorithms are among the most powerful optimization algorithms with superior convergence properties as compared to first order methods such as SGD and Adam. The main disadvantage of traditional second order methods is their heavier per iteration computation and poor accuracy as compared to first order methods. To address these, we incorporate several novel approaches in ADAHESSIAN, including: (i) a fast Hutchinson based method to approximate the curvature matrix with low computational overhead; (ii) a root-mean-square exponential moving average to smooth out variations of the Hessian diagonal across different iterations; and (iii) a block diagonal averaging to reduce the variance of Hessian diagonal elements. We show that ADAHESSIAN achieves new state-of-the-art results by a large margin as compared to other adaptive optimization methods, including variants of Adam. In particular, we perform extensive tests on CV, NLP, and recommendation system tasks and find that ADAHESSIAN: (i) achieves 1.80%/1.45% higher accuracy on ResNets20/32 on Cifar10, and 5.55% higher accuracy on ImageNet as compared to Adam; (ii) outperforms AdamW for transformers by 0.13/0.33 BLEU score on IWSLT14/WMT14 and 2.7/1.0 PPL on PTB/Wikitext-103; (iii) outperforms AdamW for SqueezeBert by 0.41 points on GLUE; and (iv) achieves 0.032% better score than Adagrad for DLRM on the Criteo Ad Kaggle dataset. Importantly, we show that the cost per iteration of ADAHESSIAN is comparable to first order methods, and that it exhibits robustness towards its hyperparameters.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Known repository: &lt;a href="https://github.com/amirgholami/adahessian"&gt;https://github.com/amirgholami/adahessian&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="2-sophia-a-scalable-stochastic-second-order-optimizer-for-language-model-pre-training"&gt;
 2. Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training&lt;span class="heading__anchor"&gt; &lt;a href="#2-sophia-a-scalable-stochastic-second-order-optimizer-for-language-model-pre-training"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2023&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Hong Liu, Zhiyuan Li, David Hall, Percy Liang, Tengyu Ma&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2305.14342&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; Sophia&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2305.14342"&gt;https://arxiv.org/abs/2305.14342&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training. Adam and its variants have been state-of-the-art for years, and more sophisticated second-order (Hessian-based) optimizers often incur too much per-step overhead. In this paper, we propose Sophia, Second-order Clipped Stochastic Optimization, a simple scalable second-order optimizer that uses a light-weight estimate of the diagonal Hessian as the pre-conditioner. The update is the moving average of the gradients divided by the moving average of the estimated Hessian, followed by element-wise clipping. The clipping controls the worst-case update size and tames the negative impact of non-convexity and rapid change of Hessian along the trajectory. Sophia only estimates the diagonal Hessian every handful of iterations, which has negligible average per-step time and memory overhead. On language modeling with GPT models of sizes ranging from 125M to 1.5B, Sophia achieves a 2x speed-up compared to Adam in the number of steps, total compute, and wall-clock time, achieving the same perplexity with 50% fewer steps, less total compute, and reduced wall-clock time. Theoretically, we show that Sophia, in a much simplified setting, adapts to the heterogeneous curvatures in different parameter dimensions, and thus has a run-time bound that does not depend on the condition number of the loss.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Known repository: &lt;a href="https://github.com/Liuhong99/Sophia"&gt;https://github.com/Liuhong99/Sophia&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="hessian-free-optimization"&gt;
 Hessian-free Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#hessian-free-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-learning-recurrent-neural-networks-with-hessian-free-optimization"&gt;
 1. Learning Recurrent Neural Networks with Hessian-Free Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#1-learning-recurrent-neural-networks-with-hessian-free-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2011&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; James Martens, Ilya Sutskever&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://www.cs.toronto.edu/~jmartens/docs/RNN_HF.pdf"&gt;https://www.cs.toronto.edu/~jmartens/docs/RNN_HF.pdf&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this work we resolve the long-outstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain long-term data dependencies. Utilizing recent advances in the Hessian-free optimization approach (Martens, 2010), together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a collection of pathological synthetic datasets which are known to be impossible for standard optimization approaches (due to their extremely long-term dependencies), and second, on three natural and highly complex real-world sequence datasets where we find that our method significantly outperforms the previous state-of-the-art method for training neural sequence models: the Long Short-term Memory approach of Hochreiter and Schmidhuber (1997). Additionally, we offer a new interpretation of the generalized Gauss-Newton matrix of Schraudolph (2002) which is used within the HF approach of Martens.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="2-training-neural-networks-with-stochastic-hessian-free-optimization"&gt;
 2. Training Neural Networks with Stochastic Hessian-Free Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#2-training-neural-networks-with-stochastic-hessian-free-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2013&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Ryan Kiros&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1301.3641&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; SHF&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1301.3641"&gt;https://arxiv.org/abs/1301.3641&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Hessian-free (HF) optimization has been successfully used for training deep autoencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be computed on the same order of time as gradients. In this paper we exploit this property and study stochastic HF with gradient and curvature mini-batches independent of the dataset size. We modify Martens&amp;rsquo; HF for these settings and integrate dropout, a method for preventing co-adaptation of feature detectors, to guard against overfitting. Stochastic Hessian-free optimization gives an intermediary between SGD and HF that achieves competitive performance on both classification and deep autoencoder experiments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Mentions &amp;lsquo;code&amp;rsquo; in abstract&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="quasi-newton"&gt;
 Quasi-Newton&lt;span class="heading__anchor"&gt; &lt;a href="#quasi-newton"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-a-stochastic-quasi-newton-method-for-large-scale-optimization"&gt;
 1. A Stochastic Quasi-Newton Method for Large-Scale Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#1-a-stochastic-quasi-newton-method-for-large-scale-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2014&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; R.H. Byrd, S.L. Hansen, J. Nocedal, Y. Singer&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1401.7020&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1401.7020"&gt;https://arxiv.org/abs/1401.7020&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; The question of how to incorporate curvature information in stochastic approximation methods is challenging. The direct application of classical quasi- Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust and scalable. It employs the classical BFGS update formula in its limited memory form, and is based on the observation that it is beneficial to collect curvature information pointwise, and at regular intervals, through (sub-sampled) Hessian-vector products. This technique differs from the classical approach that would compute differences of gradients, and where controlling the quality of the curvature estimates can be difficult. We present numerical results on problems arising in machine learning that suggest that the proposed method shows much promise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="2-a-multi-batch-l-bfgs-method-for-machine-learning"&gt;
 2. A Multi-Batch L-BFGS Method for Machine Learning&lt;span class="heading__anchor"&gt; &lt;a href="#2-a-multi-batch-l-bfgs-method-for-machine-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2016&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Albert S. Berahas, Jorge Nocedal, Martin Takáč&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1605.06049&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1605.06049"&gt;https://arxiv.org/abs/1605.06049&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="3-stochastic-quasi-newton-with-line-search-regularization"&gt;
 3. Stochastic Quasi-Newton with Line-Search Regularization&lt;span class="heading__anchor"&gt; &lt;a href="#3-stochastic-quasi-newton-with-line-search-regularization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2019&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Adrian Wills, Thomas Schön&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1909.01238&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; SQN&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1909.01238"&gt;https://arxiv.org/abs/1909.01238&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; In this paper we present a novel quasi-Newton algorithm for use in stochastic optimisation. Quasi-Newton methods have had an enormous impact on deterministic optimisation problems because they afford rapid convergence and computationally attractive algorithms. In essence, this is achieved by learning the second-order (Hessian) information based on observing first-order gradients. We extend these ideas to the stochastic setting by employing a highly flexible model for the Hessian and infer its value based on observing noisy gradients. In addition, we propose a stochastic counterpart to standard line-search procedures and demonstrate the utility of this combination on maximum likelihood identification for general nonlinear state space models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="4-practical-quasi-newton-methods-for-training-deep-neural-networks"&gt;
 4. Practical Quasi-Newton Methods for Training Deep Neural Networks&lt;span class="heading__anchor"&gt; &lt;a href="#4-practical-quasi-newton-methods-for-training-deep-neural-networks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2020&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Donald Goldfarb, Yi Ren, Achraf Bahamou&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2006.08877&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2006.08877"&gt;https://arxiv.org/abs/2006.08877&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs). In DNN training, the number of variables and components of the gradient $n$ is often of the order of tens of millions and the Hessian has $n^2$ elements. Consequently, computing and storing a full $n \times n$ BFGS approximation or storing a modest number of (step, change in gradient) vector pairs for use in an L-BFGS implementation is out of the question. In our proposed methods, we approximate the Hessian by a block-diagonal matrix and use the structure of the gradient and Hessian to further approximate these blocks, each of which corresponds to a layer, as the Kronecker product of two much smaller matrices. This is analogous to the approach in KFAC, which computes a Kronecker-factored block-diagonal approximation to the Fisher matrix in a stochastic natural gradient method. Because the indefinite and highly variable nature of the Hessian in a DNN, we also propose a new damping approach to keep the upper as well as the lower bounds of the BFGS and L-BFGS approximations bounded. In tests on autoencoder feed-forward neural network models with either nine or thirteen layers applied to three datasets, our methods outperformed or performed comparably to KFAC and state-of-the-art first-order stochastic methods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Mentions &amp;lsquo;code&amp;rsquo; in abstract; Mentions &amp;lsquo;implementation&amp;rsquo; in abstract&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="gauss-newton"&gt;
 Gauss-Newton&lt;span class="heading__anchor"&gt; &lt;a href="#gauss-newton"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-efficient-subsampled-gauss-newton-and-natural-gradient-methods-for-training-neural-networks"&gt;
 1. Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks&lt;span class="heading__anchor"&gt; &lt;a href="#1-efficient-subsampled-gauss-newton-and-natural-gradient-methods-for-training-neural-networks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2019&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Yi Ren, Donald Goldfarb&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1906.02353&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; SWM-GN, SWM-NG&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1906.02353"&gt;https://arxiv.org/abs/1906.02353&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data sets. Our methods use subsampled Gauss-Newton or Fisher information matrices and either subsampled gradient estimates (fully stochastic) or full gradients (semi-stochastic), which, in the latter case, we prove convergent to a stationary point. By using the Sherman-Morrison-Woodbury formula with automatic differentiation (backpropagation) we show how our methods can be implemented to perform efficiently. Finally, numerical results are presented to demonstrate the effectiveness of our proposed methods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="2-on-the-promise-of-the-stochastic-generalized-gauss-newton-method-for-training-dnns"&gt;
 2. On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs&lt;span class="heading__anchor"&gt; &lt;a href="#2-on-the-promise-of-the-stochastic-generalized-gauss-newton-method-for-training-dnns"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2020&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Matilde Gargiani, et al.&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2006.02409&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; SGN&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2006.02409"&gt;https://arxiv.org/abs/2006.02409&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; Following early work on Hessian-free methods for deep learning, we study a stochastic generalized Gauss-Newton method (SGN) for training DNNs. SGN is a second-order optimization method, with efficient iterations, that we demonstrate to often require substantially fewer iterations than standard SGD to converge. As the name suggests, SGN uses a Gauss-Newton approximation for the Hessian matrix, and, in order to compute an approximate search direction, relies on the conjugate gradient method combined with forward and reverse automatic differentiation. Despite the success of SGD and its first-order variants, and despite Hessian-free methods based on the Gauss-Newton Hessian approximation having been already theoretically proposed as practical methods for training DNNs, we believe that SGN has a lot of undiscovered and yet not fully displayed potential in big mini-batch scenarios. For this setting, we demonstrate that SGN does not only substantially improve over SGD in terms of the number of iterations, but also in terms of runtime. This is made possible by an efficient, easy-to-use and flexible implementation of SGN we propose in the Theano deep learning platform, which, unlike Tensorflow and Pytorch, supports forward automatic differentiation. This enables researchers to further study and improve this promising optimization technique and hopefully reconsider stochastic second-order methods as competitive optimization techniques for training DNNs; we also hope that the promise of SGN may lead to forward automatic differentiation being added to Tensorflow or Pytorch. Our results also show that in big mini-batch scenarios SGN is more robust than SGD with respect to its hyperparameters (we never had to tune its step-size for our benchmarks!), which eases the expensive process of hyperparameter tuning that is instead crucial for the performance of first-order methods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Mentions &amp;lsquo;implementation&amp;rsquo; in abstract&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="3-stochastic-gauss-newton-algorithms-for-nonconvex-compositional-optimization"&gt;
 3. Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#3-stochastic-gauss-newton-algorithms-for-nonconvex-compositional-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2020&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Quoc Tran-Dinh, et al.&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2002.07290&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; SGN with SARAH estimators&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2002.07290"&gt;https://arxiv.org/abs/2002.07290&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We develop two new stochastic Gauss-Newton algorithms for solving a class of non-convex stochastic compositional optimization problems frequently arising in practice. We consider both the expectation and finite-sum settings under standard assumptions, and use both classical stochastic and SARAH estimators for approximating function values and Jacobians. In the expectation case, we establish $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity to achieve a stationary point in expectation and estimate the total number of stochastic oracle calls for both function value and its Jacobian, where $\varepsilon$ is a desired accuracy. In the finite sum case, we also estimate $\mathcal{O}(\varepsilon^{-2})$ iteration-complexity and the total oracle calls with high probability. To our best knowledge, this is the first time such global stochastic oracle complexity is established for stochastic Gauss-Newton methods. Finally, we illustrate our theoretical results via two numerical examples on both synthetic and real datasets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="4-nonlinear-least-squares-for-large-scale-machine-learning-using-stochastic-jacobian-estimates"&gt;
 4. Nonlinear Least Squares for Large-Scale Machine Learning using Stochastic Jacobian Estimates&lt;span class="heading__anchor"&gt; &lt;a href="#4-nonlinear-least-squares-for-large-scale-machine-learning-using-stochastic-jacobian-estimates"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2021&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Johannes J. Brust&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2107.05598&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; NLLS1, NLLSL&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2107.05598"&gt;https://arxiv.org/abs/2107.05598&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; For large nonlinear least squares loss functions in machine learning we exploit the property that the number of model parameters typically exceeds the data in one batch. This implies a low-rank structure in the Hessian of the loss, which enables effective means to compute search directions. Using this property, we develop two algorithms that estimate Jacobian matrices and perform well when compared to state-of-the-art methods.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="5-improving-levenberg-marquardt-algorithm-for-neural-networks"&gt;
 5. Improving Levenberg-Marquardt Algorithm for Neural Networks&lt;span class="heading__anchor"&gt; &lt;a href="#5-improving-levenberg-marquardt-algorithm-for-neural-networks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2022&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Omead Pooladzandi, Yiming Zhou&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2212.08769&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; LM&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2212.08769"&gt;https://arxiv.org/abs/2212.08769&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We explore the usage of the Levenberg-Marquardt (LM) algorithm for regression (non-linear least squares) and classification (generalized Gauss-Newton methods) tasks in neural networks. We compare the performance of the LM method with other popular first-order algorithms such as SGD and Adam, as well as other second-order algorithms such as L-BFGS , Hessian-Free and KFAC. We further speed up the LM method by using adaptive momentum, learning rate line search, and uphill step acceptance.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="6-rethinking-gauss-newton-for-learning-over-parameterized-models"&gt;
 6. Rethinking Gauss-Newton for learning over-parameterized models&lt;span class="heading__anchor"&gt; &lt;a href="#6-rethinking-gauss-newton-for-learning-over-parameterized-models"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2023&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Michael Arbel, et al.&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2302.02904&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2302.02904"&gt;https://arxiv.org/abs/2302.02904&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; This work studies the global convergence and implicit bias of Gauss Newton&amp;rsquo;s (GN) when optimizing over-parameterized one-hidden layer networks in the mean-field regime. We first establish a global convergence result for GN in the continuous-time limit exhibiting a faster convergence rate compared to GD due to improved conditioning. We then perform an empirical study on a synthetic regression task to investigate the implicit bias of GN&amp;rsquo;s method. While GN is consistently faster than GD in finding a global optimum, the learned model generalizes well on test data when starting from random initial weights with a small variance and using a small step size to slow down convergence. Specifically, our study shows that such a setting results in a hidden learning phenomenon, where the dynamics are able to recover features with good generalization properties despite the model having sub-optimal training and test performances due to an under-optimized linear layer. This study exhibits a trade-off between the convergence speed of GN and the generalization ability of the learned solution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h3 class="heading" id="7-exact-gauss-newton-optimization-for-training-deep-neural-networks"&gt;
 7. Exact Gauss-Newton Optimization for Training Deep Neural Networks&lt;span class="heading__anchor"&gt; &lt;a href="#7-exact-gauss-newton-optimization-for-training-deep-neural-networks"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2024&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Mikalai Korbit, Adeyemi D. Adeoye, Alberto Bemporad, Mario Zanon&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2405.14402&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; EGN&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2405.14402"&gt;https://arxiv.org/abs/2405.14402&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We present EGN, a stochastic second-order optimization algorithm that combines the generalized Gauss-Newton (GN) Hessian approximation with low-rank linear algebra to compute the descent direction. Leveraging the Duncan-Guttman matrix identity, the parameter update is obtained by factorizing a matrix which has the size of the mini-batch. This is particularly advantageous for large-scale machine learning problems where the dimension of the neural network parameter vector is several orders of magnitude larger than the batch size. Additionally, we show how improvements such as line search, adaptive regularization, and momentum can be seamlessly added to EGN to further accelerate the algorithm. Moreover, under mild assumptions, we prove that our algorithm converges to an $\epsilon$-stationary point at a linear rate. Finally, our numerical experiments demonstrate that EGN consistently exceeds, or at most matches the generalization performance of well-tuned SGD, Adam, and SGN optimizers across various supervised and reinforcement learning tasks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="fisher-information"&gt;
 Fisher Information&lt;span class="heading__anchor"&gt; &lt;a href="#fisher-information"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-optimizing-neural-networks-with-kronecker-factored-approximate-curvature"&gt;
 1. Optimizing Neural Networks with Kronecker-factored Approximate Curvature&lt;span class="heading__anchor"&gt; &lt;a href="#1-optimizing-neural-networks-with-kronecker-factored-approximate-curvature"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2015&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; James Martens, Roger Grosse&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:1503.05671&lt;br&gt;
&lt;strong&gt;Algorithm:&lt;/strong&gt; K-FAC&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/1503.05671"&gt;https://arxiv.org/abs/1503.05671&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We propose an efficient method for approximating natural gradient descent in neural networks which we call Kronecker-Factored Approximate Curvature (K-FAC). K-FAC is based on an efficiently invertible approximation of a neural network&amp;rsquo;s Fisher information matrix which is neither diagonal nor low-rank, and in some cases is completely non-sparse. It is derived by approximating various large blocks of the Fisher (corresponding to entire layers) as being the Kronecker product of two much smaller matrices. While only several times more expensive to compute than the plain stochastic gradient, the updates produced by K-FAC make much more progress optimizing the objective, which results in an algorithm that can be much faster than stochastic gradient descent with momentum in practice. And unlike some previously proposed approximate natural-gradient/Newton methods which use high-quality non-diagonal curvature matrices (such as Hessian-free optimization), K-FAC works very well in highly stochastic optimization regimes. This is because the cost of storing and inverting K-FAC&amp;rsquo;s approximation to the curvature matrix does not depend on the amount of data used to estimate it, which is a feature typically associated only with diagonal or low-rank approximations to the curvature matrix.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; Known repository: Various implementations available&lt;/p&gt;
&lt;hr&gt;
&lt;h2 class="heading" id="other"&gt;
 Other&lt;span class="heading__anchor"&gt; &lt;a href="#other"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-second-order-optimization-with-lazy-hessians"&gt;
 1. Second-order optimization with lazy Hessians&lt;span class="heading__anchor"&gt; &lt;a href="#1-second-order-optimization-with-lazy-hessians"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Year:&lt;/strong&gt; 2022&lt;br&gt;
&lt;strong&gt;Authors:&lt;/strong&gt; Nikita Doikov, El Mahdi Chayti, Martin Jaggi&lt;br&gt;
&lt;strong&gt;ArXiv ID:&lt;/strong&gt; arXiv:2212.00781&lt;br&gt;
&lt;strong&gt;URL:&lt;/strong&gt; &lt;a href="https://arxiv.org/abs/2212.00781"&gt;https://arxiv.org/abs/2212.00781&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt; We analyze Newton&amp;rsquo;s method with lazy Hessian updates for solving general possibly non-convex optimization problems. We propose to reuse a previously seen Hessian for several iterations while computing new gradients at each step of the method. This significantly reduces the overall arithmetical complexity of second-order optimization schemes. By using the cubic regularization technique, we establish fast global convergence of our method to a second-order stationary point, while the Hessian does not need to be updated each iteration. For convex problems, we justify global and local superlinear rates for lazy Newton steps with quadratic regularization, which is easier to compute. The optimal frequency for updating the Hessian is once every $d$ iterations, where $d$ is the dimension of the problem. This provably improves the total arithmetical complexity of second-order algorithms by a factor $\sqrt{d}$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; No explicit source code information found&lt;/p&gt;
&lt;hr&gt;</description></item><item><title>Some popular partial differential equations (PDEs)</title><link>https://blog.namln.org/en/mathematics/analysis/pde/some-popular-pdes/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/pde/some-popular-pdes/</guid><description>&lt;h2 class="heading" id="single-pdes"&gt;
 Single PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#single-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="linear-equations"&gt;
 Linear equations&lt;span class="heading__anchor"&gt; &lt;a href="#linear-equations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Laplace’s equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\Delta u = \sum_{i=1}^{n} u_{x_i x_i} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Helmholtz’s (or eigenvalue) equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
-\Delta u = \lambda u.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Linear transport equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + \sum_{i=1}^{n} b^i u_{x_i} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Liouville’s equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + \sum_{i=1}^{n} (b^i u)_{x_i} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="5"&gt;
&lt;li&gt;Heat (or diffusion) equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t - \Delta u = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="6"&gt;
&lt;li&gt;Schrödinger’s equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
i u_t + \Delta u = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="7"&gt;
&lt;li&gt;Kolmogorov&amp;rsquo;s equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t - \sum_{i,j=1}^{n} a^{ij} u_{x_i x_j} + \sum_{i=1}^{n} b^i u_{x_i} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="8"&gt;
&lt;li&gt;Fokker–Planck equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u _t - \sum _{i,j = 1}^{n} (a^{ij} u) _{x_i x_j} - \sum _{i=1}^{n} (b^i u) _{x_i} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="9"&gt;
&lt;li&gt;Wave equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_{tt} - \Delta k = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="10"&gt;
&lt;li&gt;Klein–Gordon equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_{tt} - \Delta u + m^2 u = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="11"&gt;
&lt;li&gt;Telegraph equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_{tt} + 2\delta u_t - u_{xx} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="12"&gt;
&lt;li&gt;General wave equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t - \sum_{i,j=1}^{n} a^{ij} u_{x_i x_j} + \sum_{i=1}^{n} b^i u_{x_i} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="13"&gt;
&lt;li&gt;Airy&amp;rsquo;s equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + u_{xxx} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="14"&gt;
&lt;li&gt;Beam equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + u_{xxxx} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;h3 class="heading" id="nonlinear-equations"&gt;
 Nonlinear equations&lt;span class="heading__anchor"&gt; &lt;a href="#nonlinear-equations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Eikonal equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
|Du| = 1.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Nonlinear Poisson equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
-\Delta u = f(u).
\end{equation}
$$&lt;/p&gt;
&lt;ol start="3"&gt;
&lt;li&gt;$p$-Laplacian equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\operatorname{div}(|Du|^{p-2} Du) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Minimal surface equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\operatorname{div} \left( \frac{Du}{\sqrt{1 + |Du|^2}} \right) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="5"&gt;
&lt;li&gt;Monge–Ampère equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\det(D^2 u) = f.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="6"&gt;
&lt;li&gt;Hamilton–Jacobi equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + H(Du, x) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="7"&gt;
&lt;li&gt;Scalar conservation law&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + \operatorname{div} F(u) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="8"&gt;
&lt;li&gt;Inviscid Burgers’ equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + u u_x = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="9"&gt;
&lt;li&gt;Scalar reaction-diffusion equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t - \Delta u = f(u).
\end{equation}
$$&lt;/p&gt;
&lt;ol start="10"&gt;
&lt;li&gt;Porous medium equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t - \Delta(u^m) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="11"&gt;
&lt;li&gt;Nonlinear wave equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_{tt} - \Delta u + f(u) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="12"&gt;
&lt;li&gt;Korteweg–deVries (KdV) equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + u u_x + u_{xxx} = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="13"&gt;
&lt;li&gt;Nonlinear Schrödinger equation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
i u_t + \Delta u = f(|u|^2) u.
\end{equation}
$$&lt;/p&gt;
&lt;h2 class="heading" id="systems-of-pdes"&gt;
 Systems of PDEs&lt;span class="heading__anchor"&gt; &lt;a href="#systems-of-pdes"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="linear-systems"&gt;
 Linear systems&lt;span class="heading__anchor"&gt; &lt;a href="#linear-systems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Equilibrium equations of linear elasticity&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\mu \Delta u + (\lambda + \mu) D(\operatorname{div} u) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Evolution equations of linear elasticity&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_{tt} - \mu \Delta u - (\lambda + \mu) D(\operatorname{div} u) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Maxwell’s equations&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\begin{cases}
E_t = \operatorname{curl} B \\
B_t = -\operatorname{curl} E \\
\operatorname{div} B = \operatorname{div} E = 0.
\end{cases}
\end{equation}
$$&lt;/p&gt;
&lt;h3 class="heading" id="nonlinear-systems"&gt;
 Nonlinear systems&lt;span class="heading__anchor"&gt; &lt;a href="#nonlinear-systems"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;System of conservation laws&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t + \operatorname{div} F(u) = 0.
\end{equation}
$$&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Reaction-diffusion system&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
u_t - \Delta u = f(u).
\end{equation}
$$&lt;/p&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Euler’s equations for incompressible, inviscid flow&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;$$
\begin{equation}
\begin{cases}
u_t + u \cdot Du = -Dp \
\operatorname{div} u = 0.
\end{cases}
\end{equation}
$$&lt;/p&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Navier–Stokes equations for incompressible, viscous flow&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Study Mathematics at HCMUS</title><link>https://blog.namln.org/en/mathematics/study-math-hcmus/</link><pubDate>Thu, 27 Jun 2024 23:14:15 +0800</pubDate><guid>https://blog.namln.org/en/mathematics/study-math-hcmus/</guid><description>&lt;h2 class="heading" id="1-applied-mathematics"&gt;
 1. Applied Mathematics&lt;span class="heading__anchor"&gt; &lt;a href="#1-applied-mathematics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;MNC - Research Methodologies&lt;/li&gt;
&lt;li&gt;MTT001 - Advanced Functional Analysis&lt;/li&gt;
&lt;li&gt;MTT006 - Advanced Linear Algebra&lt;/li&gt;
&lt;li&gt;MTT011 - Numerical Analysis&lt;/li&gt;
&lt;li&gt;MTT012 - Stochastic Process&lt;/li&gt;
&lt;li&gt;MTT081 - Optimization Algorithms&lt;/li&gt;
&lt;li&gt;MTT106 - Non-linear Programming&lt;/li&gt;
&lt;li&gt;MTT107 - Set-valued Analysis&lt;/li&gt;
&lt;li&gt;MTT083 - Convex Analysis&lt;/li&gt;
&lt;li&gt;MTT130 - Numerical Programming for Applied Problems&lt;/li&gt;
&lt;li&gt;MTT131 - Seminar in Applied Mathematics&lt;/li&gt;
&lt;li&gt;MTT139 - Mathematical Models in Economics&lt;/li&gt;
&lt;li&gt;MTT147 - Statistical Modelling&lt;/li&gt;
&lt;li&gt;MTT099 - Differential Equations&lt;/li&gt;
&lt;li&gt;MTT097 - Partial Differential Equations&lt;/li&gt;
&lt;li&gt;MTH10403 - Functional Analysis&lt;/li&gt;
&lt;li&gt;MTT090 - Complex Analysis&lt;/li&gt;
&lt;li&gt;MTT149 - Convex Analysis and Optimization&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="2-mathematical-analysis"&gt;
 2. Mathematical Analysis&lt;span class="heading__anchor"&gt; &lt;a href="#2-mathematical-analysis"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;MTT001 - Advanced Functional Analysis&lt;/li&gt;
&lt;li&gt;MTT006 - Advanced Linear Algebra&lt;/li&gt;
&lt;li&gt;MTT099 - Differential Equations&lt;/li&gt;
&lt;li&gt;MTT097 - Partial Differential Equations&lt;/li&gt;
&lt;li&gt;MTT090 - Complex Analysis&lt;/li&gt;
&lt;li&gt;MTT149 - Convex Analysis and Optimization&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Explainable Reinforcement Learning (XRL)</title><link>https://blog.namln.org/en/topics/rl/xrl/</link><pubDate>Fri, 09 Feb 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/rl/xrl/</guid><description>&lt;p&gt;In the progress&amp;hellip;&lt;/p&gt;</description></item><item><title>Reinforcement Learning (RL)</title><link>https://blog.namln.org/en/topics/rl/rl/</link><pubDate>Fri, 09 Feb 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/rl/rl/</guid><description>&lt;p&gt;In the progress&amp;hellip;&lt;/p&gt;</description></item><item><title>Temporal Knowledge Graph Completetion</title><link>https://blog.namln.org/en/topics/graph-analytics/tkgc/</link><pubDate>Fri, 09 Feb 2024 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/graph-analytics/tkgc/</guid><description>&lt;p&gt;In the progress&amp;hellip;&lt;/p&gt;</description></item><item><title>Ngoại suy tri thức (Knowledge Extrapolation) cho đồ thị tri thức (Knowledge Graphs)</title><link>https://blog.namln.org/en/topics/graph-analytics/knowledge-exptrapolation/</link><pubDate>Wed, 22 Nov 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/graph-analytics/knowledge-exptrapolation/</guid><description>&lt;h2 class="heading" id="động-lực-nghiên-cứu"&gt;
 Động lực nghiên cứu&lt;span class="heading__anchor"&gt; &lt;a href="#%c4%91%e1%bb%99ng-l%e1%bb%b1c-nghi%c3%aan-c%e1%bb%a9u"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Trong nhiều ứng dụng thực tế như các cơ sở dữ liệu đồ thị (graph database systems), hệ thống gợi ý (recommendation systems), hay hệ thống trả lời câu hỏi (question answering sytems), đồ thị tri thức (knowledge graphs - KG) đóng vai trò là nguồn tri thức giá trị. Có nhiều hướng tiếp cận cho các phương pháp khai thác loại cơ sở tri thức này, và trong đó hướng tiếp cận nhúng đồ thị tri thức (knowledge graph embedding - KGE) là một trong những hướng tiếp cận khả thi và hiệu quả cho nhiều tác vụ downstream như dự đoán liên kết (link prediction/ missing fact completion), hiệu chỉnh thực thể (entity alignment). Tuy nhiên, các phương pháp KGE vẫn phải đối mặt với nhiều vấn đề và thách thức, trong đó vấn đề xử lý các thực thể hay quan hệ chưa biết (unseen objects - entities/ relations) trong quá trình đánh giá/ triển khai mô hình là một trong những khó khăn đó.&lt;/p&gt;
&lt;p&gt;Lấy động lực từ vấn đề này, một hướng nghiên cứu mới ra đời dựa trên hàng loạt các công trình gần đây, ngoại suy tri thức (Knowledge Extrapolation - KE) được hình thành. Trong notes này, chúng tôi dựa trên bài báo &lt;strong&gt;Generalizing to Unseen Elements: A Survey on Knowledge Extrapolation for Knowledge Graphs&lt;/strong&gt; của Mingyang Chen để tổng hợp và trình bày bổ sung các phương pháp gần đây cho hướng nghiên cứu KE.&lt;/p&gt;
&lt;p&gt;Nếu bạn đọc có quan tâm đến hướng nghiên cứu này, vui lòng đọc paper để có thêm thông tin chi tiết:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Chen, M., Zhang, W., Geng, Y., Xu, Z., Pan, J. Z., &amp;amp; Chen, H. (2023). &lt;a href="https://arxiv.org/pdf/2302.01859"&gt;Generalizing to Unseen Elements: A Survey on Knowledge Extrapolation for Knowledge Graphs&lt;/a&gt;. arXiv preprint arXiv:2302.01859.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 class="heading" id="nhúng-đồ-thị-tri-thức-knowledge-graph-embedding"&gt;
 Nhúng đồ thị tri thức (knowledge graph embedding)&lt;span class="heading__anchor"&gt; &lt;a href="#nh%c3%bang-%c4%91%e1%bb%93-th%e1%bb%8b-tri-th%e1%bb%a9c-knowledge-graph-embedding"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Ta định nghĩa một cách hình thức đồ thị tri thức là $\mathcal{G} = \{\mathcal{E}, \mathcal{R}, \mathcal{T}\}$, trong đó:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$\mathcal{E}$ là tập hợp các thực thể (entities).&lt;/li&gt;
&lt;li&gt;$\mathcal{R}$ là tập hợp các quan hệ (relations).&lt;/li&gt;
&lt;li&gt;$\mathcal{T}$ là tập hợp các bộ ba dữ liệu (fact triplets). Một bộ ba dữ liệu biểu diễn một mối liên hệ giữa hai thực thể thông qua một quan hệ, và có thể được biểu diễn như một tập hợp $\{h, r, t\} \subseteq \mathcal{E} \times \mathcal{R} \times \mathcal{E}$&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Do cơ sở tri thức này có cấu trúc đồ thị, nên ta hoàn toàn có thể biểu diễn nó thông qua ma trận kề. Tuy nhiên, cách này rất tốn kém, và điều đó thật là không hiệu quả. Thay vì sử dụng phương pháp nhúng &amp;ldquo;ngây thơ&amp;rdquo; như vậy, người ta sử dụng phương pháp đơn giản mà hiệu quả hơn &amp;ldquo;nhúng tra nông&amp;rdquo;, &amp;ldquo;shallow lookup embedding&amp;quot;&lt;span class="sidenote"&gt;&lt;small&gt;Trong shallow embedding, bộ mã hóa được định nghĩa bằng một &amp;ldquo;bảng tra&amp;rdquo; sao cho &lt;em&gt;tính tương đồng&lt;/em&gt; trong không gian này có thể &lt;em&gt;xấp xỉ&lt;/em&gt; tính tương đồng trong không gian trước đó. Mỗi một cột của ma trận này thể hiệu bảng nhúng của nút, còn tổng số dòng của ma trận thể hiện số chiều nhúng/ kích thước nhúng. Hơn nữa, ta cũng cần phải phân biệt giữa &amp;ldquo;shallow embedding&amp;rdquo; và &amp;ldquo;deep embedding&amp;rdquo;.&lt;/small&gt;&lt;/span&gt;. Nói chung, &lt;strong&gt;mục tiêu chính của phương pháp nhúng đồ thị tri thức là biểu diễn các phần trong các tập hợp thực tể $\mathcal{E}$ và quan hệ $\mathcal{R}$ vào không gian vector liên tục thấp chiều trong khi vẫn bảo toàn cấu trúc nội tại của dữ liệu đồ thị.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Để đánh giá một phương pháp nhúng đồ thị tri thức có tốt hay không, người ta thường khảo sát tác tục dự đoán liên kết(có thể hiểu là dự đoán các bộ dữ kiện bị thiếu, điều này chưa đúng đắn về mặt bản chất nhưng ta vẫn có thể chấp nhận được) cho việc đánh giá mức độ hiệu quả của phương pháp KGE được đề xuất.&lt;/p&gt;
&lt;p align="center"&gt;
 &lt;img src="https://blog.namln.org/post/figures/interpolation_extrapolation_kge.png" /&gt;
 &lt;br&gt;
 &lt;em&gt;(a) Tập huấn luyện (training), và (b) Tập kiểm tra (test) cho KGE truyền thống. Ví dụ về tập kiểm tra cho thiết lập bài toán ngoại suy thực thể (c) và thiết lập bài toán ngoại suy quan hệ (d). Trong đó có thể có bất kỳ thông tin bổ trợ nào về những thực thể chưa biết trong tập hỗ trợ (support set), và sử những bộ ba dữ kiện liên quan như những ví dụ.&lt;/em&gt;
&lt;/p&gt;
&lt;p&gt;Các phương pháp được đề xuất cho thiết lập ngoại suy tri thức có mục tiêu thực hiện dự đoán liên kết trên những phần tử chưa biết (unseen elements). Một cách thống nhất, trong quá trình ngoại suy tri thúc, có hai tập được sử dụng cho đánh giá:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Một tập cung cấp thông tin hỗ trợ về những phần tử chưa biết;&lt;/li&gt;
&lt;li&gt;Tập còn lại đánh giá khả năng dự đoán liên kết của mô hình.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Về mặt phân loại, ta có thể chia các phương pháp tiếp cận hiện tại theo hai hướng: ngoại suy thực thể (Entity Extrapolation), và ngoại suy quan hệ (Relation Extrapolation). Hình bên dưới thể hiện tổng quan hệ thống phân loại các phương pháp tiếp cận.&lt;/p&gt;
&lt;h2 class="heading" id="các-phương-pháp-ngoại-suy-thực-thể-entity-extrapolation-methods"&gt;
 Các phương pháp ngoại suy thực thể (Entity extrapolation methods)&lt;span class="heading__anchor"&gt; &lt;a href="#c%c3%a1c-ph%c6%b0%c6%a1ng-ph%c3%a1p-ngo%e1%ba%a1i-suy-th%e1%bb%b1c-th%e1%bb%83-entity-extrapolation-methods"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="mã-hóa-thực-thể-entity-encoding"&gt;
 Mã hóa thực thể (Entity encoding)&lt;span class="heading__anchor"&gt; &lt;a href="#m%c3%a3-h%c3%b3a-th%e1%bb%b1c-th%e1%bb%83-entity-encoding"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Một trong những cách để xử lý những thực thể chưa biết đó là học cách mã hóa những thực thể thay vì học các bảng nhúng &amp;ldquo;cố định&amp;rdquo;. Những bộ mã hóa học được này (learned encoders) có thể thực thi trên tập hợp hỗ trợ của các thực thể để tạo ra các bảng nhúng hợp lý (reasonable embeddings) cho chúng. Hiện nay, có nhiều cách để thiết kế các mô hình mã hóa này. Tùy thuộc vào tính chất của tập hỗ trợ mà ta có thể chọn lựa các phương pháp tiếp cận phù hợp.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Encode from structural information&lt;/strong&gt; (khi tập support chỉ chứa những thông tin về bộ ba chưa biết):
&lt;ul&gt;
&lt;li&gt;(MEAN) Bi, Z., Zhang, T., Zhou, P., &amp;amp; Li, Y. (2020). &lt;em&gt;Knowledge transfer for out-of-knowledge-base entities: Improving graph-neural-network-based embedding using convolutional layers&lt;/em&gt;. IEEE Access, 8, 159039-159049.&lt;/li&gt;
&lt;li&gt;(LAN) Wang, P., Han, J., Li, C., &amp;amp; Pan, R. (2019, July). &lt;em&gt;Logic attention based neighborhood aggregation for inductive knowledge graph embedding&lt;/em&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 7152-7159).&lt;/li&gt;
&lt;li&gt;Bhowmik, R., &amp;amp; de Melo, G. (2020). &lt;em&gt;Explainable link prediction for emerging entities in knowledge graphs&lt;/em&gt;. In The Semantic Web–ISWC 2020: 19th International Semantic Web Conference, Athens, Greece, November 2–6, 2020, Proceedings, Part I 19 (pp. 39-55). Springer International Publishing.&lt;/li&gt;
&lt;li&gt;Albooyeh, M., Goel, R., &amp;amp; Kazemi, S. M. (2020, November). &lt;em&gt;Out-of-sample representation learning for knowledge graphs&lt;/em&gt;. In Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 2657-2666).&lt;/li&gt;
&lt;li&gt;(CFAG) Wang, C., Zhou, X., Pan, S., Dong, L., Song, Z., &amp;amp; Sha, Y. (2022, June). &lt;em&gt;Exploring Relational Semantics for Inductive Knowledge Graph Completion&lt;/em&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 4, pp. 4184-4192).&lt;/li&gt;
&lt;li&gt;(ARGCN) Cui, Y., Wang, Y., Sun, Z., Liu, W., Jiang, Y., Han, K., &amp;amp; Hu, W. (2022, October). &lt;em&gt;Inductive knowledge graph reasoning for multi-batch emerging entities&lt;/em&gt;. In Proceedings of the 31st ACM International Conference on Information &amp;amp; Knowledge Management (pp. 335-344).&lt;/li&gt;
&lt;li&gt;(QBLP) Ali, M., Berrendorf, M., Galkin, M., Thost, V., Ma, T., Tresp, V., &amp;amp; Lehmann, J. (2021). &lt;em&gt;Improving inductive link prediction using hyper-relational facts&lt;/em&gt;. In The Semantic Web–ISWC 2021: 20th International Semantic Web Conference, ISWC 2021, Virtual Event, October 24–28, 2021, Proceedings 20 (pp. 74-92). Springer International Publishing.&lt;/li&gt;
&lt;li&gt;(GEN) Baek, J., Lee, D. B., &amp;amp; Hwang, S. J. (2020). &lt;em&gt;Learning to extrapolate knowledge: Transductive few-shot out-of-graph link prediction&lt;/em&gt;. Advances in Neural Information Processing Systems, 33, 546-560.&lt;/li&gt;
&lt;li&gt;(HRFN) Zhang, Y., Wang, W., Chen, W., Xu, J., Liu, A., &amp;amp; Zhao, L. (2021, October). &lt;em&gt;Meta-learning based hyper-relation feature modeling for out-of-knowledge-base embedding&lt;/em&gt;. In Proceedings of the 30th ACM International Conference on Information &amp;amp; Knowledge Management (pp. 2637-2646).&lt;/li&gt;
&lt;li&gt;(INDIGO) Liu, S., Grau, B., Horrocks, I., &amp;amp; Kostylev, E. (2021). &lt;em&gt;Indigo: Gnn-based inductive knowledge graph completion using pair-wise encoding&lt;/em&gt;. Advances in Neural Information Processing Systems, 34, 2034-2045.&lt;/li&gt;
&lt;li&gt;(MorsE) Chen, M., Zhang, W., Zhu, Y., Zhou, H., Yuan, Z., Xu, C., &amp;amp; Chen, H. (2022, July). &lt;em&gt;Meta-knowledge transfer for inductive knowledge graph embedding&lt;/em&gt;. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 927-937).&lt;/li&gt;
&lt;li&gt;(NodePiece) Galkin, M., Denis, E., Wu, J., &amp;amp; Hamilton, W. L. (2021). &lt;em&gt;Nodepiece: Compositional and parameter-efficient representations of large knowledge graphs&lt;/em&gt;. arXiv preprint arXiv:2106.12144.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Encode from other information&lt;/strong&gt; (khi tập support có chứa những thông tin khác):
&lt;ul&gt;
&lt;li&gt;(DKRL) Xie, R., Liu, Z., Jia, J., Luan, H., &amp;amp; Sun, M. (2016, March). &lt;em&gt;Representation learning of knowledge graphs with entity descriptions&lt;/em&gt;. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).&lt;/li&gt;
&lt;li&gt;(ConMask) Shi, B., &amp;amp; Weninger, T. (2018, April). &lt;em&gt;Open-world knowledge graph completion&lt;/em&gt;. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).&lt;/li&gt;
&lt;li&gt;(OWE) Shah, H., Villmow, J., Ulges, A., Schwanecke, U., &amp;amp; Shafait, F. (2019, July). &lt;em&gt;An open-world extension to knowledge graph completion models&lt;/em&gt;. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 3044-3051).&lt;/li&gt;
&lt;li&gt;(KEPLER) Wang, X., Gao, T., Zhu, Z., Zhang, Z., Liu, Z., Li, J., &amp;amp; Tang, J. (2021). &lt;em&gt;KEPLER: A unified model for knowledge embedding and pre-trained language representation&lt;/em&gt;. Transactions of the Association for Computational Linguistics, 9, 176-194.&lt;/li&gt;
&lt;li&gt;(StAR) Wang, B., Shen, T., Long, G., Zhou, T., Wang, Y., &amp;amp; Chang, Y. (2021, April). &lt;em&gt;Structure-augmented text representation learning for efficient knowledge graph completion&lt;/em&gt;. In Proceedings of the Web Conference 2021 (pp. 1737-1748).&lt;/li&gt;
&lt;li&gt;(BLP) Daza, D., Cochez, M., &amp;amp; Groth, P. (2021, April). &lt;em&gt;Inductive entity representations from text via link prediction&lt;/em&gt;. In Proceedings of the Web Conference 2021 (pp. 798-808).&lt;/li&gt;
&lt;li&gt;(SimKGC) Wang, L., Zhao, W., Wei, Z., &amp;amp; Liu, J. (2022). &lt;em&gt;SimKGC: Simple contrastive knowledge graph completion with pre-trained language models&lt;/em&gt;. arXiv preprint arXiv:2203.02167.&lt;/li&gt;
&lt;li&gt;(StATIK) Markowitz, E., Balasubramanian, K., Mirtaheri, M., Annavaram, M., Galstyan, A., &amp;amp; Ver Steeg, G. (2022, July). &lt;em&gt;StATIK: Structure and text for inductive knowledge graph completion&lt;/em&gt;. In Findings of the Association for Computational Linguistics: NAACL 2022 (pp. 604-615).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="dự-đoán-đồ-thị-con-subgraph-predicting"&gt;
 Dự đoán đồ thị con (Subgraph predicting)&lt;span class="heading__anchor"&gt; &lt;a href="#d%e1%bb%b1-%c4%91o%c3%a1n-%c4%91%e1%bb%93-th%e1%bb%8b-con-subgraph-predicting"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;(GraIL) Teru, K., Denis, E., &amp;amp; Hamilton, W. (2020, November). &lt;em&gt;Inductive relation prediction by subgraph reasoning. In International Conference on Machine Learning&lt;/em&gt; (pp. 9448-9457). PMLR.&lt;/li&gt;
&lt;li&gt;(CoMPILE) Mai, S., Zheng, S., Yang, Y., &amp;amp; Hu, H. (2021, May). &lt;em&gt;Communicative message passing for inductive relation reasoning&lt;/em&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 5, pp. 4294-4302).&lt;/li&gt;
&lt;li&gt;(TACT) Chen, J., He, H., Wu, F., &amp;amp; Wang, J. (2021, May). &lt;em&gt;Topology-aware correlations between relations for inductive link prediction in knowledge graphs&lt;/em&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 7, pp. 6271-6278).&lt;/li&gt;
&lt;li&gt;(ConGLR) Lin, Q., Liu, J., Xu, F., Pan, Y., Zhu, Y., Zhang, L., &amp;amp; Zhao, T. (2022, July). &lt;em&gt;Incorporating context graph with logical reasoning for inductive relation prediction&lt;/em&gt;. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 893-903).&lt;/li&gt;
&lt;li&gt;(SNRI) Xu, X., Zhang, P., He, Y., Chao, C., &amp;amp; Yan, C. (2022). &lt;em&gt;Subgraph neighboring relations infomax for inductive link prediction on knowledge graphs&lt;/em&gt;. arXiv preprint arXiv:2208.00850.&lt;/li&gt;
&lt;li&gt;(BertRL) Zha, H., Chen, Z., &amp;amp; Yan, X. (2022, June). &lt;em&gt;Inductive relation prediction by BERT&lt;/em&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 5, pp. 5923-5931).&lt;/li&gt;
&lt;li&gt;(RMPI) Geng, Y., Chen, J., Pan, J. Z., Chen, M., Jiang, S., Zhang, W., &amp;amp; Chen, H. (2023, April). &lt;em&gt;Relational message passing for fully inductive knowledge graph completion&lt;/em&gt;. In 2023 IEEE 39th International Conference on Data Engineering (ICDE) (pp. 1221-1233). IEEE.&lt;/li&gt;
&lt;li&gt;(PathCon) Wang, H., Ren, H., &amp;amp; Leskovec, J. (2021, August). &lt;em&gt;Relational message passing for knowledge graph completion&lt;/em&gt;. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp;amp; Data Mining (pp. 1697-1707).&lt;/li&gt;
&lt;li&gt;(NBFNet) Zhu, Z., Zhang, Z., Xhonneux, L. P., &amp;amp; Tang, J. (2021). &lt;em&gt;Neural bellman-ford networks: A general graph neural network framework for link prediction&lt;/em&gt;. Advances in Neural Information Processing Systems, 34, 29476-29490.&lt;/li&gt;
&lt;li&gt;(RED-GNN) Zhang, Y., &amp;amp; Yao, Q. (2022, April). &lt;em&gt;Knowledge graph reasoning with relational digraph&lt;/em&gt;. In Proceedings of the ACM web conference 2022 (pp. 912-924).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="dựa-trên-khai-thác-luật-rule-mining"&gt;
 Dựa trên khai thác luật (Rule mining)&lt;span class="heading__anchor"&gt; &lt;a href="#d%e1%bb%b1a-tr%c3%aan-khai-th%c3%a1c-lu%e1%ba%adt-rule-mining"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;(AMIE) Galárraga, L. A., Teflioudi, C., Hose, K., &amp;amp; Suchanek, F. (2013, May). &lt;em&gt;AMIE: association rule mining under incomplete evidence in ontological knowledge bases&lt;/em&gt;. In Proceedings of the 22nd international conference on World Wide Web (pp. 413-422).&lt;/li&gt;
&lt;li&gt;(RuleN) Meilicke, C., Fink, M., Wang, Y., Ruffinelli, D., Gemulla, R., &amp;amp; Stuckenschmidt, H. (2018). &lt;em&gt;Fine-grained evaluation of rule-and embedding-based systems for knowledge graph completion&lt;/em&gt;. In The Semantic Web–ISWC 2018: 17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I 17 (pp. 3-20). Springer International Publishing.&lt;/li&gt;
&lt;li&gt;(AnyBURL) Meilicke, C., Chekol, M. W., Ruffinelli, D., &amp;amp; Stuckenschmidt, H. (2019, August). &lt;em&gt;Anytime Bottom-Up Rule Learning for Knowledge Graph Completion&lt;/em&gt;. In IJCAI (pp. 3137-3143).&lt;/li&gt;
&lt;li&gt;(NeuralLP) Yang, F., Yang, Z., &amp;amp; Cohen, W. W. (2017). &lt;em&gt;Differentiable learning of logical rules for knowledge base reasoning&lt;/em&gt;. Advances in neural information processing systems, 30.&lt;/li&gt;
&lt;li&gt;(DRUM) Sadeghian, A., Armandpour, M., Ding, P., &amp;amp; Wang, D. Z. (2019). &lt;em&gt;Drum: End-to-end differentiable rule mining on knowledge graphs&lt;/em&gt;. Advances in Neural Information Processing Systems, 32.&lt;/li&gt;
&lt;li&gt;(CBGNN) Yan, Z., Ma, T., Gao, L., Tang, Z., &amp;amp; Chen, C. (2022, June). &lt;em&gt;Cycle representation learning for inductive relation prediction&lt;/em&gt;. In International Conference on Machine Learning (pp. 24895-24910). PMLR&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="các-phương-pháp-ngoại-suy-quan-hệ-relation-extrapolation-methods"&gt;
 Các phương pháp ngoại suy quan hệ (Relation extrapolation methods)&lt;span class="heading__anchor"&gt; &lt;a href="#c%c3%a1c-ph%c6%b0%c6%a1ng-ph%c3%a1p-ngo%e1%ba%a1i-suy-quan-h%e1%bb%87-relation-extrapolation-methods"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="mã-hóa-quan-hệ-relation-encoding"&gt;
 Mã hóa quan hệ (Relation encoding)&lt;span class="heading__anchor"&gt; &lt;a href="#m%c3%a3-h%c3%b3a-quan-h%e1%bb%87-relation-encoding"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Encode from structural information&lt;/strong&gt; (khi tập support chỉ chứa những thông tin về bộ ba chưa biết):
&lt;ul&gt;
&lt;li&gt;(MetaR) Chen, M., Zhang, W., Zhang, W., Chen, Q., &amp;amp; Chen, H. (2019). Meta relational learning for few-shot link prediction in knowledge graphs. arXiv preprint arXiv:1909.01515.&lt;/li&gt;
&lt;li&gt;(GANA) Niu, G., Li, Y., Tang, C., Geng, R., Dai, J., Liu, Q., &amp;hellip; &amp;amp; Si, L. (2021, July). Relational learning with gated and attentive neighbor aggregator for few-shot knowledge graph completion. In Proceedings of the 44th International ACM SIGIR conference on research and development in information retrieval (pp. 213-222).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Encode from other information&lt;/strong&gt; (khi tập support có chứa những thông tin khác):
&lt;ul&gt;
&lt;li&gt;(ZSGAN) Qin, P., Wang, X., Chen, W., Zhang, C., Xu, W., &amp;amp; Wang, W. Y. (2020, April). Generative adversarial zero-shot relational learning for knowledge graphs. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 8673-8680).&lt;/li&gt;
&lt;li&gt;(OntoZSL) Geng, Y., Chen, J., Chen, Z., Pan, J. Z., Ye, Z., Yuan, Z., &amp;hellip; &amp;amp; Chen, H. (2021, April). Ontozsl: Ontology-enhanced zero-shot learning. In Proceedings of the Web Conference 2021 (pp. 3325-3336).&lt;/li&gt;
&lt;li&gt;(DMoG) Song, R., He, S., Zheng, S., Gao, S., Liu, K., Yu, Z., &amp;amp; Zhao, J. (2022, October). Decoupling Mixture-of-Graphs: Unseen Relational Learning for Knowledge Graph Completion by Fusing Ontology and Textual Experts. In Proceedings of the 29th International Conference on Computational Linguistics (pp. 2237-2246).&lt;/li&gt;
&lt;li&gt;(HAPZSL) Li, X., Ma, J., Yu, J., Xu, T., Zhao, M., Liu, H., &amp;hellip; &amp;amp; Yu, R. (2022). HAPZSL: A hybrid attention prototype network for knowledge graph zero-shot relational learning. Neurocomputing, 508, 324-336.&lt;/li&gt;
&lt;li&gt;(DOZSL) Geng, Y., Chen, J., Zhang, W., Xu, Y., Chen, Z., Z. Pan, J., &amp;hellip; &amp;amp; Chen, H. (2022, August). Disentangled ontology embedding for zero-shot learning. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining (pp. 443-453).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="khớp-cặp-thực-thể-entity-pair-matching"&gt;
 Khớp cặp thực thể (Entity pair matching)&lt;span class="heading__anchor"&gt; &lt;a href="#kh%e1%bb%9bp-c%e1%ba%b7p-th%e1%bb%b1c-th%e1%bb%83-entity-pair-matching"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;Các công trình tiêu biểu&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;(GMatching) Xiong, W., Yu, M., Chang, S., Guo, X., &amp;amp; Wang, W. Y. (2018). &lt;em&gt;One-shot relational learning for knowledge graphs&lt;/em&gt;. arXiv preprint arXiv:1808.09040.&lt;/li&gt;
&lt;li&gt;(FSRL) Zhang, C., Yao, H., Huang, C., Jiang, M., Li, Z., &amp;amp; Chawla, N. V. (2020, April). Few-shot knowledge graph completion. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 03, pp. 3041-3048).&lt;/li&gt;
&lt;li&gt;(FAAN) Sheng, J., Guo, S., Chen, Z., Yue, J., Wang, L., Liu, T., &amp;amp; Xu, H. (2020). &lt;em&gt;Adaptive attentional network for few-shot knowledge graph completion&lt;/em&gt;. arXiv preprint arXiv:2010.09638.&lt;/li&gt;
&lt;li&gt;(MetaP) Jiang, Z., Gao, J., &amp;amp; Lv, X. (2021, July). &lt;em&gt;Metap: Meta pattern learning for one-shot knowledge graph completion&lt;/em&gt;. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2232-2236).&lt;/li&gt;
&lt;li&gt;(P-INT) Xu, J., Zhang, J., Ke, X., Dong, Y., Chen, H., Li, C., &amp;amp; Liu, Y. (2021, November). &lt;em&gt;P-INT: A path-based interaction model for few-shot knowledge graph completion&lt;/em&gt;. In Findings of the Association for Computational Linguistics: EMNLP 2021 (pp. 385-394).&lt;/li&gt;
&lt;li&gt;(GraphANGEL) Jin, J., Wang, Y., Du, K., Zhang, W., Zhang, Z., Wipf, D., &amp;hellip; &amp;amp; Gan, Q. (2021, October). &lt;em&gt;Inductive Relation Prediction Using Analogy Subgraph Embeddings&lt;/em&gt;. In International Conference on Learning Representations.&lt;/li&gt;
&lt;li&gt;(CSR) Huang, Q., Ren, H., &amp;amp; Leskovec, J. (2022). &lt;em&gt;Few-shot relational reasoning via connection subgraph pretraining&lt;/em&gt;. Advances in Neural Information Processing Systems, 35, 6397-6409.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="dữ-liệu"&gt;
 Dữ liệu&lt;span class="heading__anchor"&gt; &lt;a href="#d%e1%bb%af-li%e1%bb%87u"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Các bộ dữ liệu:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;WN11-{Head/Tail/Both}-{1,000/3,000/5,000}
&lt;ul&gt;
&lt;li&gt;Được đề xuất bởi&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;{WN18RR/FB15k-237/NELL995}-{v1/2/3/4}&lt;/li&gt;
&lt;li&gt;NELL-One/Wiki-One&lt;/li&gt;
&lt;li&gt;NELL-ZS/Wiki-ZS&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="bàn-luận"&gt;
 Bàn luận&lt;span class="heading__anchor"&gt; &lt;a href="#b%c3%a0n-lu%e1%ba%adn"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Bàn luận 1: Những gia định về ngoại suy thực thể&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Thường có hai giả định khác nhau về ngoại suy thực thể (entity extroplation).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Giả định thứ nhất: các thực thể chưa biết trong tập support được liên kết với những thực thể đã biết. Giả định này được gọi là bán ngoại suy thực thể (semi-entity extrapolation).&lt;/li&gt;
&lt;li&gt;Giả định thứ hai: các thực thể chưa biết tạo thành một đồ thị tri thức hoàn toàn mới trong các tập support và không liên kết bởi các thực thể đã biết. Giả định này được gọi là ngoại suy thực thể hoàn toàn (fully-entity extrapolation).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Như vậy, ta hoàn toàn có thể thấy các mô hình được thiết kế để giải quyết cho vấn đề ngoại suy hoàn toàn thì có thể áp dụng để giải quyết cho trường hợp bán ngoại suy, nhưng chiều ngược lại thì không được.&lt;/p&gt;
&lt;p&gt;Hầu hết các mô hình bán ngoại suy thực thể nằm trong nhóm các mô hình dựa trên mã hóa thực thể và mã hóa thực thể chưa biết từ thông tin cấu trúc bởi vì chúng thường thiết kế các module cho việc chuyển giao tri thức từ các thực thể đã biết. Một số mô hình thiết kế bộ mã hóa độc lập với thực thể khiến chúng có thể giải quyết vấn đề ngoại suy hoàn toàn.&lt;/p&gt;
&lt;p&gt;Các phương pháp mã hóa các thực thể chưa biết từ các nguồn thông tin khác như thông tin văn bản mô tả cũng có thể giải quyết được bài toán ngoại suy hoàn toàn. Các phương pháp dựa trên dự đoán đồ thị con và học dựa trên luật có khả năng xử lý bài toán ngoại suy hoàn toàn bởi vì các đồ thị con và luật thì độc lập với thực thể.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bàn luận 2: Khai thác thông tin trong tập support&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Nhiều thể loại thông tin có thể được khai thác để xây dựng các tập support cho các thành phần chưa biết, bao gồm các bộ ba dữ kiện, mô tả ngữ cảnh, và bản thể học (ontologies). Chúng ta sẽ lần lượt xem xét từng thể loại một.&lt;/p&gt;
&lt;p&gt;Đầu tiên, các bộ ba dữ kiện, mà cung cấp thông tin cấu trúc, một kiểu trực quan của thông tin hỗ trợ cho các thành phần chưa biết bởi chúng thường xuất hiện với những thành phần khác trong dạng thức của một bộ ba dữ kiện thay vì đứng một mình. Tri thức từ những thành phần đă biết được cung cấp bởi các bộ ba mà có thể sử dụng bởi các thành phần chưa biết.&lt;/p&gt;
&lt;p&gt;Bên cạnh đó, thông tin mô tả ngữ cảnh cũng phổ biến cho KG bởi vì nhiều KG được xây dựng từ dữ liệu văn bản. Mô tả ngữ cảnh có thể cung cấp một cách tự nhiên khả năng ngoại suy đến cho những thành phần chưa biết, và thường được sử dụng trong các bộ mã hóa văn bản để biến đổi văn bản thành các embeddings.&lt;/p&gt;
&lt;p&gt;Cuối cùng, bản thế học (ontologies) thường được sử dụng như tri thức tiên nghiệm (prior knowledge) về mối tương quan giữa các thành phần đã biết và chưa biết, và được sử dụng giải quyết các quan hệ chưa biết trong nhiều trong trình hiện nay. Một ontology thường được thể hiện như một đồ thị bao gồm các quan hệ phân cấp và ràng buộc trên các miền và khoảng quan hệ. Embedding của các quan hệ chưa biết có thể được phát sinh bằng cách sử dụng một phương pháp dựa trên ontology mà sử dụng nhiều kỹ thuật bao gồm GAN hay disentangled representation learning.&lt;/p&gt;
&lt;h2 class="heading" id="các-định-hướng-tương-lai"&gt;
 Các định hướng tương lai&lt;span class="heading__anchor"&gt; &lt;a href="#c%c3%a1c-%c4%91%e1%bb%8bnh-h%c6%b0%e1%bb%9bng-t%c6%b0%c6%a1ng-lai"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Định hướng 1: Khai thác vào các ứng dụng&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Hầu hết các phương pháp ngoại suy tri thức hiện nay được đánh giá dựa trên bài toán dự đoán liên kết trên các tập kiểm tra. Mặc dù tác vụ dự đoán liên kết có thể cho thấy tính hiệu quả của mô hình và giúp đồ thị tri thức hoàn thiện, nó cũng có giá trị để khám phát cách để phát sinh những thành phần chưa biết của KG trong nhiều ứng dụng như: answering logical queries expressed in a subset of first-order logic; entity alignment task under the growing KG; question answering; &amp;hellip;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Định hướng 2: Thông tin hỗ trợ đa thể thức&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Đồ thị tri thức đa thể thức (Multi-modal knowledge graphs) là một trong những chủ đề nghiên cứu được đề cập nhiều trong thời gian gần đây. Trong khi nhiều phương pháp ngoại suy tri thức tập trung vào việc sử dụng ngôn ngữ tự nhiên như trong tin hỗ trợ cho các thành phần chưa biết, thì có tương đối ít các công trình giải quyết vấn đề tiềm năng của việc sử dụng thông tin thị giác.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Định hướng 3: Ngoại suy thực thể và quan hệ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Các nghiên cứu hiện tại trên vấn đề ngoại suy tập trung chủ yếu vào việc giải quyết ngoại suy thực thể và quan suy quan hệ một cách hoàn toàn độc lập, nhưng trong nhiều ứng dụng thực tế, các thực thể và quan hệ chưa biết có thể xuất hiện một cách đồng thời. Một lời giải khả thi ở đây là các phương pháp tích hợp một cách hiệu quả cả ngoại suy thực thể và quan hệ.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Định hướng 4: Thiết lập động và lifelong&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Trong nhiều ứng dụng thực tế, một số KG bao gồm các ràng buộc thời gian mà thỏa mãn một số xem xét về thông tin thời gian khi mà đánh giá điểm cho một bộ ba nào đó. Đồ thị tri thức động cũng đối mặt với thách thức về việc xuất hiện của các thành phần bởi vì bản chất động của nó. Để giải quyết vấn đề này, nhiều công trình định nghĩa một bài toán về ngoại suy thực thể trong đồ thị động và sử dụng các kỹ thuật để thu được các embedding cho các thực thể chưa biết.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tài liệu tham khảo&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[1] Chen, M., Zhang, W., Geng, Y., Xu, Z., Pan, J. Z., &amp;amp; Chen, H. (2023). &lt;a href="https://arxiv.org/pdf/2302.01859"&gt;Generalizing to Unseen Elements: A Survey on Knowledge Extrapolation for Knowledge Graphs&lt;/a&gt;. arXiv preprint arXiv:2302.01859.&lt;/p&gt;</description></item><item><title>Đồ thị tri thức thực sự là gì? - What're actually knowledge graphs?</title><link>https://blog.namln.org/research/what_knowledge_graphs/</link><pubDate>Thu, 26 Oct 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/research/what_knowledge_graphs/</guid><description>&lt;p&gt;Trong lĩnh vực nghiên cứu đặc trưng tri thức (knowledge representation) và suy diễn (reasoning), tích hợp/ tổng hợp dữ liệu là một tác vụ quan trọng, và nó thường được thực hiện bằng cách sử dụng các cơ sở tri thức (knowledge bases). Có nhiều loại cơ sở tri thức, trong đó có đồ thị tri thức (knowledge graphs).&lt;/p&gt;
&lt;p&gt;Đồ thị tri thức được tạo ra bằng cách sử dụng một mô hình tri thức (knowledge model), đó là một mô hình dữ liệu cấu trúc hóa dạng đồ thị (graph-structured data model) hay còn được gọi là &lt;strong&gt;ontology&lt;/strong&gt;. Đó là lý do tại sao nói, mô hình tri thức là trái tim của đồ thị tri thức.&lt;/p&gt;
&lt;p&gt;Thông thường, đồ thị tri thức thường được sử dụng để mà lưu trữ những mô tả có liên kết nội tại (interlinked descriptions) của các thực thể (entities) bao gồm đối tượng (objects), sự kiện (events), tình huống (situations) hay những khái niệm trừu tượng (abstract concepts).&lt;/p&gt;
&lt;p&gt;Các mô tả bên trong đồ thị đều có thông tin ngữ nghĩa (formal sematic) được mã hóa cho phép có thể được sử dụng để làm cơ sở cho việc tương tác người-máy để xử lý theo cách hiệu quả và tránh nhập nhằng. Hơn nữa, chúng cũng đóng góp cho những mô tả khác, hình thành nên một mạng lưới (network) mà trong đó mỗi thực thể thể hiện một phần của mô tả của những thực thể có liên hệ đến nó. Và dựa vào mô hình tri thức, tính đa dạng dữ liệu cũng được liên kết giữa các thành phần trong đồ thị và được mô tả thông qua semantic metadata.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://miro.medium.com/v2/resize:fit:1061/1*OIgEwADf5ZD2ib_9ejZ-zw.png" alt=""&gt;&lt;/p&gt;
&lt;h2 class="heading" id="lịch-sử-hình-thành"&gt;
 Lịch sử hình thành&lt;span class="heading__anchor"&gt; &lt;a href="#l%e1%bb%8bch-s%e1%bb%ad-h%c3%acnh-th%c3%a0nh"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Vào những năm 1972, thuật ngữ &amp;ldquo;Đồ thị tri thức&amp;rdquo; hay &amp;ldquo;Knowledge graphs&amp;rdquo; được nhà ngôn ngữ học người Australia, Edgar W. Schneider đề ra trong một thảo luận về cách thức xây dựng một hệ thống giảng dạy module hóa (modular instructional systems for courses). Và đến mãi cuối những năm 1980, University of Groningen và University of Twente đã hợp tác trong một dự án gọi là Knowledge Graphs với mục tiêu tập trung vào thiết kế các mạng ngữ nghĩa (semantic networks) với những cạnh giới hạn trong một tập quan hệ hữu hạn để mà tạo điều kiện thuận lợi cho nghiên cứu đại số trên đồ thị. Theo đó trong những thập kỉ tiếp theo, khoảng cách giữa semantic networks và knowledge graphs trở nên mờ hẳn đi.&lt;/p&gt;
&lt;p&gt;Những đồ thị tri thức đầu tiên là những cơ sở tri thức trong một miền tri thức cụ thể. Vào năm 1985, cơ sở dữ liệu WordNet được hình thành, nắm bắt các quan hệ ngữ nghĩa giữa các từ và ý nghĩa của chúng. Vào năm 2005, Marc Wirk sáng lập Geonames, nắm bắt các quan hệ giữa những tên gọi địa lý và vị trí và những thực thể được liên kết. Đến năm 1998, Andrew Edmonds of Science - Finance Ltd ở Anh, tạo ra một hệ thống gọi là ThinkBase sử dụng logic mờ (fuzzy-logic) dựa trên suy diễn trong ngữ cảnh trực quan (graphical context).&lt;/p&gt;
&lt;p&gt;Đến năm 2007, lần lượt cả DBpedia và Freebase được hình thành và công bố như các cơ sở tri thức dạng đồ thị (graph-based knowledge bases) cho mục tiêu tổng quát hóa tri thức. DBpedia tập trung vào những dữ liệu được rút trích từ Wikipedia, trong khi Freebase tổng hợp một lượng lớn các tập dữ liệu công khai. Tuy nhiên cả hai không tự gọi chúng là &amp;ldquo;knowledge graphs&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;Đến năm 2012, Google giới thiệu đồ thị tri thức của họ, Google Knowledge Graphs, được xây dựng trên DBpedia và Freebase cùng với một lượng lớn các nguồn dữ liệu khác. Sau đó, họ tích hợp các nội dung được rút trích như RDFa, Microdara, JSON-LD từ các web pages, CIA World Factbook, Wikidata, và Wikipedia. Các loại thực thể và mối quan hệ liên kết trong đồ thị tri thức này đã được tổ chức thêm bằng cách sử dụng các thuật ngữ từ bộ tự vựng schema.org.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.researchgate.net/publication/356140196/figure/fig2/AS:1089072500097059@1636666526420/Development-history-of-the-knowledge-graph.jpg" alt=""&gt;&lt;/p&gt;
&lt;h2 class="heading" id="định-nghĩa"&gt;
 Định nghĩa&lt;span class="heading__anchor"&gt; &lt;a href="#%c4%91%e1%bb%8bnh-ngh%c4%a9a"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Như ta đã biết, một cơ sở tri thức là một tập dữ liệu cụ thể mà thể hiện những dữ liệu thế giới thực và các quan hệ ngữa nghĩa trong dạng các bộ ba (triplets). Khi mà những bộ ba được thể hiện như một đồ thị với các cạnh là những quan hệ và các nút là những thự thể, nó được xem là đồ thị tri thức. Một cách tổng quát, đồ thị tri thức và cơ sở tri thức được xem là giống nhau về mặt khái niệm và có thể thay thế được cho nhau.&lt;/p&gt;
&lt;p&gt;Vậy, một đồ thị tri thức thực sự là gì?&lt;/p&gt;
&lt;p&gt;Không có một định nghĩa được chấp nhận. Hầu hết chúng đều dựa trên góc nhìn từ semantic web và bao gồm những đặc trưng chính:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Flexible relations among knowledge in topical domains: Một đồ thị tri thức
&lt;ul&gt;
&lt;li&gt;định nghĩa các lớp trừu tượng, và các quan hệ của những thực thể trong một lược đồ (schema),&lt;/li&gt;
&lt;li&gt;mô tả chủ yếu những thực thể thế giới thực và các quan hệ nội tại giữa chúng trong tổ chức cấu trúc dữ liệu đồ thị,&lt;/li&gt;
&lt;li&gt;cho phép bất kỳ thực thể nào có quan hệ tiềm năng với những thực thể khác,&lt;/li&gt;
&lt;li&gt;bao quát đa dạng miền tri thức&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;General structure: một mạng lưới các thực thể, những loại ngữ nghĩa, thuộc tính, và các mối quan hệ.&lt;/li&gt;
&lt;li&gt;Supporting reasoning over inferred ontologies: đồ thị tri thức thu thập và tích hợp thông tin vào một ontology và áp dụng bộ suy luận để rút ra kiến thức mới.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Tuy nhiên, có nhiều đặc trưng đồ thị tri thức không thật sự cần thiết và liên quan với nhau trong một số tình huống. Có thể hiểu đơn giản hơn:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Đồ thị tri thức là một cấu trúc số hóa mà thể hiện tri thức như các khái niệm và quan hệ giữa chúng (dữ kiện).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="đặc-trưng-cốt-lỗi-của-đồ-thị-tri-thức"&gt;
 Đặc trưng cốt lỗi của đồ thị tri thức&lt;span class="heading__anchor"&gt; &lt;a href="#%c4%91%e1%ba%b7c-tr%c6%b0ng-c%e1%bb%91t-l%e1%bb%97i-c%e1%bb%a7a-%c4%91%e1%bb%93-th%e1%bb%8b-tri-th%e1%bb%a9c"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Các đồ thị tri thức kết hợp nhiều tính chất của nhiều mô hình quản lý dữ liệu như:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cơ sở dữ liệu (database) $\rightarrow$ dữ liệu có thể được khai phá thông qua các truy vấn được cấu trúc hóa (structured queries)&lt;/li&gt;
&lt;li&gt;Cấu trúc dữ liệu đồ thị (graph) $\rightarrow$ dữ liệu có thể được phân tích như cấu trúc dữ liệu mạng, đồ thị&lt;/li&gt;
&lt;li&gt;Cơ sở tri thức (knowledge base) $\rightarrow$ dữ liệu mang trong nó các thông tin ngữ nghĩa hình thức, có thể được sử dụng cho các tác vụ tích hợp và suy diễn.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thông thường, các đồ thị tri thức được thể hiện trong Resource Description Framework (RDF), nó cho phép thực thi tích hợp (integration), thống nhất (unification), liên kết (linking), và tái sử dụng (reuse) bởi vì nó có đặc điểm:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Tính biểu diễn (expressivity) vì khả năng thể hiện hiệu quả nhiều loại dữ liệu và nội dung.&lt;/li&gt;
&lt;li&gt;Hiệu suất (performance) cao khi có thể xử lý hàng tỉ dữ kiện và thuộc tính.&lt;/li&gt;
&lt;li&gt;Có khả năng tương tác (interoperability) giữa người và máy nhờ cho phép truy vấn thông qua SPARQL Protocol, quản lý nhờ vào SPARQL Store, và cộng tác (federation).&lt;/li&gt;
&lt;li&gt;Có tính tiêu chuẩn hóa thông qua quá trình W3C.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="bản-thể-luận-ontologies-và-ngữ-nghĩa-hình-thức-formal-semantics"&gt;
 Bản thể luận (ontologies) và ngữ nghĩa hình thức (formal semantics)&lt;span class="heading__anchor"&gt; &lt;a href="#b%e1%ba%a3n-th%e1%bb%83-lu%e1%ba%adn-ontologies-v%c3%a0-ng%e1%bb%af-ngh%c4%a9a-h%c3%acnh-th%e1%bb%a9c-formal-semantics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Bản thể luận (ontologies) là xương sống của ngữ nghĩa hình thức (formal semantics) của một đồ thị tri thứ. Nó còn gọi là một lược đồ của dồ thị. Nó là mối liên hệ giữa các developers của một đồ thị tri thức và mong muốn của người dùng về ý nghĩa của dữ liệu bên trong đồ thị.&lt;/p&gt;
&lt;p&gt;Một người dùng có thể là con người hoặc một phần mềm mà muốn tích hợp dữ liệu theo một cách đáng tin cậy và chính xác. Các bản thể luận đảm bảo hiểu đúng đắn về dữ liệu và ý nghĩa của nó.&lt;/p&gt;
&lt;p&gt;Khi các ngữ nghĩa hình thức (formal semantics) được sử dụng để khai triển và tích hợp dữ liệu của đồ thị tri thức, một số chỉ dẫn cần được đề ra:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Lớp (classes)&lt;/li&gt;
&lt;li&gt;Loại quan hệ (relationship types)&lt;/li&gt;
&lt;li&gt;Loại (categories)&lt;/li&gt;
&lt;li&gt;Mô tả phi ngữ cảnh (free context descriptions)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="thế-nào-là-không-phải-là-đồ-thị-tri-thức"&gt;
 Thế nào là KHÔNG PHẢI LÀ đồ thị tri thức?&lt;span class="heading__anchor"&gt; &lt;a href="#th%e1%ba%bf-n%c3%a0o-l%c3%a0-kh%c3%b4ng-ph%e1%ba%a3i-l%c3%a0-%c4%91%e1%bb%93-th%e1%bb%8b-tri-th%e1%bb%a9c"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Không phải mọi đồ thị RDF là một đồ thị tri thức&lt;/strong&gt;. Cụ thể, một tập hợp dữ liệu thống kế, ví dụ như dữ liệu GDP của các quốc gia được thể hiện trong một RDF thì không phải một đồ thị tri thức. Một đồ thị thể hiện dữ liệu thường thì hữu ích, nhưng nó có thể không thật sự cần thiết để nắm bắt tri thức ngữ nghĩa của dữ liệu. Nó có thể hợp lý cho một ứng dụng chỉ cần có một chuỗi &amp;ldquo;Italy&amp;rdquo; liên kết với một chuỗi &amp;ldquo;GDP&amp;rdquo; và một con số &amp;ldquo;1 tỷ&amp;rdquo; mà không cần phải định nghĩa quốc gia nào hay GDP &amp;ldquo;Gross Domestic Product&amp;rdquo; của một quốc gia là gì? &lt;strong&gt;Đó là những liên kết và cấu trúc đồ thị tạo nên đồ thị tri thức&lt;/strong&gt;, không phải do ngôn ngữ dùng để thể hiện dữ liệu.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Không phải mọi cơ sở tri thức là một đồ thị tri thức&lt;/strong&gt;. Một đặc trưng cốt lõi của một đồ thị tri thức là những mô tả thực thể nên được liên kết nội tại với một thực thể khác. Điều này định nghĩa một thực thể liên kết với một thực thể khác. Và liên kết đó là cách mà đồ thị hình thành, ví dụ A là B mà B là C và C có D thì A có D. Cơ sở tri thức mà không có cấu trúc hình thức và ngữ nghĩa như cơ sở tri thức hỏi đáp về một domain nào đó thì không phải là một đồ thị tri thức. Nó hoàn toàn khả thi để có một hệ thống chuyên gia mà có một tập dữ liệu được tổ chức mà không phải ở dạng đồ thị như một tập các luật &amp;ldquo;if-then&amp;rdquo;.&lt;/p&gt;
&lt;h2 class="heading" id="đồ-thị-tri-thức-lớn"&gt;
 Đồ thị tri thức lớn&lt;span class="heading__anchor"&gt; &lt;a href="#%c4%91%e1%bb%93-th%e1%bb%8b-tri-th%e1%bb%a9c-l%e1%bb%9bn"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Google Knowledge Graph&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.ithinkanidea.com/wp-content/uploads/elementor/thumbs/Google-Knowledge-Graph-A-Complete-Guide-ojcy4q42vke5n0bu8gvbn4scrm1f53irumek7ggknc.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DBpedia&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://pbs.twimg.com/media/DzH7AhvX4AAhhev.jpg" alt=""&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Geonames&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://techcrunch.com/wp-content/uploads/2007/05/geonames.png" alt=""&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wordnet&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://upload.wikimedia.org/wikipedia/commons/b/b8/WordNet.PNG" alt=""&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FactForge&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://www.ontotext.com/wp-content/uploads/2016/09/FactForge_header1-1024x530.png" alt=""&gt;&lt;/p&gt;
&lt;h2 class="heading" id="tham-khảo"&gt;
 Tham khảo&lt;span class="heading__anchor"&gt; &lt;a href="#tham-kh%e1%ba%a3o"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] &lt;a href="https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/"&gt;What is a knowledge graphs?&lt;/a&gt;, &lt;a href="https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/"&gt;https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Miscellanea</title><link>https://blog.namln.org/en/miscellanea/</link><pubDate>Tue, 17 Oct 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/miscellanea/</guid><description>&lt;h2 class="heading" id="miscellanea"&gt;
 Miscellanea&lt;span class="heading__anchor"&gt; &lt;a href="#miscellanea"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="latex-template"&gt;
 $\LaTeX$ Template&lt;span class="heading__anchor"&gt; &lt;a href="#latex-template"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Report Template&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[1] DoCS HCMUS - Template Report 01&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/mqvdqztstvnf#77b130"&gt;https://www.overleaf.com/read/mqvdqztstvnf#77b130&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[2] DoCS HCMUS - Template Report 02&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/qvqpqytgztsn#9c5467"&gt;https://www.overleaf.com/read/qvqpqytgztsn#9c5467&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Thesis Template&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[1] Master Thesis Proposal&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/pmmkbqmsrvnq#7c860f"&gt;https://www.overleaf.com/read/pmmkbqmsrvnq#7c860f&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[2] Master Thesis Template&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/ybsqztfjnvjc#a23f6b"&gt;https://www.overleaf.com/read/ybsqztfjnvjc#a23f6b&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[3] Master Math Thesis Template&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/bzwmvkymwfwb#2b968e"&gt;https://www.overleaf.com/read/bzwmvkymwfwb#2b968e&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Beamer Template&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[1] DoCS HCMUS - Template Slide 01&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/dhkcxygmnxjv#fa7ec3"&gt;https://www.overleaf.com/read/dhkcxygmnxjv#fa7ec3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[2] Slide-template&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Link: &lt;a href="https://www.overleaf.com/read/jfgnzwpsxmhk#4d4625"&gt;https://www.overleaf.com/read/jfgnzwpsxmhk#4d4625&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="advices"&gt;
 Advices&lt;span class="heading__anchor"&gt; &lt;a href="#advices"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="https://people.math.osu.edu/harper.903/advice_to_a_young_mathematician_atiyah.pdf"&gt;Advice to a Young Mathematician&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://windowsontheory.org/2015/11/03/advice-for-the-budding-theorist/"&gt;Advice for the budding theorist&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.boazbarak.org/informal/"&gt;Non-technical or less-technical writings and talks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://terrytao.wordpress.com/career-advice/"&gt;Career advice&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="links"&gt;
 Links&lt;span class="heading__anchor"&gt; &lt;a href="#links"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/list/cs.CC/recent"&gt;arXiv: Computational Complexity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/list/cs.CG/recent"&gt;arXiv: Computational Geometry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/list/cs.DS/recent"&gt;arXiv: Data Structures and Algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://aaronsadventures.blogspot.com/"&gt;Aaron Roth&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://adamsheffer.wordpress.com/"&gt;Adam Sheffer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://adamdsmith.wordpress.com/"&gt;Adam Smith&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://polylogblog.wordpress.com/"&gt;Andrew McGregor&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=""&gt;Banach&amp;rsquo;s Algorithmic Corner&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://benjamin-recht.github.io/"&gt;Ben Recht&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://bit-player.org/"&gt;bit-player&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cstheory-jobs.org/"&gt;CCI: jobs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cstheory-events.org/"&gt;CS Theory Events&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.computationalcomplexity.org/"&gt;Computational Complexity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://11011110.github.io/blog/"&gt;David Eppstein&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://daveagp.wordpress.com/"&gt;David Pritchard&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://decentdescent.org/"&gt;Decent Descent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://decentralizedthoughts.github.io/"&gt;Decentralized Thoughts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://differentialprivacy.org/"&gt;DifferentialPrivacy.org&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://eccc.weizmann.ac.il/"&gt;ECCC Papers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://emanueleviola.wordpress.com/"&gt;Emanuele Viola&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://3dpancakes.typepad.com/ernie/"&gt;Ernie&amp;rsquo;s 3D Pancakes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dstheory.wordpress.com/"&gt;Foundation of Data Science - Virtual Talk Series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://francisbach.com/"&gt;Francis Bach&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gilkalai.wordpress.com/"&gt;Gil Kalai&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blogs.oregonstate.edu/glencora"&gt;Glencora Borradaile&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://research.googleblog.com/search/label/Algorithms"&gt;Google Research Blog: Algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gradientscience.org/"&gt;Gradient Science&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://grigory.github.io/blog"&gt;Grigory Yaroslavtsev&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://minorfree.github.io"&gt;Hung Le&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tcsmath.wordpress.com"&gt;James R. Lee&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kamathematics.wordpress.com"&gt;Kamathematics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://processalgebra.blogspot.com/"&gt;Luca Aceto&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://lucatrevisan.wordpress.com"&gt;Luca Trevisan&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mittheory.wordpress.com/"&gt;MIT CSAIL Student Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://mybiasedcoin.blogspot.com/"&gt;Michael Mitzenmacher&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.mrtz.org/"&gt;Moritz Hardt&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://mysliceofpizza.blogspot.com/search/label/aggregator"&gt;Muthu Muthukrishnan&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nisheethvishnoi.wordpress.com"&gt;Nisheeth Vishnoi&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.solipsistslog.com"&gt;Noah Stephens-Davidowitz&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://offconvex.github.io/"&gt;Off the Convex Path&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://paulwgoldberg.blogspot.com/search/label/aggregator"&gt;Paul Goldberg&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ptreview.sublinear.info"&gt;Property Testing Review&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://rjlipton.wpcomstaging.com"&gt;Richard Lipton&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blogs.princeton.edu/imabandit"&gt;Sébastien Bubeck&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://scottaaronson.blog"&gt;Scott Aaronson&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.simons.berkeley.edu/"&gt;Simons Institute Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tcsplus.wordpress.com/"&gt;TCS+ Seminar Series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.geomblog.org/"&gt;The Geomblog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.let-all.com/blog"&gt;The Learning Theory Alliance Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://theorydish.blog/"&gt;Theory Dish: Stanford Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://thmatters.wordpress.com/"&gt;Theory Matters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mycqstate.wordpress.com"&gt;Thomas Vidick&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://agtb.wordpress.com/"&gt;Turing&amp;rsquo;s Invisible Hand&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://windowsontheory.org/"&gt;Windows on Theory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://people.csail.mit.edu/jshun/graph.shtml"&gt;Papers on Graph Analytics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.win.tue.nl/~wscor/woeginger/P-versus-NP.htm"&gt;The P-versus-NP page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cstheory-jobs.org/"&gt;Theoretical Computer Science Jobs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://theory.report/"&gt;Theory of Computing Report&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://yetanothermathblog.com/"&gt;Yet Another Mathblog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kidger.site/thoughts/just-know-stuff/"&gt;Just know stuff. (Or, how to achieve success in a machine learning PhD.)&lt;/a&gt; - Patrick Kidger&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.mit.edu/~dimitrib/Ten_Rules.html"&gt;Ten Simple Rules for Mathematical Writing&lt;/a&gt; - Dimitri Bertsekas&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mathproblems123.wordpress.com/"&gt;Beni Bogoşel&amp;rsquo;s blog&lt;/a&gt; - Beni Bogoşel&lt;/li&gt;
&lt;li&gt;&lt;a href="https://web.stanford.edu/~boyd/"&gt;Prof. Stephen P. Boyd&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://web.evanchen.cc/"&gt;Even Chen, MIT&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="heading" id="videos"&gt;
 Videos&lt;span class="heading__anchor"&gt; &lt;a href="#videos"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Mathematics - The Language of the Universe&lt;/li&gt;
&lt;/ul&gt;
&lt;iframe width="560" height="315" src="https://www.youtube.com/embed/S5LuCwZ0bpg?si=kbfZKFNQBCZiXJUp" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;
&lt;ul&gt;
&lt;li&gt;The World of Mathematical Reality&lt;/li&gt;
&lt;/ul&gt;
&lt;iframe width="560" height="315" src="https://www.youtube.com/embed/V1gT2f3Fe44?si=J6nWgaGcUb4wf4az" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;
&lt;ul&gt;
&lt;li&gt;Paul Lockhart teaching Go&lt;/li&gt;
&lt;/ul&gt;
&lt;iframe width="560" height="315" src="https://www.youtube.com/embed/vWya5fKwZ38?si=A_Ogq8wOAOB4bfaC" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;
&lt;ul&gt;
&lt;li&gt;Five Principles of Extraordinary Math Teaching&lt;/li&gt;
&lt;/ul&gt;
&lt;iframe width="560" height="315" src="https://www.youtube.com/embed/ytVneQUA5-c?si=mh-gPGq6OHxrWAK7" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;
&lt;ul&gt;
&lt;li&gt;The map of Mathematics&lt;/li&gt;
&lt;/ul&gt;
&lt;iframe width="560" height="315" src="https://www.youtube.com/embed/OmJ-4B-mS-Y?si=0Wgz04N6U3-m-vIh" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;
&lt;ul&gt;
&lt;li&gt;The map of Computer Science&lt;/li&gt;
&lt;/ul&gt;
&lt;iframe width="560" height="315" src="https://www.youtube.com/embed/SzJ46YA_RaA?si=ZrUK1SFMz_ehQF3a" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen&gt;&lt;/iframe&gt;
&lt;h3 class="heading" id="youtube-channles"&gt;
 Youtube channles&lt;span class="heading__anchor"&gt; &lt;a href="#youtube-channles"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;[1] &lt;a href="https://www.youtube.com/@mitocw"&gt;MIT OpenCourseWare&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[2] &lt;a href="https://www.youtube.com/@3blue1brown"&gt;3Blue1Brown&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[3] &lt;a href="https://www.youtube.com/@statquest"&gt;StatQuest with Josh Starmer&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[4] &lt;a href="https://www.youtube.com/@compscilessons"&gt;Computer Science Theory Explained&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[5] &lt;a href="https://www.youtube.com/@TheMathDistrict"&gt;The Math District&lt;/a&gt;&lt;/p&gt;
&lt;h3 class="heading" id="pre-print-on-optimization-and-operations-research"&gt;
 Pre-print on Optimization and Operations Research&lt;span class="heading__anchor"&gt; &lt;a href="#pre-print-on-optimization-and-operations-research"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://optimization-online.org/"&gt;Optimization Online&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/list/math.OC/new"&gt;arXiv math.OC Optimization and Control&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.paperdigest.org/2025/03/most-influential-arxiv-optimization-and-control-papers-2025-03-version/"&gt;Most Influential ArXiv (Optimization and Control) Papers (2025-03 Version)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://resources.paperdigest.org/2024/10/most-influential-arxiv-optimization-and-control-papers-2024-10/"&gt;Most Influential ArXiv (Optimization and Control) Papers (2024-10)&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="optimization-and-operations-research-journal"&gt;
 Optimization and Operations Research Journal&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-and-operations-research-journal"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://www.scimagojr.com/journalsearch.php?q=19700177340"&gt;Advanced Modeling and Optimization&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10479"&gt;Annals of Operations Research&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/245"&gt;Applied Mathematics and Optimization&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/archive/math.OC"&gt;ArXiv.org Optimization and Control Preprints&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10589"&gt;Computational Optimization and Applications&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/computers-and-operations-research"&gt;Computers &amp;amp; Operations Research&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/discrete-optimization"&gt;Discrete Optimization&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/european-journal-of-operational-research"&gt;European Journal of Operational Research&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/tinf20/current"&gt;INFOR Journal&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pubsonline.informs.org/journal/ijoc"&gt;INFORMS Journal on Computing&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.palgrave.com/gp/journal/41274"&gt;International Abstracts in Operations Research&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10878"&gt;Journal of Combinatorial Optimization&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10898"&gt;Journal of Global Optimization&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10732"&gt;Journal of Heuristics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/tjor20/current"&gt;Journal of the Operational Research Society&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10957"&gt;Journal of Optimization Theory and Applications&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/186"&gt;Mathematical Methods of Operations Research&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10107"&gt;Mathematical Programming: Series A and B&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pubsonline.informs.org/journal/moor"&gt;Mathematics of Operations Research&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://onlinelibrary.wiley.com/journal/15206750"&gt;Naval Research Logistics&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://onlinelibrary.wiley.com/journal/10970037"&gt;Networks&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pubsonline.informs.org/journal/opre"&gt;Operations Research&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/operations-research-letters"&gt;Operations Research Letters&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/gopt20/current"&gt;Optimization&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/11081"&gt;Optimization and Engineering&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/291"&gt;OR Spectrum&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.siam.org/publications/journals/siam-journal-on-computing-sicomp"&gt;SIAM Journal on Computing&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.siam.org/publications/journals/siam-journal-on-optimization-siopt"&gt;SIAM Journal on Optimization&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/series/15193"&gt;Stochastic Programming E-Print Series&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/transportation-research-part-b-methodological"&gt;Transportation Research Part B: Methodological&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="combinatorics-and-graph-theory-journal"&gt;
 Combinatorics and Graph Theory Journal&lt;span class="heading__anchor"&gt; &lt;a href="#combinatorics-and-graph-theory-journal"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/advances-in-applied-mathematics"&gt;Advances in Applied Mathematics&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.advancesincombinatorics.com/"&gt;Advances in Combinatorics&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/uakc20/current"&gt;AKCE International Journal of Graphs and Combinatorics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://admjournal.luguniv.edu.ua/index.php/adm"&gt;Algebra and Discrete Mathematics&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://alco.centre-mersenne.org/"&gt;Algebraic Combinatorics&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/26"&gt;Annals of Combinatorics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.charlesbabbage.ca/index.php/ars-combinatoria"&gt;Ars Combinatoria&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://amc-journal.eu/index.php/amc"&gt;Ars Mathematica Contemporanea&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/archive/math.CO"&gt;ArXiv Combinatorics Preprints&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ajc.maths.uq.edu.au/"&gt;Australasian Journal of Combinatorics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://britishcombinatorial.wordpress.com/bulletin/"&gt;British Combinatorial Bulletin&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.combinatorics.org/"&gt;Bulletin of the Institute of Combinatorics and Its Applications&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://escholarship.org/uc/combinatorial_theory"&gt;Combinatorial Theory&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/493"&gt;Combinatorica&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cambridge.org/core/journals/combinatorics-probability-and-computing"&gt;Combinatorics, Probability and Computing&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.utilitasmathematica.com/congressus-numerantium"&gt;Congressus Numerantium&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10623"&gt;Designs, Codes and Cryptography&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://dimacs.rutgers.edu/archive/Publications/Series.html"&gt;DIMACS Series in Discrete Mathematics and Theoretical Computer Science (Surveys)&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://dimacs.rutgers.edu/archive/TechnicalReports/"&gt;DIMACS Technical Reports&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://discreteanalysisjournal.com/"&gt;Discrete Analysis&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/discrete-applied-mathematics"&gt;Discrete Applied Mathematics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/discrete-mathematics"&gt;Discrete Mathematics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.degruyter.com/journal/key/dma/html"&gt;Discrete Mathematics and Applications&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.discuss.wmie.uz.zgora.pl/gt/"&gt;Discussiones Mathematicae Graph Theory&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.combinatorics.org/"&gt;Electronic Journal of Combinatorics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/electronic-notes-in-discrete-mathematics"&gt;Electronic Notes in Discrete Mathematics&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://ecajournal.haifa.ac.il/"&gt;Enumerative Combinatorics and Applications&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/european-journal-of-combinatorics"&gt;European Journal of Combinatorics&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.fq.math.ca/"&gt;Fibonacci Quarterly&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.ams.org/journals/notices/"&gt;Graph Theory Notes of New York&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/373"&gt;Graphs and Combinatorics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/uinm20/current"&gt;Internet Mathematics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/10801"&gt;Journal of Algebraic Combinatorics&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://onlinelibrary.wiley.com/journal/15206696"&gt;Journal of Combinatorial Designs&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://combinatorialmath.ca/"&gt;Journal of Combinatorial Mathematics and Combinatorial Computing&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/journal-of-combinatorial-theory-series-a"&gt;Journal of Combinatorial Theory - Series A&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/journal-of-combinatorial-theory-series-b"&gt;Journal of Combinatorial Theory - Series B&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.intlpress.com/site/pubpages/journal.php?journal=joc"&gt;Journal of Combinatorics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.scimagojr.com/journalsearch.php?q=21100887622"&gt;Journal of Combinatorics, Information and System Sciences&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/tdmc20/current"&gt;Journal of Discrete Mathematical Sciences and Cryptography&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://jgaa.info/"&gt;Journal of Graph Algorithms and Applications&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://onlinelibrary.wiley.com/journal/10970118"&gt;Journal of Graph Theory&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kam.mff.cuni.cz/series/"&gt;KAM-DIMATIA Preprint Series&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.mathnet.ru/eng/mjcnt"&gt;Moscow Journal of Combinatorics and Number Theory&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/11083"&gt;Order: A Journal on the Theory of Ordered Sets and its Applications&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.quasigroups.eu/"&gt;Quasigroups and Related Systems&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://onlinelibrary.wiley.com/journal/10982418"&gt;Random Structures and Algorithms&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="theoretical-computer-science-journal"&gt;
 Theoretical Computer Science Journal&lt;span class="heading__anchor"&gt; &lt;a href="#theoretical-computer-science-journal"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/236"&gt;Acta Informatica&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/453"&gt;Algorithmica&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/archive/cs"&gt;ArXiv.org Computer Science Preprints&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://cjtcs.cs.uchicago.edu/"&gt;Chicago Journal of Theoretical Computer Science&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://citeseerx.ist.psu.edu/"&gt;CiteSeer Preprints&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/37"&gt;Computational Complexity&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dmtcs.episciences.org/"&gt;Discrete Mathematics &amp;amp; Theoretical Computer Science&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://eccc.weizmann.ac.il/"&gt;Electronic Colloquium on Computational Complexity&lt;/a&gt; (Ranking: No Info)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=18"&gt;IEEE Transactions on Information Theory&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/information-processing-letters"&gt;Information Processing Letters&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.tandfonline.com/toc/gcom20/current"&gt;International Journal of Computer Mathematics&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dl.acm.org/journal/jacm"&gt;Journal of the ACM&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/journal-of-algorithms"&gt;Journal of Algorithms&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.jalc.de/"&gt;Journal of Automata, Languages and Combinatorics&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/journal-of-complexity"&gt;Journal of Complexity&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/journal-of-computer-and-system-sciences"&gt;Journal of Computer and System Sciences&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.springer.com/journal/145"&gt;Journal of Cryptology&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/journal-of-discrete-algorithms"&gt;Journal of Discrete Algorithms&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.worldscientific.com/worldscinet/join"&gt;Journal of Interconnection Networks&lt;/a&gt; (Ranking: Q3)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/mathematical-and-computer-modelling"&gt;Mathematical and Computer Modelling&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.cambridge.org/core/journals/mathematical-structures-in-computer-science"&gt;Mathematical Structures in Computer Science&lt;/a&gt; (Ranking: Q2)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.journals.elsevier.com/theoretical-computer-science"&gt;Theoretical Computer Science&lt;/a&gt; (Ranking: Q1)&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Teaching</title><link>https://blog.namln.org/en/teaching/</link><pubDate>Tue, 17 Oct 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/teaching/</guid><description>&lt;h2 class="heading" id="teaching-assistant"&gt;
 Teaching Assistant&lt;span class="heading__anchor"&gt; &lt;a href="#teaching-assistant"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Applied in Data Science&lt;/li&gt;
&lt;li&gt;Data Hiding and Secret Sharing&lt;/li&gt;
&lt;li&gt;Data Structures and Algorithms&lt;/li&gt;
&lt;li&gt;Data Mining and Applications&lt;/li&gt;
&lt;li&gt;Data Visualization&lt;/li&gt;
&lt;li&gt;Fundamental of Artificial Intelligence&lt;/li&gt;
&lt;li&gt;Fundemental of Programming&lt;/li&gt;
&lt;li&gt;Introduction to Programming&lt;/li&gt;
&lt;li&gt;Introduction to Data Science&lt;/li&gt;
&lt;li&gt;Introduction to Machine Learning&lt;/li&gt;
&lt;li&gt;Introduction to Bigdata&lt;/li&gt;
&lt;li&gt;Introduction to Information Technology&lt;/li&gt;
&lt;li&gt;Graph Mining&lt;/li&gt;
&lt;li&gt;Parallel Programming&lt;/li&gt;
&lt;li&gt;Programming for Data Science&lt;/li&gt;
&lt;li&gt;Swarm Intelligence&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Optimization Research Papers in JMLR Volume 24</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v24/</link><pubDate>Fri, 29 Sep 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v24/</guid><description>&lt;h1 class="heading" id="optimization-research-papers-in-jmlr-volume-24-2023"&gt;
 Optimization Research Papers in JMLR Volume 24 (2023)&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-research-papers-in-jmlr-volume-24-2023"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;This document lists papers from JMLR Volume 24 (2023) that focus on optimization research, categorized by their primary themes. Each paper is numbered starting from 1 within its subsection, with a brief description of its key contributions to optimization theory, algorithms, or applications.&lt;/p&gt;
&lt;h2 class="heading" id="convex-optimization"&gt;
 Convex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing convex optimization problems, including sparse PCA, L0 regularization, and matrix decomposition.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse PCA: A Geometric Approach&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dimitris Bertsimas, Driss Lahlou Kitane&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a geometric approach for sparse principal component analysis using convex optimization techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fundamental Limits and Algorithms for Sparse Linear Regression with Sublinear Sparsity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Lan V. Truong&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates algorithms and theoretical limits for sparse linear regression with sublinear sparsity in a convex framework.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse Training with Lipschitz Continuous Loss Functions and a Weighted Group L0-norm Constraint&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Michael R. Metel&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes sparse training methods using Lipschitz continuous loss functions and group L0-norm constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;MARS: A Second-Order Reduction Algorithm for High-Dimensional Sparse Precision Matrices Estimation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Qian Li, Binyan Jiang, Defeng Sun&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Presents a second-order reduction algorithm for sparse precision matrix estimation using convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse GCA and Thresholded Gradient Descent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Sheng Gao, Zongming Ma&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops sparse generalized correlation analysis with thresholded gradient descent in a convex framework.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Parameter-Free Conditional Gradient Method for Composite Minimization under Hölder Condition&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Masaru Ito, Zhaosong Lu, Chuan He&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a parameter-free conditional gradient method for composite minimization under Hölder smoothness.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;L0Learn: A Scalable Package for Sparse Learning using L0 Regularization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hussein Hazimeh, Rahul Mazumder, Tim Nonet&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Presents a scalable package for sparse learning with L0 regularization in convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dimitris Bertsimas, Ryan Cory-Wright, Nicholas A. G. Johnson&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a discrete optimization approach for sparse plus low-rank matrix decomposition using convex methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributed Sparse Regression via Penalization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yao Ji, Gesualdo Scutari, Ying Sun, Harsha Honnappa&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops distributed sparse regression algorithms using penalization techniques in convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Elastic Gradient Descent, an Iterative Optimization Method Approximating the Solution Paths of the Elastic Net&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Oskar Allerbo, Johan Jonasson, Rebecka Jörnsten&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces an iterative method approximating elastic net solution paths in convex settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Novel Integer Linear Programming Approach for Global L0 Minimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Diego Delle Donne, Matthieu Kowalski, Leo Liberti&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes an integer linear programming approach for global L0 minimization in convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="nonconvex-optimization"&gt;
 Nonconvex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#nonconvex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers tackling nonconvex optimization, focusing on descent algorithms, majorization minimization, and minimax problems.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Line-Search Descent Algorithm for Strict Saddle Functions with Complexity Guarantees&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Michael J. O&amp;rsquo;Neill, Stephen J. Wright&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a line-search descent algorithm for nonconvex strict saddle functions with complexity guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Inertial Block Majorization Minimization Framework for Nonsmooth Nonconvex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Le Thi Khanh Hien, Duy Nhat Phan, Nicolas Gillis&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes an inertial block majorization minimization framework for nonsmooth nonconvex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the O(epsilon^(-7/4)) Complexity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Huan Li, Zhouchen Lin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a restarted accelerated gradient descent method for nonconvex optimization, eliminating polylogarithmic factors.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preconditioned Gradient Descent for Overparameterized Nonconvex Burer-Monteiro Factorization with Global Optimality Certification&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Gavin Zhang, Salar Fattahi, Richard Y. Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops preconditioned gradient descent for nonconvex Burer-Monteiro factorization with global optimality guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Zeroth-Order Alternating Gradient Descent Ascent Algorithms for A Class of Nonconvex-Nonconcave Minimax Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zi Xu, Zi-Qi Wang, Jun-Lin Wang, Yu-Hong Dai&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes zeroth-order alternating gradient descent ascent for nonconvex-nonconcave minimax problems.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="stochastic-optimization"&gt;
 Stochastic Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#stochastic-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on stochastic optimization methods, including gradient descent, proximal point methods, and continuous-time approaches.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Convergence of Stochastic Gradient Descent with Bandwidth-Based Step Size&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xiaoyu Wang, Ya-xiang Yuan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence of stochastic gradient descent with bandwidth-based step sizes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Optimization under Distributional Drift&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Joshua Cutler, Dmitriy Drusvyatskiy, Zaid Harchaoui&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies stochastic optimization under distributional drift with theoretical guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhuang Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes improved powered stochastic optimization algorithms for large-scale machine learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xiao-Tong Yuan, Ping Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a sharper analysis of minibatch stochastic proximal point methods, focusing on stability and smoothness.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Continuous-Time Stochastic Gradient Descent Method for Continuous Data&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kexin Jin, Jonas Latz, Chenguang Liu, Carola-Bibiane Schönlieb&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a continuous-time stochastic gradient descent method for continuous data optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sensitivity-Free Gradient Descent Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ion Matei, Maksym Zhenirovskyy, Johan de Kleer, John Maxwell&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops sensitivity-free gradient descent algorithms for stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="distributeddecentralized-optimization"&gt;
 Distributed/Decentralized Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#distributeddecentralized-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing distributed or decentralized optimization algorithms, focusing on federated learning, asynchronous updates, and network topology.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decentralized Learning: Theoretical Optimality and Practical Improvements&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yucheng Lu, Christopher De Sa&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes theoretical optimality and practical improvements for decentralized learning algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a general theory for federated optimization with asynchronous and heterogeneous client updates.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Buffered Asynchronous SGD for Byzantine Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yi-Rui Yang, Wu-Jun Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes buffered asynchronous SGD for Byzantine-resilient distributed learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Minimax Estimation for Personalized Federated Learning: An Alternative Between FedAvg and Local Training&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Shuxiao Chen, Qinqing Zheng, Qi Long, Weijie J. Su&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates minimax estimation for personalized federated learning, comparing FedAvg and local training.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kun Yuan, Sulaiman A. Alghunaim, Xinmeng Huang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Enhances decentralized SGD by addressing data heterogeneity and network topology dependence.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-Consensus Decentralized Accelerated Gradient Descent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Haishan Ye, Luo Luo, Ziang Zhou, Tong Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops multi-consensus decentralized accelerated gradient descent for distributed optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerated Primal-Dual Mirror Dynamics for Centralized and Distributed Constrained Convex Optimization Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: You Zhao, Xiaofeng Liao, Xing He, Mingliang Zhou, Chaojie Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes accelerated primal-dual mirror dynamics for centralized and distributed convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Beyond Spectral Gap: The Role of the Topology in Decentralized Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Thijs Vogels, Hadrien Hendrikx, Martin Jaggi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Examines the role of network topology in decentralized learning optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bandits-and-online-learning"&gt;
 Bandits and Online Learning&lt;span class="heading__anchor"&gt; &lt;a href="#bandits-and-online-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing multi-armed bandits, online optimization, and regret minimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adaptation to the Range in K-Armed Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hédi Hadiji, Gilles Stoltz&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies adaptation to the range in k-armed bandit problems with regret minimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dimension Reduction in Contextual Online Learning via Nonparametric Variable Selection&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Wenhao Li, Ningyuan Chen, L. Jeff Hong&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes dimension reduction techniques for contextual online learning with nonparametric variable selection.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Non-Stationary Online Learning with Memory and Non-Stochastic Control&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Peng Zhao, Yu-Hu Yan, Yu-Xiang Wang, Zhi-Hua Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates non-stationary online learning with memory and non-stochastic control strategies.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online Non-Stochastic Control with Partial Feedback&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yu-Hu Yan, Peng Zhao, Zhi-Hua Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops online non-stochastic control methods with partial feedback for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yasin Abbasi-Yadkori, András György, Nevena Lazić&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes dynamic regret in non-stationary stochastic bandit problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A PDE Approach for Regret Bounds under Partial Monitoring&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Erhan Bayraktar, Ibrahim Ekren, Xin Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Uses a PDE-based approach to derive regret bounds for partial monitoring in online learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Continuous-in-Time Limit for Bayesian Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuhua Zhu, Zachary Izzo, Lexing Ying&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores the continuous-time limit for Bayesian bandit algorithms with theoretical guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Bandit Problems with Fidelity Rewards&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Gábor Lugosi, Ciara Pike-Burke, Pierre-André Savalle&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies bandit problems with fidelity rewards, focusing on regret minimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Linear Partial Monitoring for Sequential Decision Making: Algorithms, Regret Bounds and Applications&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Johannes Kirschner, Tor Lattimore, Andreas Krause&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops algorithms and regret bounds for linear partial monitoring in sequential decision-making.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="optimization-in-reinforcement-learning"&gt;
 Optimization in Reinforcement Learning&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-in-reinforcement-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on optimization techniques for reinforcement learning, including actor-critic methods and constrained RL.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement Learning for Joint Optimization of Multiple Rewards&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mridul Agarwal, Vaneet Aggarwal&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Focuses on reinforcement learning for optimizing multiple rewards simultaneously.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Qinbo Bai, Vaneet Aggarwal, Ather Gattami&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a sample-efficient model-free algorithm for MDPs with peak constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Off-Policy Actor-Critic with Emphatic Weightings&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops off-policy actor-critic methods with emphatic weightings for RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yanwei Jia, Xun Yu Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes q-learning convergence and near-optimality for MDPs with general state spaces.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kaiqing Zhang, Sham M. Kakade, Tamer Basar, Lin F. Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies model-based multi-agent RL in zero-sum Markov games with near-optimal sample complexity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;F2A2: Flexible Fully-Decentralized Approximate Actor-Critic for Cooperative Multi-Agent Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Wenhao Li, Bo Jin, Xiangfeng Wang, Junchi Yan, Hongyuan Zha&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a flexible fully-decentralized approximate actor-critic method for cooperative multi-agent RL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adaptation Augmented Model-Based Policy Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jian Shen, Hang Lai, Minghuan Liu, Han Zhao, Yong Yu, Weinan Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces adaptation-augmented model-based policy optimization for RL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Single Timescale Actor-Critic Method to Solve the Linear Quadratic Regulator with Convergence Guarantees&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mo Zhou, Jianfeng Lu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a single timescale actor-critic method for linear quadratic regulators with convergence guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convex Reinforcement Learning in Finite Trials&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mirco Mutti, Riccardo De Santi, Piersilvio De Bartolomeis, Marcello Restelli&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates convex reinforcement learning with finite trials, focusing on optimization techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a variational primal-dual policy optimization method for constrained RL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Instance-Dependent Confidence and Early Stopping for Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Eric Xia, Koulik Khamaru, Martin J. Wainwright, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops instance-dependent confidence bounds and early stopping strategies for RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="other-optimization-topics"&gt;
 Other Optimization Topics&lt;span class="heading__anchor"&gt; &lt;a href="#other-optimization-topics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers covering miscellaneous optimization topics, including Riemannian optimization, matrix completion, and optimal transport.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Relaxed Inertial Forward-Backward-Forward Algorithm for Solving Monotone Inclusions with Application to GANs&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Radu I. Bot, Michael Sedlmayer, Phan Tu Vuong&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a relaxed inertial forward-backward-forward algorithm for monotone inclusions with applications to GANs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Discrete Variational Calculus for Accelerated Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Cédric M. Campos, Alejandro Mahillo, David Martín de Diego&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces discrete variational calculus for accelerating optimization processes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online Optimization over Riemannian Manifolds&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xi Wang, Zhipeng Tu, Yiguang Hong, Yingyi Wu, Guodong Shi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops online optimization algorithms over Riemannian manifolds.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fast Objective &amp;amp; Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes fast convergence for non-convex strongly-concave min-max problems under the Polyak-Łojasiewicz condition.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hamid Reza Feyzmahdavian, Mikael Johansson&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides new sequence results and sharper guarantees for asynchronous optimization iterations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Proximal ID Algorithm&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ilya Shpitser, Zach Wood-Doughty, Eric J. Tchetgen Tchetgen&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a proximal algorithm for identification problems in optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Inexact Augmented Lagrangian Algorithm for Training Leaky ReLU Neural Network with Group Sparsity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Wei Liu, Xin Liu, Xiaojun Chen&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops an inexact augmented Lagrangian algorithm for training leaky ReLU networks with group sparsity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Optimality of Nuclear-Norm-Based Matrix Completion for Problems with Smooth Non-Linear Structure&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yunhua Xiang, Tianyu Zhang, Xu Wang, Ali Shojaie, Noah Simon&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies nuclear-norm-based matrix completion for problems with smooth nonlinear structures.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Importance Sparsification for Sinkhorn Algorithm&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mengyu Li, Jun Yu, Tao Li, Cheng Meng&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes importance sparsification techniques for the Sinkhorn algorithm in optimal transport.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Near-Optimal Weighted Matrix Completion&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Oscar López&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates near-optimal weighted matrix completion using optimization techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Implicit Regularization and Entrywise Convergence of Riemannian Optimization for Low Tucker-Rank Tensor Completion&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Haifeng Wang, Jinchi Chen, Ke Wei&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes implicit regularization and entrywise convergence in Riemannian optimization for low Tucker-rank tensor completion.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Unbalanced Optimal Transport: Gradient Methods, Sparsity and Approximation Error&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Quang Minh Nguyen, Hoang H. Nguyen, Yi Zhou, Lam M. Nguyen&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies gradient methods for unbalanced optimal transport, focusing on sparsity and approximation error.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>List of Selected Papers on Algorithms for Large-Scale Graph Processing.</title><link>https://blog.namln.org/graph-analytics/large-scale-graph-processing-reading-list/</link><pubDate>Sat, 19 Aug 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/graph-analytics/large-scale-graph-processing-reading-list/</guid><description>&lt;p&gt;1/ [ISAAC'11] Goodrich, M. T., Sitchinava, N., &amp;amp; Zhang, Q. (2011, December). &lt;a href="https://arxiv.org/abs/1101.1902"&gt;Sorting, searching, and simulation in the mapreduce framework&lt;/a&gt;. In International Symposium on Algorithms and Computation (pp. 374-383). Springer, Berlin, Heidelberg.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{goodrich2011sorting,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Sorting, searching, and simulation in the mapreduce framework},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Goodrich, Michael T and Sitchinava, Nodari and Zhang, Qin},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={International Symposium on Algorithms and Computation},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={374--383},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2011},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; organization={Springer}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;2/ [STOC'14] Andoni, A., Nikolov, A., Onak, K., &amp;amp; Yaroslavtsev, G. (2014, May). &lt;a href="https://arxiv.org/pdf/1401.0042"&gt;Parallel algorithms for geometric graph problems&lt;/a&gt;. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing (pp. 574-583).&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{andoni2014parallel,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Parallel algorithms for geometric graph problems},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Andoni, Alexandr and Nikolov, Aleksandar and Onak, Krzysztof and Yaroslavtsev, Grigory},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={Proceedings of the forty-sixth annual ACM symposium on Theory of computing},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={574--583},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2014}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;3/ [STOC'17] Im, S., Moseley, B., &amp;amp; Sun, X. (2017, June). &lt;a href="https://dl.acm.org/doi/pdf/10.1145/3055399.3055460"&gt;Efficient massively parallel methods for dynamic programming&lt;/a&gt;. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (pp. 798-811).&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{im2017efficient,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Efficient massively parallel methods for dynamic programming},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Im, Sungjin and Moseley, Benjamin and Sun, Xiaorui},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={798--811},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2017}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;4/ [FOCS'18] Andoni, A., Song, Z., Stein, C., Wang, Z., &amp;amp; Zhong, P. (2018, October). &lt;a href="https://arxiv.org/pdf/1805.03055"&gt;Parallel graph connectivity in log diameter rounds&lt;/a&gt;. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) (pp. 674-685). IEEE.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{andoni2018parallel,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Parallel graph connectivity in log diameter rounds},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Andoni, Alexandr and Song, Zhao and Stein, Clifford and Wang, Zhengyu and Zhong, Peilin},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={674--685},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2018},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; organization={IEEE}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;5/ [SOSA'19] Liu, P., &amp;amp; Vondrák, J. (2018). &lt;a href="https://arxiv.org/abs/1810.01489"&gt;Submodular optimization in the mapreduce model&lt;/a&gt;. arXiv preprint arXiv:1810.01489.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@article{liu2018submodular,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Submodular optimization in the mapreduce model},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Liu, Paul and Vondr{\&amp;#39;a}k, Jan},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; journal={arXiv preprint arXiv:1810.01489},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2018}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;6/ [PODC'19] Behnezhad, S., Brandt, S., Derakhshan, M., Fischer, M., Hajiaghayi, M., Karp, R. M., &amp;amp; Uitto, J. (2019, July). &lt;a href="https://dl.acm.org/citation.cfm?id=3331609"&gt;Massively parallel computation of matching and MIS in sparse graphs&lt;/a&gt;. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing (pp. 481-490).&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{behnezhad2019massively,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Massively parallel computation of matching and MIS in sparse graphs},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Behnezhad, Soheil and Brandt, Sebastian and Derakhshan, Mahsa and Fischer, Manuela and Hajiaghayi, MohammadTaghi and Karp, Richard M and Uitto, Jara},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={481--490},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2019}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;Brandt, S., Fischer, M., &amp;amp; Uitto, J. (2018). &lt;a href="https://arxiv.org/abs/1807.05374"&gt;Matching and MIS for uniformly sparse graphs in the low-memory MPC model&lt;/a&gt;. arXiv preprint arXiv:1807.05374.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@article{brandt2018matching,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Matching and MIS for uniformly sparse graphs in the low-memory MPC model},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Brandt, Sebastian and Fischer, Manuela and Uitto, Jara},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; journal={arXiv preprint arXiv:1807.05374},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2018}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;Behnezhad, S., Derakhshan, M., Hajiaghayi, M., &amp;amp; Karp, R. M. (2018). &lt;a href="https://arxiv.org/pdf/1807.06701"&gt;Massively parallel symmetry breaking on sparse graphs: MIS and maximal matching&lt;/a&gt;. arXiv preprint arXiv:1807.06701.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@article{behnezhad2018massively,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Massively parallel symmetry breaking on sparse graphs: MIS and maximal matching},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Behnezhad, Soheil and Derakhshan, Mahsa and Hajiaghayi, MohammadTaghi and Karp, Richard M},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; journal={arXiv preprint arXiv:1807.06701},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2018}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;7/ [PODC'19] Chang, Y. J., Fischer, M., Ghaffari, M., Uitto, J., &amp;amp; Zheng, Y. (2019, July). &lt;a href="https://dl.acm.org/doi/pdf/10.1145/3293611.3331607"&gt;The complexity of $$(\Delta+ 1)$$ coloring in congested clique, massively parallel computation, and centralized local computation&lt;/a&gt;. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing (pp. 471-480).&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{chang2019complexity,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={The complexity of ($\Delta$+ 1) coloring in congested clique, massively parallel computation, and centralized local computation},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Chang, Yi-Jun and Fischer, Manuela and Ghaffari, Mohsen and Uitto, Jara and Zheng, Yufan},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={471--480},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2019}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;8/ [FOCS'19] Ghaffari, M., Kuhn, F., &amp;amp; Uitto, J. (2019, November). &lt;a href="https://people.inf.ethz.ch/gmohsen/MPA19/Notes/conditionalLBs.pdf"&gt;Conditional hardness results for massively parallel computation from distributed lower bounds&lt;/a&gt;. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS) (pp. 1650-1663). IEEE.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{ghaffari2019conditional,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Conditional hardness results for massively parallel computation from distributed lower bounds},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Ghaffari, Mohsen and Kuhn, Fabian and Uitto, Jara},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS)},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={1650--1663},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2019},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; organization={IEEE}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;&lt;p&gt;9/ [SODA'20] Ghaffari, M., Nowicki, K., &amp;amp; Thorup, M. (2020). &lt;a href="https://epubs.siam.org/doi/pdf/10.1137/1.9781611975994.77"&gt;Faster algorithms for edge connectivity via random 2-out contractions&lt;/a&gt;. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1260-1279). Society for Industrial and Applied Mathematics.&lt;/p&gt;

&lt;figure class="code-block"&gt;
 
 &lt;div class="highlight-wrapper"&gt;
 &lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;@inproceedings{ghaffari2020faster,
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; title={Faster algorithms for edge connectivity via random 2-out contractions},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; author={Ghaffari, Mohsen and Nowicki, Krzysztof and Thorup, Mikkel},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; booktitle={Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; pages={1260--1279},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; year={2020},
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; organization={SIAM}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
 &lt;/div&gt;
&lt;/figure&gt;</description></item><item><title>Reading list &amp; mathematics resources.</title><link>https://blog.namln.org/mathematics/math-reading-list/</link><pubDate>Sat, 19 Aug 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/mathematics/math-reading-list/</guid><description>&lt;h2 class="heading" id="foundations-of-mathematics"&gt;
 Foundations of Mathematics&lt;span class="heading__anchor"&gt; &lt;a href="#foundations-of-mathematics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h2 class="heading" id="number-theory"&gt;
 Number Theory&lt;span class="heading__anchor"&gt; &lt;a href="#number-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h2 class="heading" id="algebra"&gt;
 Algebra&lt;span class="heading__anchor"&gt; &lt;a href="#algebra"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;1/ Lay, D. C., Lay, S. R., &amp;amp; McDonald, J. (2016). Linear algebra and its applications. Pearson Education.&lt;/p&gt;
&lt;p&gt;2/ Strang, G. (2019). Linear algebra and learning from data (Vol. 4). Cambridge: Wellesley-Cambridge Press.&lt;/p&gt;
&lt;h2 class="heading" id="combinatorics---graph-theory"&gt;
 Combinatorics - Graph Theory&lt;span class="heading__anchor"&gt; &lt;a href="#combinatorics---graph-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="graph-theory-books"&gt;
 Graph theory books&lt;span class="heading__anchor"&gt; &lt;a href="#graph-theory-books"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;[1] Lewis, R. (2015). A guide to graph colouring (Vol. 7). Berlin: Springer.&lt;/p&gt;
&lt;p&gt;[2] Tucker, A. (1994). Applied combinatorics. John Wiley &amp;amp; Sons, Inc..&lt;/p&gt;
&lt;p&gt;[3] Li, Y., &amp;amp; Lin, Q. (2022). Elementary Methods of Graph Ramsey Theory (Vol. 211). Springer Nature.&lt;/p&gt;
&lt;p&gt;[4] David Conlon - Extremal graph theory&lt;/p&gt;
&lt;p&gt;[5] Trudeau, R. J. (1994). Introduction to graph theory. Dover Pubns.&lt;/p&gt;
&lt;p&gt;[6] Reinhard, D. (2017). Graph Theory. GTM, vol. 173.&lt;/p&gt;
&lt;p&gt;[7] Bondy, J. A., &amp;amp; Murty, U. S. R. (1976). Graph theory with applications (Vol. 290). London: Macmillan.&lt;/p&gt;
&lt;p&gt;[8] Bollobás, B. (1998). Modern graph theory (Vol. 184). Springer Science &amp;amp; Business Media.&lt;/p&gt;
&lt;p&gt;[9] Needham, M., &amp;amp; Hodler, A. E. (2019). Graph algorithms: practical examples in Apache Spark and Neo4j. O&amp;rsquo;Reilly Media.&lt;/p&gt;
&lt;p&gt;[10] Guia, J., Soares, V. G., &amp;amp; Bernardino, J. (2017, April). Graph Databases: Neo4j Analysis. In ICEIS (1) (pp. 351-356).&lt;/p&gt;
&lt;p&gt;[11] Harary, Frank - Graph Theory-Perseus Books (1999)&lt;/p&gt;
&lt;p&gt;[12] Miklós Bóna - A Walk Through Combinatorics: An Introduction to Enumeration and Graph Theory, World Scientific (2016)&lt;/p&gt;
&lt;p&gt;[13] Robin J. Wilson - Introduction to Graph Theory, Fourth Edition-Addison Wesley (1996)&lt;/p&gt;
&lt;p&gt;[14] (Textbooks in Mathematics) Jonathan L. Gross, Jay Yellen, Mark Anderson - Graph Theory and Its Applications, third edition (2018)&lt;/p&gt;
&lt;p&gt;[15] Introduction to Graph Theory, Douglas B. West&lt;/p&gt;
&lt;h2 class="heading" id="geometry--topology"&gt;
 Geometry &amp;amp; Topology&lt;span class="heading__anchor"&gt; &lt;a href="#geometry--topology"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h2 class="heading" id="analysis"&gt;
 Analysis&lt;span class="heading__anchor"&gt; &lt;a href="#analysis"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h2 class="heading" id="probability--statistics"&gt;
 Probability &amp;amp; Statistics&lt;span class="heading__anchor"&gt; &lt;a href="#probability--statistics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;1/ Gould, R., &amp;amp; Ryan, C. N. (2015). Introductory statistics: Exploring the world through data. Pearson.&lt;/p&gt;
&lt;p&gt;2/ Johnson, R. A., &amp;amp; Wichern, D. W. (2002). Applied multivariate statistical analysis.&lt;/p&gt;
&lt;p&gt;3/ Härdle, W. K., &amp;amp; Simar, L. (2019). Applied multivariate statistical analysis. Springer Nature.&lt;/p&gt;
&lt;h2 class="heading" id="numerical-analysis"&gt;
 Numerical Analysis&lt;span class="heading__anchor"&gt; &lt;a href="#numerical-analysis"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h2 class="heading" id="signal-processing"&gt;
 Signal processing&lt;span class="heading__anchor"&gt; &lt;a href="#signal-processing"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h2 class="heading" id="applied-mathematics"&gt;
 Applied mathematics&lt;span class="heading__anchor"&gt; &lt;a href="#applied-mathematics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="machine-learning"&gt;
 Machine learning&lt;span class="heading__anchor"&gt; &lt;a href="#machine-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;1/ Deisenroth, M. P., Faisal, A. A., &amp;amp; Ong, C. S. (2020). Mathematics for machine learning. Cambridge University Press.&lt;/p&gt;
&lt;p&gt;2/ Bishop, C. M., &amp;amp; Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: springer.&lt;/p&gt;
&lt;p&gt;3/ Koller, D., &amp;amp; Friedman, N. (2009). Probabilistic graphical models: principles and techniques. MIT press.&lt;/p&gt;
&lt;p&gt;4/ Barber, D. (2012). &lt;a href="http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/090310.pdf"&gt;Bayesian reasoning and machine learning&lt;/a&gt;. Cambridge University Press.&lt;/p&gt;
&lt;p&gt;5/ Hastie, T., Tibshirani, R., Friedman, J. H., &amp;amp; Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction (Vol. 2, pp. 1-758). New York: springer.&lt;/p&gt;
&lt;p&gt;6/ Mohri, M., Rostamizadeh, A., &amp;amp; Talwalkar, A. (2018). Foundations of machine learning. MIT press.&lt;/p&gt;
&lt;p&gt;7/ Williams, C. K., &amp;amp; Rasmussen, C. E. (2006). Gaussian processes for machine learning (Vol. 2, No. 3, p. 4). Cambridge, MA: MIT press.&lt;/p&gt;
&lt;p&gt;8/ Vapnik, V. (1999). The nature of statistical learning theory. Springer science &amp;amp; business media.&lt;/p&gt;
&lt;p&gt;9/ Shalev-Shwartz, S., &amp;amp; Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge university press.&lt;/p&gt;
&lt;p&gt;10/ Wainwright, M. J., &amp;amp; Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2), 1-305.&lt;/p&gt;
&lt;h3 class="heading" id="optimization"&gt;
 Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;1/ Kochenderfer, M. J., &amp;amp; Wheeler, T. A. (2019). Algorithms for optimization. Mit Press.&lt;/p&gt;
&lt;p&gt;2/ Kochenderfer, M. J., Wheeler, T. A., &amp;amp; Wray, K. H. (2022). Algorithms for decision making. MIT press.&lt;/p&gt;
&lt;p&gt;3/ Boyd, S. P., &amp;amp; Vandenberghe, L. (2004). Convex optimization. Cambridge university press.&lt;/p&gt;
&lt;p&gt;4/ Bertsekas, D. (2009). Convex optimization theory (Vol. 1). Athena Scientific.&lt;/p&gt;
&lt;p&gt;5/ Papadimitriou, C. H., &amp;amp; Steiglitz, K. (1998). Combinatorial optimization: algorithms and complexity. Courier Corporation.&lt;/p&gt;
&lt;p&gt;6/ Cook, W. J., Cunningham, W. H., Pulleyblank, W. R., &amp;amp; Schrijver, A. (2009). Combinatorial optimization. Oberwolfach Reports, 5(4), 2875-2942.&lt;/p&gt;</description></item><item><title>Reading list on Graph Learning - Explainable artificial intelligence (xAI).</title><link>https://blog.namln.org/ai/xai-graph-reading-list/</link><pubDate>Sat, 19 Aug 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/ai/xai-graph-reading-list/</guid><description>&lt;h1 class="heading" id="xai-graph"&gt;
 XAI-Graph&lt;span class="heading__anchor"&gt; &lt;a href="#xai-graph"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;h2 class="heading" id="2023"&gt;
 2023&lt;span class="heading__anchor"&gt; &lt;a href="#2023"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Azzolin, S., Longa, A., Barbiero, P., Liò, P., &amp;amp; Passerini, A. (2022). &lt;a href="https://arxiv.org/abs/2210.07147"&gt;Global explainability of gnns via logic combination of learned concepts&lt;/a&gt;. arXiv preprint arXiv:2210.07147.&lt;/p&gt;
&lt;p&gt;[2] Miao, S., Luo, Y., Liu, M., &amp;amp; Li, P. (2022). &lt;a href="https://arxiv.org/abs/2210.16966"&gt;Interpretable Geometric Deep Learning via Learnable Randomness Injection&lt;/a&gt;. arXiv preprint arXiv:2210.16966.&lt;/p&gt;
&lt;p&gt;[3] Liu, Y., Zhang, X., &amp;amp; Xie, S. (2023, February). &lt;a href="https://openreview.net/forum?id=lRdhvzMpVYV"&gt;A Differential Geometric View and Explainability of GNN on Evolving Graphs&lt;/a&gt;. In The Eleventh International Conference on Learning Representations.&lt;/p&gt;
&lt;p&gt;[4] Wang, X., &amp;amp; Shen, H. W. (2022). &lt;a href="https://arxiv.org/abs/2209.07924"&gt;GNNInterpreter: A Probabilistic Generative Model-Level Explanation for Graph Neural Networks&lt;/a&gt;. arXiv preprint arXiv:2209.07924.&lt;/p&gt;
&lt;p&gt;[5] Xia, W., Lai, M., Shan, C., Zhang, Y., Dai, X., Li, X., &amp;amp; Li, D. (2023, February). &lt;a href="https://openreview.net/forum?id=BR_ZhvcYbGJ"&gt;Explaining Temporal Graph Models through an Explorer-Navigator Framework&lt;/a&gt;. In The Eleventh International Conference on Learning Representations.&lt;/p&gt;
&lt;h2 class="heading" id="2022"&gt;
 2022&lt;span class="heading__anchor"&gt; &lt;a href="#2022"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Zhang, S., Liu, Y., Shah, N., &amp;amp; Sun, Y. (2022, January). &lt;a href="https://openreview.net/forum?id=Qry8exovcNA"&gt;GStarX: Explaining Graph Neural Networks with Structure-Aware Cooperative Games&lt;/a&gt;. In Advances in Neural Information Processing Systems.&lt;/p&gt;
&lt;p&gt;[2] Xie, Y., Katariya, S., Tang, X., Huang, E., Rao, N., Subbian, K., &amp;amp; Ji, S. (2022). &lt;a href="https://arxiv.org/abs/2202.08335"&gt;Task-agnostic graph explanations&lt;/a&gt;. arXiv preprint arXiv:2202.08335.&lt;/p&gt;
&lt;p&gt;[3] Peng, X., Riedl, M., &amp;amp; Ammanabrolu, P. (2022). &lt;a href="https://proceedings.neurips.cc/paper_files/paper/2022/hash/672e44a114a41d5f34b97459877c083d-Abstract-Conference.html"&gt;Inherently explainable reinforcement learning in natural language&lt;/a&gt;. Advances in Neural Information Processing Systems, 35, 16178-16190.&lt;/p&gt;
&lt;p&gt;[4] Ma, J., Guo, R., Mishra, S., Zhang, A., &amp;amp; Li, J. (2022). &lt;a href="https://arxiv.org/abs/2210.08443"&gt;CLEAR: Generative Counterfactual Explanations on Graphs&lt;/a&gt;. arXiv preprint arXiv:2210.08443.&lt;/p&gt;
&lt;p&gt;[5] Xiong, P., Schnake, T., Montavon, G., Müller, K. R., &amp;amp; Nakajima, S. (2022, June). &lt;a href="https://proceedings.mlr.press/v162/xiong22a.html"&gt;Efficient Computation of Higher-Order Subgraph Attribution via Message Passing&lt;/a&gt;. In International Conference on Machine Learning (pp. 24478-24495). PMLR.&lt;/p&gt;
&lt;p&gt;[6] Miao, S., Liu, M., &amp;amp; Li, P. (2022, June). &lt;a href="https://proceedings.mlr.press/v162/miao22a.html"&gt;Interpretable and generalizable graph learning via stochastic attention mechanism&lt;/a&gt;. In International Conference on Machine Learning (pp. 15524-15543). PMLR.&lt;/p&gt;
&lt;p&gt;[7] Wu, Y. X., Wang, X., Zhang, A., He, X., &amp;amp; Chua, T. S. (2022). &lt;a href="https://arxiv.org/abs/2201.12872"&gt;Discovering invariant rationales for graph neural networks&lt;/a&gt;. arXiv preprint arXiv:2201.12872.&lt;/p&gt;
&lt;p&gt;[8] Feng, Q., Liu, N., Yang, F., Tang, R., Du, M., &amp;amp; Hu, X. (2023). &lt;a href="https://arxiv.org/abs/2305.12895"&gt;Degree: Decomposition based explanation for graph neural networks&lt;/a&gt;. arXiv preprint arXiv:2305.12895.&lt;/p&gt;
&lt;p&gt;[9] &lt;strong&gt;Tena Cucala, D. J., Cuenca Grau, B., Kostylev, E. V., &amp;amp; Motik, B. (2022). &lt;a href="https://ora.ox.ac.uk/objects/uuid:5d732bae-b80a-4439-8b4d-918a413a1765"&gt;Explainable GNN-based models over knowledge graphs&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[10] Dong, Y., Wang, S., Wang, Y., Derr, T., &amp;amp; Li, J. (2022, August). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3534678.3539319"&gt;On structural explanation of bias in graph neural networks&lt;/a&gt;. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 316-326).&lt;/p&gt;
&lt;p&gt;[11] Liu, G., Zhao, T., Xu, J., Luo, T., &amp;amp; Jiang, M. (2022, August). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3534678.3539347"&gt;Graph rationalization with environment-based augmentations&lt;/a&gt;. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 1069-1078).&lt;/p&gt;
&lt;p&gt;[12] Wang, P., Cai, R., &amp;amp; Wang, H. (2022, April). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3485447.3512168"&gt;Graph-based Extractive Explainer for Recommendations&lt;/a&gt;. In Proceedings of the ACM Web Conference 2022 (pp. 2163-2171).&lt;/p&gt;
&lt;p&gt;[13] Tan, J., Geng, S., Fu, Z., Ge, Y., Xu, S., Li, Y., &amp;amp; Zhang, Y. (2022, April). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3485447.3511948"&gt;Learning and evaluating graph neural network explanations based on counterfactual and factual reasoning&lt;/a&gt;. In Proceedings of the ACM Web Conference 2022 (pp. 1018-1027).&lt;/p&gt;
&lt;p&gt;[14] Islam, S. M., &amp;amp; Bhattacharya, S. (2022, April). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3485447.3511941"&gt;AR-BERT: Aspect-relation enhanced Aspect-level Sentiment Classification with Multi-modal Explanations&lt;/a&gt;. In Proceedings of the ACM Web Conference 2022 (pp. 987-998).&lt;/p&gt;
&lt;p&gt;[15] Zhang, Z., Liu, Q., Wang, H., Lu, C., &amp;amp; Lee, C. (2022, June). &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/20898"&gt;Protgnn: Towards self-explaining graph neural networks&lt;/a&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 8, pp. 9127-9135).&lt;/p&gt;
&lt;p&gt;[16] Feng, A., You, C., Wang, S., &amp;amp; Tassiulas, L. (2022, June). &lt;a href="https://ojs.aaai.org/index.php/AAAI/article/view/20615"&gt;Kergnns: Interpretable graph neural networks with graph kernels&lt;/a&gt;. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 6, pp. 6614-6622).&lt;/p&gt;
&lt;p&gt;[17] &lt;strong&gt;Aglionby, G., &amp;amp; Teufel, S. (2022, December). &lt;a href="https://aclanthology.org/2022.emnlp-main.743/"&gt;Faithful Knowledge Graph Explanations in Commonsense Question Answering&lt;/a&gt;. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 10811-10817).&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[18] Li, X., Zhang, X., JiaHao, P., Mao, R., Zhou, M., Xie, X., &amp;amp; Liao, H. (2022, December). &lt;a href="https://aclanthology.org/2022.emnlp-main.216/"&gt;A Joint Learning Framework for Restaurant Survival Prediction and Explanation&lt;/a&gt;. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (pp. 3285-3297).&lt;/p&gt;
&lt;h2 class="heading" id="2021"&gt;
 2021&lt;span class="heading__anchor"&gt; &lt;a href="#2021"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Shan, C., Shen, Y., Zhang, Y., Li, X., &amp;amp; Li, D. (2021). &lt;a href="https://proceedings.neurips.cc/paper/2021/hash/be26abe76fb5c8a4921cf9d3e865b454-Abstract.html"&gt;Reinforcement learning enhanced explainer for graph neural networks&lt;/a&gt;. Advances in Neural Information Processing Systems, 34, 22523-22533.&lt;/p&gt;
&lt;p&gt;[2] Wang, X., Wu, Y., Zhang, A., He, X., &amp;amp; Chua, T. S. (2021). &lt;a href="https://proceedings.neurips.cc/paper/2021/hash/99bcfcd754a98ce89cb86f73acc04645-Abstract.html"&gt;Towards multi-grained explainability for graph neural networks&lt;/a&gt;. Advances in Neural Information Processing Systems, 34, 18446-18458.&lt;/p&gt;
&lt;p&gt;[3] Bajaj, M., Chu, L., Xue, Z. Y., Pei, J., Wang, L., Lam, P. C. H., &amp;amp; Zhang, Y. (2021). &lt;a href="https://proceedings.neurips.cc/paper/2021/hash/2c8c3a57383c63caef6724343eb62257-Abstract.html"&gt;Robust counterfactual explanations on graph neural networks&lt;/a&gt;. Advances in Neural Information Processing Systems, 34, 5644-5655.&lt;/p&gt;
&lt;p&gt;[4] Yuan, H., Yu, H., Wang, J., Li, K., &amp;amp; Ji, S. (2021, July). &lt;a href="http://proceedings.mlr.press/v139/yuan21c.html"&gt;On explainability of graph neural networks via subgraph explorations&lt;/a&gt;. In International Conference on Machine Learning (pp. 12241-12252). PMLR.&lt;/p&gt;
&lt;p&gt;[5] Lin, W., Lan, H., &amp;amp; Li, B. (2021, July). &lt;a href="https://proceedings.mlr.press/v139/lin21d.html"&gt;Generative causal explanations for graph neural networks&lt;/a&gt;. In International Conference on Machine Learning (pp. 6666-6679). PMLR.&lt;/p&gt;
&lt;p&gt;[6] Henderson, R., Clevert, D. A., &amp;amp; Montanari, F. (2021, July). &lt;a href="https://proceedings.mlr.press/v139/henderson21a.html"&gt;Improving molecular graph neural network explainability with orthonormalization and induced sparsity&lt;/a&gt;. In International Conference on Machine Learning (pp. 4203-4213). PMLR.&lt;/p&gt;
&lt;p&gt;[7] Wang, X., Fan, S., Kuang, K., &amp;amp; Zhu, W. (2021, July). &lt;a href="http://proceedings.mlr.press/v139/wang21f.html"&gt;Explainable automated graph representation learning with hyperparameter importance&lt;/a&gt;. In International Conference on Machine Learning (pp. 10727-10737). PMLR.&lt;/p&gt;
&lt;p&gt;[8] Faber, L., K. Moghaddam, A., &amp;amp; Wattenhofer, R. (2021, August). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3447548.3467283"&gt;When comparing to ground truth is wrong: On evaluating gnn explanation methods&lt;/a&gt;. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp;amp; Data Mining (pp. 332-341).&lt;/p&gt;
&lt;p&gt;[9] Abrate, C., &amp;amp; Bonchi, F. (2021, August). &lt;a href="https://dl.acm.org/doi/abs/10.1145/3447548.3467154"&gt;Counterfactual graphs for explainable classification of brain networks&lt;/a&gt;. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp;amp; Data Mining (pp. 2495-2504).&lt;/p&gt;
&lt;p&gt;[10] Liu, Y., Chen, C., Liu, Y., Zhang, X., &amp;amp; Xie, S. (2021, December). &lt;a href="https://ieeexplore.ieee.org/abstract/document/9679172/"&gt;Multi-objective Explanations of GNN Predictions&lt;/a&gt;. In 2021 IEEE International Conference on Data Mining (ICDM) (pp. 409-418). IEEE.&lt;/p&gt;
&lt;p&gt;[11] Gao, Y., Sun, T., Bhatt, R., Yu, D., Hong, S., &amp;amp; Zhao, L. (2021, December). &lt;a href="https://ieeexplore.ieee.org/abstract/document/9679041/"&gt;Gnes: Learning to explain graph neural networks&lt;/a&gt;. In 2021 IEEE International Conference on Data Mining (ICDM) (pp. 131-140). IEEE.&lt;/p&gt;
&lt;p&gt;[12] Fan, Y., Yao, Y., &amp;amp; Joe-Wong, C. (2021, December). &lt;a href="https://ieeexplore.ieee.org/abstract/document/9679020/"&gt;Gcn-se: Attention as explainability for node classification in dynamic graphs&lt;/a&gt;. In 2021 IEEE International Conference on Data Mining (ICDM) (pp. 1060-1065). IEEE.&lt;/p&gt;
&lt;h2 class="heading" id="2020"&gt;
 2020&lt;span class="heading__anchor"&gt; &lt;a href="#2020"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[1] Vu, M., &amp;amp; Thai, M. T. (2020). &lt;a href="https://proceedings.neurips.cc/paper/2020/hash/8fb134f258b1f7865a6ab2d935a897c9-Abstract.html"&gt;Pgm-explainer: Probabilistic graphical model explanations for graph neural networks&lt;/a&gt;. Advances in neural information processing systems, 33, 12225-12235.&lt;/p&gt;
&lt;p&gt;[2] Luo, D., Cheng, W., Xu, D., Yu, W., Zong, B., Chen, H., &amp;amp; Zhang, X. (2020). &lt;a href="https://proceedings.neurips.cc/paper/2020/hash/e37b08dd3015330dcbb5d6663667b8b8-Abstract.html"&gt;Parameterized explainer for graph neural network&lt;/a&gt;. Advances in neural information processing systems, 33, 19620-19631.&lt;/p&gt;
&lt;p&gt;[3] Sanchez-Lengeling, B., Wei, J., Lee, B., Reif, E., Wang, P., Qian, W., &amp;hellip; &amp;amp; Wiltschko, A. (2020). &lt;a href="https://proceedings.neurips.cc/paper/2020/hash/417fbbf2e9d5a28a855a11894b2e795a-Abstract.html"&gt;Evaluating attribution for graph neural networks&lt;/a&gt;. Advances in neural information processing systems, 33, 5898-5910.&lt;/p&gt;</description></item><item><title>Research &amp; Teaching</title><link>https://blog.namln.org/en/research/</link><pubDate>Sat, 19 Aug 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/research/</guid><description>&lt;p&gt;&lt;img src="https://iad4phd.wordpress.com/wp-content/uploads/2018/02/picture1.jpg" alt=""&gt;&lt;/p&gt;
&lt;h2 class="heading" id="research-in-mathematics-and-computational"&gt;
 Research in &lt;a href="https://blog.namln.org/en/mathematics/"&gt;Mathematics&lt;/a&gt; and Computational&lt;span class="heading__anchor"&gt; &lt;a href="#research-in-mathematics-and-computational"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;(Vietnamese Translation) &lt;a href="https://dec41.user.srcf.net/notes/"&gt;Cambridge Notes&lt;/a&gt;. You can access by using this &lt;a href="https://blog.namln.org/research/lecture-notes/cam-notes.md"&gt;link&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;(Vietnamese Translation) &lt;a href="https://pillowmath.github.io/"&gt;Daniel Raban&amp;rsquo;s Note Repository&lt;/a&gt;. You can access by using this &lt;a href="https://blog.namln.org/research/lecture-notes/dr-notes.md"&gt;link&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="research-in-computer-science-and-machine-learning"&gt;
 Research in &lt;a href="https://blog.namln.org/en/theoretical-computer-science/"&gt;Computer Science&lt;/a&gt; and Machine Learning&lt;span class="heading__anchor"&gt; &lt;a href="#research-in-computer-science-and-machine-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/research/ds/"&gt;Data Analytics (General)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/research/graph-analytics/"&gt;Graph Data Analytics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/research/ml-co/"&gt;Machine Learning for Combinatorical Optimization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=""&gt;Reinforcement Learning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=""&gt;Singular Learning Theory&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="teaching-assistant--fit-hcmus"&gt;
 &lt;a href="https://blog.namln.org/en/teaching/"&gt;Teaching Assistant @ FIT-HCMUS&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#teaching-assistant--fit-hcmus"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;Applied in Data Science&lt;/li&gt;
&lt;li&gt;Data Hiding and Secret Sharing&lt;/li&gt;
&lt;li&gt;Data Structures and Algorithms&lt;/li&gt;
&lt;li&gt;Data Mining and Applications&lt;/li&gt;
&lt;li&gt;Data Visualization&lt;/li&gt;
&lt;li&gt;Fundamental of Artificial Intelligence&lt;/li&gt;
&lt;li&gt;Fundemental of Programming&lt;/li&gt;
&lt;li&gt;Introduction to Programming&lt;/li&gt;
&lt;li&gt;Introduction to Data Science&lt;/li&gt;
&lt;li&gt;Introduction to Machine Learning&lt;/li&gt;
&lt;li&gt;Introduction to Bigdata&lt;/li&gt;
&lt;li&gt;Introduction to Information Technology&lt;/li&gt;
&lt;li&gt;Graph Mining&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/tcs/parallel-programming/"&gt;Parallel Programming&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Programming for Data Science&lt;/li&gt;
&lt;li&gt;Swarm Intelligence&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="miscellanea"&gt;
 &lt;a href="https://blog.namln.org/en/miscellanea/"&gt;Miscellanea&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#miscellanea"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;$\LaTeX$ Resources&lt;/li&gt;
&lt;li&gt;Blogs and Advice&lt;/li&gt;
&lt;li&gt;Mathematical Journals&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;JSTOR (lots of back issues of journals)&lt;/li&gt;
&lt;li&gt;Electronic Library of Mathematics (lots of free online journals, proceedings, etc.)&lt;/li&gt;
&lt;li&gt;Math Journal Archive&lt;/li&gt;
&lt;li&gt;Elsevier Science, ScienceDirect, SpringerOnline, SpringerLink, Kluwer Online Journals, Birkhauser, Cambridge University Press, AMS Journals, SIAM Journals, INFORMS Journals, ACM Journals, Project Euclid, Wiley Interscience, World Scientific, Marcel Dekker, Taylor &amp;amp; Francis, Palgrave Macmillan&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Mathematical books: Academic Press , A K Peters , AMS, Birkhauser, Cambridge, CRC Press , Dover , INFORMS, International Press, Kluwer , Oxford , Prentice-Hall , SIAM, Springer, Wiley , World Scientific.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href="https://blog.namln.org/en/miscellanea/"&gt;(More)&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Timeless Quotes</title><link>https://blog.namln.org/en/quotations/</link><pubDate>Sat, 19 Aug 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/quotations/</guid><description>&lt;div class="quote-container"&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;There are some things which cannot be learned quickly, and time, which is all we have, must be paid heavily for their acquiring. They are the very simplest things, and because it takes a man’s life to know them the little new that each man gets from life is very costly and the only heritage he has to leave.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ernest Hemingway (From A. E. Hotchner, &lt;i&gt;Papa Hemingway&lt;/i&gt;, Random House, NY, 1966)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We are punished by our sins, not for them.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Elbert Hubbard&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;the lyf so short, the craft so long to lerne&lt;/p&gt;
 &lt;p class="attribution"&gt;– Chaucer (1340-1400)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Ars longa, vita brevis, occasio praeceps, experimentum periculosum, iudicium difficile (Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgment difficult.)&lt;/p&gt;
 &lt;p class="attribution"&gt;– Hippocrates (c. 400BC)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;‘the cat sat on the mat’ is not the beginning of a story, but ‘the cat sat on the dog’s mat’ is.&lt;/p&gt;
 &lt;p class="attribution"&gt;– John le Carré (David John Moore Cornwell)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Excellence in any department can be attained only by the labor of a lifetime; it is not to be purchased at a lesser price.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Samuel Johnson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Only one who devotes himself to a cause with his whole strength and soul can be a true master. For this reason mastery demands all of a person.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Books are attracted to me. They make a beeline for me, and stick to me. I have been so fond of them that at last they have begun to reciprocate. In my hands books burst like ripe fruit. Like magic flowers they unfold their petals to show me the vital thought, the suggestive word, the confirming quotation, the decisive illustration.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Sergei Eisenstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If we concentrate our attention on trying to solve a problem of geometry, and if at the end of an hour we are no nearer to doing so than at the beginning, we have nevertheless been making progress each minute of that hour in another more mysterious dimension. Without knowing or feeling it, this apparent barren effort has brought more light into the soul.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Simone Weil&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We see things not as they are, but as we are.&lt;/p&gt;
 &lt;p class="attribution"&gt;– The Talmud&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The scientist does not study nature because it is useful; he studies it because he delights in it, and he delights in it because it is beautiful. If nature were not beautiful, it would not be worth knowing, and if nature were not worth knowing, life would not be worth living.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henri Poincaré&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;A noble man compares and estimates himself by an idea which is higher than himself; and a mean man, by one lower than himself. The one produces aspiration; the other ambition, which is the way in which a vulgar man aspires.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Joseph Conrad&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Believe that none of the effort you put into coming closer to God is ever wasted – even if in the end you don’t achieve what you are striving for.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rebbe Nachman of Breslov&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;When you look at a human being, you see his hands working, his feet walking, his mouth talking. You don’t see his heart, his brain, his lungs and kidneys. They work quietly, inside. But they are the essential organs of life. The world, too, has hands and feet—those who are making the news, moving things around, shaking things up. The heart, the inner organs, they are those who work quietly from the inside, those unnoticed, those who do a simple act of kindness with no thought of reward.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabbi M. M. Schneerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Too many people spend money they haven’t earned to buy things they don’t want to impress people they don’t like.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Will Rogers&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;When you thwart what’s real about you in order to keep creating content for financial need, you’re just not gonna make it. You’re not gonna keep going. You have your number. It’s very dangerous to be liked by more people than should like you. It’s bad for them, and it’s bad for you. There’s gonna be a shock down the road for them, or you’re gonna dilute yourself and take yourself to a place where you can’t live with who you are. I think that you make an honest account of who you are and you live with the results. The results will be appropriate to who you are… If you’re saying things just to piss people off, then I don’t know why do it. If you’re saying things just to please people, that’s a short-lived victory. But if you just say the things you believe, and the things you like to say, and that mean something to you — if you stay close to the gut — then everything will work itself out.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Louis C.K.&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To exist is to change, to change is to mature, to mature is to go on creating oneself endlessly.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henri Bergson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;What we have done for ourselves alone dies with us; what we have done for others and the world remains and is immortal.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Pike&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Perhaps all the dragons of our lives are princesses who are only waiting to see us once beautiful and brave.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rainer Maria Rilke&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;You must stay drunk on writing so reality cannot destroy you.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ray Bradbury&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The ultimate test of a man’s conscience may be his willingness to sacrifice something today for future generations whose words of thanks will not be heard.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Gaylord Nelson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Finish each day and be done with it. You have done what you could; some blunders and absurdities have crept in; forget them as soon as you can. Tomorrow is a new day; you shall begin it serenely and with too high a spirit to be encumbered with your old nonsense.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ralph Waldo Emerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Marriage is an alliance entered into by a man who can’t sleep with the window shut and a woman who can’t sleep with the window open.&lt;/p&gt;
 &lt;p class="attribution"&gt;– George Bernard Shaw&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If you think education is expensive, try ignorance.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Derek Bok&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;People talk about “wasting time,” or even “killing time.” Neither term is accurate. Time does not belong to you that you can waste it. Yetట Yet neither does it have a life of its own that you can take away. Rather, time awaits you to give it life.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabbi M. M. Schneerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Most folks are about as happy as they make up their minds to be.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Abraham Lincoln&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;One who loves must learn fear. One who fears must learn love. The thinker must do. The doer must think. The pacifist must fight, the fighter must find peace. If you flow as a river, burn as a fire. If you burn as a furnace, flow as a river. If you fly as a bird, sit firm as a rock. If you sit firmly, then fly as a bird. Be a fire that flows. A rock that flies. Love with fear and fear with love. For we are not fire, not water, not air, not rocks, not thoughts, not deeds, not fear, not love. We are G-dly beings.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabbi M. M. Schneerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;When you come to the end of all the light you know, and it’s time to step into the darkness of the unknown, faith is knowing that one of two things shall happen: Either you will be given something solid to stand on or you will be taught to fly.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Edward Teller&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Whatever you can do, or dream you can do, begin it. Boldness has genius and power and magic in it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Johann Goethe (John Anster’s translation of Faust)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;It is impossible to enjoy idling thoroughly unless one has plenty of work to do.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Jerome K. Jerome&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Every society honors its live conformists and its dead troublemakers.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Mignon McLaughlin&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;You can easily judge the character of a man by how he treats those who can do nothing for him.&lt;/p&gt;
 &lt;p class="attribution"&gt;– James D. Miles&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;In our thinking…we attribute to this concept of the bodily object a significance, which is to high degree independent of the sense impression which orignally gives rise to it. This is what we mean when we attribute to the bodily object a real existence. …By means of such concepts and mental relations between them, we are able to orient ourselves in the labyrinth of sense impressions. These notions and relations…appear to us as stronger and more unalterable than the individual sense experience itself, the character of which as anything other than the result of an illusion or hallucination is never completely guaranteed.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Praise and blame, gain and loss, pleasure and sorrow come and go like the wind. To be happy, rest like a giant tree in the midst of them all.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Buddha&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I am always doing things I can’t do, that’s how I get to do them.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Pablo Picasso&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;This above all: to thine own self be true. And it must follow, as the night the day, Thou canst not then be false to any man.&lt;/p&gt;
 &lt;p class="attribution"&gt;– William Shakespeare&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If the world is cold make it your business to build fires.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Horace Traubel&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Nearly all men can stand adversity, but if you want to test a man’s character, give him power.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Abraham Lincoln&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Your work is to discover your work and then, with all your heart, to give yourself to it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Buddha&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Strive to realize a state of inward happiness, independent of circumstances.&lt;/p&gt;
 &lt;p class="attribution"&gt;– J.P. Greaves&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;When one door of happiness closes, another opens; but often we look so long at the closed door that we do not see the one which has opened for us.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Helen Keller&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I keep six honest serving men (They taught me all I know) Their names are What and Why and When And How and Where and Who&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rudyard Kipling, in Just So Stories&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Whatsoever is, is in God, and without God nothing can be, or be conceived.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baruch Spinoza&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We must not forget that when radium was discovered no one knew that it would prove useful in hospitals. The work was one of pure science. And this is a proof that scientific work must not be considered from the point of view of the direct usefulness of it. It must be done for itself, for the beauty of science, and then there is always the chance that a scientific discovery may become like the radium a benefit for humanity.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Marie Curie&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I believe that a scientist looking at nonscientific problems is just as dumb as the next guy.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Richard Feynman&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To be what we are, and to become what we are capable of becoming, is the only end in life.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baruch Spinoza&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The highest activity a human being can attain is learning for understanding, because to understand is to be free.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baruch Spinoza&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I call him free who is led solely by reason.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baruch Spinoza&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;God is the indwelling and not the transient cause of all things.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baruch Spinoza&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;He who finds a thought that enables him to obtain a slightly deeper glimpse into the eternal secrets of nature has been given great grace.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Watch your thoughts; they become words. Watch your words; they become actions. Watch your actions, they become habits. Watch your habits, they become character. Watch your character; it becomes your destiny.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Frank Outlaw&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Creativity is God’s gift to you. What you do with it is your gift to God.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Bob Moawad&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;In the long run men hit only what they aim at. Therefore, though they should fail immediately, they had better aim at something high.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henry David Thoreau&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We act as though comfort and luxury were the chief requirements of life, when all that we need to make us really happy is something to be enthusiastic about.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Charles Kingsley&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I have always believed that whatever good or bad fortune may come our way we can always give it meaning and transform it into something of value.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Hermann Hesse&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;It is even harder for the average ape to believe that he has descended from man.&lt;/p&gt;
 &lt;p class="attribution"&gt;– H.L. Mencken&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Truth, like gold, is to be obtained not by its growth, but by washing away from it all that is not gold.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Leo Tolstoy&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If you do not change direction, you may end up where you are heading.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Lao Tzu&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The best thing for being sad is to learn something. That is the only thing that never fails. You may grow old and trembling in your anatomies, you may lie awake at night listening to the disorder of your veins, you may miss your only love, you may see the world about you devastated by evil lunatics, or know your honor trampled in the sewers of baser minds. There is only one thing for it then to learn. Learn why the world wags and what wags it. That is the only thing which the mind can never exhaust, never alienate, never be tortured by, never fear or distrust, and never dream of regretting.&lt;/p&gt;
 &lt;p class="attribution"&gt;– T. H. White, in The Once and Future King&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;What we hope ever to do with ease we may learn first to do with diligence.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Samuel Johnson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The way is long if one follows precepts, but short… if one follows patterns.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Lucius Annaeus Seneca&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Find out just what any people will quietly submit to and you have found out the exact measure of injustice and wrong which will be imposed upon them.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Frederick Douglass&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Somewhere, something incredible is waiting to be known.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Carl Sagan&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Principles for the Development of a Complete Mind: Study the science of art. Study the art of science. Develop your senses – especially learn how to see. Realise that everything connects to everything else.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Leonardo DaVinci&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I have come here to chew bubblegum and kick ass … and I’m all out of bubblegum.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Nada, in They Live (1988) by John Carpenter&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Where the mind is without fear and the head is held high Where knowledge is free Where the world has not been broken up into fragments By narrow domestic walls Where the words come out From the depth of truth Where the tireless striving stretches its arms towards perfection Where the clear stream of reason has not lost its way into the dreary desert sand of dead habit Where the mind is led forward by thee In ever widening thought and action Into that heaven of freedom, my father Let my country awake.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabindranath Tagore (from Gitanjali)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;By all means marry; if you get a good wife, you’ll become happy; if you get a bad one, you’ll become a philosopher.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The belief in an external world independent of the perceiving subject is the basis of all natural science. Since, however, sense perception only gives information of this external world or of “physical reality” indirectly, we can only grasp the latter by speculative means. It follows from this that our notions of physical reality can never be final. We must always be ready to change these notions – that is to say, the axiomatic basis of physics – in order to do justice to perceived facts in the most perfect way logically.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I love you when you bow in your mosque, kneel in your temple, pray in your church. For you and I are sons of one religion, and it is the spirit.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Kahlil Gibran&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To find yourself, think for yourself.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The earth is but one country, and mankind its citizens.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baha’u’llah&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;There is only one good, knowledge, and one evil, ignorance.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;For every complicated problem there is a solution that is simple, direct, understandable, and wrong.&lt;/p&gt;
 &lt;p class="attribution"&gt;– H. L. Mencken&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is.&lt;/p&gt;
 &lt;p class="attribution"&gt;– John Louis von Neumann&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The only true wisdom is in knowing you know nothing.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;It’s only recently that I’ve come to understand that writers are not marginal to our society, that they, in fact, do all our thinking for us, that we are writing myths and our myths are believed, and that old myths are believed until someone writes a new one.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Kurt Vonnegut&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;All ads do the same: create an anxiety relievable by purchase.&lt;/p&gt;
 &lt;p class="attribution"&gt;– David Foster Wallace&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Beginnings are hard. For good reason. If they were easy, we would prowl into each new venture like a snug fat cat. When you begin pent up in an iron cage, a new life emerges. A tiger that breaks through the door of its cage and pounces with a vengeance. Bless those cages, those impossible brick walls, those rivers of fire that lie at the outset of each worthwhile journey. Without them we would be only as powerful as we appear.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabbi M. M. Schneerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I really think the mark of experience isn’t the ability to write a lot of good pages, it’s the ability to generate shitty pages faster without worrying so much about it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Justin Marks&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The more subtle and elegant you are in hiding your plot points, the better you are as a writer.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Billy Wilder&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Inspiration does exist, but it must find you working.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Pablo Picasso&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Every character should want something, even if it is only a glass of water.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Kurt Vonnegut&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;There is no abstract art. You must always start with something. Afterward you can remove all traces of reality.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Pablo Picasso&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To the complaint, ‘There are no people in these photographs,’ I respond, There are always two people: the photographer and the viewer.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ansel Adams&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The more abstract is form, the more clear and direct its appeal.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Wassily Kandinsky&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The artist must have something to say, for mastery over form is not his goal but rather the adapting of form to its inner meaning.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Wassily Kandinsky&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Treat a man as he appears to be, and you make him worse. But treat a man as if he were what he potentially could be, and you make him what he should be.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Johann Wolfgang von Goethe&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Pure mathematics is, in its way, the poetry of logical ideas. One seeks the most general ideas of operation which will bring together in simple, logical and unified form the largest possible circle of formal relationships. In this effort toward logical beauty spiritual formulas are discovered necessary for the deeper penetration into the laws of nature.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Pursue some path, however narrow and crooked, in which you can walk with love and reverence.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henry David Thoreau&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If any man wish to write in a clear style, let him be first clear in his thoughts; and if any would write in a noble style, let him first possess a noble soul.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Johann Wolfgang von Goethe&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Only the curious will learn, only the resolute overcome the obstacles to learning. The Quest quotient has always excited me more than the intelligence quotient.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Eugene S. Wilson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Human beings can attain a worthy and harmonious life only if they are able to rid themselves, within the limits of human nature, of striving to fulfill wishes of the material kind.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;You can’t wait for inspiration. You have to go after it with a club.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Jack London&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If you don’t have time to read, you don’t have the time – or the tools—to write.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Stephen King&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;A professor must have a theory as a dog must have fleas.&lt;/p&gt;
 &lt;p class="attribution"&gt;– H. L. Mencken&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We shall not cease from exploration, and the end of all our exploring will be to arrive where we started, and know the place for the first time.&lt;/p&gt;
 &lt;p class="attribution"&gt;– T. S. Eliot&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Humankind has not woven the web of life. We are but one thread within it. Whatever we do to the web we do to ourselves. All things are bound together. All things are connected.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Chief Seattle&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We do not inherit the earth from our ancestors, we borrow it from our children.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Native American Proverb&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We cannot command Nature except by obeying her.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Francis Bacon&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Each player must accept the cards life deals him or her: but once they are in hand, he or she alone must decide how to play the cards in order to win the game.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Voltaire&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We are what we think. All that we are arises with our thoughts. With our thoughts, we make the world.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Buddha&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Every intellectual has a very special responsibility. He has the privilege and opportunity of studying. In return, he owes it to his fellow men (or ‘to society’) to represent the results of his study as simply, clearly and modestly as he can. The worst thing that intellectuals can do – the cardinal sin – is to try to set themselves up as great prophets vis-a-vis their fellow men and to impress them with puzzling philosophies. Anyone who cannot speak simply and clearly should say nothing and continue to work until he can do so.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Karl Popper&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;A man who stands for nothing will fall for anything.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Malcolm X&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;You need not leave your room. Remain sitting at your table and listen. You need not even listen, simply wait, just learn to become quiet, and still, and solitary. The world will freely offer itself to you to be unmasked. It has no choice; it will roll in ecstasy at your feet.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Franz Kafka&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Creativity is essentially a lonely art. An even lonelier struggle. To some a blessing. To others a curse. It is in reality the ability to reach inside yourself and drag forth from your very soul an idea.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Lou Dorfsman&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Life is not easy for any of us. But what of that? We must have perseverance and above all confidence in ourselves. We must believe that we are gifted for something, and that this thing, at whatever cost, must be attained.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Marie Curie&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Nothing in this world can take the place of persistence. Talent will not; nothing is more common than unsuccessful people with talent. Genius will not; unrewarded genius is almost a proverb. Education will not; the world is full of educated derelicts. Persistence and determination alone are omnipotent. The slogan “press on” has solved and always will solve the problems of the human race.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Calvin Coolidge&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Education is the passport to the future, for tomorrow belongs to those who prepare for it today.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Malcolm X&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Jump off the cliff and build your wings on the way down.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ray Bradbury&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;However great a man’s natural talent may be, the act of writing cannot be learned all at once.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Jean Jacques Rousseau&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Talent is cheaper than table salt. What separates the talented individual from the successful one is a lot of hard work.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Stephen King&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;My ambition is to find freedom, without taking it from someone else.&lt;/p&gt;
 &lt;p class="attribution"&gt;– George Dyson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;writing = ass + chair&lt;/p&gt;
 &lt;p class="attribution"&gt;– Oliver Stone&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Truth is a demure lady, much too ladylike to knock you on your head and drag you to her cave. She is there, but people must want her, and seek her out.&lt;/p&gt;
 &lt;p class="attribution"&gt;– William F. Buckley&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Death is a dignitary who when he comes announced is to be received with formal manifestations of respect, even by those most familiar with him. In the code of military etiquette silence and fixity are forms of deference.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ambrose Bierce (from An Occurrence at Owl Creek, 1890)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Don’t market yourself. Editors and readers don’t know what they want until they see it. Scratch what itches. Write what you need to write, feed the hunger for meaning in your life. Play at the serious questions of life and death.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Donald M. Murray&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Never, under any circumstances, hate a movie. It won’t help you and it’s a waste of time. There’s plenty of reasons to not to like a movie. But if you hate them? Meaning if let them bother you? Then they’ll do nothing but bother you. And I mean if you want to do this for a fucking living and you’re absolutely serious, then never hate a movie. You can learn so much about the craft from bad movies. Bad movies teach you what not to do and what to correct in your process and that’s way more helpful. And fuck man, hating movies closes you off to stuff that seems like whatever you hate. Or stuff by the same guy. And who knows? That other stuff could be awesome. Some of my favorite filmmakers made bad movies. It won’t help you. It just won’t. It stops your development right in its tracks, okay? I mean like everything and I ain’t trying to get you to be like me or anything. I’m just saying I think it’s better for you. And it makes me way, way happier. Never hate a movie. They’re gifts. Every fucking one of em.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Quentin Tarantino&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I think everybody should get rich and famous and do everything they ever dreamed of so they can see that it’s not the answer.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Jim Carrey&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Due to circumstances beyond my control, I am the master of my fate and captain of my soul.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ashleigh Brilliant (variant from a line in the poem “Invictus” by William Earnest Henley, written in 1875)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Entertain yourself. Luck comes just as often (and just as rarely) to every writer. Don’t be the writer that got lucky doing something they hate.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Dan Harmon&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If there’s a book you really want to read but it hasn’t been written yet, then you must write it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Toni Morrison&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I’d just say to aspiring journalists or writers – who I meet a lot of – do it now. Don’t wait for permission to make something that’s interesting or amusing to you. Just do it now. Don’t wait. Find a story idea, start making it, give yourself a deadline, show it to people who’ll give you notes to make it better. Don’t wait till you’re older, or in some better job than you have now. Don’t wait for anything. Don’t wait till some magical story idea drops into your lap. That’s not where ideas come from. Go looking for an idea and it’ll show up. Begin now. Be a fucking soldier about it and be tough.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ira Glass&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Success consists of going from failure to failure without loss of enthusiasm.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Winston Churchill&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;All the gods, all the heavens, all the hells, are within you.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Joseph Campbell&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If we concentrate our attention on trying to solve a problem of geometry, and if at the end of an hour we are no nearer to doing so than at the beginning, we have nevertheless been making progress each minute of that hour in another more mysterious dimension. Without knowing or feeling it, this apparent barren affort has brought more light into the soul.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Simone Weil&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We see things not as they are, but as we are.&lt;/p&gt;
 &lt;p class="attribution"&gt;– The Talmud&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The scientist does not study nature because it is useful; he studies it because he delights in it, and he delights in it because it is beautiful. If nature were not beautiful, it would not be worth knowing, and if nature were not worth knowing, life would not be worth living.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henri Poincaré&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;A noble man compares and estimates himself by an idea which is higher than himself; and a mean man, by one lower than himself. The one produces aspiration; the other ambition, which is the way in which a vulgar man aspires.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Joseph Conrad&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Believe that none of the effort you put into coming closer to God is ever wasted – even if in the end you don’t achieve what you are striving for.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rebbe Nachman of Breslov&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Too many people spend money they haven’t earned to buy things they don’t want to impress people they don’t like.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Will Rogers&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To exist is to change, to change is to mature, to mature is to go on creating oneself endlessly.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henri Bergson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;What we have done for ourselves alone dies with us; what we have done for others and the world remains and is immortal.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Pike&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Perhaps all the dragons of our lives are princesses who are only waiting to see us once beautiful and brave.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rainer Maria Rilke&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The ultimate test of a man’s conscience may be his willingness to sacrifice something today for future generations whose words of thanks will not be heard.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Gaylord Nelson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Finish each day and be done with it. You have done what you could; some blunders and absurdities have crept in; forget them as soon as you can. Tomorrow is a new day; you shall begin it serenely and with too high a spirit to be encumbered with your old nonsense.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Ralph Waldo Emerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Marriage is an alliance entered into by a man who can’t sleep with the window shut and a woman who can’t sleep with the window open.&lt;/p&gt;
 &lt;p class="attribution"&gt;– George Bernard Shaw&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If you think education is expensive, try ignorance.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Derek Bok&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Most folks are about as happy as they make up their minds to be.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Abraham Lincoln&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;One who loves must learn fear. One who fears must learn love. The thinker must do. The doer must think. The pacifist must fight, the fighter must find peace. If you flow as a river, burn as a fire. If you burn as a furnace, flow as a river. If you fly as a bird, sit firm as a rock. If you sit firmly, then fly as a bird. Be a fire that flows. A rock that flies. Love with fear and fear with love. For we are not fire, not water, not air, not rocks, not thoughts, not deeds, not fear, not love. We are G-dly beings.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabbi M. M. Schneerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;When you come to the end of all the light you know, and it’s time to step into the darkness of the unknown, faith is knowing that one of two things shall happen: Either you will be given something solid to stand on or you will be taught to fly.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Edward Teller&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Whatever you can do, or dream you can do, begin it. Boldness has genius and power and magic in it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Johann Goethe (John Anster’s translation of Faust)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;It is impossible to enjoy idling thoroughly unless one has plenty of work to do.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Jerome K. Jerome&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Every society honors its live conformists and its dead troublemakers.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Mignon McLaughlin&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;You can easily judge the character of a man by how he treats those who can do nothing for him.&lt;/p&gt;
 &lt;p class="attribution"&gt;– James D. Miles&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;In our thinking…we attribute to this concept of the bodily object a significance, which is to high degree independent of the sense impression which originally gives rise to it. This is what we mean when we attribute to the bodily object a real existence. …By means of such concepts and mental relations between them, we are able to orient ourselves in the labyrinth of sense impressions. These notions and relations…appear to us as stronger and more unalterable than the individual sense experience itself, the character of which as anything other than the result of an illusion or hallucination is never completely guaranteed.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Praise and blame, gain and loss, pleasure and sorrow come and go like the wind. To be happy, rest like a giant tree in the midst of them all.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Buddha&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The right time to show your good character is when you are pestered by somebody weaker than you.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Buddha&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I am always doing things I can’t do, that’s how I get to do them.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Pablo Picasso&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;This above all: to thine own self be true. And it must follow, as the night the day, Thou canst not then be false to any man.&lt;/p&gt;
 &lt;p class="attribution"&gt;– William Shakespeare&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If the world is cold make it your business to build fires.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Horace Traubel&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Nearly all men can stand adversity, but if you want to test a man’s character, give him power.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Abraham Lincoln&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Your work is to discover your work and then, with all your heart, to give yourself to it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Buddha&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Strive to realize a state of inward happiness, independent of circumstances.&lt;/p&gt;
 &lt;p class="attribution"&gt;– J.P. Greaves&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;When one door of happiness closes, another opens; but often we look so long at the closed door that we do not see the one which has opened for us.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Helen Keller&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I keep six honest serving men (They taught me all I know) Their names are What and Why and When And How and Where and Who&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rudyard Kipling, in Just So Stories&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We must not forget that when radium was discovered no one knew that it would prove useful in hospitals. The work was one of pure science. And this is a proof that scientific work must not be considered from the point of view of the direct usefulness of it. It must be done for itself, for the beauty of science, and then there is always the chance that a scientific discovery may become like the radium a benefit for humanity.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Marie Curie&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I believe that a scientist looking at nonscientific problems is just as dumb as the next guy.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Richard Feynman&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To be what we are, and to become what we are capable of becoming, is the only end in life.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baruch Spinoza&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Watch your thoughts; they become words. Watch your words; they become actions. Watch your actions, they become habits. Watch your habits, they become character. Watch your character; it becomes your destiny.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Frank Outlaw&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Creativity is God’s gift to you. What you do with it is your gift to God.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Bob Moawad&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;In the long run men hit only what they aim at. Therefore, though they should fail immediately, they had better aim at something high.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Henry David Thoreau&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;We act as though comfort and luxury were the chief requirements of life, when all that we need to make us really happy is something to be enthusiastic about.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Charles Kingsley&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I have always believed that whatever good or bad fortune may come our way we can always give it meaning and transform it into something of value.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Hermann Hesse&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Truth, like gold, is to be obtained not by its growth, but by washing away from it all that is not gold.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Leo Tolstoy&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If you do not change direction, you may end up where you are heading.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Lao Tzu&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The best thing for being sad is to learn something. That is the only thing that never fails. You may grow old and trembling in your anatomies, you may lie awake at night listening to the disorder of your veins, you may miss your only love, you may see the world about you devastated by evil lunatics, or know your honor trampled in the sewers of baser minds. There is only one thing for it then to learn. Learn why the world wags and what wags it. That is the only thing which the mind can never exhaust, never alienate, never be tortured by, never fear or distrust, and never dream of regretting.&lt;/p&gt;
 &lt;p class="attribution"&gt;– T. H. White, in The Once and Future King&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;What we hope ever to do with ease we may learn first to do with diligence.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Samuel Johnson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The way is long if one follows precepts, but short… if one follows patterns.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Lucius Annaeus Seneca&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Find out just what any people will quietly submit to and you have found out the exact measure of injustice and wrong which will be imposed upon them.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Frederick Douglass&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Somewhere, something incredible is waiting to be known.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Carl Sagan&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Principles for the Development of a Complete Mind: Study the science of art. Study the art of science. Develop your senses – especially learn how to see. Realise that everything connects to everything else.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Leonardo DaVinci&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I have come here to chew bubblegum and kick ass … and I’m all out of bubblegum.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Nada, in They Live (1988) by John Carpenter&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Where the mind is without fear and the head is held high Where knowledge is free Where the world has not been broken up into fragments By narrow domestic walls Where the words come out From the depth of truth Where the tireless striving stretches its arms towards perfection Where the clear stream of reason has not lost its way into the dreary desert sand of dead habit Where the mind is led forward by thee In ever widening thought and action Into that heaven of freedom, my father Let my country awake.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabindranath Tagore (from Gitanjali)&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;By all means marry; if you get a good wife, you’ll become happy; if you get a bad one, you’ll become a philosopher.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The belief in an external world independent of the perceiving subject is the basis of all natural science. Since, however, sense perception only gives information of this external world or of “physical reality” indirectly, we can only grasp the latter by speculative means. It follows from this that our notions of physical reality can never be final. We must always be ready to change these notions – that is to say, the axiomatic basis of physics – in order to do justice to perceived facts in the most perfect way logically.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Albert Einstein&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I love you when you bow in your mosque, kneel in your temple, pray in your church. For you and I are sons of one religion, and it is the spirit.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Kahlil Gibran&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;To find yourself, think for yourself.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The earth is but one country, and mankind its citizens.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Baha’u’llah&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;There is only one good, knowledge, and one evil, ignorance.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;For every complicated problem there is a solution that is simple, direct, understandable, and wrong.&lt;/p&gt;
 &lt;p class="attribution"&gt;– H. L. Mencken&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is.&lt;/p&gt;
 &lt;p class="attribution"&gt;– John Louis von Neumann&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;The only true wisdom is in knowing you know nothing.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Socrates&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;It’s only recently that I’ve come to understand that writers are not marginal to our society, that they, in fact, do all our thinking for us, that we are writing myths and our myths are believed, and that old myths are believed until someone writes a new one.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Kurt Vonnegut&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;All ads do the same: create an anxiety relievable by purchase.&lt;/p&gt;
 &lt;p class="attribution"&gt;– David Foster Wallace&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;Beginnings are hard. For good reason. If they were easy, we would prowl into each new venture like a snug fat cat. When you begin pent up in an iron cage, a new life emerges. A tiger that breaks through the door of its cage and pounces with a vengeance. Bless those cages, those impossible brick walls, those rivers of fire that lie at the outset of each worthwhile journey. Without them we would be only as powerful as we appear.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Rabbi M. M. Schneerson&lt;/p&gt;
 &lt;/div&gt;
 &lt;div class="quote"&gt;
 &lt;p&gt;I really think the mark of experience isn’t the ability to write a lot of good pages, it’s the ability to generate shitty pages faster without worrying so much about it.&lt;/p&gt;
 &lt;p class="attribution"&gt;– Justin Marks&lt;/p&gt;
 &lt;/div&gt;
&lt;/div&gt;</description></item><item><title>Mathematics</title><link>https://blog.namln.org/en/mathematics/</link><pubDate>Sat, 08 Apr 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/</guid><description>&lt;p&gt;&lt;img src="https://denisegaskins.com/wp-content/uploads/2017/02/map-of-mathematics.jpg" alt=""&gt;&lt;/p&gt;
&lt;h2 class="heading" id="study-mathematics"&gt;
 Study Mathematics&lt;span class="heading__anchor"&gt; &lt;a href="#study-mathematics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/study-math-hcmus/"&gt;Master of Science in Mathematics @ HCMUS&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="branches-of-mathematics"&gt;
 Branches of Mathematics&lt;span class="heading__anchor"&gt; &lt;a href="#branches-of-mathematics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;&lt;img src="https://www.math4wisdom.com/files/MatematikosSakosDidelis.png" alt=""&gt;&lt;/p&gt;
&lt;h3 class="heading" id="1-foundation-of-mathematics"&gt;
 1. Foundation of Mathematics&lt;span class="heading__anchor"&gt; &lt;a href="#1-foundation-of-mathematics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Transition To Pure Rigour Math&lt;/li&gt;
&lt;li&gt;Set Theory&lt;/li&gt;
&lt;li&gt;Logic&lt;/li&gt;
&lt;li&gt;Category Theory&lt;/li&gt;
&lt;li&gt;Type Theory&lt;/li&gt;
&lt;li&gt;Homotopy Type Theory&lt;/li&gt;
&lt;li&gt;Surreal Numbers&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="2-number-theory"&gt;
 2. &lt;a href="https://blog.namln.org/en/mathematics/number-theory/"&gt;Number Theory&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#2-number-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Algebraic Number Theory&lt;/li&gt;
&lt;li&gt;Analytic Number Theory&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="3-algebra"&gt;
 3. &lt;a href="https://blog.namln.org/en/mathematics/algebra/"&gt;Algebra&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#3-algebra"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/algebra/abstract-algebra"&gt;Abstract Algebra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/algebra/group-theory/"&gt;Group Theory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/algebra/linear-algebra/"&gt;Linear Algebra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/algebra/ring-theory/"&gt;Ring Theory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/algebra/galois-theory/"&gt;Galois Theory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/algebra/lie-algebras/"&gt;Lie Algebras&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="4-combinatorics"&gt;
 4. &lt;a href="https://blog.namln.org/en/mathematics/combinatorics/"&gt;Combinatorics&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#4-combinatorics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Probabilistic methods in Combinatorics&lt;/li&gt;
&lt;li&gt;Algebraic Combinatorics&lt;/li&gt;
&lt;li&gt;Graph Theory&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="5-geometry-topology"&gt;
 5. &lt;a href="https://blog.namln.org/en/mathematics/geometry-topology/"&gt;Geometry Topology&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#5-geometry-topology"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Differential Geometry&lt;/li&gt;
&lt;li&gt;Algebraic Geometry&lt;/li&gt;
&lt;li&gt;Algebraic Statistics&lt;/li&gt;
&lt;li&gt;Topology&lt;/li&gt;
&lt;li&gt;Algebraic Topology&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="6-mathematical-analysis"&gt;
 6. &lt;a href="https://blog.namln.org/en/mathematics/mathematical-analysis/"&gt;Mathematical analysis&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#6-mathematical-analysis"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/real-analysis/"&gt;Real Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/harmonic-analysis/"&gt;Harmonic Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/complex-analysis/"&gt;Complex Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/functional-analysis/"&gt;Functional Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/measure-theory/"&gt;Measure Theory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/ode/"&gt;ODE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/pde/"&gt;PDE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Variational Analysis&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/calculus-of-variations/"&gt;Calculus of Variations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Calculus (Single/ Multi-variables)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/analysis/optimization/"&gt;Optimization &amp;amp; Operation Research&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Dynamical Systems&lt;/li&gt;
&lt;li&gt;Set-valued Analysis&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="7-probability-and-statistics"&gt;
 7. &lt;a href="https://blog.namln.org/en/mathematics/probability-statistics/"&gt;Probability and Statistics&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#7-probability-and-statistics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Probability Theory&lt;/li&gt;
&lt;li&gt;Statistics&lt;/li&gt;
&lt;li&gt;Statistical Learning&lt;/li&gt;
&lt;li&gt;Stochastic processes&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="8-numerical-analysis"&gt;
 8. &lt;a href="https://blog.namln.org/en/mathematics/numerical-analysis/"&gt;Numerical Analysis&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#8-numerical-analysis"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Numerical methods for PDEs&lt;/li&gt;
&lt;li&gt;Numerical methods for ODEs&lt;/li&gt;
&lt;li&gt;Computational Linear Algebra&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="9-signal-processing"&gt;
 9. &lt;a href="https://blog.namln.org/en/mathematics/signal-processing/"&gt;Signal Processing&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#9-signal-processing"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h3 class="heading" id="10-mathematics-for-computer-science"&gt;
 10. &lt;a href="https://blog.namln.org/en/mathematics/mathematics-computer-science/"&gt;Mathematics for Computer Science&lt;/a&gt;&lt;span class="heading__anchor"&gt; &lt;a href="#10-mathematics-for-computer-science"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h3 class="heading" id="11-mathematical-physics"&gt;
 11. Mathematical Physics&lt;span class="heading__anchor"&gt; &lt;a href="#11-mathematical-physics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;</description></item><item><title>Parallel Programming</title><link>https://blog.namln.org/en/tcs/parallel-programming/</link><pubDate>Sat, 08 Apr 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/tcs/parallel-programming/</guid><description>&lt;ol&gt;
&lt;li&gt;CUDA C++ Programming Guide (&lt;a href="https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Table of instruction throughputs (&lt;a href="https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;PTX Reference Manual (&lt;a href="https://docs.nvidia.com/cuda/parallel-thread-execution/"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Inline PTX syntax guide (&lt;a href="https://docs.nvidia.com/cuda/inline-ptx-assembly/index.html"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Tensor core instruction data layouts (&lt;a href="https://docs.nvidia.com/cuda/parallel-thread-execution/#warp-level-matrix-fragment-mma-1688"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="3"&gt;
&lt;li&gt;SASS Instruction List (&lt;a href="https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#nvidia-ampere-gpu-and-ada-instruction-set"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Compiler Explorer by Matt Godbolt (&lt;a href="https://godbolt.org/"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;GPU Mode (&lt;a href="https://github.com/gpu-mode"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Modal GPU Glossary (&lt;a href="https://modal.com/gpu-glossary"&gt;Link&lt;/a&gt;)&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Theoretical Computer Science</title><link>https://blog.namln.org/en/theoretical-computer-science/</link><pubDate>Sat, 08 Apr 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/theoretical-computer-science/</guid><description>&lt;p&gt;&lt;img src="https://iibawards-prod.s3.amazonaws.com/projects/images/000/002/333/page.png" alt=""&gt;&lt;/p&gt;
&lt;h2 class="heading" id="study-computer-science"&gt;
 Study Computer Science&lt;span class="heading__anchor"&gt; &lt;a href="#study-computer-science"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://blog.namln.org/en/mathematics/study-computer-science-hcmus/"&gt;Master of Science in Computer Science @ HCMUS&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="branches-of-theoretical-computer-science"&gt;
 Branches of Theoretical Computer Science&lt;span class="heading__anchor"&gt; &lt;a href="#branches-of-theoretical-computer-science"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;h3 class="heading" id="1-theory-of-computation"&gt;
 1. Theory of Computation&lt;span class="heading__anchor"&gt; &lt;a href="#1-theory-of-computation"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Computational Complexity&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;Communication Complexity&lt;/li&gt;
&lt;li&gt;Circuit Complexity&lt;/li&gt;
&lt;li&gt;Quantum Complexity&lt;/li&gt;
&lt;li&gt;Proof Complexity&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Computability Theory&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="2-logic"&gt;
 2. Logic&lt;span class="heading__anchor"&gt; &lt;a href="#2-logic"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h3 class="heading" id="3-programming-language-theory"&gt;
 3. Programming Language Theory&lt;span class="heading__anchor"&gt; &lt;a href="#3-programming-language-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;Basic of Programming Language Theory&lt;/li&gt;
&lt;li&gt;Formal Verification&lt;/li&gt;
&lt;li&gt;Type Theory&lt;/li&gt;
&lt;li&gt;Functional Programming&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="4-algorithms"&gt;
 4. Algorithms&lt;span class="heading__anchor"&gt; &lt;a href="#4-algorithms"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;ol&gt;
&lt;li&gt;General Algorithms&lt;/li&gt;
&lt;li&gt;Lower Bounds&lt;/li&gt;
&lt;li&gt;Randomization &amp;amp; Probability for Algorithms&lt;/li&gt;
&lt;li&gt;Approximation Algorithms&lt;/li&gt;
&lt;li&gt;Parameterized Algorithms&lt;/li&gt;
&lt;li&gt;Learning-augmented Algorithms&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="5-informationcoding-theory"&gt;
 5. Information/Coding Theory&lt;span class="heading__anchor"&gt; &lt;a href="#5-informationcoding-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h3 class="heading" id="6-cryptography"&gt;
 6. Cryptography&lt;span class="heading__anchor"&gt; &lt;a href="#6-cryptography"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h3 class="heading" id="7-machine-learning-theory"&gt;
 7. Machine Learning Theory&lt;span class="heading__anchor"&gt; &lt;a href="#7-machine-learning-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;h3 class="heading" id="8-game-theory"&gt;
 8. Game Theory&lt;span class="heading__anchor"&gt; &lt;a href="#8-game-theory"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;</description></item><item><title>Cambridge Notes (Vietnamese)</title><link>https://blog.namln.org/en/topics/lecture-notes/cam-notes/</link><pubDate>Wed, 11 Jan 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/lecture-notes/cam-notes/</guid><description>&lt;h2 class="heading" id="ghi-chú-bài-giảng-cambridge"&gt;
 Ghi chú bài giảng Cambridge&lt;span class="heading__anchor"&gt; &lt;a href="#ghi-ch%c3%ba-b%c3%a0i-gi%e1%ba%a3ng-cambridge"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Tất cả các ghi chú đều được dịch từ &lt;a href="https://dec41.user.srcf.net/notes/"&gt;Cambridge Notes&lt;/a&gt; do Dexter Chua biên tập. Các bản dịch sang tiếng Việt được sử dụng cho mục đích học tập. Vui lòng không sử dụng cho mục đích thương mại.&lt;/p&gt;
&lt;h3 class="heading" id="part-ia"&gt;
 Part IA&lt;span class="heading__anchor"&gt; &lt;a href="#part-ia"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Michaelmas Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Phương trình vi phân - Differential Equations: &lt;a href="https://dec41.user.srcf.net/h/III_L/the_standard_model"&gt;HTML&lt;/a&gt;, &lt;a href="https://dec41.user.srcf.net/notes/IA_M/differential_equations.pdf"&gt;PDF&lt;/a&gt;, &lt;a href="https://dec41.user.srcf.net/notes/IA_M/differential_equations_trim.pdf"&gt;PDF (Trim)&lt;/a&gt;, &lt;a href="https://dec41.user.srcf.net/notes/IA_M/differential_equations_def.pdf"&gt;PDF (defs)&lt;/a&gt;, &lt;a href="https://dec41.user.srcf.net/notes/IA_M/differential_equations_thm.pdf"&gt;PDF (thm)&lt;/a&gt;, &lt;a href="https://dec41.user.srcf.net/notes/IA_M/differential_equations_thm_proof.pdf"&gt;PDF (thm+proof)&lt;/a&gt;, &lt;a href=""&gt;Official Notes&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Lý thuyết nhóm - Groups&lt;/li&gt;
&lt;li&gt;Số học và Tập hợp - Numbers and Sets&lt;/li&gt;
&lt;li&gt;Vector và Ma trận - Vectors and Matrices&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Lent Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="6"&gt;
&lt;li&gt;Giải tích I - Analysis I&lt;/li&gt;
&lt;li&gt;Động học và Thuyết tương đối - Dynamics and Relativity&lt;/li&gt;
&lt;li&gt;Xác suất - Probability&lt;/li&gt;
&lt;li&gt;Giải tích vector - Vector Calculus&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="part-ib"&gt;
 Part IB&lt;span class="heading__anchor"&gt; &lt;a href="#part-ib"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Michaelmas Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="10"&gt;
&lt;li&gt;Giải tích II - Analysis II&lt;/li&gt;
&lt;li&gt;Đại số tuyến tính - Linear Algebra&lt;/li&gt;
&lt;li&gt;Xích Markov - Markov Chains&lt;/li&gt;
&lt;li&gt;Kỹ thuật toán học - Methods&lt;/li&gt;
&lt;li&gt;Cơ học lượng tử - Quantum Mechanics&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Lent Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="15"&gt;
&lt;li&gt;Giải tích phức - Complex Analysis&lt;/li&gt;
&lt;li&gt;Kỹ thuật phức - Complex Methods&lt;/li&gt;
&lt;li&gt;Điện tử - Electromagnetism&lt;/li&gt;
&lt;li&gt;Cơ học chất lỏng - Fluid Dynamics&lt;/li&gt;
&lt;li&gt;Hình học - Geometry&lt;/li&gt;
&lt;li&gt;Nhóm, Vành và Modules - Groups, Rings and Modules&lt;/li&gt;
&lt;li&gt;Giải tích số - Numerical Analysis&lt;/li&gt;
&lt;li&gt;Thống kê - Statistics&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Easter Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="23"&gt;
&lt;li&gt;Không gian Metric và Topo - Metric and Topological Spaces&lt;/li&gt;
&lt;li&gt;Optimisation&lt;/li&gt;
&lt;li&gt;Nguyên lý biến phân - Variational Principles&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="part-ii"&gt;
 Part II&lt;span class="heading__anchor"&gt; &lt;a href="#part-ii"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Michaelmas Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="26"&gt;
&lt;li&gt;Topo Đại số - Algebraic Topology&lt;/li&gt;
&lt;li&gt;Lý thuyết Galois - Galois Theory&lt;/li&gt;
&lt;li&gt;Hệ khả tích - Integrable Systems&lt;/li&gt;
&lt;li&gt;Giải tích tuyến tính - Linear Analysis&lt;/li&gt;
&lt;li&gt;Độ đo và Xác suất - Probability and Measure&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Lent Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="31"&gt;
&lt;li&gt;Logic và Lý thuyết tập hợp - Logic and Set Theory&lt;/li&gt;
&lt;li&gt;Trường số học - Number Fields&lt;/li&gt;
&lt;li&gt;Lý thuyết biểu diễn - Representation Theory&lt;/li&gt;
&lt;li&gt;Vật lý thống kê - Statistical Physics&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="part-iii"&gt;
 Part III&lt;span class="heading__anchor"&gt; &lt;a href="#part-iii"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Michaelmas Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="35"&gt;
&lt;li&gt;Xác suất nâng cao - Advanced Probability&lt;/li&gt;
&lt;li&gt;Topo Đại số - Algebraic Topology&lt;/li&gt;
&lt;li&gt;Giải tích về Phương trình Đạo hàm riêng - Analysis of Partial Differential Equations&lt;/li&gt;
&lt;li&gt;Tổ hợp - Combinatorics&lt;/li&gt;
&lt;li&gt;Hình học vi phân - Differential Geometry&lt;/li&gt;
&lt;li&gt;Extremal Graph Theory&lt;/li&gt;
&lt;li&gt;Hydrodynamic Stability&lt;/li&gt;
&lt;li&gt;Trường địa phương - Local Fields&lt;/li&gt;
&lt;li&gt;Các kỹ thuật thống kê hiện đại - Modern Statistical Methods&lt;/li&gt;
&lt;li&gt;Percolation and Random Walks on Graphs&lt;/li&gt;
&lt;li&gt;Tính toán lượng tử - Quantum Computation&lt;/li&gt;
&lt;li&gt;Lý thuyết trường lượng tử - Quantum Field Theory&lt;/li&gt;
&lt;li&gt;Symmetries, Fields and Particles&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Lent Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="48"&gt;
&lt;li&gt;Lý thuyết trường lượng tử - Advanced Quantum Field Theory&lt;/li&gt;
&lt;li&gt;Đại số - Algebras&lt;/li&gt;
&lt;li&gt;Logic&lt;/li&gt;
&lt;li&gt;Modular Forms and L-functions&lt;/li&gt;
&lt;li&gt;Tính dương trong Đại số Hình học - Positivity in Algebraic Geometry&lt;/li&gt;
&lt;li&gt;Lý thuyết Ramsey - Ramsey Theory&lt;/li&gt;
&lt;li&gt;Hình học Riemannian - Riemannian Geometry&lt;/li&gt;
&lt;li&gt;Tiến hóa Schramm–Loewner - Schramm–Loewner Evolutions&lt;/li&gt;
&lt;li&gt;Giải tích ngẫu nghiên và Ứng dụng - Stochastic Calculus and Applications&lt;/li&gt;
&lt;li&gt;Symplectic Geometry&lt;/li&gt;
&lt;li&gt;Mô hình chuẩn - The Standard Model&lt;/li&gt;
&lt;li&gt;Theoretical Physics of Soft Condensed Matter&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Easter Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="60"&gt;
&lt;li&gt;Classical and Quantum Solitons&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 class="heading" id="part-iv"&gt;
 Part IV&lt;span class="heading__anchor"&gt; &lt;a href="#part-iv"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Michaelmas Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="61"&gt;
&lt;li&gt;Topics in Geometric Group Theory&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Lent Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="62"&gt;
&lt;li&gt;Topics in Number Theory&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Easter Term&lt;/strong&gt;&lt;/p&gt;
&lt;ol start="63"&gt;
&lt;li&gt;Bounded Cohomology&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Daniel Raban's Note Repository Notes (Vietnamese)</title><link>https://blog.namln.org/en/topics/lecture-notes/dr-notes/</link><pubDate>Wed, 11 Jan 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/lecture-notes/dr-notes/</guid><description>&lt;h2 class="heading" id="daniel-rabans-note-repository"&gt;
 Daniel Raban&amp;rsquo;s Note Repository&lt;span class="heading__anchor"&gt; &lt;a href="#daniel-rabans-note-repository"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;[UCLA] Math 206A: Combinatorial Discrete Geometry (Igor Pak, F18): &lt;a href="https://pillowmath.github.io/Math%20206A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 206B: Algebraic Combinatorics (Igor Pak, W19): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 210A: Algebra (Romyar Sharifi, F18): &lt;a href="https://pillowmath.github.io/Math%20210A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 210B: Algebra (Romyar Sharifi, W19): &lt;a href="https://pillowmath.github.io/Math%20210B/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 210C: Algebra (Romyar Sharifi, Sp19): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 245B: Real Analysis (Tim Austin, W19): &lt;a href="https://pillowmath.github.io/Math%20245B/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 245C: Real Analysis (Wilfrid Gangbo, Sp19): &lt;a href="https://pillowmath.github.io/Math%20245C/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 246A: Complex Analysis (John Garnett, F18): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 246B: Complex Analysis (Michael Hitrik, W19): &lt;a href="https://pillowmath.github.io/Math%20246B/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 246C: Complex Analysis (Michael Hitrik, Sp19): &lt;a href="https://pillowmath.github.io/Math%20246C/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 247A: Classical Fourier Analysis (Monica Visan, W20): &lt;a href="https://pillowmath.github.io/Math%20247A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 254A: Topics in Entropy and Statistical Mechanics (Tim Austin, Sp21): &lt;a href="https://pillowmath.github.io/Math%20254A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 254B: Ergodic Theory and Fractals (Tim Austin, Sp19): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 255A: Functional Analysis (Michael Hitrik, F18): &lt;a href="https://pillowmath.github.io/Math%20255A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 255A&amp;rsquo;: Functional Analysis (Tim Austin, F19): &lt;a href="https://pillowmath.github.io/Math%20255A%27/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 255B: Functional Analysis (Michael Hitrik, W20): &lt;a href="https://pillowmath.github.io/Math%20255B/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 259A: Operator Algebras in Hilbert Space (Sorin Popa, F19): &lt;a href="https://pillowmath.github.io/Math%20259A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UCLA] Math 275D: Stochastic Calculus (Jun Yin, F19): &lt;a href="https://pillowmath.github.io/Math%20275D/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] CS 294: Analysis of Boolean Functions (Avishay Tal, Sp23): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] EE 229A: Information Theory and Coding (Venkat Anantharam, F21): &lt;a href="https://pillowmath.github.io/EE%20229A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math 142: Algebraic Topology (Jamie Conway, Sp18): &lt;a href="https://pillowmath.github.io/Math%20142/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math 222A: Partial Differential Equations (Daniel Tataru, F21): &lt;a href="https://pillowmath.github.io/Math%20222A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math 222B: Partial Differential Equations (Sung-Jin Oh, Sp22): &lt;a href="https://pillowmath.github.io/Math%20222B/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math &lt;a href="https://en.wikipedia.org/wiki/249_(number)"&gt;249&lt;/a&gt;: Algebraic Combinatorics (Mark Haiman, F17): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math 250A: Groups, Rings, and Fields (&lt;a href="Borcherds/borcherds.html"&gt;Richard Borcherds&lt;/a&gt;, F17): &lt;a href="https://pillowmath.github.io/Math%20250A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math 272: Theory of Combinatorial Limits (Dan Král, S25): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Math 279: Topics in Stochastic Partial Differential Equations (Fraydoun Rezakhanlou, F21): &lt;a href="https://pillowmath.github.io/Math%20279/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Stat 155: Game Theory (Oscar Hernan Madrid Padilla, Sp18): &lt;a href="https://pillowmath.github.io/Stat%20155/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Stat 206B: Stochastic Processes (Jim Pitman, Sp 18): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Stat C206B: Statics and Dynamics of Random Surfaces (Shirshendu Ganguly, Sp 22): [PDF], &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Stat 210A: Theoretical Statistics (Will Fithian, F21): &lt;a href="https://pillowmath.github.io/Stat%20210A/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[UC Berkeley] Stat 210B: High-Dimensional Statistics (Song Mei, Sp22): &lt;a href="https://pillowmath.github.io/Stat%20210B/main.pdf"&gt;PDF&lt;/a&gt;, &lt;a href=""&gt;PDF (Vi)&lt;/a&gt;&lt;/p&gt;</description></item><item><title>List of Ebooks for Data Science</title><link>https://blog.namln.org/en/topics/ds/books/</link><pubDate>Wed, 11 Jan 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/ds/books/</guid><description>&lt;h2 class="heading" id="the-law---the-mathematical-foundations"&gt;
 The Law - The mathematical foundations&lt;span class="heading__anchor"&gt; &lt;a href="#the-law---the-mathematical-foundations"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.amazon.com/Statistical-Inference-George-Casella/dp/0534243126"&gt;Statistical Inference&lt;/a&gt; - Casella &amp;amp; Berger&lt;/li&gt;
&lt;li&gt;&lt;a href="https://foundations-of-applied-mathematics.github.io/"&gt;Foundations of Applied Mathematics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="history---foundational-works-that-provide-additional-context-for-more-advanced-concepts"&gt;
 History - Foundational works that provide additional context for more advanced concepts&lt;span class="heading__anchor"&gt; &lt;a href="#history---foundational-works-that-provide-additional-context-for-more-advanced-concepts"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://web.stanford.edu/%7Eboyd/cvxbook/"&gt;Convex Optimization&lt;/a&gt; - Boyd &amp;amp; Vandenberghe&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/dp/0521592712"&gt;Probability Theory: The Logic of Science&lt;/a&gt; - Jaynes&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882?tag=hackr-20&amp;amp;geniuslink=true"&gt;Clean Code&lt;/a&gt; - Martin&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="poetry---prose-type-works"&gt;
 Poetry - Prose type works&lt;span class="heading__anchor"&gt; &lt;a href="#poetry---prose-type-works"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Art-Data-Analysis-Question-Statistics/dp/1118411315"&gt;The Art of Data Analysis&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Signal-Noise-Many-Predictions-Fail-but/dp/0143125087"&gt;Why Predictions Fail&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Weapons-Math-Destruction-Increases-Inequality/dp/0553418815"&gt;Weapons of Math Destruction&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="major-prophets---seminal-works-on-major-topics"&gt;
 Major Prophets - Seminal works on major topics&lt;span class="heading__anchor"&gt; &lt;a href="#major-prophets---seminal-works-on-major-topics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Applied-Regression-Analysis-Probability-Statistics/dp/0471170828"&gt;Applied Regression Analysis&lt;/a&gt; - Draper &amp;amp; Smith&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247"&gt;The Data Warehouse Toolkit&lt;/a&gt; - Kimball&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="http://www.stat.columbia.edu/%7Egelman/book/"&gt;Bayesian Data Analysis&lt;/a&gt; - Gelman&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://otexts.com/fpp3/"&gt;Forecasting: Principles and Practices&lt;/a&gt; - Hyndman &amp;amp; Athanasopoulos&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="minor-prophets---important-works-but-not-quite-at-the-level-of-the-ds-major-prophets"&gt;
 Minor Prophets - Important works, but not quite at the level of the DS Major Prophets&lt;span class="heading__anchor"&gt; &lt;a href="#minor-prophets---important-works-but-not-quite-at-the-level-of-the-ds-major-prophets"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.mostlyharmlesseconometrics.com/"&gt;Mostly Harmless Econometrics&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://matheusfacure.github.io/python-causality-handbook/landing-page.html"&gt;Causal Inference for the Brave and True&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Trustworthy-Online-Controlled-Experiments-Practical/dp/1108724264"&gt;Trustworthy Online Controlled Experiments&lt;/a&gt;&lt;/p&gt;
&lt;h2 class="heading" id="the-gospels---the-fulfillment-of-the-ds-law"&gt;
 The Gospels - The fulfillment of the DS Law&lt;span class="heading__anchor"&gt; &lt;a href="#the-gospels---the-fulfillment-of-the-ds-law"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.statlearning.com/"&gt;Introduction to Statistical Learning&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://hastie.su.domains/ElemStatLearn/"&gt;The Elements of Statistical Learning&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.deeplearningbook.org/"&gt;Deep Learning&lt;/a&gt; - Goodfellow&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="history-pt-2---data-science-goes-to-the-gentiles-non-dsexecs"&gt;
 History Pt. 2 - Data science goes to the Gentiles (non-DS/execs)&lt;span class="heading__anchor"&gt; &lt;a href="#history-pt-2---data-science-goes-to-the-gentiles-non-dsexecs"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Data-Science-Executives-Leveraging-Intelligence/dp/1544511256"&gt;Data Science for Executives&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/dp/1119002257"&gt;Storytelling with Data: a Guide to Data Visualization&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="heading" id="letters---further-explanation-and-interpretation-of-the-ds-gospel"&gt;
 Letters - Further explanation and interpretation of the DS Gospel&lt;span class="heading__anchor"&gt; &lt;a href="#letters---further-explanation-and-interpretation-of-the-ds-gospel"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://probml.github.io/pml-book/"&gt;Machine Learning: a Probabilistic Perspective&lt;/a&gt; - Murphy&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://r4ds.had.co.nz/index.html"&gt;R for Data Science&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://www.amazon.com/Python-Machine-Learning-scikit-learn-TensorFlow-ebook/dp/B0742K7HYF"&gt;Python Machine Learning&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>List of Github Repository for Data Science</title><link>https://blog.namln.org/en/topics/ds/github/</link><pubDate>Wed, 11 Jan 2023 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/ds/github/</guid><description>&lt;ol&gt;
&lt;li&gt;The Data Engineering Cookbook, &lt;a href="https://github.com/andkret/Cookbook?tab=readme-ov-file"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A curated list of data engineering tools for software developers, &lt;a href="https://github.com/igorbarinov/awesome-data-engineering"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Data Engineering Zoomcamp, &lt;a href="https://github.com/DataTalksClub/data-engineering-zoomcamp"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Python Data Science Handbook: full text in Jupyter Notebooks, &lt;a href="https://github.com/jakevdp/PythonDataScienceHandbook"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Data Science for Beginners - A Curriculum, &lt;a href="https://github.com/microsoft/Data-Science-For-Beginners"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. &lt;a href="https://github.com/donnemartin/data-science-ipython-notebooks"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Papers &amp;amp; tech blogs by companies sharing their work on data science &amp;amp; machine learning in production. &lt;a href="https://github.com/eugeneyan/applied-ml"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;An awesome Data Science repository to learn and apply for real world problems. &lt;a href="https://github.com/academic/awesome-datascience"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;List of Data Science Cheatsheets to rule the world, &lt;a href="https://github.com/FavioVazquez/ds-cheatsheets"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Data science interview questions and answers, &lt;a href="https://github.com/alexeygrigorev/data-science-interviews"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A curated list of applied machine learning and data science notebooks and libraries across different industries, &lt;a href="https://github.com/firmai/industry-machine-learning"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A curated list of data science blogs, &lt;a href="https://github.com/rushter/data-science-blogs"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Optimization Research Papers in JMLR Volume 23</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v23/</link><pubDate>Thu, 29 Sep 2022 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v23/</guid><description>&lt;h1 class="heading" id="optimization-research-papers-in-jmlr-volume-23-2022"&gt;
 Optimization Research Papers in JMLR Volume 23 (2022)&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-research-papers-in-jmlr-volume-23-2022"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;This document lists papers from JMLR Volume 23 (2022) that focus on optimization research, categorized by their primary themes. Each paper is numbered starting from 1 within its subsection, with a brief description of its key contributions to optimization theory, algorithms, or applications.&lt;/p&gt;
&lt;h2 class="heading" id="convex-optimization"&gt;
 Convex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing convex optimization problems, including sparse PCA, L1-regularized SVMs, and metric-constrained problems.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving Large-Scale Sparse PCA to Certifiable (Near) Optimality&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dimitris Bertsimas, Ryan Cory-Wright, Jean Pauphilet&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops convex optimization techniques for large-scale sparse principal component analysis with certifiable near-optimal solutions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Novel Min-Max Reformulations of Linear Inverse Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mohammed Rayyan Sheriff, Debasish Chatterjee&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes min-max reformulations for linear inverse problems using convex optimization frameworks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;New Insights for the Multivariate Square-Root Lasso&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Aaron J. Molstad&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the square-root Lasso in multivariate settings, focusing on its convex optimization properties.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Towards An Efficient Approach for the Nonconvex lp Ball Projection: Algorithm and Analysis&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xiangyu Yang, Jiashan Wang, Hao Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops efficient algorithms for lp ball projection, addressing both convex and nonconvex aspects.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Solving L1-Regularized SVMs and Related Linear Programs: Revisiting the Effectiveness of Column and Constraint Generation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Antoine Dedieu, Rahul Mazumder, Haoyue Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates L1-regularized SVMs using convex optimization with column and constraint generation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Extensions to the Proximal Distance Method of Constrained Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alfonso Landeros, Oscar Hernan Madrid Padilla, Hua Zhou, Kenneth Lange&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Extends the proximal distance method for constrained convex optimization problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Subgradient for Composite Convex Optimization with Functional Constraints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ion Necoara, Nitesh Kumar Singh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes stochastic subgradient methods for composite convex optimization with functional constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Regularized Square-Root Regression Problems: Distributionally Robust Interpretation and Fast Computations&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hong T.M. Chu, Kim-Chuan Toh, Yangjing Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies regularized square-root regression with a distributionally robust perspective and efficient computational methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Project and Forget: Solving Large-Scale Metric Constrained Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rishi Sonthalia, Anna C. Gilbert&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a convex optimization approach for large-scale metric-constrained problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Faster Randomized Interior Point Methods for Tall/Wide Linear Programs&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Agniva Chowdhury, Gregory Dexter, Palma London, Haim Avron, Petros Drineas&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops randomized interior point methods for efficient optimization of tall/wide linear programs.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="nonconvex-optimization"&gt;
 Nonconvex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#nonconvex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers tackling nonconvex optimization, focusing on optimality, stability, and convergence in nonsmooth and game settings.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimality and Stability in Non-Convex Smooth Games&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Guojun Zhang, Pascal Poupart, Yaoliang Yu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes optimality and stability in nonconvex smooth games with convergence guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhize Li, Jian Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes simple and optimal stochastic gradient methods for nonsmooth, nonconvex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Oracle Complexity in Nonsmooth Nonconvex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Guy Kornowski, Ohad Shamir&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies the oracle complexity of nonsmooth nonconvex optimization problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Brian Swenson, Ryan Murray, H. Vincent Poor, Soummya Kar&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates distributed SGD for nonconvex, nonsmooth optimization with convergence to local minima.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="stochastic-optimization"&gt;
 Stochastic Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#stochastic-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on stochastic optimization methods, including bundle methods, zeroth-order algorithms, and adaptive techniques.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Stochastic Bundle Method for Interpolation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alasdair Paren, Leonard Berrada, Rudra P. K. Poudel, M. Pawan Kumar&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a stochastic bundle method for efficient interpolation in optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Biased Stochastic Gradient Estimation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Derek Driggs, Jingwei Liang, Carola-Bibiane Schönlieb&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes biases in stochastic gradient estimation and their impact on optimization performance.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes accelerated zeroth-order and first-order momentum methods for a range of optimization problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Zeroth-Order Optimization under Nonstationarity and Nonconvexity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant Mohapatra&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies zeroth-order optimization in nonstationary and nonconvex settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerating Adaptive Cubic Regularization of Newton’s Method via Random Sampling&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xi Chen, Bo Jiang, Tianyi Lin, Shuzhong Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Enhances Newton’s method with adaptive cubic regularization using random sampling.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Momentumized, Adaptive, Dual Averaged Gradient Method&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Aaron Defazio, Samy Jelassi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a momentum-based adaptive gradient method for stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic DCA with Variance Reduction and Applications in Machine Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hoai An Le Thi, Hoang Phuc Hau Luu, Hoai Minh Le, Tao Pham Dinh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a stochastic difference-of-convex-functions algorithm with variance reduction for machine learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Robust Distributed Accelerated Stochastic Gradient Methods for Multi-Agent Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alireza Fallah, Mert Gürbüzbalaban, Asuman Ozdaglar, Umut Şimşekli, Lingjiong Zhu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes robust stochastic gradient methods for distributed optimization in multi-agent networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Acceleration for Convex Composite Minimization with Noise-Corrupted Gradients and Approximate Proximal Mapping&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Qiang Zhou, Sinno Jialin Pan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Addresses acceleration in convex composite minimization with noisy gradients.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Asymptotic Study of Stochastic Adaptive Algorithms in Non-Convex Landscape&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Sébastien Gadat, Ioana Gavra&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the asymptotic behavior of stochastic adaptive algorithms in nonconvex settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Congliang Chen, Li Shen, Fangyu Zou, Wei Liu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies the Adam optimizer, focusing on nonconvexity, convergence, and mini-batch acceleration.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Efficient Sampling Algorithm for Non-Smooth Composite Potentials&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, Peter L. Bartlett&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops an efficient sampling algorithm for nonsmooth composite potentials in stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SGD with Coordinate Sampling: Theory and Practice&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rémi Leluc, François Portier&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores coordinate sampling in stochastic gradient descent with theoretical and practical insights.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="distributeddecentralized-optimization"&gt;
 Distributed/Decentralized Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#distributeddecentralized-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing distributed or decentralized optimization algorithms, focusing on communication efficiency and convergence.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Asymptotic Network Independence and Step-Size for a Distributed Subgradient Method&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alex Olshevsky&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes step-size and convergence for a distributed subgradient optimization method.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Projection-Free Distributed Online Learning with Sublinear Communication Complexity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuanyu Wan, Guanghui Wang, Wei-Wei Tu, Lijun Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops projection-free algorithms for distributed online learning with reduced communication complexity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Huan Li, Zhouchen Lin, Yongchun Fang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes variance-reduced methods for decentralized optimization with optimal acceleration.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="submodular-optimization"&gt;
 Submodular Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#submodular-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on submodular optimization, particularly in model selection.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Joint Continuous and Discrete Model Selection via Submodularity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jonathan Bunton, Paulo Tabuada&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Uses submodularity for joint continuous and discrete model selection in optimization.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bandits-and-online-learning"&gt;
 Bandits and Online Learning&lt;span class="heading__anchor"&gt; &lt;a href="#bandits-and-online-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing multi-armed bandits, online optimization, and regret minimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yu-Guan Hsieh, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies multi-agent online optimization with delays, focusing on asynchronicity and optimism.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online Mirror Descent and Dual Averaging: Keeping Pace in the Dynamic Case&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Huang Fang, Nicholas J. A. Harvey, Victor S. Portella, Michael P. Friedlander&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes online mirror descent and dual averaging for dynamic online optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;No Weighted-Regret Learning in Adversarial Bandits with Delays&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ilai Bistritz, Zhengyuan Zhou, Xi Chen, Nicholas Bambos, Jose Blanchet&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates regret minimization in adversarial bandits with delays.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Aurélien Garivier, Hédi Hadiji, Pierre Ménard, Gilles Stoltz&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides optimal regret bounds for stochastic bandits using KL-UCB-Switch.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Multi-Agent Multi-Armed Bandits with Limited Communication&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mridul Agarwal, Vaneet Aggarwal, Kamyar Azizzadenesheli&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores multi-agent bandits with limited communication, focusing on regret minimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Nonstochastic Bandits with Composite Anonymous Feedback&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Claudio Gentile, Yishay Mansour&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies nonstochastic bandits with composite feedback, analyzing regret and optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Expected Regret and Pseudo-Regret are Equivalent When the Optimal Arm is Unique&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Daron Anderson, Douglas J. Leith&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proves equivalence of expected regret and pseudo-regret in specific bandit settings.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bayesian-and-hyperparameter-optimization"&gt;
 Bayesian and Hyperparameter Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#bayesian-and-hyperparameter-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing Bayesian optimization and hyperparameter tuning for efficient optimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, René Sass, Frank Hutter&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Presents SMAC3, a versatile Bayesian optimization package for hyperparameter tuning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Implicit Differentiation for Fast Hyperparameter Selection in Non-Smooth Convex Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Quentin Bertrand, Quentin Klopfenstein, Mathurin Massias, Mathieu Blondel, Samuel Vaiter, Alexandre Gramfort, Joseph Salmon&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Uses implicit differentiation for efficient hyperparameter selection in nonsmooth convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Auto-Sklearn 2.0: Hands-Free AutoML via Meta-Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Matthias Feurer, Katharina Eggensperger, Stefan Falkner, Marius Lindauer, Frank Hutter&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces Auto-Sklearn 2.0, leveraging meta-learning for automated hyperparameter optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="optimization-in-reinforcement-learning"&gt;
 Optimization in Reinforcement Learning&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-in-reinforcement-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on optimization techniques for reinforcement learning, including policy gradient and value estimation.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Generalized Projected Bellman Error for Off-Policy Value Estimation in Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Andrew Patterson, Adam White, Martha White&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops optimization methods for off-policy value estimation using a generalized projected Bellman error.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates greedification operators for policy optimization, focusing on KL divergences.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yanwei Jia, Xun Yu Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes policy gradient and actor-critic methods for continuous-time RL optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Convergence Rates of Policy Gradient Methods&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Lin Xiao&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies convergence rates of policy gradient methods in reinforcement learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor-Critic under State Distribution Mismatch&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Shangtong Zhang, Remi Tachet des Combes, Romain Laroche&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Examines global optimality in softmax off-policy actor-critic methods under distribution mismatch.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="other-optimization-topics"&gt;
 Other Optimization Topics&lt;span class="heading__anchor"&gt; &lt;a href="#other-optimization-topics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers covering miscellaneous optimization topics, including proximal algorithms, tensor completion, and learning-to-optimize frameworks.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;TFPnP: Tuning-Free Plug-and-Play Proximal Algorithms with Applications to Inverse Imaging Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Hua Huang, Carola-Bibiane Schönlieb&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces tuning-free proximal algorithms for inverse imaging problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Complexity of Approximating Multimarginal Optimal Transport&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tianyi Lin, Nhat Ho, Marco Cuturi, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the complexity of approximating multimarginal optimal transport problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Riemannian Stochastic Proximal Gradient Methods for Nonsmooth Optimization over the Stiefel Manifold&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Bokun Wang, Shiqian Ma, Lingzhou Xue&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes stochastic proximal gradient methods for nonsmooth optimization on the Stiefel manifold.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Provable Tensor-Train Format Tensor Completion by Riemannian Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jian-Feng Cai, Jingyang Li, Dong Xia&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops Riemannian optimization for tensor-train format tensor completion.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Let’s Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Julie Nutini, Issam Laradji, Mark Schmidt&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Enhances block coordinate descent with faster convergence techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Efficiency of Entropic Regularized Algorithms for Optimal Transport&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tianyi Lin, Nhat Ho, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies entropic regularization for efficient optimal transport algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Explicit Convergence Rates of Greedy and Random Quasi-Newton Methods&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dachao Lin, Haishan Ye, Zhihua Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides explicit convergence rates for greedy and random quasi-Newton methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tian Tong, Cong Ma, Ashley Prater-Bennette, Erin Tripp, Yuejie Chi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Addresses nonconvex low-rank tensor estimation with provable guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning to Optimize: A Primer and A Benchmark&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, Wotao Yin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a primer and benchmark for learning-to-optimize techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Clustering with Semidefinite Programming and Fixed Point Iteration&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Pedro Felzenszwalb, Caroline Klivans, Alice Paul&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Uses semidefinite programming and fixed-point iteration for clustering optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Bregman Learning Framework for Sparse Neural Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Leon Bungert, Tim Roith, Daniel Tenbrinck, Martin Burger&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a Bregman learning framework for optimizing sparse neural networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yoav Freund, Yi-An Ma, Tong Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes dimension-independent convergence of Langevin algorithms from a composite optimization perspective.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse Continuous Distributions and Fenchel-Young Losses&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores sparse continuous distributions using Fenchel-Young losses for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Handling Hard Affine SDP Shape Constraints in RKHSs&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Pierre-Cyril Aubin-Frankowski, Zoltan Szabo&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Addresses affine SDP constraints in reproducing kernel Hilbert spaces for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;OMLT: Optimization &amp;amp; Machine Learning Toolkit&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Francesco Ceccon, Jordan Jalving, Joshua Haddad, Alexander Thebelt, Calvin Tsay, Carl D Laird, Ruth Misener&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Presents OMLT, a toolkit integrating optimization and machine learning techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Optimization Research Papers in JMLR Volume 22</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v22/</link><pubDate>Wed, 29 Sep 2021 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v22/</guid><description>&lt;h1 class="heading" id="optimization-research-papers-in-jmlr-volume-22-2021"&gt;
 Optimization Research Papers in JMLR Volume 22 (2021)&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-research-papers-in-jmlr-volume-22-2021"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;This document lists papers from JMLR Volume 22 (2021) that focus on optimization research, categorized by their primary themes. Each paper is numbered starting from 1 within its subsection, with a brief description of its key contributions to optimization theory, algorithms, or applications.&lt;/p&gt;
&lt;h2 class="heading" id="convex-optimization"&gt;
 Convex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing convex optimization problems, including clustering, Wasserstein barycenters, sparse optimization, and bandits.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Defeng Sun, Kim-Chuan Toh, Yancheng Yuan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a convex clustering model with theoretical guarantees and an efficient algorithm.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Fast Globally Linearly Convergent Algorithm for the Computation of Wasserstein Barycenters&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Lei Yang, Jia Li, Defeng Sun, Kim-Chuan Toh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a fast, globally linearly convergent algorithm for computing Wasserstein barycenters.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Wasserstein Barycenters Can Be Computed in Polynomial Time in Fixed Dimension&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jason M. Altschuler, Enric Boix-Adsera&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Demonstrates that Wasserstein barycenters can be computed in polynomial time for fixed dimensions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;From Low Probability to High Confidence in Stochastic Convex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Damek Davis, Dmitriy Drusvyatskiy, Lin Xiao, Junyu Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes methods to achieve high-confidence solutions in stochastic convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse and Smooth Signal Estimation: Convexification of L0-Formulations&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alper Atamturk, Andres Gomez, Shaoning Han&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes convexification techniques for L0-formulations in sparse and smooth signal estimation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Proximal AUC Maximization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yunwen Lei, Yiming Ying&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops stochastic proximal methods for maximizing the area under the ROC curve (AUC) in convex settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sparse Convex Optimization via Adaptively Regularized Hard Thresholding&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kyriakos Axiotis, Maxim Sviridenko&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces adaptively regularized hard thresholding for sparse convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores continuous and mixed-integer optimization approaches for learning sparse classifiers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;First-Order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mingrui Liu, Hassan Rafique, Qihang Lin, Tianbao Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides first-order convergence theory for weakly convex-weakly concave min-max problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convex Geometry and Duality of Over-parameterized Neural Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tolga Ergen, Mert Pilanci&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convex geometry and duality in over-parameterized neural networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Linear Bandits on Uniformly Convex Sets&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Thomas Kerdreux, Christophe Roux, Alexandre d&amp;rsquo;Aspremont, Sebastian Pokutta&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies linear bandits on uniformly convex sets, focusing on convex optimization techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="nonconvex-optimization"&gt;
 Nonconvex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#nonconvex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers tackling nonconvex optimization, including stochastic gradient descent, neural network training, and stability properties.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Online Stochastic Gradient Descent on Non-Convex Losses from High-Dimensional Inference&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Gerard Ben Arous, Reza Gheissari, Aukosh Jagannath&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes online stochastic gradient descent for nonconvex losses in high-dimensional inference.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Non-attracting Regions of Local Minima in Deep and Wide Neural Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Henning Petzka, Cristian Sminchisescu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates non-attracting regions of local minima in deep and wide neural networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;When Does Gradient Descent with Logistic Loss Find Interpolating Two-Layer Networks?&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Niladri S. Chatterji, Philip M. Long, Peter L. Bartlett&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Examines conditions under which gradient descent with logistic loss finds interpolating two-layer networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Replica Exchange for Non-Convex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jing Dong, Xin T. Tong&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes replica exchange methods for nonconvex optimization problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Failures of Model-Dependent Generalization Bounds for Least-Norm Interpolation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Peter L. Bartlett, Philip M. Long&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes limitations of model-dependent generalization bounds in least-norm interpolation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Stability Properties and the Optimization Landscape of Training Problems with Squared Loss for Neural Networks and General Nonlinear Conic Approximation Schemes&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Constantin Christof&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies stability and optimization landscapes for neural network training with squared loss.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="stochastic-optimization"&gt;
 Stochastic Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#stochastic-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on stochastic optimization methods, including momentum, Langevin dynamics, and communication-efficient algorithms.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Continuous Time Analysis of Momentum Methods&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nikola B. Kovachki, Andrew M. Stuart&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a continuous-time analysis of momentum methods in stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generalization Performance of Multi-pass Stochastic Gradient Descent with Convex Loss Functions&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yunwen Lei, Ting Hu, Ke Tang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes generalization performance of multi-pass stochastic gradient descent for convex losses.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Wenlong Mou, Yi-An Ma, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops an accelerated MCMC algorithm using high-order Langevin diffusion.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Path Length Bounds for Gradient Descent and Flow&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Establishes path length bounds for gradient descent and flow in stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Michael Muehlebach, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes momentum-based optimization from dynamical, control-theoretic, and symplectic perspectives.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;L-SVRG and L-Katyusha with Arbitrary Sampling&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xun Qian, Zheng Qu, Peter Richtárik&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces L-SVRG and L-Katyusha algorithms with arbitrary sampling for stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Lyapunov Analysis of Accelerated Methods in Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ashia C. Wilson, Ben Recht, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a Lyapunov analysis for accelerated optimization methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;NUQSGD: Provably Communication-Efficient Data-Parallel SGD via Nonuniform Quantization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes NUQSGD, a communication-efficient stochastic gradient descent method using nonuniform quantization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Inertial Newton Algorithm for Deep Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops an inertial Newton algorithm for deep learning optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tian Tong, Cong Ma, Yuejie Chi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes scaled gradient descent for accelerating ill-conditioned low-rank matrix estimation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On ADMM in Deep Learning: Convergence and Saturation-Avoidance&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jinshan Zeng, Shao-Bo Lin, Yuan Yao, Ding-Xuan Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence and saturation-avoidance properties of ADMM in deep learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Unified Convergence Analysis for Shuffling-Type Gradient Methods&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Lam M. Nguyen, Quoc Tran-Dinh, Dzung T. Phan, Phuong Ha Nguyen, Marten van Dijk&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a unified convergence analysis for shuffling-type gradient methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Online Optimization Using Kalman Recursion&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Joseph de Vilmarest, Olivier Wintenberger&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies Kalman recursion to stochastic online optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Expanding Boundaries of Gap Safe Screening&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Cassio F. Dantas, Emmanuel Soubies, Cédric Févotte&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Expands gap safe screening techniques for stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Massimo Fornasier, Lorenzo Pareschi, Hui Huang, Philippe Sünnen&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops consensus-based optimization on the sphere with applications to machine learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes decentralized stochastic gradient Langevin dynamics and Hamiltonian Monte Carlo methods.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="distributeddecentralized-optimization"&gt;
 Distributed/Decentralized Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#distributeddecentralized-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing distributed or decentralized optimization algorithms, focusing on communication efficiency and scalability.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Projection-Free Decentralized Online Learning for Submodular Maximization over Time-Varying Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Junlong Zhu, Qingtao Wu, Mingchuan Zhang, Ruijuan Zheng, Keqin Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops projection-free decentralized online learning for submodular maximization over time-varying networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Communication-Efficient Distributed Covariance Sketch, with Application to Distributed PCA&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zengfeng Huang, Xuemin Lin, Wenjie Zhang, Ying Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a communication-efficient distributed covariance sketch for distributed PCA.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal Rates of Distributed Regression with Imperfect Kernels&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hongwei Sun, Qiang Wu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Establishes optimal rates for distributed regression with imperfect kernels.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;One-Shot Federated Learning: Theoretical Limits and Algorithms to Achieve Them&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Saber Salehkaleybar, Arsalan Sharifnassab, S. Jamaloddin Golestani&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes theoretical limits and algorithms for one-shot federated learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cooperative SGD: A Unified Framework for the Design and Analysis of Local-Update SGD Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jianyu Wang, Gauri Joshi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a unified framework for designing and analyzing local-update SGD algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DeEPCA: Decentralized Exact PCA with Linear Convergence Rate&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Haishan Ye, Tong Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops DeEPCA, a decentralized exact PCA method with linear convergence.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="submodular-optimization"&gt;
 Submodular Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#submodular-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on submodular optimization, particularly in experimental design.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Batch Greedy Maximization of Non-Submodular Functions: Guarantees and Applications to Experimental Design&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jayanth Jagalur-Mohan, Youssef Marzouk&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides guarantees for batch greedy maximization of non-submodular functions with applications to experimental design.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bandits-and-online-learning"&gt;
 Bandits and Online Learning&lt;span class="heading__anchor"&gt; &lt;a href="#bandits-and-online-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing multi-armed bandits, online optimization, and regret minimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Regulating Greed Over Time in Multi-Armed Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Stefano Tracà, Cynthia Rudin, Weiyu Yan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies methods to regulate greed over time in multi-armed bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preference-Based Online Learning with Dueling Bandits: A Survey&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Viktor Bengs, Róbert Busa-Fekete, Adil El Mesaoudi-Paul, Eyke Hüllermeier&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Surveys preference-based online learning with dueling bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Multi-Armed Bandit Designs for Dose-Finding Trials&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Maryam Aziz, Emilie Kaufmann, Marie-Karelle Riviere&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores multi-armed bandit designs for dose-finding trials.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Julian Zimmert, Yevgeny Seldin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes Tsallis-INF, an optimal algorithm for stochastic and adversarial bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Bandit Convex Optimization in Non-Stationary Environments&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Peng Zhao, Guanghui Wang, Lijun Zhang, Zhi-Hua Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Addresses bandit convex optimization in non-stationary environments.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Contextual Bandit Bake-off&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alberto Bietti, Alekh Agarwal, John Langford&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Compares contextual bandit algorithms in a comprehensive evaluation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;MetaGrad: Adaptation Using Multiple Learning Rates in Online Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Tim van Erven, Wouter M. Koolen, Dirk van der Hoeven&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces MetaGrad, an adaptive online learning algorithm with multiple learning rates.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Achieving Fairness in the Stochastic Multi-Armed Bandit Problem&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Vishakha Patil, Ganesh Ghalme, Vineet Nair, Y. Narahari&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops methods for achieving fairness in stochastic multi-armed bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Refined Approachability Algorithms and Application to Regret Minimization with Global Costs&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Joon Kwon&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes refined approachability algorithms for regret minimization with global costs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Bandit Learning in Decentralized Matching Markets&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Lydia T. Liu, Feng Ruan, Horia Mania, Michael I. Jordan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies bandit learning to decentralized matching markets.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Thompson Sampling Algorithms for Cascading Bandits&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zixin Zhong, Wang Chi Chueng, Vincent Y. F. Tan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops Thompson sampling algorithms for cascading bandits.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fast Learning for Renewal Optimization in Online Task Scheduling&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Michael J. Neely&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes fast learning methods for renewal optimization in online task scheduling.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bayesian-and-hyperparameter-optimization"&gt;
 Bayesian and Hyperparameter Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#bayesian-and-hyperparameter-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing Bayesian optimization and hyperparameter tuning for scalable and robust optimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Empirical Study of Bayesian Optimization: Acquisition Versus Partition&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Erich Merrill, Alan Fern, Xiaoli Fern, Nima Dolatnia&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Conducts an empirical study comparing acquisition and partition strategies in Bayesian optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Hyperparameter Optimization via Sequential Uniform Designs&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zebin Yang, Aijun Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes sequential uniform designs for hyperparameter optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Are We Forgetting about Compositional Optimisers in Bayesian Optimisation?&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Antoine Grosnit, Alexander I. Cowen-Rivers, Rasul Tutunov, Ryan-Rhys Griffiths, Jun Wang, Haitham Bou-Ammar&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores the role of compositional optimizers in Bayesian optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GIBBON: General-Purpose Information-Based Bayesian Optimisation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Henry B. Moss, David S. Leslie, Javier Gonzalez, Paul Rayson&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces GIBBON, a general-purpose information-based Bayesian optimization framework.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On lp-Hyperparameter Learning via Bilevel Nonsmooth Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Takayuki Okuno, Akiko Takeda, Akihiro Kawana, Motokazu Watanabe&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies lp-hyperparameter learning using bilevel nonsmooth optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="optimization-in-reinforcement-learning"&gt;
 Optimization in Reinforcement Learning&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-in-reinforcement-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on optimization techniques for reinforcement learning, including policy iteration and Q-learning.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alberto Maria Metelli, Matteo Pirotta, Daniele Calandriello, Marcello Restelli&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a safe policy iteration method with monotonic improvement for reinforcement learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alekh Agarwal, Sham M. Kakade, Jason D. Lee, Gaurav Mahajan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the optimality, approximation, and distribution shift in policy gradient methods.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Vikram Krishnamurthy, George Yin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies Langevin dynamics to adaptive inverse reinforcement learning for stochastic gradient algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jeongho Kim, Jaeuk Shin, Insoon Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops Hamilton-Jacobi deep Q-learning for deterministic continuous-time systems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Partial Policy Iteration for L1-Robust Markov Decision Processes&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Chin Pang Ho, Marek Petrik, Wolfram Wiesemann&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces partial policy iteration for L1-robust Markov decision processes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Gaussian Approximation for Bias Reduction in Q-Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Carlo D&amp;rsquo;Eramo, Andrea Cini, Alessandro Nuara, Matteo Pirotta, Cesare Alippi, Jan Peters, Marcello Restelli&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes Gaussian approximation techniques for bias reduction in Q-learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="other-optimization-topics"&gt;
 Other Optimization Topics&lt;span class="heading__anchor"&gt; &lt;a href="#other-optimization-topics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers covering miscellaneous optimization topics, including Newton methods, SVM training, and eigenvector computation.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Global and Quadratic Convergence of Newton Hard-Thresholding Pursuit&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Shenglong Zhou, Naihua Xiu, Hou-Duo Qi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes global and quadratic convergence of Newton hard-thresholding pursuit.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Two-Level Decomposition Framework Exploiting First and Second Order Information for SVM Training Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Giulio Galvan, Matteo Lapucci, Chih-Jen Lin, Marco Sciandrone&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a two-level decomposition framework for SVM training using first and second-order information.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Approximate Newton Methods&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Haishan Ye, Luo Luo, Zhihua Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops approximate Newton methods for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Guodong Zhang, Xuchan Bao, Laurent Lessard, Roger Grosse&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides a unified analysis of first-order methods for smooth games using integral quadratic constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;LassoNet: A Neural Network with Feature Sparsity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ismael Lemhadri, Feng Ruan, Louis Abraham, Robert Tibshirani&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces LassoNet, a neural network architecture promoting feature sparsity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;An Algorithmic View of L2 Regularization and Some Path-Following Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yunzhang Zhu, Renxiong Liu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Explores L2 regularization from an algorithmic perspective with path-following algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Ensmallen Library for Flexible Numerical Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Ryan R. Curtin, Marcus Edel, Rahul Ganesh Prabhu, Suryoday Basak, Zhihao Lou, Conrad Sanderson&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces the ensmallen library for flexible numerical optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Black-Box Reductions for Zeroth-Order Gradient Algorithms to Achieve Lower Query Complexity&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Bin Gu, Xiyuan Wei, Shangqian Gao, Ziran Xiong, Cheng Deng, Heng Huang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes black-box reductions for zeroth-order gradient algorithms to reduce query complexity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Riemannian Search for Eigenvector Computation&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhiqiang Xu, Ping Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops Riemannian search methods for eigenvector computation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Optimization Research Papers in JMLR Volume 21</title><link>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v21/</link><pubDate>Tue, 29 Sep 2020 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/mathematics/analysis/optimization/jmlr-v21/</guid><description>&lt;h1 class="heading" id="optimization-research-papers-in-jmlr-volume-21-2020"&gt;
 Optimization Research Papers in JMLR Volume 21 (2020)&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-research-papers-in-jmlr-volume-21-2020"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h1&gt;&lt;p&gt;This document lists papers from JMLR Volume 21 (2020) that focus on optimization research, categorized by their primary themes. Each paper is numbered starting from 1 within its subsection, with a brief description of its key contributions to optimization theory, algorithms, or applications.&lt;/p&gt;
&lt;h2 class="heading" id="convex-optimization"&gt;
 Convex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#convex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing convex optimization problems, including complexity bounds, convergence analysis, and applications in regression and assortment optimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Low Complexity Algorithm with O(√T) Regret and O(1) Constraint Violations for Online Convex Optimization with Long Term Constraints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Hao Yu, Michael J. Neely&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a low-complexity algorithm for online convex optimization with long-term constraints, achieving O(√T) regret and O(1) constraint violations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lower Bounds for Parallel and Randomized Convex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Jelena Diakonikolas, Cristóbal Guzmán&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Establishes lower complexity bounds for parallel and randomized algorithms in convex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Discerning the Linear Convergence of ADMM for Structured Convex Optimization through the Lens of Variational Analysis&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xiaoming Yuan, Shangzhi Zeng, Jin Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the linear convergence of ADMM for structured convex optimization using variational analysis.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Data Efficient and Feasible Level Set Method for Stochastic Convex Optimization with Expectation Constraints&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Qihang Lin, Selvaprabu Nadarajah, Negar Soheili, Tianbao Yang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a data-efficient level set method for stochastic convex optimization with expectation constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Conic Optimization for Quadratic Regression Under Sparse Noise&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Igor Molybog, Ramtin Madani, Javad Lavaei&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies conic optimization to quadratic regression under sparse noise conditions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Assortment Optimization with Changing Contextual Information&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xi Chen, Yining Wang, Yuan Zhou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Addresses dynamic assortment optimization with changing contextual information using convex optimization techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convex Programming for Estimation in Nonlinear Recurrent Models&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Sohail Bahmani, Justin Romberg&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Uses convex programming for parameter estimation in nonlinear recurrent models.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="nonconvex-optimization"&gt;
 Nonconvex Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#nonconvex-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers tackling nonconvex optimization, focusing on guarantees for local minima, variance reduction, and algorithmic advancements.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Salar Fattahi, Somayeh Sojoudi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides exact guarantees for the absence of spurious local minima in non-negative rank-1 robust PCA.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Nested Variance Reduction for Nonconvex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dongruo Zhou, Pan Xu, Quanquan Gu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces a stochastic nested variance reduction method for nonconvex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nhan H. Pham, Lam M. Nguyen, Dzung T. Phan, Quoc Tran-Dinh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes ProxSARAH, an efficient framework for stochastic composite nonconvex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convergence Rates for the Stochastic Gradient Descent Method for Non-Convex Objective Functions&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Benjamin Fehrman, Benjamin Gess, Arnulf Jentzen&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence rates of stochastic gradient descent for nonconvex objective functions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;AdaGrad Stepsizes: Sharp Convergence Over Nonconvex Landscapes&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rachel Ward, Xiaoxia Wu, Leon Bottou&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies sharp convergence of AdaGrad stepsize schedules in nonconvex optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Sparse Semismooth Newton Based Proximal Majorization-Minimization Algorithm for Nonconvex Square-Root-Loss Regression Problems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Peipei Tang, Chengjing Wang, Defeng Sun, Kim-Chuan Toh&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops a sparse semismooth Newton-based proximal majorization-minimization algorithm for nonconvex square-root-loss regression.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="stochastic-optimization"&gt;
 Stochastic Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#stochastic-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on stochastic optimization methods, including gradient descent, variance reduction, and robustness to noise.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Convergences of Regularized Algorithms and Stochastic Gradient Methods with Random Projections&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Junhong Lin, Volkan Cevher&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence of regularized algorithms and stochastic gradient methods with random projections.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dominic Richards, Patrick Rebeschini&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies graph-dependent implicit regularization in distributed stochastic subgradient descent.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a robust asynchronous stochastic gradient-push method with asymptotically optimal performance for strongly convex functions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xi Chen, Simon S. Du, Xin T. Tong&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Investigates stationary-point hitting time and ergodicity in stochastic gradient Langevin dynamics.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Aryan Mokhtari, Hamed Hassani, Amin Karbasi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Extends stochastic conditional gradient methods from convex minimization to submodular maximization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Aryan Mokhtari, Alec Koppel, Martin Takac, Alejandro Ribeiro&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces parallel doubly stochastic algorithms for large-scale learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yao Ma, Alex Olshevsky, Csaba Szepesvari, Venkatesh Saligrama&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies gradient descent to sparse rank-one matrix completion for crowd-sourced worker aggregation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal Convergence for Distributed Learning with Stochastic Gradient Methods and Spectral Algorithms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Junhong Lin, Volkan Cevher&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Establishes optimal convergence rates for distributed learning using stochastic gradient methods and spectral algorithms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Andrei Kulunchakov, Julien Mairal&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops estimate sequences for stochastic composite optimization with variance reduction and noise robustness.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Unified q-Memorization Framework for Asynchronous Stochastic Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Bin Gu, Wenhan Xian, Zhouyuan Huo, Cheng Deng, Heng Huang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a unified q-memorization framework for asynchronous stochastic optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Asymptotic Analysis via Stochastic Differential Equations of Gradient Descent Algorithms in Statistical and Computational Paradigms&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yazhen Wang, Shang Wu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes gradient descent algorithms using stochastic differential equations in statistical and computational settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Error-Feedback Framework: SGD with Delayed Gradients&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Sebastian U. Stich, Sai Praneeth Karimireddy&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces an error-feedback framework for stochastic gradient descent with delayed gradients.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="distributedparallel-optimization"&gt;
 Distributed/Parallel Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#distributedparallel-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing distributed or parallel optimization algorithms, focusing on communication efficiency and scalability.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On the Complexity Analysis of the Primal Solutions for the Accelerated Randomized Dual Coordinate Ascent&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Huan Li, Zhouchen Lin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the complexity of primal solutions for accelerated randomized dual coordinate ascent in distributed settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;WONDER: Weighted One-shot Distributed Ridge Regression in High Dimensions&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Edgar Dobriban, Yue Sheng&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes WONDER, a weighted one-shot distributed ridge regression method for high-dimensional data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Anis Elgabli, Jihong Park, Amrit S. Bedi, Mehdi Bennis, Vaneet Aggarwal&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces GADMM, a fast and communication-efficient framework for distributed machine learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Boyue Li, Shicong Cen, Yuxin Chen, Yuejie Chi&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops communication-efficient distributed optimization with gradient tracking and variance reduction.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;On Convergence of Distributed Approximate Newton Methods: Globalization, Sharper Bounds and Beyond&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Xiao-Tong Yuan, Ping Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes convergence of distributed approximate Newton methods with sharper bounds and globalization techniques.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="submodular-optimization"&gt;
 Submodular Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#submodular-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on submodular optimization, including minimization and maximization problems.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Quadratic Decomposable Submodular Function Minimization: Theory and Practice&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Pan Li, Niao He, Olgica Milenkovic&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies quadratic decomposable submodular function minimization with theoretical and practical insights.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rad Niazadeh, Tim Roughgarden, Joshua R. Wang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops optimal algorithms for continuous non-monotone submodular and DR-submodular maximization.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="bayesian-and-hyperparameter-optimization"&gt;
 Bayesian and Hyperparameter Optimization&lt;span class="heading__anchor"&gt; &lt;a href="#bayesian-and-hyperparameter-optimization"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers addressing Bayesian optimization and hyperparameter tuning for scalable and robust optimization.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kirthevasan Kandasamy, Karun Raju Vysyaraju, Willie Neiswanger, Biswajit Paria, Christopher R. Collins, Jeff Schneider, Barnabas Poczos, Eric P. Xing&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces Dragonfly, a scalable and robust Bayesian optimization framework for hyperparameter tuning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Distributionally Ambiguous Optimization for Batch Bayesian Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Nikitas Rontsis, Michael A. Osborne, Paul J. Goulart&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes distributionally ambiguous optimization for batch Bayesian optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Kalai-Smorodinsky Solution for Many-Objective Bayesian Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mickael Binois, Victor Picheny, Patrick Taillandier, Abderrahmane Habbal&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies the Kalai-Smorodinsky solution to many-objective Bayesian optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Robust Reinforcement Learning with Bayesian Optimisation and Quadrature&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Supratik Paul, Konstantinos Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Integrates Bayesian optimization and quadrature for robust reinforcement learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="optimization-in-reinforcement-learning"&gt;
 Optimization in Reinforcement Learning&lt;span class="heading__anchor"&gt; &lt;a href="#optimization-in-reinforcement-learning"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers focusing on optimization techniques for policy optimization and reinforcement learning.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dhruv Malik, Ashwin Pananjady, Kush Bhatia, Koulik Khamaru, Peter L. Bartlett, Martin J. Wainwright&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops derivative-free methods for policy optimization in linear quadratic systems with guarantees.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Expected Policy Gradients for Reinforcement Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Kamil Ciosek, Shimon Whiteson&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces expected policy gradients for reinforcement learning optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Importance Sampling Techniques for Policy Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Alberto Maria Metelli, Matteo Papini, Nico Montali, Marcello Restelli&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes importance sampling techniques for efficient policy optimization in reinforcement learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 class="heading" id="other-optimization-topics"&gt;
 Other Optimization Topics&lt;span class="heading__anchor"&gt; &lt;a href="#other-optimization-topics"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;p&gt;Papers covering miscellaneous optimization topics, including dictionary learning, neural network verification, and differential privacy.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Learning with Fenchel-Young Losses&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Mathieu Blondel, André F.T. Martins, Vlad Niculae&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces optimization with Fenchel-Young losses for structured prediction.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Branch and Bound for Piecewise Linear Neural Network Verification&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Rudy Bunel, Jingyue Lu, Ilker Turkaslan, Philip H.S. Torr, Pushmeet Kohli, M. Pawan Kumar&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies branch and bound techniques for piecewise linear neural network verification.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Conjugate Gradients for Kernel Machines&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Simon Bartels, Philipp Hennig&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops conjugate gradient methods for optimization in kernel machines.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unique Sharp Local Minimum in L1-Minimization Complete Dictionary Learning&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yu Wang, Siqi Wu, Bin Yu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes unique sharp local minima in L1-minimization for complete dictionary learning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Community-Based Group Graphical Lasso&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Eugen Pircalabelu, Gerda Claeskens&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a community-based group graphical Lasso for structured optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Constrained Dynamic Programming and Supervised Penalty Learning Algorithms for Peak Detection in Genomic Data&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Toby Dylan Hocking, Guillem Rigaill, Paul Fearnhead, Guillaume Bourque&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops constrained dynamic programming and supervised penalty learning for peak detection in genomic data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Loss Control with Rank-One Covariance Estimate for Short-Term Portfolio Optimization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Zhao-Rong Lai, Liming Tan, Xiaotian Wu, Liangda Fang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies rank-one covariance estimation for loss control in short-term portfolio optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Owen Marschall, Kyunghyun Cho, Cristina Savin&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a unified framework of online learning algorithms for training recurrent neural networks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Networks&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Amir R. Asadi, Emmanuel Abbe&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Introduces multilevel entropic regularization for neural network training using chaining and chain rule.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Nesterov&amp;rsquo;s Acceleration for Approximate Newton&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Haishan Ye, Luo Luo, Zhihua Zhang&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Applies Nesterov’s acceleration to approximate Newton methods for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;New Insights and Perspectives on the Natural Gradient Method&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: James Martens&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Provides new insights into the natural gradient method for optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Group&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Yuexiang Zhai, Zitong Yang, Zhenyu Liao, John Wright, Yi Ma&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops complete dictionary learning via L4-norm maximization over the orthogonal group.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Empirical Risk Minimization in the Non-Interactive Local Model of Differential Privacy&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Di Wang, Marco Gaboardi, Adam Smith, Jinhui Xu&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Studies empirical risk minimization in the non-interactive local model of differential privacy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stable Regression: On the Power of Optimization over Randomization&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dimitris Bertsimas, Ivan Paskov&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Analyzes the power of optimization over randomization in stable regression.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fast Exact Matrix Completion: A Unified Optimization Framework for Matrix Completion&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Dimitris Bertsimas, Michael Lingzhi Li&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Proposes a unified optimization framework for fast exact matrix completion.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Rank-Based Lasso - Efficient Methods for High-Dimensional Robust Model Selection&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Authors&lt;/em&gt;: Wojciech Rejchel, Małgorzata Bogdan&lt;br&gt;
&lt;em&gt;Description&lt;/em&gt;: Develops rank-based Lasso methods for high-dimensional robust model selection.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title/><link>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_22/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_22/</guid><description/></item><item><title/><link>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_23/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_23/</guid><description/></item><item><title/><link>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_24/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_24/</guid><description/></item><item><title/><link>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_25/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/mathematics/analysis/optimization/tmlr_25/</guid><description/></item><item><title/><link>https://blog.namln.org/en/topics/mathematics/analysis/variational-analysis/reading-books/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://blog.namln.org/en/topics/mathematics/analysis/variational-analysis/reading-books/</guid><description>&lt;h2 class="heading" id="apa-references-for-convex-optimization-and-analysis-books"&gt;
 APA References for Convex Optimization and Analysis Books&lt;span class="heading__anchor"&gt; &lt;a href="#apa-references-for-convex-optimization-and-analysis-books"&gt;#&lt;/a&gt;&lt;/span&gt;
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Hendrix, E. M. T., &amp;amp; G.-Tóth, B. (2010). &lt;em&gt;Introduction to nonlinear and global optimization&lt;/em&gt;. Springer. &lt;a href="https://link.springer.com/book/10.1007/978-0-387-88670-1"&gt;https://link.springer.com/book/10.1007/978-0-387-88670-1&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Horst, R., &amp;amp; Pardalos, P. M. (1995). &lt;em&gt;Handbook of global optimization&lt;/em&gt;. Springer. &lt;a href="https://link.springer.com/book/10.1007/978-1-4615-2025-2"&gt;https://link.springer.com/book/10.1007/978-1-4615-2025-2&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mordukhovich, B. S. (2006a). &lt;em&gt;Variational analysis and generalized differentiation I: Basic theory&lt;/em&gt;. Springer. &lt;a href="https://link.springer.com/book/10.1007/3-540-31247-1"&gt;https://link.springer.com/book/10.1007/3-540-31247-1&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mordukhovich, B. S. (2006b). &lt;em&gt;Variational analysis and generalized differentiation II: Applications&lt;/em&gt;. Springer. &lt;a href="https://link.springer.com/book/10.1007/3-540-31246-3"&gt;https://link.springer.com/book/10.1007/3-540-31246-3&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mordukhovich, B. S. (2018). &lt;em&gt;Variational analysis and applications&lt;/em&gt;. Springer. &lt;a href="https://link.springer.com/book/10.1007/978-3-319-92775-6"&gt;https://link.springer.com/book/10.1007/978-3-319-92775-6&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mordukhovich, B. S. (2024). &lt;em&gt;Second-order variational analysis in optimization, variational stability, and control: Theory, algorithms&lt;/em&gt;. [PDF file]. Retrieved from [source, if available].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mordukhovich, B. S., &amp;amp; Nguyen, M. N. (2022). &lt;em&gt;Convex analysis and beyond - Volume I: Basic theory&lt;/em&gt;. Springer. &lt;a href="https://link.springer.com/book/10.1007/978-3-030-14784-6"&gt;https://link.springer.com/book/10.1007/978-3-030-14784-6&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item></channel></rss>