Optimization methods that make use of derivatives of the objective function up to order \p > 2 are called tensor methods. Among them, ones that minimize a regularized \p\th-order Taylor expansion at each step have been shown to possess optimal global complexity, which improves as \p increases. The local convergence of such optimization algorithms on functions that have Lipschitz continuous \p\th derivatives and are uniformly convex of order \q has been studied by Doikov and Nesterov [Math. Program., 193 (2022), pp. 315–336]. We extend these local convergence results to locally uniformly convex functions and fully adaptive methods, which do not need knowledge of the Lipschitz constant, thus providing the first sharp local rates for AR\p\. We discuss the surprising new challenges encountered by nonconvex local models and non-unique model minimizers. For \p > 2 our examples show that in particular when using the global minimizer of the subproblem, even asymptotically not all iterations need to be successful. Only if the "right" local model minimizer is used, the \p/(q-1)\th-order local convergence from the non-adaptive case is preserved for \p > q-1 otherwise the superlinear rate can degrade. We thus confirm that adaptive higher-order methods achieve superlinear convergence for certain degenerate problems as long as \p is large enough and provide sharp bounds on the order of convergence one can expect in the limit.
@misc{welzel_local_2025,title={Local {{Convergence}} of {{Adaptively Regularized Tensor Methods}}},author={Welzel, Karl and Liu, Yang and Hauser, Raphael A. and Cartis, Coralia},year={2025},month=oct,number={arXiv:2510.25643},eprint={2510.25643},primaryclass={math},publisher={arXiv},doi={10.48550/arXiv.2510.25643},archiveprefix={arXiv},keywords={Mathematics - Optimization and Control},}
Efficient Implementation of Third-order Tensor Methods with Adaptive Regularization for Unconstrained Optimization
High-order tensor methods that employ local Taylor models of degree \p within adaptive regularization frameworks (AR\p\) have recently received significant attention, due to their optimal global and local rates of convergence for both convex and nonconvex optimization problems. However, their numerical performance for general unconstrained optimization problems remains insufficiently explored, which we address by showcasing the numerical performance of standard second- and third-order variants (\p=2,3\) and proposing novel techniques for key algorithmic aspects when \p\geq3 to improve numerical efficiency. To improve the adaptive choice of the regularization parameter, we extend the interpolation-based updating strategy introduced in (Gould, Porcelli, and Toint, 2012) for \p=2 to \p\geq3\. We identify fundamental differences between the local minima of regularized subproblems for \p=2 and \p\geq3 and their effect on performance. Then, for \p\geq3 we introduce a novel pre-rejection technique that rejects poor subproblem minimizers (referred to as ‘transient’) before any function evaluation, reducing cost and selecting useful (‘persistent’) ones. Numerical studies confirm efficiency improvements in our modified AR\3 algorithm. We also assess the effect of different subproblem termination conditions and the choice of the initial regularization parameter on overall performance. Finally, we benchmark our best-performing AR\3 variants, along with those in (Birgin et al., 2020), against second-order ones (AR\2\). Encouraging results on standard test problems confirm that AR\3 variants can outperform AR\2 in terms of objective evaluations, derivative evaluations, and subproblem solves. We provide an efficient, extensive, and modular MATLAB software package including various AR\2 and AR\3 variants, allowing ease of use and experimentation for interested users.
@misc{cartis_efficient_2025,title={Efficient {{Implementation}} of {{Third-order Tensor Methods}} with {{Adaptive Regularization}} for {{Unconstrained Optimization}}},author={Cartis, Coralia and Hauser, Raphael and Liu, Yang and Welzel, Karl and Zhu, Wenqi},year={2025},month=feb,number={arXiv:2501.00404v2},eprint={2501.00404v2},primaryclass={math},publisher={arXiv},doi={10.48550/arXiv.2501.00404},archiveprefix={arXiv},keywords={Mathematics - Optimization and Control},}
2024
Approximating Higher-Order Derivative Tensors Using Secant Updates
Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating, for example, third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton updates are least-change updates satisfying the secant equation, with different methods using different norms to measure the size of the change. We present a full characterization for least-change updates in weighted Frobenius norms (satisfying an analogue of the secant equation) for derivatives of arbitrary order. Moreover, we establish convergence of the approximations to the true derivative under standard assumptions and explore the quality of the generated approximations in numerical experiments.
@article{welzel_approximating_2024,title={Approximating {Higher}-{Order} {Derivative} {Tensors} {Using} {Secant} {Updates}},volume={34},issn={1052-6234, 1095-7189},url={https://epubs.siam.org/doi/10.1137/23M1549687},doi={10.1137/23M1549687},language={en},number={1},journal={SIAM Journal on Optimization},author={Welzel, Karl and Hauser, Raphael A.},month=mar,year={2024},pages={893--917},}