为了正常的体验网站,请在浏览器设置里面开启Javascript功能!

直接搜索法历史和现状

2017-11-11 50页 doc 170KB 11阅读

用户头像

is_105949

暂无简介

举报
直接搜索法历史和现状直接搜索法历史和现状 Direct search methods:then and now 直接搜索法:历史和现状 a,1a,b,*,2c,aRobert Michael Lewis,Virginia Torezon,Michael W. Trosset aICASE, Mail Stop 132C, NASA Langley Research Center, Hampton,VA 23681-2199. USA bDepartment of Computer Science, College of William ...
直接搜索法历史和现状
直接搜索法历史和现状 Direct search methods:then and now 直接搜索法:历史和现状 a,1a,b,*,2c,aRobert Michael Lewis,Virginia Torezon,Michael W. Trosset aICASE, Mail Stop 132C, NASA Langley Research Center, Hampton,VA 23681-2199. USA bDepartment of Computer Science, College of William & Mary, P.O. Box 8795, Williamsburg, VA 23187-8795, USA cDepartment of Mathematics, College of William & Mary, P.O. Box 8795, Williamsburg, VA 23187-8795, USA Received 1 July 1999; received in revised form 23 February 2000 Abstract 摘要 我们讨论无约束优化的直接搜索法。我们从现在的观点来看这类与导 数无关的算法, 主要集中在1960到1971年的直接搜索法发展的黄金时期。我们首先 讨论在未构建目标模型的情况怎样使用直接搜索法。然后我们考虑一 些经典直接搜索法并揭示那些年这类算法的进展。特别地,当原始直 接搜索法开始直接利用启发式方法时,更多近来的分析表明,虽然不 是全部但大部分启发式方法实际上已经足可以保证迭代序列中至少有 一个子序列全局收敛到目标函数的一阶驻点。 关键词:求导无关优化;直接搜索法;模式搜索法 We discuss direct search methods for unconstrained optimization. We give a modern perspective on this classical family of derivative-free algorithms, focusing on the development of direct search methods during their golden age from 1960 to 1971. We discuss how direct search methods are characterized by the absence of the construction of a model of the objective. We then consider a number of the classical direct search methods and discuss what research in the intervening years has uncovered about these algorithms. In particular, while the original direct search methods were consciously based on straightforward heuristics, more recent analysis has shown that in most – but not all – cases these heuristics actually suffice to ensure global convergence of at least one subsequence of the sequence of iterates to a first-order stationary point of the objective function. ? 2000 Elsevier Science B.V. All rights reserved. Keywords: Derivative-free optimization; Direct search methods; Pattern search methods 1. Introduction 1( 引言 罗伯特?胡克和T.A.捷吾斯首先在1961年的计算机械协会期刊上 的一篇论文上提出“直接搜索”[12].在他们的论文的引言中这样 描述直接搜索: Robert Hooke and T.A. Jeeves coined the phrase “direct search” in a paper that appeared in 1961 in the journal of the Association of Computing Machinery [12]. They Provided the following description of direct search in the introduction to their paper: *Corresponding author. E-mail addressed: bucharoo@icase.edu (R.M. Lewis), va@cs.wm.edu (V. Torczon), trosset@math.wm.edu (M.W. Trosset). 1This research was supported by the National Aeronautics and Space Administration under NASA Contract No.NAS1-97046. 2This research was supported By the National Science Foundation under Grant CCR-9734044 and by the National Aeronautics and Space Administration under NASA Contract No. NASI-97046, while the author was in residence at the Institute for Computer Applications in Science and Engineering (ICASE). 0377-0427/00/$-see front matter ? 2000 Elsevier Science B.V. All rights reserved. Pll:S0377-0427(00)00423-4 我们用“直接搜索”术语来描述试验解的有序检查,包括每个试验解 与当前“最好”解的比较和决定(作为以前结果的函数)下一个试验 解的策略。该术语暗示我们的优先选择是基于经验的。对于直接搜索 策略,一般的经典分析方法起不到任何作用。 We use the phrase “direct search” to describe sequential examination of trial solutions involving comparison of each trial solution with the “best” obtained up to that time together with a strategy for determining (as a function of earlier results) what the next trial solution will be. The phrase implies our preference, based on experience, for straightforward search strategies which employ no techniques of classical analysis except where there is a demonstrable advantage in doing so. 对于现代读者,这种“除非有特别优势”而回避一般经典分析技 术的偏好听起来好像十分奇怪。毕竟,拟牛顿法应用时的成功是不容 置疑的。但考虑胡克和捷吾斯的历史环境,我们现在所用的表示怎样 修正最速下降法以保证全局收敛的Armijo-Goldstein-Wolfe条件是在胡 克和捷吾斯论文发表5年以后才问世的。这篇论文仅在戴维森未发表 的用割线法推出拟牛顿法的2年后出现的,在计算机杂志上发表 的弗莱彻和鲍威尔的相似观点的论文的前两年[10]。因此,胡克和吉 夫斯的偏好现在没有证明。 To a modern reader, this preference for avoiding techniques of classical analysis “except when there is a demonstrable advantage in doing so” quite likely sounds odd. After all, the success quasi-Newton methods, when applicable, is now undisputed. But consider the historical context the remark by Hooke and Jeeves. Hooke and Jeeves’ paper appeared five years before what we now referred to as the Armijo-Goldstein-Wolfe conditions were introduced and used to show how the method of steepest descent could be modified to ensure global convergence [1,11,29]. The paper appeared only two years after Davidon’s unpublished report on using secant updates to deries quasi-Newton methods [8], and two years before Fletcher and Powell published a similar idea in The Computer Journal [10]. So in [96], this preference on the part of Hooke and Jeeves was now without justification. 四十年以后现在的问是:为什么直接搜索法仍然在使用呢,的 确这些没有理论证明的基于启发式方法的杂乱的方法类应该被现代数 值优化方法取代。 Forty years later, the question we now ask is: why are direct search methods still in use? Surely this seemingly hodge-podge collection of methods based on heuristics, which generally appeared without any attempt at a theoretical justification, should have been superseded by more ”modern approaches to numerical optimization. 在更大的范围,直接搜索法已经被更多的复杂方法所取代。数值 优化已经成熟,软件技术已经使用户很容易利用更多的复杂数值技术。 许多用户已经例行使用许多全局拟牛顿法的变形。 To a large extent direct search methods have been replaced by more sophisticated techniques. At the field of numerical optimization has matured, and software has appeared which eases the ability of consumers to make use of these more sophisticated numerical techniques, many users now routinely rely on some variant of a globalized quasi-Newton method. 但是,直接搜索法仍然有它存在的理由。首先也是最重要的,直 接搜索法仍然流行的一个原因是它在实践中工作的很好。实际上,许 多直接搜索法的基础启发性最近被分析家证明有和全局拟牛顿法技术 类似的全局收敛性。直接搜索法的成功是因为他们中的许多,包括胡 克和吉夫斯的直接搜索法是依赖于经典分析技术,方法上外观不是显 然形成他们的初始规格。 Yet direct search methods persist for several good reasons. First and foremost, direct search methods have remained popular because they work well in practice. In fact, many of the direct search methods are based on surprisingly sound heuristics that fairly recent analysis demonstrate guarantee global convergence behavior analogous to the results known for globalized quasi-Newton techniques. Direct search methods succeed because many of them – including the direct search method of Hooke and Jeeves-can be shown to rely on techniques of classical analysis in ways that are not readily apparent form their original specifications. 其次,拟牛顿法不能求解所有的非线性优化问题。直接搜索法能 够解决许多其它精巧算法所解决不了的问题。直接搜索法的独特特性 避免了许多其它方法的缺陷。 Second, quasi-Newton methods are not applicable to all nonlinear optimization problems. Direct search methods have succeeded when more elaborate approaches failed. Features unique to direct search methods often avoid the pitfalls that can plague more sophisticated approaches. 第三,直接搜索法即使对于要求很高的用户也是首选。理由很简 单:直接搜索法能够直接、立即地被利用来求解许多非线性优化问题。 对用户的需求是最少的,算法几乎不需要设定太多参数。在拟牛顿法 被利用前,计算机软件的发展跟不上复杂优化问题的求解(例如,导 数计算或者梯度用有限差逼近时扰动量的正确选择)。对于这样的问 题,可以利用直接搜索法全局收敛的特性开始搜索来解决最小化问题, 接着再用拟牛顿法求详细解。当拟牛顿法的前期准备完成后,用直接 搜索法得到的最好结果作为拟牛顿法的搜索中心,利用其超局部收敛 特性。这种复合优化策略和直接搜索法一样古老[21]。 Third, direct search methods can be the method of first recourse, even among well-informes users, The reason is simple enough: direct search methods are reasonably straightforward to implement and can be applied almost immediately to many nonlinear optimization problems. The requirements form a user are minimal and the algorithms themselves require the setting of few parameters. It is not unusual for complex optimization problems to require further software development before quasi-Newton methods can be applied (e.g., the development of procedures to compute derivatives or the proper choice of perturbation for finite-difference approximations to gradients). For such problems, it can make sense to begin the search for a minimizer using a direct search method with known global convergence properties, while undertaking the preparations for the quasi-Newton method. When the preparations for the quasi-Newton method have been completed, the best known result from the direct search calculation can be used as a “hot heart” for one of the quasi-Newton approaches, which enjoy superior local convergence properties. Such hybrid optimization strategies are as old as the direct search methods themselves [21]. 这个回顾我们有三个目的。首先,我们要概要地介绍直接搜索法 在解决非线性问题中与其他方法不同的特性。理解这些特性以后可以 更好地解释它的成功。其次,作为直接搜索分类的部分,我们给出三 种导出直接搜索法的基本方法,并解释怎样能够更好地理解直接搜索 法的收敛性质。最先推动这些方法发展的启发性在许多实例的长期应 用中已被证明包含足够的能够被现在技术分析的结构。我们不能 十分确定这些技术的首创者有没有认识到这些技术有多么的可靠,我 们能相信他们认识到了。无论如何,我们一直被在收集的原始论文中 找到的讨论的新观点留下印象。我们欣赏着四十年优化研究的进展。 我们的目的是从它们历史发展的对比中来了解直接搜索法在各种解决 非线性优化问题的方法中的位置。 We have three goals in this review. First, we want to outline the features of direct search that instinguish these methods from other approaches to nonlinear optimization. Understanding these features will go a long way toward explaining their continued success. Second, as part of out categorization of direct search, we suggest three basic approaches to devising direct search methods and explain how the better known about the convergence properties of direct search methods. The heuristics that first motivated the development of these techniques have proven, with time, to embody enough structure to allow – in most instances – analysis based on now standard techniques. We are never quite sure if the original authors appreciated just how reliable their techniques would prove to be; we would like to believe they did. Nevertheless, we are always impressed by new insights to be gleaned from the discussions to be found in the original papers. We enjoy the perspective of forty intervening years of optimization research. Our intent is to use this hindsight to place direct search methods on a firm standing as one of many useful classed of techniques available for solving nonlinear optimization problems. 我们对于直接搜索算法的讨论没有任何夸张,重点放在它的1960 到1971的十二年的发展上。篇幅上也不允许列出所有的参考文献。因 此,我们很抱歉省去了很多有趣的工作的介绍。 Our discussion of direct search algorithms is by no means exhaustive, focusing on those developed during the dozen years from 1960 to 1971. Space also does not permit an exhaustive bibliography. Consequently, we apologize in advance for omitting reference to a great deal of interesting work. 2. What is “direct search”? 2. 什么是“直接搜索”, For simplicity, we restrict our attention in the paper to unconstrained minimization: 为了简单,我们在文中将注意力集中在无约束最小化问题上: Minimize f(x), nf:,,,Where .We assume that f is continuously differentiable, but that information about nf:,,,这里.我们假设f是连续可微的,但与f相关的梯度信息或难 以获得或是不可靠的。 the gradient of f is either unavailable or unreliable. 因为直接搜索法既不需要计算也不要逼近导数,他们常常被描述 成“导数无关”。但是,正如[27]所说,这个并不能完全描述“直接搜 索”。 Because direct search methods neither compute nor approximate derivatives, they are often described as “derivative-free”. However, as argued in [27], this description does not fully characterize what constitutes “direct search”. 在历史上,许多解决优化问题的方法都借助于熟悉的“经典分析技 术”,即目标函数的泰勒级数展开。实际上,我们可以根据所用的展开 项数来分类数值优化的方法。采用一、二阶导数的二阶泰勒多项式构 建f的局部二次逼近牛顿方法是一个二阶方法。采用一阶导数的一阶泰 勒多项式构建f的局部线性逼近的最速下降方法是一个一阶方法。在这 种分类中,“零阶方法”不需要求导信息和构造f的逼近。这些在工程 优化界被称为零阶的方法就是直接搜索法。 Historically, most approaches to optimization have appealed to a familiar “technique of classical analysis”, the Taylor’s series expansion of the objective function. In fact, one can classify most methods for numerical optimization according to how many terms of the expansion are exploited. Newton’s method, which assumes the availability of first and second derivatives and uses the second-order Taylor polynomial to construct local quadratic approximations of f, is a second-order method. Steepest descent, which assumes the availability of first derivatives and uses the first-order Taylor polynomial to construct local linear approximations of f, is a first-order method. In this taxonomy, “zero-order methods” do not require derivative information and do not construct approximations of f. They are direct search methods, which indeed are often called zero-order methods in the engineering optimization community. 直接搜索法完全依赖于目标函数的值,但这并不能将直接搜索法同 其他优化方法完全区分。比如在用最速下降法时,梯度很难求时。这 种情况,我们习惯于用一个梯度估计值代替实际值。如果可能得到目 标函数的准确值,梯度往往用有限差分代替。我们这里讨论的是数值 优化的一个例子。如果函数值不确定,梯度常常通过设计一个合适的 实验和进行衰退分析得到。这通常在随机优化的响应面方法中出现。 在直接搜索法之前响应面方法起到了一个关键作用a point to which we return shortly.。两种方法都只依赖于目标函数的值,而且它们都被分到 一阶方法。但是,什么叫直接搜索法呢,直接搜索法不用计算或逼近 导数究竟意味着什么, Direct search methods rely exclusively on values of the objective function, but even this property is not enough to distinguish them from other optimization methods. For example, suppose that one would like to use steepest descent, but that gradients are not available. In this case, it is customary to replace the actual gradient with an estimated gradient. If it is possible to observe exact values of the objective function, then the gradient is usually estimated by finite differencing. This is the case of numerical optimization, with which we are concerned herein. If function evaluation is uncertain, then the gradient is usually estimated by designing an appropriate experiment and performing a regression analysis. This occurs, for instance, in response surface methodology in stochastic optimization. Response surface methodology played a crucial role in the pre-history of direct search methods, a point to which we return shortly. Both approaches rely exclusively on values of the objective function, yet each is properly classified as a first-order method. What, then is a direct search method? What exactly does it mean to say that direct search methods neither compute nor approximate derivatives? 虽然是有益的,我们觉得基于泰勒展式的分类法脱离了基本的问 题。正如[27],我们更强调逼近的构建,而不是他们构建的机制。对于 不需要目标函数有关泰勒展式的求导和逼近信息的优化,这样的方法 是“求导无关”,但他们不是直接搜索。区别是什么呢, Although instructive, we believe that a taxonomy based on Taylor expansions diverts attention from the basic issue. As in [27], we prefer here to emphasize the construction of approximations, not the mechanism by which they are constructed. The optimization literature contains numerous examples of methods that do not require derivative information and approximate the objective function without recourse to Taylor expansions. Such methods are “derivative-free”, but they are not direct searches. What is the distinction? 胡克和吉夫斯认为直接搜索法包括每次的尝试解与前次最优解的 比较。因此,直接搜索法的一个明显特征(至少在无约束优化的情况) 是它们不需要数值函数估算目标值的相关阶。那就是说,无约束优化 的直接搜索法仅仅依赖于通过可数集的函数值的相关阶的目标函数。 直接搜索法可用新的迭代来减少目标。这与伪牛顿直线搜索算法的 Armijo-Goldstein-Wolfe条件相反。他需要满足足够的减少条件。另一 个直接搜索法特征的推论是它排除了逼近f的正常方法,不需要使用数 值函数的值。 Hooke and Jeeves considered that direct search involves the comparison of each trial solution with the best previous solution. Thus, a distinguishing characterization of direct search methods (at least in the case of unconstrained optimization) is that they do not require numerical function values the relative rank of objective values is sufficient. That is, direct search methods for unconstrained optimization depend on the objective function only through the relative ranks of a countable set of function values. This means that direct search methods can accept new iterates that produce simple decrease in the objective. This is in contrast to the Armijo-Goldstein-Wolfe conditions for quasi-Newton line search algorithms, which require that a sufficient decrease condition be satisfied. Another consequence of this characterization of direct search is that it precludes the usual ways of approximating f, since access to numerical function values is not presumed. 还有其它的理由将直接搜索法同其他许多的求导无关方法区分开 来。我们已经提过响应平面方法通过衰退来构造f的局部近似解。响应 平面法于1951年在论文[4]中作为一种不同的最速下降法(实际上是最 速上升,作者用来求最大值)提出。1957年,关于工业过程改进问题 的讨论和技术人员的短缺,[3]给出了一个相对简单的方法进化操作的 概要。响应平面法依赖于深奥的实验设计,衰退和最速上升;进化操 作只需简单的设计和目标函数值的直接比较。Spendley et al.[21]后来说 [3]中的设计可以被单纯形设计取代并指出进化操作可以被自动化和用 作数值优化。在3.2部分讨论中,他们的算法仍然被使用,而且直接搜 索法中最有名的Nelderd和Mead的单纯形算法也起源于此。因此,G.E.P. Box 在1950年指出的响应平面法和进化操作,f的逼近和f值的比较 在直接搜索法的发展中起到了至关重要角色。 There are other reasons to distinguish direct search methods within the larger class of derivative-free methods. We have already remarked that response surface methodology constructs local approximations of f by regression. Response surface methodology was proposed in 1951, in the seminal paper [4], as a variant of steepest descent (actually steepest ascent, since the authors were maximizing). In 1957, concerned with the problem of improving industrial processes and the shortage of technical personnel, Box [3] outlined a less sophisticated procedure called evolutionary operation. Response surface methodology relied on esoteric experimental designs, regression, and steepest ascent; evolutionary operation relied on simple designs and the direct comparison of observed function values. Spendley et al. [21] subsequently observed that the designs in [3] could be replaced with simplex designs and suggested that evolutionary operation could be automated and used for numerical optimization. As discussed in Section 3.2, their algorithm is still in use and is the progenitor of the simplex algorithm of Nelder and Mead [17], the most famous of all direct search methods. Thus the distinction that G.E.P. Box drew in the 1950, between response surface methodology and evolutionary operation, between approximating f and comparing values of f, played a crucial role in the development of direct search methods. 我们重新将无约束最小化问题的直接搜索法分为三个基本类。由于 多种原因,我们着重于经典直接搜索法,这些方法在1960年到1971 年发展起来的。有实践和历史性的限制。 We organize the popular direct search methods for unconstrained minimization into three basic categories. For a variety of reasons, we focus on the classical direct search methods, those developed during the period 1960-1971. The restriction is part practical, part historical. 在实践上,我们将要区分模式搜索法,单纯形法(这里我们说的不 是线性规划的单纯形法)和搜索方向集适应法。在教科书上找到的直 接搜索法一般被分为三类。此外,直接搜索法的早期发展或多或少的 为后续算法的发展奠定了基础。这三种导出直接搜索法的基本方法的 许多不同点在后来的几年里出现了,许多在应用文献中提到,这些新 方法都是这三种方法的基本原理的改进版本。这是他们的发展分化的 直接因素。 On the practical side, we will make the distinction between pattern search methods, simplex methods ( and here we do not mean the simplex method for linear programming), and methods with adaptive sets of search directions. The direct search methods that one finds described most often in texts can be partitioned relatively neatly into these three categories. Furthermore, the early developments in direct search methods more or less set the stage for subsequent algorithmic developments. While a wealth of variants on these three basic approaches to designing direct search methods have appeared in subsequent years – largely in the applications literature – these newer methods are modifications of principles behind each of the three approaches, it is a relatively straightforward matter to devise variations on these three themes. 也有一些历史性的原因限制我们对这些算法在60年代发展的关注。 在那些年间,直接搜索法吸引了数值优化界的注意力。这些被提出的 算法在那时(现在也是)被认为有很大的实用价值。但是随着学科的 成熟,数值优化学家们对启发性的兴趣越来越少,而更对正式的收敛 理论感兴趣。1971年1月在英国国家物理实验室召开的一个重要的会 议IMA/NPL上,Swann[23]概观了直接搜索法的状况并得出已下结论: There are also historical reasons for restricting our attention to the algorithmic developments in the 1960s. Throughout those years, direct search methods enjoyed attention in the numerical optimization community. The algorithms proposed were then (and are now) of considerable practical importance. As their discipline matured, however, numerical optimizers became less interested in heuristics and more interested in formal theories of convergence. At a joint IMA/NPL conference that took place at the National Physics Laboratory in England in January 1971, Swann [23] surveyed the status of direct search methods and concluded with this apologia: 虽然上面所描述的方法启发性地发展起来,没有相关收敛的证明, 但事实上实践证明它们是鲁棒的和可靠的,至少对于给定的局部最小 化它们很少失败,虽然有时收敛率会很慢。 Although the methods described above have been developed heuristically and no proofs of convergence have been derived for them, in practice they have generally proved to be robust and reliable in that only rarely do they fail to locate at least a local minimum of a given function, although sometimes the rate of convergence can be very slow. Swann的评论给大家留下了一个很遗憾的感觉支配着研究界:无论 实践上多么成功,直接搜索法在理论上仍然被怀疑着。讽刺性的是, 在Swann的概观的同一年,直接搜索法有关收敛的结果出现了。正如 我们将要讨论的可是他们并没有被广泛知道。仅仅最近,在20世纪90 年代,随着计算技术的进步和进一步分析的发展,这种感觉才被改变 [30]。 Swann’s remarks address an unfortunate perception that would dominate the research community for years to come: that whatever successes they enjoy in practice, direct search methods are theoretically suspect, Ironically, in the same year as Swann’s survey, convergence results for direct search methods began to appear, though they seem not to have been widely known, as we discuss shortly. Only recently, in the late 1990s, as computational experience has evolved and further analysis has been developed, has this perception changed [30]. 3.1. Pattern search 3.1.模式搜索 在戴维森的ANL 5990 [8]延期的序言中,他描述了最基础的一种 模式搜索算法,由于这么简单而没有归类: In his belated preface for ANL 5990 [8], Davidon described one of the most basic of pattern search algorithms, one so simple that it goes without attribution: 恩里科?费尔米和尼古拉斯?梅求坡里斯第一台数字计算机Los Alamos Maniac来确定特定理论参数(相位移)哪些值最符合实验数据 (散射交叉部分)。他们每次已同样的步长变化一个理论参数,当这些 任意的参数的增加和减少不能再改进符合实验数据时,他们就将步长 变为原来的一半继续这样的过程直到认为步长足够小。他们的简单方 法虽然慢但确是正确的。 Enrico Fermi and Nicholas Metropolis used one of the first digital computers, the Los Alamos Maniac, to determine which values of certain theoretical parameters (phase shifts) best fit experimental data (scattering cross sections). They varied one theoretical parameter at a time by steps of the same magnitude, and when no such increase or decrease in any one parameter further improved the fit to the experimental data, they halved the step size and repeated the process until the steps were deemed sufficiently small. Their simple procedure was slow but sure,…… 模式搜索法用一系列的点模式考虑目标函数的行为的试探位移来 刻划。所有的依赖于有理格。在上面的例子中,单元相量是格子的基 ,指代这个量)表明了网格分辨率。础,当前步长值(我们方便地用k 试探位移由当前迭代邻近网格的点访问的系统策略组成。 Pattern search methods are characterized by a series of exploratory moves that consider the behavior of the objective function at a pattern of points, all of which lie on a rational lattice. In the example described above, the unit coordinate vectors form a basis for the lattice and the current ,magnitude of the steps (it is convenient to refer to this quantity as ) k dictates the resolution of the lattice. The exploratory moves consist of a systematic strategy for visiting the points in the lattice in the immediate vicinity of the current iterate. 注意费尔米和梅确坡里斯过程的一些特性是有益的。首先,它没有 给基本的目标函数建模。每次当参数变化时,和学家们都要关心:在 拟和实验数据上是否有改进。简单的“是”和“否”决定了下一步。 因此,这个过程是直接搜索。其次,参数随着预先数量的步数变化而 变化。步长以前次的一半的方式减小,因此可以确保每次迭代都有合 理的网格。这是使直接搜索法成为模式搜索法的关键特征。再次,步 长只有在任何单个参数量的减少不能改进拟和的时候才缩小。因此可 以确保步长不会过早地缩小。这种特征在[26]中是另一种模型搜索的正 式定义,并且在那里也是一种重要的收敛分析方法。 It is instructive to note several features of the procedure used by Fermi and Metropolis. First, it does not model the underlying objective function. Each time that a parameter was varied, the scientists asked: was there improvement in the fit to the experimental data. A simple “yes” or “no” answer determined which move would be made. Thus, the procedure is a direct search. Second, the parameters were varied by steps of predetermined magnitude. When the step size was reduced, it was multiplied by one-half, thereby ensuring that all iterates remained on a rational lattice. This is the key feature that makes the direct search a pattern search. Third, the step size was reduced only when no increase of decrease in any one parameter further improved the fit, thus ensuring that the step sizes were not decreased prematurely. This feature is another part of the formal definition of pattern search in [26] and is crucial to the convergence analysis presented therein. 3.1.1. Early analysis 3.1.1.早期分析 在1971年之前,一个这种简单算法的全局收敛的证明出现在优化 的教科书[18]上,那里这种技术被称为局部变量方法。特别地,Polak 证明了下面的结果: ,,,,,xx.1.如果定理3是用局部变量法构造的序列,则任何的聚点xkk ,满足,f(x),0(假设f(x)至少是一次连续可导的) By 1971, a proof of global convergence for this simple algorithm existed in the optimization text [18], where the technique goes by the name method of local variations. Specifically, Polak proved the following result: ,,xTheorem 3.1. If is a sequence constructed by the method of local k ,,,,xvariations, then any accumulation point of satisfies . ,f(x),0xk (By assumption, is at least once continuously differentiable.) f(x) Polak的结果和同时期的任何最速下降或全局拟牛顿法的结果一 样强。但是,为了保证后面方法的全局收敛,必须有足够的减少条件 (Armijo-Goldstein-Wolfe条件)或者部分柯西减少条件——它们全都 依赖于外在的数值函数值,就和在当前迭代逼近方向导数一样。不同 寻常的是我们不知不觉中证明了直接搜索法的收敛性。 Polak’s result is as strong as any of the contemporaneous global convergence results for either steepest descent or a globalized quasi-Newton method. However, to establish global convergence for these latter methods, one must enforce either sufficient decrease conditions (the Armijo-Goldstein-Wolfe conditions) or a fraction of Cauchy decrease condition – all of which rely on explicit numerical function values, as well as explicit approximations to the directional derivative at the current iterate. What is remarkable is that we have neither for direct search methods, yet can prove convergence. Polak明显地认识到,虽然他没有明显证明,但局部变化方法的所 有迭代就依赖于有理格(看了他的教科书的43页就能更好地理解)。 正如他所提到的,这种效果在步长减半前仅仅构造有限数量的中间点。 因此该算法的“不会死在一点”——更准确地说, Armiho-Goldstein-Wolfe 条件的未成熟收敛的病态被排除了。 What Polak clearly realized, though his proof dose not make explicit use of this fact, is that all of the iterates for the method of local variations lie on a rational lattice (one glance at the figure on p.43 of his text confirms his insight ). The effect, as he notes, is that the method can construct only a finite number of intermediate points before reducing the step size by precisely, the one-half. Thus the algorithm “cannot jam up at a point” – pathology of premature convergence that the Armiho-Goldstein-Wolfe conditions are designed to preclude. Polak并不是唯一认识到模型搜索法的结构可以得到很好的全局收 敛结果。同年,Céa也出版了一本优化的教科书[7],详细说明了胡克 和吉夫斯[12]模型搜索算法的全局收敛证明。收敛的假设得到进一步证 1实(加上假设f,C,他假设认为f严格凸且当时)。f(x),,,x,,, 然而,确定的是胡克和吉夫斯方法的叠代序列收敛于f的唯一最小值。 这是又一个不需要直接导数的算法,秩信息已经足够。 Polak was not alone in recognizing that pattern search methods contain sufficient structure to support a global convergence result. In the same year, Céa also published an optimization text [7] in which he provided a proof of global convergence for the pattern search algorithm of Hooke and Jeeves [12]. The assumptions used to establish convergence were stronger ( in 1,addition to the assumption that f C, it is assumed that f is strictly convex and that f(x),,, as ).Nevertheless, it is established that the x,,, sequence of iterates produced by the method of Hooke and Jeeves converges to the unique minimizer of f – again with an algorithm that has no explicit recourse to the directional derivative and for which ranking information is sufficient. Polak和Céa的结果的基础前提是两种算法都在决定减少一定的步 ,的边缘时有足够的有关目标函数局部行为的信息来确保不会减得长k 太快。特别的,局部变量法和胡克吉夫斯模式搜索法读要在满足 f(x),f(x,,e),i,{1,?,n},e,的情况下才允许减少。这里表示第Ikkkiik x个单元列向量。这在两种分析中都起到了关键作用。只要不是的不fk ,e,i,{1,?,n}动点,自少会有的2n个方向中的一个是下降方向。因此,i f(x,,e),f(x)f(x,,e),f(x),足够小,我们能保证和,一旦kkikkkikk 中至少一个成立。 i,{1,?,n} Both Polak’s and Céa’s results rely on the fact that when either of these ,two algorithms reach the stage where the decision is made to reduce , k which controls the length of the steps, sufficient information about the local behavior of the objective has been acquired to ensure that the reduction is not premature. Specifically, neither the method of local variations nor the ,pattern search algorithm of Hooke and Jeeves allow to be reduced until k it has been verigied that f(x),f(x,,e),i,{1,?,n}, kkki ewhere denotes the ith unit coordinate vector. This plays a critical role in i xboth analyses. As long as is not a stationary point of f, then at least k ,e,i,{1,?,n}one of the 2n directions defined by must be a direction of i ,descent. Thus, once is sufficiently small, we are guaranteed that either k f(x,,e),f(x)f(x,,e),f(x) or for at least one i,{1,?,n}. kkikkkik 另一个值得一提的早期的分析是Berman [2]的。根据后来的发展, Berman正好对此感兴趣是因为他认识到如果他显性利用有理格结构, 它可以构造出不可求导的连续非线性最小化问题的算法。比如,如果f 是连续和强单峰的,他声称可以保证收敛到最小值。 The other early analysis worth noting is that of Berman [2]. In light of later developments, Berman’s work is interesting precisely because he realized that if he make explicit use of a rational lattice structure, he could construct algorithms that produce minimizers to continuous nonlinear is continuous functions that might not be differentiable. For example, if f and strongly unimodal, he argues that convergence to a minimizer is guaranteed. x在Berman导出和分析的算法中,有理格起到了一个显性的作用。 0 ,( 初始值)决定的格子L和 (格子的初始分辨率)由0 n,定义,这里是的整数点的格子。特别,L(x,,),{xx,x,,,,,,,}0000 L,L(x,,)重要的事实是逼近最小值的连续格子有以下特点:如果,kkk kL,L这里,代表一个正整数,则。这个结论的一个重,,,/,,,1kk,10k {x,x,x,?,x},L要分支是,对于任意的k,保证了Polak提出的有012kk,1 限的属性,并且在现代模式搜索中起到了一个重要作用。 In the algorithms formulated and analyzed by Berman, the rational xlattice plays an explicit role. The lattice L determined by ( the initial 0 ,iterate) and ( the initial resolution of the lattice) is defined by 0 ,, where is the lattice of integral points of L(x,,),{xx,x,,,,,,,}0000 n. Particularly important is the fact that the lattices used successively to , L,L(x,,)approximate the minimizer have the following property: if , kkk kL,Lwhere and denotes a positive integer, then . ,,,/,,,1kk,10k {x,x,x,?,x},LThe important ramification of this fact is that , for 012kk,1 any choice of k, thus ensuring the finiteness property to which Polak alludes, and which also plays an important tole in the more recent analysis for pattern search. 在介绍更近的结果前,无论如何,我们先用一些观察资料结束前面 的讨论,具有讽刺的是所有三个结果都和Swann导出的有关直接搜索 法的收敛不可证明的结论同事出现。但是,这些结果的每一个都是独 立做出的。没有任何一个作者参考过其他两个相关的结果;没有一个 引用了另外两个并且没有任何一个结果相关的讨论表明任何一个作者 或多或少地参考了同时发展起的另外两个结果。而且,这些结果都没 在非线性优化文献中得到重视和引用。它们一直都没被认为是“正常 智慧”,所以它们是非常规的。直到现在,还能听到有关直接搜索法是 基于启发性发展起来的,并且没有有关收敛性的证明。 Before moving on to the more recent results, however, we close with some observations about this early work. First, it is with no small degree of irony that we note that all three results [2,7,18] are contemporaneous with Swann’s remark that no proofs of convergence had been derived for direct search methods. However, each of these results was developed in isolation. None of the three authors appears to have been aware of the work of the others; none of the works contains citations of the other two and there is nothing in the discussion surrounding each result to suggest that any one of the authors was aware of the more-or-less simultaneous developments by the other two. Furthermore, these results have passed largely unknown and unreferenced in the nonlinear optimization literature. They have not been part of the “common wisdom” and so it was not unusual, until quite recently, to still hear claims that direct search methods had “been developed heuristically and no proofs of convergence have been derived for them”. 但是所有模式搜索更一般的收敛理论的关键部分在1971年已经全 部确定。Polak和Céa的工作在证明现存广泛应用的单个算法的范围比 较适度。Berman的工作范围则更广,他基于想推出任意数量的可根据 特定求解问题的假设裁减的新算法而定义了一般法则。这一直被认为 所有这些工作可以被统一分析,更一般地可包括进更多的算法。 Yet all the critical pieces needed for a more general convergence theory of pattern search had been identified by 1971. The work of Polak and Céa was more modest in scope in that each was proving convergence for a single, extant algorithm, already widely in use. Berman’s work was more ambitious in that he was defining a general principle with the intent of deriving any number of new algorithms tailored to particular assumptions about the problems to be solved. What remained to be realized was that all this work could be unified under one analysis – and generalized even further to allow more algorithmic perturbations. 3.1.2. 近期分析 3.1.2. Recent analysis 最近,模式搜索[26]的一般理论扩展到多维搜索[24]全局收敛分析 [25]。像3.2部分的单纯形算法,是通过以其中一个面的质心反射一个 n单纯形(空间中n+1个点)的多维搜索过程。但是,不像3.2讨论, 的单纯形法的地方是多维搜索同时也是模式搜索。 Recently, a general theory for pattern search [26] extended a global convergence analysis [25] of the multidirectional search algorithm [24]. Like the simplex algorithms of Section 3.2, multidirectional search nproceeds by reflecting a simplex (n+1 points in ) through the centroid , of one of the faces. However, unlike the simplex methods discussed in Section 3.2, multidirectional search also a pattern search. x事实上,一般理论的本质因素已经在[2,u,18]中确定。首先,如果k 不是的不动点,算法选择估算目标函数值的尝试点的点模式必须有足f 够的点以保证至少一个下降的方向。对于Céa和Polak的方法,就是 'e意味着包含形式的点的模式,这里是单元列向x,x,,e,i,{1,?,n}ikkki n量。对于Berman的方法,意味着需要空间的网格整数点,也就是, n,n需要网格的基是单位矩阵。 I,, In fact, the essential ingredients of the general theory has already been identified in [2,u,18]. First, the pattern of points form which one selects trial points at which to evaluate the objective function must be sufficiently xrich to ensure at least one direction of descent if is not a stationary k point of f. For Céa and Polak, this meant a pattern that included points of 'ethe form , where the are the unit coordinate x,x,,e,i,{1,?,n}ikkki nvectors. For Berman, it meant requiring the lattice integral points of , , n,ni.e., requiring that the basis for the lattice be the identity matrix . I,, n,n在[26]中,这些条件放松到允许任意非奇异矩阵B,,做为网格的 ,,,x,x,,B,基。事实上,我们可用,的模式,这里是一个整数向量,kkkkk 因此步进方向由组成B的列的整数组合决定。Céa和Polak研究的特殊 ,,,,e,i,{1,?,n}.情况可简单地选择B为I和而得到。 ki In [26], these conditions were relaxed to allow any nonsingular matrix n,n to be the basis for the lattice. In fact, we can allow patterns of the B,, ,,,x,x,,B,,form , where is an integral vector, so that the direction kkkkk of the step is determined by forming an integral combination of the columns of B. The special cases studied by Céa and Polak are easily ,,,,e,i,{1,?,n}.recovered by choosing B I and ki ,x其次,每个分析的本质因素要求如果目标函数能通过移动到一个k ,而减少则不会减少。这些必要条件的概括在[26,15]中有考虑。这个k 限制保证了不会过早收敛到不稳定点。 Second, an essential ingredient of each of the analyses is the ,requirement that not be reduced if the objective function can be k ,xdecreased by moving to one of the . Generalizations of the requirement k were considered in [26,15]. This restriction acts to prevent premature convergence to a nonstationary point. ,最后,我们限制这些习惯重新调节。Céa和Polak使用的传统的k k,选择是将减半,因此。稍微更一般化,Berman允许初以,,,/2k0k k,,,/3任意整数,因此(比如)可以有。事实上,进一步一般,,1k0 w化也是可能的。对于,我们让,这里w是指定有限集中,,,,,,1k,1k 的任意整数。那么,有三种可能: ,Finally, we restrict the manner by which is rescaled. The k conventional choice, used by both Céa and Polak, is to divide , by two, k kso that . Somewhat more generally, Berman allowed dividing ,,,/20k k,,,/3by any integer , so that (for example) one could have . In ,,1k0 wfact even greater generality is possible. For , we allow , ,,,,,,1k,1kwhere w is any integer in a designated finite set. Then there are three possibilities: ,1. w<0. 这需要减少,仅在特定条件下允许。当允许时,k L,L。Berman考虑的条件。 kk,1 L,L,2. w=0. 这时不变,因此。 kk,1k ,L,L3. w>0. 这需要增加,因此。 kk,1k ,1.w<0. This decreases , which is only permitted under certain k L,Lconditions (see above). When it is permitted, then , the relation kk,1considered by Berman. L,L,2.w=0. This leaves unchanged, so that . kk,1k ,L,L3.w>0. This increases , so that . kk,1k LL结果是重要的不是与的关系,而是要保证对于任意kk,1 L,{L,L,?,L,L},任意,存在单个网格,这又反j,0,?,k,1L,Li01kk,1ji 过来在收敛性分析中起到关键作用。 LLIt turns out that what matters is not the relation of to , but the kk,1 L,{L,L,?,L,L}assurance that there exists a single lattice , for which i01kk,1 {x,?,x},Lj,0,?,k,1 for all . This implies that ,which in turn L,L0kiji plays a crucial role in the convergence analysis. 利用我们刚才确定的本质因素,可以得出全局收敛的一般理论。下 面的结果表明至少一个叠代的序列收敛到目标函数的稳定点。 Exploiting the essential ingredients that we have identified, one can derive a general theory of global convergence. The following result says that at least one subsequence of iterates converges to a stationary point of the objective function. L(x),{x|f(x),f(x)}L(x)紧密的并且f在的邻域中是定理 3.2. 假设000 {x}有 连续可微的。则对于一般模式搜索算法得到的叠代序列k liminf,f(x),0kk,,, {x}在更强一点的假设下,一般化Polak的收敛结论,可以得到的每一k 个极限都是f的稳定点。在[26,15]中可找到详细的分析;[14]提供了一 个基本参数的说明性讨论。 L(x),{x|f(x),f(x)}Theorem 3.2. Assume that is compact and that f00 L(x)is continuously differentiable on a neighborhood of . Then for the 0 {x}sequence of iterates produced by a generalized pattern search k algorithm, . liminf,f(x),0kk,,, Under only slightly stronger hypotheses, one can show that every limit {x}point of is a stationary point of f, generalizing Polak’s convergence k result. Details of the analysis can be found in [26,15]; [14] provides an expository discussion of the basic argument. 3.2. 单纯形搜索 Simplex search 单纯形搜索法由指导搜索的简单策略刻划。 Simplex search methods are characterized by the simple device that they use to guide the search . 第一个单纯形方法是在1962年由Spendley et al.[21]在论文中提出 的。他们是由于早期的直接搜索法在任何地方都需要2n到2n个目标 估值完成叠代改进的搜索的事实而提出的。他们的结果是不需要n+1 个以上的目标函数的值来确定上升(或下降)方向。这意味着,只要 n+1个f(x)图像中的点就可确定一个平面,利用n+1个f(x)值的有限差 分估计。同时,n+1个点确定一个单纯形。这引出了单纯形搜索,f(x) n的基本思想:在空间中构造一个非退化单纯形并用单纯形驱动搜索。 , The first of the simplex methods is due to Spendley et al. [21] in a paper that appeared in 1962. They were motivated by the fact that earlier direct search methods required anywhere from 2n to 2n objective evaluations to complete the search for improvement on the iterate. Their observation was that it should take no more that n+1 values of the objective to identify a downhill (or uphill) direction. This makes sense, since n+1 points in the graph of f(x) determine a plane, and n+1 values of f(x) would be needed to estimate ,f(x)via finite differences. At the same time n+1 points determine a simplex. This leads to the basic idea of simplex nsearch: construct a nondegenerate simplex in and use the simplex to , drive the search. n2一个单纯形是空间中n+1个点的集合。因此在空间中是一个,, 3三角形,空间中是一个四面体,等等。一个非退化的单纯形是与任, 意顶点相连的边集组成空间的基。也就是说,我们可通过与给定点相 连的边的线性组合来构造域中任意点的搜索。 n2A simplex is a set of n+1 points in . Thus one has a triangle in , ,, 3a tetrahedron in , etc. A nondegenerate simplex is one for which the set , of edges adjacent to any vertex in the simplex forms a basis for the space. In other words, we want to be sure that any point in the domain of the search can be constructed by taking linear combinations of the edges adjacent to any given vertex. 单纯形不只是简化了空间的采样,它同时还有其它特性,如果用某 个点的相对于其象对面中心的对称点代替该点后仍然是单纯形,见图 1。这同样简化了操作,因为在搜索最优解时一次可反射一个顶点,简 化了过程。 Not only does the simplex provide a frugal design for sampling the space, it has the added feature that if one replaces a vertex by reflecting it through the centroid of the opposite face. Then the result is also a simplex, as shown in Fig.1. This, too, is a frugal feature because it means that one can proceed parsimoniously, reflecting one vertex at a time, in the search for an optimizer. r Fig 1. The original simplex, the reflection of one vertex through the centroid of the opposite face, and the resulting reflection simplex. 图1.原始单纯形,关于相对面中心对称的点,和镜像单纯形。 一旦初始单纯形构造好,Spendley et al单纯形算法的单个移动是镜 像的。这个移动首先在单纯形中确定“最差”顶点(例如:最小期望 目标值的),然后通过向对面映射最差单纯形。如果映射后的点仍然是 最差的,则选择“次差”顶点继续这个过程。(一个图1的快速回顾应 该确保镜像点是否不比下一个最差点好,否则如果“最差”顶点又一 次被选择镜像,就会被简单的反射回原来的地方,开始了一个循环) Once an initial simplex is constructed, the single move specified in the original Spendley et al simplex algorithm is that of reflection. This move first identifies the “worst” vertex in the simplex (i.e., the one with the least desirable objective value ) and then reflects the worst simplex through the centroid of the opposite face. If the reflected vertex is still the worst vertex, then next choose the “next worst” vertex and repeat the process. (A quick review of Fig. 1 should confirm that if the reflected vertex is not better than the next worst vertex, then if the “worst” vertex is once again chosen for reflection, it will simply be reflected back to where it started, thus creating an infinited cycle.) 最终的目标是取代“最好”顶点(比如,有最期望目标值的)或者 确定出最优顶点是一个最优解的候选。直到那时,算法一直在通过相 对面的中心翻转顶点来移动单纯形(而不是最好顶点)。 The ultimate goals are either to replace the “best” vertex (i.e., the one with the most desirable objective value) or to ascertain that the best vertex is a candidate for a minimizer. Until then, the algorithm keeps moving the simplex by flipping some vertex (other than the best vertex) through the centroid of the opposite face. 极端的时候的启发性是直接的:在最好点的目标值的最终改进的期 望上,我们将“较差”顶点沿其余点的公共方向(就是上面说的剩余 点的中心)上移动。然有问题变成:我们有了一个最优解的候选了么, 我们是否在或接近最优解了么, The basic heuristic is straightforward in the extreme: we move a “worse” vertex in the general direction of the remaining vertices (as represented by the centroid of the remaining vertices), with the expectation of eventual improvement in the value of the objective at the best vertex. The questions then become: do we have a new candidate for a minimizer and are we at or near a minimizer? 第一个问题很容易回答。当一个镜像顶点引起最优顶点的目标值单 调减时,我们就有了一个最优解的新的候选,又一次简单减少规则起 作用了。 The first question is easy to answer. When a reflected vertex produces strict decrease on the value of the objective at the best vertex, we have a new candidate for a minimizer; once again the simple decrease rule is in effect. 第二个问题的答案就有点不明确了。在原始论文中,Spendley. Hext 用二维演示——一个用来解释表明最优解的领域被确定单纯形的循环 序列,我们在图2中看到了一个相似的例子,这里5步映射就回到了 xx,因此说明了在一个稳定点的周围 开始的地方,没有取代kk The answer to the second question is decidedly more ambiguous. In the original paper, Spendley. Hext, and himsworth illustrate – in two dimentions – a circling sequence of simplices that could be interpreted as indicating that the neighborhood of a minimizer has been identified, We see a similar example in Fig.2, where a sequence of five reflections brings the xsearch back to where it started, without replacing , thus suggesting that k x may be in the neighborhood of a stationary point. k x kxk xkr 3rr 21 r 1 rr 21 r r 54rr 54r 4 xkx kxr k3 r 3 r 3 r 1 rr 21rr 21r2 {r,r,r,r,r}x图2. 一个映射的序列,都没有成功取代最好顶点,12345k而回到了这个序列开始的单纯形。 {r,r,r,r,r}Fig. 2. A sequence of reflections , each of which fails to 12345 xreplace the best vertex , which brings the search back to the simplex k from which this sequence started. 二维的图片似乎有点误导,第五次映射又回到了原始单纯形的最差顶点-只在一、二维才发生的情况。所以Spendley, Hext, 和 Himsworth给出了一个启发性的公式,用来确定什么时候单纯形已绕最好点足够长而可以确定最优点的领域。当发现这种情况时,有两种解决方法:减少连到最好顶点的边长或者重新用获得快速局部收敛的更高阶的方法 The picture in two dimensions is somewhat misleading since the fifth reflection maps back onto the worst vertex in the original simplex – a situation that only occurs in either one or two dimensions. So Spendley, Hext, and Himsworth give a heuristic formula for when the simplex has flipped around the current best vertex long enough to conclude that the neighborhood of a minimizer has been identified. When this situation has been detected, they suggest two alternatives: either reduce the lengths of the edges adjacent to the “best” vertex and resume the search or resort to a higher-order method to obtain faster local convergence. Nelder和Mead [17]的贡献是将单纯形搜索转变成可加快搜索的附 加移动的优化算法。特别的,很容易理解的是反射移动保留了单纯形 的形状——和维度没关。Nelder和Mead提出的改进是通过他们建议的 能够更好适应目标函数特性的设计来加速搜索的方法改进基本反射移 动。作为结束,他们加的方法称为扩大和缩小移动,见图3。 The contribution of Nelder and Mead [17] was to turn simplex search into an optimization algorithm with additional moves designed to accelerate the search. In particular, it was already well-understood that the reflection move preserved the original shape of the simplex – regardless of the dimension. What Nelder and Mead proposed was to supplement the basic reflection move with additional options designed to accelerate the search by deforming the simplex in a way that they suggested would better adapt to the features of the objective function. To this end, they added what are known as expansion and contraction moves, as shown in Fig. 3. 算法逻辑的详细说明有其它引文叙述;一个特别清晰和仔细的用现 代算法符号的描述可参见[13]。与我们相关的重要的是扩大步骤允许夸 张地将反射点与相对面中心的距离变为原来的两倍,而缩小步骤允许 同样地将反射点或最差点与相对面中心的距离变为原来的一半。更进 一步,出了在一个单独叠代中完成的适应变换,当原始单纯形形状变 化时(或者说是原理中的适应)这些新的可能性也在将来的叠代中反 射了。 We leave the full details of the logic of the algorithm to others; a particularly clear and careful description, using modern algorithmic notation, can be found in [13]. For our purposes, what is important to note is that the expansion step allows for a more aggressive move by doubling the length of the step from the centroid to the reflection point, whereas the contraction steps allow for more conservative moves by halving the length of the step from the centroid to either the reflection point or the worst vertex. Furthermore, in addition to allowing these adaptations within a single iteration, these new possibilities have repercussions for future iterations as they deform (or, as the rationale goes, adapt) the shape of the original simplex. Nelder 和 Mead同时也解决了如果增加一个缩小步长没有一个尝 试步长是可接受的改进怎么办的问题:如果都失败,将与当前最好顶 点相连的边减半,正如图3所示。 Nelder and Mead also resolved the question of what to do if none of the steps tried bring acceptable improvement by adding a shrink step: when all else fails, reduce the lengths of the edges adjacent to the current best vertex by half, as is also illustrated in Fig. 3. Nelder-Mead单纯形算法得到了广泛的应用。在所有的直接搜索法 中,Nelder-Mead单纯形算法是在数值软件包中最常见的。Nelder和 Mead的原始论文是最常用的科学文献索引,从Acta Anaesthesiologica Scandinavica到Zhurnal Fizicheskio Khimii有上千的科学文献期刊引用。 事实上,在化学工程社区有一套书[28]介绍单纯形搜索在优化中的应用 的。 The Nelder-Mead simplex algorithm has enjoyed enduring popularity. Of all the direct search methods, the Nelder-Mead simplex algorithm is the one most often found in numerical software packages. The original paper by Nelder and Mead is a Science Citation Index classic, with several thousand references across the scientific literature in journals ranging from Acta Anaesthesiologica xx kk Fig 3. The original simplex, with the reflection, expansion, and two possible constraction simplices, along with the shrink step toward the best xvertex , when all else fails. k 原始单纯形,通过镜相,扩张,两种可能的压缩单纯形,向最好顶 x。其他方法都失败了。 点缩小的步长k Scandinavica to Zhurnal Fizicheskio Khimii. In fact, there is an entire book from the chemics engineering community devoted to simplex search for optimization [28]. 但为什么还要费心继续研究呢,为什么在利用直接搜索法时不只 依赖于Nelder-Mead单纯形法呢,答案是:有关Nelder-Mead单纯形法 健壮性的突出问题长期困扰数值优化学家。当使用该方法时,实际上 能很好的工作,常常只需远少于其它直接搜索法的目标函数的估值就 可找到解。但它也有失败的时候。可以参考早期的应用文献,常常被 报告为“慢”收敛。一个Nelder-Mead的系统研究说当使用一套标准优 化测试问题时,也报告偶然出现收敛到函数的不稳定点[24]。大家都观 察到的是在这些例子中,单纯形的变形都是搜索方向(也就是最差点 到相对面中心的方向)数值上都与梯度正交。 So why bother with looking any further? Why not rely exclusively on the Nelder-Mead simplex method if one is going to employ a direct search method? The answer: there is the outstanding question regarding the robustness of the Nelder-Mead simplex method that has long troubled numerical optimizers. When the method works, it can work very well indeed, often finding a solution in far fewer evaluations of the objective function than other direct search methods. But it can also fail. One can see this in the applications literature, fairly early on, frequently reported as no more than “slow” convergence. A systematic study of Nelder-Mead, when applied to a suite of standard optimizations test problems, also reported occasional convergence to a nonstationary point of the function [24] the one consistent observation to be made was that in these instances the deformation of the simplex meant that the search direction (i.e., the direction defined along the worst vertex toward the centroid of the remaining vertices) became numerically orthogonal to the gradient. 这些有关Nelder-Mead在实践中的行为的观察在相对近来的时间里 导致了两种研究。第一个[13],研究证明和Nelder-Mead的渐进形为相 1关的东西。结果说明在中,算法是健壮的;在标准假设下可以保证, 收敛到稳定点。一些高维的一般性质也可以被证明,但不能保证在高 维空间的全局收敛。 These observations about the behavior of Nelder-Mead in practice led to two, relatively recent, investigations. The first [13], strives to investigate what can be proven about the asymptotic behavior of Nelder-Mead. The 1results show that in , the algorithm is robust; under standard , assumptions, convergence to a stationary point is guaranteed. Some general properties in higher dimensions can also be proven, but none that guarantee global convergence for problems in higher dimensions. 更据第二个最近的mckinnon的结果[16]这不令人惊讶。他用一些例 子说明了在Nelder-mead校对全局收敛上存在极限:算法在二维光滑 2(C)突目标会失败。 This is not surprising in light of a second recent result by mckinnon [16]. He shows with several examples that limits exist on proving global convergence for Nelder-mead: to wit, the algorithm can fail on smooth 2(C) convex objectives in two dimensions. 令我们不满意的情况是尽管either Spendley et al.和Nelder and Mead 的单纯形方法是两个最流行和广泛使用的直接搜索方法,但它们都没 有全局收敛的结果。进一步,McKinnon的例子指出证明Nelder-Mead 单纯形算法在更高维的全局收敛是不可能的。另一方面,导致 McKinnon相应的例子失败的机制看起来与Nelder-Mead在实践中代表 性的失败的机制不是一样的。这使得Nelder-Mead在实践中的失败的原 因还没找到。 This leaves us in the unsatisfactory situation of reporting that no general convergence results exist for the simplex methods of either Spendley et al. or Nelder and Mead – despite the fact that they are two of the most popular and widely used of the direct search methods. Further, McKinnon’s examples indicate that it will not be possible to prove global convergence for the Nelder-Mead simplex algorithm in higher dimensions. On the other hand, the mechanism that leads to failure in McKinnon’s counterexample does not seem to be the mechanism by which Nelder-Mead typically fails in practice. This leaves the question of why Nelder-Mead fails in practice unresolved. 3.3. 搜索方向适应集的方法 3.3. Methods with adaptive sets of search directions 我们考虑的最后一个经典方法的家族包括Rosenbrock 和Powell的 方法。这些算法试图利用在搜索过程中获得的函数曲率的信息构造方 向来加速搜索。 The last family of classical methods we consider includes Rosenbrock’s and Powell’s methods. These algorithms attempt to accelerate the search by constructing directions designed to use information about the curvature of the objective obtained during the course of the search. 3.3.1. Rosenbroch’的方法 3.3.1. Rosenbroch’s method 在这些方法中,第一个是Rosenbrock [20]的。Rosenbrock的方法 有意识地发展到与Rosenbrock的著名的“香蕉函数”特性相媲美的程 度,“香蕉函数”的最小值在一个狭窄和弯曲的谷里。Rosenbrock方法 的过程经历了一系列的阶段。每个阶段包括一些沿着给定边的固定方 向的试探搜索,这些固定方向在从一个阶段到另一个阶段的过程中利 用目标的信息进行更新。 Of these methods, the first was due to Rosenbrock [20]. Rosenbrock’s method was quite consciously derived to cope with the peculiar features of Rosenbrock’s famous “ banana function”, the minimizer of which lies inside a narrow, curved valley. Rosenbrock’s method proceeds by a series of stages, each of which consists of a number of exploratory searches along a set of directions that are fixed for the given stage, but which are updated form stage to stage to make use of information acquired about the objective. Rosenbrock方法的初始阶段用列向量做为搜索方向。它沿着这些方 向搜索,每一轮循环一次,再转到产生成功步的新的叠代(一个不成 功的步是引起目标期望值变小的)。这持续到每个搜索方向。一旦如此, 当前阶段就结束了。作为直接搜索法的例子,目标的数值在这个过程 中不是必需的。如果这些步中的目标比当前最好点的目标有所改进, 则将当前点移向新点。 The initial stage of Rosenbrock’s method begins with the coordinate directions as the search directions. It then conducts searches along these directions, cycling over each in turn, moving to new iterates that yield successful steps ( an unsuccessful step being one that leads to a less desirable value of the objective ). This continues until there has been at least one successful and one unsuccessful step in each search direction. Once this occurs. The current stage terminates. As is the case for direct search methods, numerical values of the objective are not necessary in this process. If the objective at any of these steps is perceived as being an improvement over the objective at the current best point, we move to the new point. 在下一阶段,就不是像局部变量方法一样重复搜索同样集的正交向 量了,Rosenbrock旋转了方向集来获得在早期移动过程中确定的目标 的信息。特别的,他利用了在前一阶段开始的叠代到新阶段开始的叠 代的非零步包含一个好的下降方向,或者是少是有很大可能性的方向的 情况——所以在新阶段,他保证这个特殊的方向包含在搜索将要沿着 进行的方向集中。(这个启发信息易于跟着引向香蕉函数最小值的谷 底)Rosenbrock指出搜索方向集一直是n个向量的正交集因此向量集 很好地保持线性无关的条件。新的正交向量集是用Gram-Schmidt正交 化过程产生的,将刚完成阶段的“有希望的”方向做为正交化过程的 首向量。 At the next stage, rather than repeating the search process with the same set of orthogonal vectors, as is done for the method of local variations, Rosenbrock rotates the set of directions to capture information about the objective ascertained during the course of the earlier moves. Specifically, he takes advantage of the fact that a nonzero step from the iterate at the beginning of the previous stage to the iterate at the start of the new stage suggests a good direction of descent – or, at the very least, a promising direction – so in the new stage, he makes sure that this particular direction is included in the set of directions along which the search will be conducted. ( This heuristic is particularly apt for following the bottom of the valley that leads to the minimizer of the banana function.) Rosenbrock imposes the condition that the set of search directions always be an orthogonal set of n vectors so that the set of vectors remains nicely linearly independent. The new set of orthonormal vectors is generated using the Gram-Schmidt orthonormalization procedure, with the “ promising” direction from the just-completed stage used as the first vector in the orthonormalization process. Rosenbrock的应用他的香蕉函数的方法图4有表述。每一阶段开始 的叠代用正方形表示。这些叠代的阶层是新阶段的搜索方向。注意搜 索适应窄谷有多快;在三个阶段内搜索方向就可反映出这个特性。同 时也注意搜索方向怎样改变使算法在谷中拐弯并继续求解。 Rosenbrock’s method as applied to his banana function is depicted in Fig. 4. The iterate at the beginning of each stage is indicated with a square. Superimposed on these iterates are the search directions for the new stage. Note how quickly the search adapts to the narrow valley; within three stages the search directions reflect this feature. Also notice how the search directions change to allow the algorithm to turn the corner in the valley and continue to the solution. Rosenbrock方法的搜索方向集的更新使得与我们前面讨论的两族 直接搜索法相比稍微地又增加了些复杂度。另一方面,香蕉函数的例 子可以很清诉地看到这个额外的工作的作用:调整整个搜索方向集在 搜索的过程中利用了新认识到的与目标相关的信息。 Updating the set of search directions for Rosenbrock’s method entails slightly more complexity than that which appears in any of the other two families of direct search methods we have surveyed. On the other hand, the example of the banana function makes the motivation for this additional work clear: adapting the entire set of search directions takes advantage of what has been learner about the objective during the course of the search. 3.3.2. Davies, Swann, 和Campey的不同点 3.3.2. The variant of Davies, Swann, and Campey 3 一个Rosenbrock算法的精炼版本在[22]中提出。Davies et al.指 出沿着这个搜索方向执行一系列比Rosenbrock原始算法中更复杂的一 维搜索会更有好处。 3A refinement to Rosenbrock’s algorithm was proposed in [22]. Davies et al. noted that there was merit to carrying out a sequence of more sophisticated one-dimensional searches along each of the search directions than those performed in Rosenbrock’s original algorithm. 正如[23]中所述, Davies et al的更精致的线搜索首次设法沿着从 ,前面集的方向增加多个固定值直到得到最小值(一维)的一个支架。 这仍然符合我们直接搜索法的定义。 As described in [23], the more elaborate line search of Davies et al. first ,takes steps of increasing multiples of some fixed value along a direction from the prescribed set until a bracket for the (one-dimensional) minimizer is obtained. This still corresponds to our definition of a direct search method. 但是,一旦一维最小值的支架被找到,单个的二次插值就可更准确 地用来预测最小值的位置[23]。这是一个构造目标模型的过程,完成这 个后,就控制了目标数值。因此,叠代的最后一步使我们不把 Davies ,Swann和Campey的方法作为直接搜索法来描述。但是,这种 策略还是很吸引人的,作者断言这个不同于Rosenbrock方法的地方使 得该方法比原始方法更有效[6]。 However, once a bracket for the one-dimensional minimizer has been found, a “single quadratic interpolation is made to predict the position of the minimum more closely” [23]. This is the construction of a model of the objective, and to do this, numerical values for the objective must be in hand. Thus, this final move within an iteration disqualifies the method of Davies, Swann, and Campey as a direct search method by our characterization. Nonetheless, this strategy is undeniably appealing, and its authors aver that this variant of Rosenbrock’s method is more generally efficient than the original [6]. 3.3.3. Powell方法 3.3.3. Powell’s method 在一个与[22]同年的报告中,Powell [19]描绘了一个不用计算导数 寻找最小值的方法。我们定义它为求导无关算法,它不像直接搜索法, 建模是搜索的核心。直接目标是保证该方法应用于凸二次函数,选择 共厄方向来加速收敛。在这个意义上,Powell算法可以被认为是一个 求导无关的非线性共厄梯度算法。 In a paper appearing the same year as the report in [22], Powell [19] outlined a method for finding minimizers without calculating derivatives. By our definition, it is a derivative-free, rather than a direct search method, for modeling is at the heart of the approach. The explicit goal is to ensure that if the method is applied to a convex quadratic function, conjugate directions are chosen with the goal of accelerating convergence. In this sense, Powell’s algorithm may be viewed as a derivative-free version of nonlinear conjugate gradients. 和Rosenbrock方法一样,Powell方法也按阶段进行。每个阶段有一 系列的n+1个一维搜索。通过为每个方向计算的二次插值寻找准确的 最小值来构造一维搜索(因此我们将它分为求导无关算法,但不是直接 搜索法)。第一批n个搜索是沿线性无关方向集中的每一个方向的。后 面的搜索是沿着连结第一批n个搜索结束时得到的点和下一阶段开始 的点的方向。在本阶段的结束处,第一批n个搜索方向中的一个要被 接着的搜索方向代替。这个过程在下面的阶段都会重复。 Like Rosenbrock’s method, Powell’s method proceeds in stages. Each stage consists of a sequence of n+1 one-dimensional searches. The one-dimensional searches are conducted by finding the exact minimizer of a quadratic interpolant computed for each direction ( hence our classification of the method as a derivative-free, but not direct search, method). The first n searches are along each of a set of linearly independent directions. The last search is along the direction connecting the point obtained at the end of the first n searches with the starting point of the stage. At the end of the stage, one of the first n search directions is replaced by the last search direction. The process then repeats at the next stage. Powell表示如果目标是凸二次的,则在每个阶段的最后一步增加的 方向集形成一个共厄方向集(它们仍然线性无关)。 Powell showed that if the objective is a convex quadratic, then the set of directions added at the last step of each stage forms a set of conjugate directions (provided they remain linear independent ). 3作者不能定位的论文。作者非常感谢有原报告的读者,并影印一份给 我们。 3A paper the authors have been unable to locate. The authors would be very much obliged to any reader who has a copy of the original report and would forward a photocopy to us. Powell依次使用来展示他的被称之为“Q-属性”的方法的过程。如果 一个算法能在有限步叠代内找到凸二次函数的最小值就称其具有Q-属 性。那就是,Q-属性是像共厄梯度算法的凸二次函数一样能有限计算 终止的属性。在Powell方法的例子中,对于凸二次函数可通过n个阶 段的有限计算终止。Zangwill [31]给出一个Powell的修正方法,可以避 免线性依赖搜索方向的可能性。Zangwill接着证明了严格凸函数收敛到 最小值(虽然不是有限步)。 Powell used this, in turn, to show that his method possessed what was known then as the “Q-property”. An algorithm has the Q-property if it will find the minimizer of a convex quadratic in a finite number of iterations. That is, the Q-property is the finite termination property for convex quadratics such as that exhibited by the conjugate gradient algorithm. In the case of Powell’s method, one obtains finite termination in n stages for convex quadratics. Zangwill [31] gave a modification of Powell’s method that avoids the possibility of linearly dependent search directions. Zangwill further proved convergence to minimizers of strictly convex functions ( though not in a finite number of steps). 我们知识中最好的,Powell方法第一次使直接搜索法和求导无关 方法有服务性的收敛分析。像Powell方法中线性搜索使用的目标一样 的显性建模的需要很清诉:它使得优化方法行为的严格陈述称为可能。 我们可以期望目标本质为二次的一个解的邻域里算法一次快速收敛到 最小值。 To the best of our knowledge, Powell’s method marks the first time that either a direct search or a derivative-free method appeared with any attendant convergence analysis. The appeal of the explicit modeling of the objective such as that used in the line-searches in Powell’s method is clear: it makes possible strong statements about the behavior of the optimization method. We can expect the algorithm to quickly converge to a minimizer once in a neighborhood of a solution on which the objective is essentially quadratic. 二次目标的有限终止是在20世纪60年代到70年代间优化界频繁 讨论的问题。那个时候优化界(为了分析上的方便,通常称为封闭形 态目标函数)出的很多成果可以说明这个问题。在那个时候的许多报 告[5,9]巩固了假定的包含在Powell求导无关共厄方向算法的保证有限 终止的基于建模的逼近的优点。 Finite termination on quadratic objectives was a frequently expressed concern within the optimization community during the 1960s and 1970s. The contemporary numerical results produced by the optimization community ( for analytical, closed-form objective functions, it should be noted) evidence this concern. Most reports of the time [5,9] confirm the supposed superiority of the modeling-based approach with guaranteed finite termination as embodied in Powell’s derivative-free conjugate directions algorithm. 然而40年以后,“除了在实践中又迷人的魔力但没有理论分析”的 直接搜索法仍然很流行。可以用很多数据说明它的流行,包括满意的 用户,文献引用和可用软件。什么能解释这个矛盾的历史发展呢, Yet forty years later, direct search methods, “ which employ no techniques of analysis except where there is a demonstrable advantage in doing so”, remain popular, as indicated by any number of measures: satisfied users, literature citations, and available software. What explains this apparently contradictory historical development? 4.结论 4.Conclusion 直接搜索法之所以仍然很流行是因为它的简单,灵活和可靠。回顾 40年前直接搜索法的初始发展,我们可以在一个宽阔的发展史中定位 现在我们所知道和理解的算法。 Direct search methods remain popular because of their simplicity, flexibility, and reliability. Looking back at the initial development of direct search methods from a remove of forty years. We can firmly place what is now known and understood about these algorithms in a broader context. 除了我们在3.2部分特别讨论的基于单纯形的方法,直接搜索法是 健壮的。现存的分析结果证明在与那些分析求解非线性优化问题的算 法的全局形为类似的假设条件下,直接搜索法能够满足求解最小值的 一阶必须条件(也就是收敛到稳定点)。这里值得注意的是直接搜索法 不需要或显性地估算导数信息;实际上,可以通过阶数信息来得到保 证。大多数直接搜索法需要一个覆盖搜索空间的方向集的事实足够保 证存在足够的有关函数局部行为的信息可以在整个方向集被查询后安 全减少步长。 With the exception of the simplex-based methods specifically discussed in Section 3.2, direct search methods are robust. Analytical results now exist to demonstrate that under assumptions comparable to those commonly used to analyze the global behavior of algorithms for solving unconstrained nonlinear optimization problems, direct search methods can be shown to satisfy the first-order necessary conditions for a minimizer (i.e., convergence to a stationary point). This seems remarkable given that direct search methods neither require nor explicitly estimate derivative information; in fact, one obtains these guarantees even when using only ranking information. The fact that most of the direct search methods require a set of directions that span the search space is enough to guarantee that sufficient information about the local behavior of the function exists to safely reduce the step length after the full set of directions has been queried. 按照Spendley et al. [21]的说法,我们可以认为直接搜索法是“最速 下降法”。这些作者很清诉地把他们的算法设计成和最速下降法差不多 的样子(因为是求最大化,实际上是最速上升)。尽管没有明显的梯度 描述,但通过采样得到的足够局部信息可以确保能确定下降方向(虽 然最速下降方向不是必需的)。Spendley et al.也直觉感到最速下降可以 确保我们所称的全局收敛;更进一步,他们认识到需要更高阶方法来 获得快速局部收敛。 Following the lead of Spendley et al. [21], we like to think of direct search methods as “methods of steep descent”. These authors made it quite clear that their algorithm was designed to be related to the method of steepest descent (actually steepest ascent, since the authors were maximizing). Although no explicit representation of the gradient is formed, enough local information is obtained by sampling to ensure that a downhill direction (though not necessarily the steepest downhill direction) can be identified. Spendley et al. also intuited that steep descent would be needed to ensure what we now call global convergence; furthermore, they recognized the need to switch to higher-order methods to obtain fast local convergence. 这引起我们关注经典直接搜索法的第二点。它对于二次目标的有限 终止或快速局部收敛的条件仍不满意。为此,需要获得目标的局部曲 率信息,而这有必要需要通过建模的方式——因此,不可否认的需要 基于建模的逼近。但是,建模引入了额外的可能在直接搜索法所使用 的设置上不适当的限制:特别的,为了使用插值或其它形式的逼近需 要足够可靠的显性的数值函数值。事实,临时性在用增加额外层的信 息来设计逼近曲率(二阶)信息的求导无关方法上仍然不太有效。许 多组的研究者现在正在寻找类似于拟牛顿法的雅致的依赖域全局技术 求导无关的方法。该方法要实现有助确保全局收敛的Cauchy(最速下 降)方向和确保快速局部收敛的牛顿方向的无缝转换 This brings us to the second point to be made about the classical direct search methods. They do not enjoy finite termination on quadratic objectives or rapid local convergence. For this, one needs to capture the local curvature of the objective, and this necessarily requires some manner of modeling – hence, the undeniable appeal of modeling-based approaches. However modeling introduces additional restrictions that may not always be appropriate in the settings in which direct search methods are used: specifically, the need to have explicit numerical function values of sufficient reliability to allow interpolation or some other form of approximation. In truth. The jury is still out on the effectiveness of adding this additional layer of information to devise derivative-free methods that also approximate curvature (second-order) information. Several groups of researchers are currently looking for a derivative-free analog of the elegant trust region globalization techniques for quasi-Newton methods that switch seamlessly between favoring the Cauchy (steepest-descent) direction to ensure global convergence and the Newton direction to ensure fast local convergence. 我们在结束前如下,虽然非线性优化问题以各种形式出现,没 有“万能”算法可以成功解决所有问题。直接搜索法有时可能被作为 首选算法不恰当地用来解决其它优化技术能更好解决的问题。但是当 其它方法使用失败后,可以适当地将直接搜索法作为备用方法用来解 决该问题。任何优化的实践者都要在他们的工具中慎重考虑使用直接 搜索法。分析表明在各个邻域的实践者经过仔细选择,小心地使用直 接搜索法是解决许多非线性优化问题的有效途径。 We close with the observation that, since nonlinear optimization problems come in all forms, there is no “one-size-fits-all” algorithm that can successfully solve all problems. Direct search methods are sometimes inappropriately – as the method of first recourse when other used – optimization techniques would be more suitable. But direct search methods are also used – appropriately – as the methods of last recourse, when other approaches have been tried and failed. Any practical optimizer would be well-advised to include direct search methods among their many tools of the trade. Analysis now confirms what practitioners in many different fields have long recognized a carefully chosen, carefully implemented direct search method can be an effective tool for solving many nonlinear optimization problems. References 引文 [1] L.Armijo, Minimization of functions having Lipschitz continuous first partial derivatives, Pacific J. Math. 16 (1966) 1-3. [2] G. Berman, Lattice approximations to the minima of functions of several variables, J. Assoc. Comput. Mach. 16 (1969) 286-294. [3] G.E.P. Box. Evolutionary operation: a method for increasing industrial productivity, Appl. Statist. 6 (1957) 81-101. [4] G.E.P. Box. K.B. Wilson, On the experimental attainment of optimum conditions, J. Roy. Statist. Soc. Ser. B XII (1951) 1-45. [5] M.J. Box. A comparison of several current optimization methods, and the use of transformations in constrained problems. Comput. J. 9 (1966) 67-77. [6] M.J. Box. D. Davies. W.H. Swann. Non-Linear Optimization Techniques, ICI Monograph, No. 5, Oliver & Boyd, Edinburgh, 1969. [7] J. Céa, Optimisation: Théorie et Algorithmes, Dunod, Paris, 1971. [8] W.C. Davidon. Variable metric method for minimization, SIAM J. Optim. 1 (1991) 1-17 (the article was originally published as Argonne National Laboratory Research and Development Report 5990. May 1959 (revised November 1959)). [9] R. Fletcher. Function minimization without evaluating derivatives – a review, Comput. J. 8 (1965) 33-41. [10] R. Fletcher. M.J.D. Powell. A rapidly convergent descent method for minimization, Comput. J. 6 (1963) 163-168. [11] A.A. Goldstein, Constructive Real Analysis, Harper & Row, New York, 1967. [12] R.Hooke. T.A. Jeeves, Direct search solution of numerical and statistical problems. J. Assoc. Comput. Mach. 8 (1961) 212-229. [13] J.C. Lagarias. J.A. Reeds, M.H. Wright, P.E. Wright, Convergence properties of the Nelder-Mead simplex method in low dimensions, SIAM J. Optim. 9 (1998) 112-147. [14] R.M. Lewis, V. Torczon, M.W. Trosset, Why pattern search works. Optima (1998) 1-7. [15] R.M. Lewis. V.J. Torczon, Rank ordering and positive bases in pattern search algorithms. Technical Report 96-71. Institute for Computer Applications in Science and Engineering, Mail Stop 132C, NASA Langley Research Center. Hampton. VA 23681-2199, 1996. [16] K.I.M. McKinnon. Convergence of the Nelder-Mead simplex method to a nonstationary point, SIAM J. Optim. 9 (1998) 148-158. [17] J.A. Nelder. R. Mad. A simplex method for function minimization. Comput. J. 7 (1965) 308-313. [18] E. Polak. Computational Methods in Optimization: A Unified Approach, Academic Press, New York, 1971. [19] M.J.D. Powell, An efficient method for finding the minimum of a function of several variables without calculating derivatives, Comput. J. 7 (1964) 155-162. [20] H.H. Rosenbrock, An automatic method for finding the greatest or least value of a function. Comput. J. 3 (1960) 175-184. [21] W. Spendley. G.R. hext, F.R. Himsworth, Sequential application of simplex designs in optimization and evolutionary operation. Technometrics 4 (1962) 441-461. [22] W.H. Swann, Report on the development of a new direct search method of optimization, Research Note 64/3, I.C.I. Central Instrument Lab. 1964. [23] W.H. Swann, Direct search methods, in: W. Murray (Ed.), Numerical Methods for Unconstrained Optimization. Academic Press, new York. 1972, pp. 13-28. [24] V. Torczon, Multi-directional search: a direct search algorithm for parallel machines, ph.D. Thesis, Department of Mathematical Sciences. Rice University, Houston, Texas. 1989. [25] V. Torczon. On the convergence of the multidirectional search algorithm. SIAM J. Optim. 1 (1991) 123-145. [26] V. Torczon. On the convergence of pattern search algorithms. SIAM J. Optim. 7 (1997) 1-25. [27] M.W. Trosset. I know it when I see it: toward a definition of direct search methods. SIAG/OPT Views-and-News 9 (1997) 7-10. [28] F.H. Walters, L.R. Parker Jr., S.L. Morgan, S.N. Deming. Sequential Simplex Optimization. Chemometrics Series. CRC press, Boca Raton, FL, 1991. [29] P. Wolfe, Convergence conditions for ascent methods, SIAM Rev. 11 (1969) 226-235. [30] M.H. Wright, Direct search methods: once scorned, now respectable, in: D.F. Griffiths. G.A. Watson (Eds.). Numerical Analysis 1995, Addison-Wesley Longman, Reading, MA, 1996, pp. 191-208. [31] W.I. Zangwill, Minimizing a function without calculating derivatives, Comput. J. 10 (1967) 293-296.
/
本文档为【直接搜索法历史和现状】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索