侧边栏壁纸
博主头像
qiql博主等级

水能载舟,亦可赛艇

  • 累计撰写 33 篇文章
  • 累计创建 28 个标签
  • 累计收到 20 条评论

目 录CONTENT

文章目录

🌈有限元分析学习笔记(二):最速下降法与共轭梯度法

qiql
2022-07-14 / 0 评论 / 4 点赞 / 932 阅读 / 9,788 字

最速下降法

定理 1

考虑线性方程组:

Ax=b,  ARn×n,  A对称正定\bold A \bold x = \bold b,\;A \in \bold R^{n\times n},\;A对称正定

和二次泛函

φ(x)=xTAx2bTx\varphi(\bold x) = \bold x^T\bold A \bold x-2\bold b^T\bold x

有定理 1:设 A\bold A 对称正定,求方程组 Ax=b\bold A \bold x = \bold b 的解等价与求二次泛函 φ(x)\varphi (x) 的极小值点。

即求 xRn\bold x^* \in \bold R^n 使得 φ(x)=minxRφ(x)\varphi(\bold x^*) = \mathop{\min}\limits_{x \in R} \varphi(\bold x)

证明:

必要性

x\bold x^*φ(x)\varphi(\bold x^*)RnR^n 上的极小值点,则 grad(φ(x))=φ(x)=0grad(\varphi (\bold x^*)) = \nabla \varphi (\bold x)=0

grad(φ(x))=(φx1φx2φxn)grad(\varphi(\bold x^*)) = \begin{pmatrix} \frac{\partial \varphi}{\partial x_1}\\ \frac{\partial \varphi}{\partial x_2}\\ \vdots\\ \frac{\partial \varphi}{\partial x_n}\\ \end{pmatrix}

下面开始求 φ(x)\varphi(x) :

φ(x)=xTAx2bTx=[x1,  x2,  ,  xn][a11a12a1na21a22a2nan1an2ann][x1x2xn]2[b1,  b2,  ,  bn][x1x2xn]=[i=1nai1xi,  i=2nai2xi,  ,  i=nnainxi][x1x2xn]2i=1nbixi=i=1nai1xix1+i=1nai2xix2+  +i=1nainxixn2i=1nbixi=j=1ni=1naijxixj2i=1nbixi\begin{aligned} \varphi(\bold x) & = \bold x^T\bold A \bold x -2\bold b^T \bold x \\&= \begin{bmatrix} x_1,\;x_2,\;\cdots,\;x_n \end{bmatrix} \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&&\vdots\\ a_{n1}&a_{n2}&\cdots&a_{nn} \end{bmatrix} \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix} -2 \begin{bmatrix} b_1,\;b_2,\;\cdots,\;b_n \end{bmatrix} \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix}\\ & = \begin{bmatrix} \sum\limits_{i=1}^na_{i1}x_i,\;\sum\limits_{i=2}^na_{i2}x_i,\;\cdots,\;\sum\limits_{i=n}^na_{in}x_i \end{bmatrix} \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix} -2\sum\limits_{i=1}^nb_ix_i\\ & = \sum\limits_{i=1}^na_{i1}x_ix_1+\sum\limits_{i=1}^na_{i2}x_ix_2+\;\cdots+\sum\limits_{i=1}^na_{in}x_ix_n-2\sum\limits_{i=1}^nb_ix_i\\ & = \sum\limits_{j=1}^n\sum\limits_{i=1}^na_{ij}x_ix_j-2\sum\limits_{i=1}^nb_ix_i\\ \end{aligned}

求偏导:

φ(x)xk=ak1x1+ak2x2+ak,k1xk1+φ(xk)xk+ak,k+1xk+1++ak,nxn2bk\frac{\partial \varphi(\bold x)}{\partial x_k}=a_{k1}x_1 + a_{k2}x_2 + a_{k,k-1}x_{k-1} + \frac{\partial \varphi(x_k)}{x_k} +a_{k,k+1}x_{k+1} + \cdots + a_{k,n}x_n - 2b_k

上式中,把 φ(xk)xk\frac{\partial \varphi(x_k)}{x_k} 单独拎出来计算:

φ(xk)xk=(i=1naikxixk)xk=i=1,iknaikxi+2akkxk\frac{\partial \varphi(x_k)}{x_k} = \frac{\partial\left(\sum\limits_{i=1}^na_{ik}x_ix_k\right)}{x_k}= \sum\limits_{i=1,i\neq k}^na_{ik}x_i + 2a_{kk}x_k

又由于 A\bold A 矩阵是对称的,即 A=AT,  aki=aik\bold A = \bold A^T ,\; a_{ki} = a_{ik}
整理一下,得到:

φ(x)xK=2i=1nakixi2bk\frac{\partial \varphi(\bold x)}{\partial x_K}=2\sum\limits_{i=1}^na_{ki}x_i - 2b_k

那么,

grad(φ(x))=(φx1φx2φxn)=2(i=1na1ixii=1na2ixii=1nanixi)2(b1b2bn)=2(Axb)=2r  ,  (r=bAx)grad(\varphi(\bold x^*)) = \begin{pmatrix} \frac{\partial \varphi}{\partial x_1}\\ \frac{\partial \varphi}{\partial x_2}\\ \vdots\\ \frac{\partial \varphi}{\partial x_n}\\ \end{pmatrix} = 2\begin{pmatrix} \sum\limits_{i=1}^na_{1i}x_i\\ \sum\limits_{i=1}^na_{2i}x_i\\ \vdots\\ \sum\limits_{i=1}^na_{ni}x_i\\ \end{pmatrix} -2 \begin{pmatrix} b_1\\ b_2\\ \vdots\\ b_n\\ \end{pmatrix} =2(\bold A\bold x - \bold b) = -2\bold r\;, \; (r = \bold b - \bold A\bold x)

所以,如果 Ax=b\bold A\bold x^* = \bold b ,那么,x\bold x^* 就是 Ax=b\bold A \bold x = \bold b 的解

充分性

Ax=b\bold A\bold x^* = \bold b ,则 yRn\forall \bold y\in \bold R^n ,有

φ(x+y)=(x+y)TA(x+y)2bT(x+y)=xTAx+xTAy+yTAx+yTAy2bTx2bTy=xTAx+bTy+yTb+yTAy2bTx2bTy            (AT=A,  Ax=b,所以xTAy=xTATy=AxT=bTy)=xTAx+yTAy2bTx            (bTyyTb是相等的,都是数字,所以可以跟后面的2bTy  消掉)=φ(x)+yTAy一定为非负数φ(x)\begin{aligned} \varphi(\bold x^*+\bold y) &=(\bold x^*+\bold y)^T\bold A (\bold x^*+\bold y)-2\bold b^T(\bold x^*+\bold y)\\ & = \bold {x^*}^T\bold A \bold x^* + \bold {x^*}^T\bold A\bold y + \bold y^T\bold A \bold x^*+\bold y^T\bold A\bold y-2\bold b^T\bold x^* - 2\bold b^T\bold y\\ & = \bold {x^*}^T\bold A \bold x^* + \bold b^T\bold y + \bold y^T\bold b+\bold y^T\bold A\bold y-2\bold b^T\bold x^* - 2\bold b^T\bold y\;\; \\&\;\;\;\;(\bold A^T = \bold A, \; \bold A\bold x^* = \bold b,所以\bold {x^*}^T\bold A\bold y = \bold {x^*}^T\bold A^T\bold y = {\bold A\bold x^*}^T\bold =\bold b^T\bold y)\\ & = \bold {x^*}^T\bold A \bold x^* + \bold y^T\bold A\bold y-2\bold b^T\bold x^*\;\; \\&\;\;\;\;(\bold b^T\bold y和\bold y^T\bold b是相等的,都是数字,所以可以跟后面的- 2\bold b^T\bold y \;消掉)\\ & = \varphi(\bold x^*) \underbrace{ + \bold y^T\bold A\bold y }_{一定为非负数} \geq \varphi(\bold x^*) \end{aligned}

x\therefore x^* 使得 φ(x)\varphi(\bold x) 达到最小

证毕

计算思路

通过前面的定理,那么用最速下降法求方程组就相当于求二次泛函的极小值点,求这个极小值点的方式类似于盲人下山

(1)对给定的初始向量 x0\bold x_0 ,确定一个下山方向 P0\bold P_0 ,沿着直线 x=x0+αP0\bold x = \bold x_0 + \alpha \bold P_0 寻找点 x1=x0+α0P0\bold x_1 = \bold x_0 + \alpha_0 \bold P_0 使得 φ(x0+α0P0)φ(x0+αP0),  αR\varphi(\bold x_0+\alpha_0\bold P_0) \leq \varphi(\bold x_0+\alpha \bold P_0),\;\alpha \in \bold R
(2)类似地,找点 x2=x1+α1P1\bold x_2 = \bold x_1+\alpha_1 \bold P_1,使得 φ(x2)=φ(x1+α1P1)φ(x1+αP1),  αR\varphi(\bold x_2)=\varphi(\bold x_1+\alpha_1\bold P_1) \leq \varphi(\bold x_1+\alpha \bold P_1),\;\alpha \in \bold R
     …
(k)找点 xk=xk1+αk1Pk1\bold x_k = \bold x_{k-1} + \alpha_{k-1}\bold P_{k-1} ,使得 φ(xk)=φ(xk1+αk1Pk1)φ(xk1+αPk1),  αR\varphi(\bold x_k)=\varphi(\bold x_{k-1}+\alpha_{k-1}\bold P_{k-1}) \leq \varphi(\bold x_{k-1}+\alpha \bold P_{k-1}),\;\alpha \in \bold R
     …
  直到 rk=bAxktol||\bold r_k|| = ||\bold b - \bold A \bold x_k || \leq toltoltol 是能够容许的误差,通常为一个极小的值

在计算过程中,需要确定步长 αk\alpha_k 和搜索方向 Pk\bold P_k
首先确定搜索方向,应该是 φ(x)\varphi(\bold x) 减少速度最快的方向,也就是负梯度方向:Pk=rk=bAxk\bold P_k = \bold r_k = \bold b - \bold A \bold x_k
然后确定步长 αk\alpha_k ,使得 f(αk)=minαf(α)=minαφ(xk+αPk)=φ(xk+αkPk)f(\alpha_k) = \mathop{\min}\limits_{\alpha}f(\alpha)=\mathop{\min}\limits_{\alpha}\varphi(x_k+\alpha P_k)=\varphi(\bold x_k+\alpha_k \bold P_k)
所以 αk\alpha_k 应为满足 f(α)=0f'(\alpha) = 0 的解
下面确定 αk,  Pk\alpha_k,\;\bold P_k

f(α)=φ(xk+αPk)=(xk+αPk)TA(xk+αPk)2bT(xk+αPk)=xkTAxk+αxkTAPk+αPkTAxk这两项其实是一样的,可以用转置的定义推导得到+α2PkTAPk2bTxk2αbTPk=xkTAxk+2αxkTAPk+α2PkTAPk2bTxk2αbTPk=xkTAxk2αxkTAPk+α2PkTAPk2bTxk          (rk=bAxk,  2αxkTAPk2αbTPk=2α((Axk)TbT)Pk=2αrkTpk  )=α2PkTAPk2αrkTpk+φ(xk)\begin{aligned} f(\alpha) & = \varphi(\bold x_k + \alpha \bold P_k) = (\bold x_k + \alpha\bold P_k)^T\bold A (\bold x_k+\alpha \bold P_k)-2\bold b^T(\bold x_k+\alpha \bold P_k)\\ & = \bold x_k^T \bold A \bold x_k \underbrace{ +\alpha \bold x_k^T\bold A \bold P_k + \alpha \bold P_k^T\bold A \bold x_k }_{这两项其实是一样的,可以用转置的定义推导得到} +\alpha^2\bold P_k^T\bold A \bold P_k -2\bold b^T\bold x_k - 2\alpha\bold b^T\bold P_k\\ & = \bold x_k^T\bold A\bold x_k \underline{+ 2\alpha \bold x_k^T\bold A \bold P_k} + \alpha^2\bold P_k^T\bold A \bold P_k -2\bold b^T\bold x_k\underline{- 2\alpha\bold b^T\bold P_k}\\ & = \bold x_k^T\bold A\bold x_k -2\alpha\bold x_k^T\bold A\bold P_k+\alpha^2\bold P_k^T\bold A \bold P_k-2\bold b^T\bold x_k\\ &\;\;\;\;\;(\bold r_k = \bold b - \bold A \bold x_k ,\; 2\alpha \bold x_k^T\bold A \bold P_k-2\alpha\bold b^T\bold P_k = 2\alpha((\bold A\bold x_k)^T-\bold b^T)\bold P_k=-2\alpha \bold r_k^T\bold p_k\;)\\ & =\alpha^2\bold P_k^T\bold A \bold P_k-2\alpha \bold r_k^T\bold p_k+\varphi(\bold x_k) \end{aligned}

解方程:    f(α)=2αPkTAPk2rkTPk=0:    αk=rkTPkPkTAPk又由于:  Pk=rk=bAxk    αk=rkTrkPkTAPk\begin{aligned} 解方程& : \;\;f'(\alpha) = 2\alpha \bold P_k^T\bold A \bold P_k - 2\bold r_k^T\bold P_k= 0\\ 得& : \;\;\alpha_k=\frac{\bold r_k^T\bold P_k}{\bold P_k^T\bold A\bold P_k}\\ 又由于&: \;\bold P_k = \bold r_k = \bold b-\bold A\bold x_k\\ \therefore&\;\;\alpha_k = \frac{\bold r_k^T\bold r_k}{\bold P_k^T\bold A\bold P_k} \end{aligned}

验证不等式 φ(xk+1)=φ(xk+αkPk)<φ(xk)\varphi(\bold x_{k+1}) = \varphi(\bold x_k+\alpha_k\bold P_k) < \varphi(\bold x_k) 是否满足

φ(xk+1φ(xk))=φ(xk+αkPk)φ(xk)=(xk+αkPk)TA(xk+αkPk)2bT(xk+αkPk)(xkTAxk2bTxk)=xkTAxk+αkxkTAPk+αkPkTAxk+αk2PkTAPk2bTxk2αkbTPkxkTAxk+2bTxk=αkxkTAPk+αkPkTAxk+αk2PkTAPk2αkbTPk=2αkxkTAPk+αk2PkTAPk2αkbTPk=αk2PkTAPk2αkrkTPk=(rkTrk)2(PkTAPk)2PkTAPk2rkTrkPkTAPkrkTPk=(rkTrk)2PkTAPk2(rkTrk)2PkTAPk=(rkTrk)2PkTAPk<0    (只要  rkTPk0)\begin{aligned} \varphi(\bold x_{k+1}-\varphi(\bold x_k)) & = \varphi(\bold x_k+\alpha_k\bold P_k) - \varphi(\bold x_k)\\ & = (\bold x_k+\alpha_k\bold P_k)^T\bold A(\bold x_k+\alpha_k\bold P_k)-2\bold b^T(\bold x_k+\alpha_k\bold P_k) - (\bold x_k^T\bold A\bold x_k-2\bold b^T\bold x_k)\\ & = \bold x_k^T \bold A \bold x_k +\alpha_k \bold x_k^T\bold A \bold P_k + \alpha_k \bold P_k^T\bold A \bold x_k +\alpha_k^2\bold P_k^T\bold A \bold P_k -2\bold b^T\bold x_k - 2\alpha_k\bold b^T\bold P_k - \bold x_k^T\bold A\bold x_k + 2\bold b^T\bold x_k\\ & = \alpha_k \bold x_k^T\bold A \bold P_k + \alpha_k \bold P_k^T\bold A \bold x_k +\alpha_k^2\bold P_k^T\bold A \bold P_k - 2\alpha_k\bold b^T\bold P_k\\ & = 2\alpha_k \bold x_k^T\bold A \bold P_k+\alpha_k^2\bold P_k^T\bold A \bold P_k - 2\alpha_k\bold b^T\bold P_k\\ & = \alpha_k^2\bold P_k^T\bold A \bold P_k - 2\alpha_k\bold r_k^T\bold P_k\\ & = \frac{(\bold r_k^T\bold r_k)^2}{(\bold P_k^T\bold A \bold P_k)^2}\bold P_k^T\bold A\bold P_k - 2\frac{\bold r_k^T\bold r_k}{\bold P_k^T\bold A \bold P_k}\bold r_k^T\bold P_k\\ & = \frac{(\bold r_k^T\bold r_k)^2}{\bold P_k^T\bold A \bold P_k} - 2\frac{(\bold r_k^T\bold r_k)^2}{\bold P_k^T\bold A \bold P_k}\\ & =- \frac{(\bold r_k^T\bold r_k)^2}{\bold P_k^T\bold A \bold P_k} < 0\;\;(只要\;\bold r_k^T\bold P_k \neq 0) \end{aligned}

rk=0,:φ(xk+1=φxk),即解为  xk\bold r_k = 0 , 则:\varphi(\bold x_{k+1} = \varphi{\bold x_k}), 即解为\;\bold x_k

一些向量的性质

PkTPk+1=0\bold P_k^T \bold P_{k+1} = 0

其中,  Pk+1=bAxk+1=bA(xk+αkPk)=bAxkαkAPk=rkαkAPk其中,\;\bold P_{k+1} = \bold b-\bold A\bold x_{k+1}=\bold b-\bold A(\bold x_k+\alpha_k\bold P_k) = \bold b-\bold A\bold x_k-\alpha_k\bold A\bold P_k = \bold r_k-\alpha_k\bold A\bold P_k
那么:PkTPk+1=PkTrkPkTAPkrkTrkrkTArk=rkTrkrkTrk=0那么:\bold P_k^T \bold P_{k+1} = \bold P_k^T\bold r_k - \bold P_k^T\bold A\bold P_k\frac{\bold r_k^T\bold r_k}{\bold r_k^T\bold A \bold r_k}=\bold r_k^T\bold r_k-\bold r_k^Tr_k = 0
这一性质的几何含义:如果在二维上,表现为两次下降的方向是垂直的(正交的)

具体计算步骤

     步0:任选初始点 x0\bold x_0 ,置 k=0k = 0
     步1:计算 rk=bAxk\bold r_k = \bold b-\bold A \bold x_k (下降方向为负梯度方向)
     步2:若 rk=0\bold r_k = 0 ,则停止
     步3:计算 αk=rkTrkrkTArk\alpha_k = \frac{\bold r_k^T \bold r_k}{\bold r_k^T \bold A \bold r_k} % (步长因子)
     步4:计算 xk+1=xk+αkPk\bold x_{k+1} = \bold x_k + \alpha_k \bold P_k (新的迭代点)
     步5:k=k+1k = k+1 ,转步1

补充

定理2:设 ARn×n\bold A \in \bold R^{n\times n} 对称正定,A\bold A 的特征值为:

λnλn1λ1>0\lambda_n\geq\lambda_{n-1}\geq\cdots \geq \lambda_1 > 0

{xk}\{x_k\} 为由求解对称正定方程组 Ax=b\bold A \bold x = \bold b 的最速下降法得到的向量序列,则有:

xkxA(λnλ1λn+λ1)kx0xA||\bold x_k-\bold x^*||_A\leq\left( \frac{\lambda_n-\lambda_1}{\lambda_n+\lambda_1} \right)^k||\bold x_0-\bold x^*||_A

     其中:Ax=b\bold A\bold x^* = \bold b
     定义:设 ARn×n\bold A \in \bold R^{n\times n} 对称正定,则称 xA=(xTAx)12||\bold x||_A = (\bold x^T \bold A \bold x)^{\frac{1}{2}}Rn\bold R^n 上的 A范数\bold A- 范数
     注:可以用范数定义证明 A||*||_A 是范数
     从定理的结论上看,当 λnλ11\frac{\lambda_n}{\lambda_1} \gg 1 时,迭代序列收敛速度很慢
     缺点:从局部上看,最速下降法的每一步迭代都是最优的,但是从整体上看却未必,为获得全局最优的迭代方法,需要采用共轭梯度法

共轭梯度法

推导

(1)  x0Rn,  r0=bAx0\forall\;\bold x_0\in\bold R^n,\;\bold r_0=\bold b-\bold A\bold x_0 ,仍取 P0=r0\bold P_0 = \bold r_0x0\bold x_0 沿 P0\bold P_0 方向作直线搜索
     使得 φ(x0+α0P0)=minαφ(x0+αP0),  αR\varphi(\bold x_0+\alpha_0\bold P_0) = \mathop{\min}\limits_{\alpha}\varphi(\bold x_0+\alpha \bold P_0),\;\alpha \in \bold R 证明方法与最速下降法类似
     解之,得 α0=r0Tr0r0TAr0>0\alpha_0 = \frac{\bold r_0^T\bold r_0}{\bold r_0^T\bold A \bold r_0} > 0 【 通过求f(α)=0f'(\alpha) = 0 中的 α\alpha 得到 】
     x1=x0+α0P0,  r1=bAx1\therefore \bold x_1=\bold x_0 + \alpha_0\bold P_0,\;\bold r_1=\bold b - \bold A\bold x_1
     注:P0Tr1=P0T(bAx0α0AP0)=P0Tr0α0P0TAP0=r0=P00\bold P_0^T \bold r_1 = \bold P_0^T(\bold b - \bold A\bold x_0 - \alpha_0\bold A\bold P_0) = \bold P_0^T\bold r_0 - \alpha_0\bold P_0^T\bold A\bold P_0 \xlongequal{\bold r_0 = \bold P_0} \bold 0
          r0Tr1=P0Tr1=0  ()\bold r_0^T\bold r_1 = \bold P_0^T\bold r_1 = 0 \;(*)

(2)从第二步起,下山方向就不再选取负梯度方向,而是在二维平面 π2={x=x1+ξr1+ηP0    ξ,ηR}\pi_2 = \{\bold x=\bold x_1+\xi\bold r_1 + \eta\bold P_0\;|\;\xi,\eta\in \bold R\} 内找出使函数下降最快的方向作为新的下山方向 P1\bold P_1 和新的步长 α1\alpha_1

  • 先求 P1\bold P_1

φ(ξ,η)=φ(x1+ξr1+ηP0)=(x1+ξr1+ηP0)TA(x1+ξr1+ηP0)2bT(x1+ξr1+ηP0)=x1TAξr1+ξr1TAx1+ξ2r1TAr1+ξr1TAηP0+ηP0TAξr12bTξr1\begin{aligned} \varphi(\xi,\eta) & = \varphi(\bold x_1+\bold \xi\bold r_1+\eta\bold P_0) \\ & = (\bold x_1 + \xi\bold r_1 + \eta\bold P_0)^T\bold A(\bold x_1 + \xi\bold r_1 + \eta\bold P_0) - 2\bold b^T(\bold x_1 + \xi\bold r_1 + \eta\bold P_0)\\ & = \bold x_1^T \bold A \xi\bold r_1+\xi\bold r_1^T\bold A\bold x_1+\xi^2\bold r_1^T\bold A \bold r_1 + \xi\bold r_1^T\bold A \eta\bold P_0+ \eta P_0^T\bold A\xi\bold r_1-2\bold b^T\xi\bold r_1 \end{aligned}

     对其求偏导,得到

φ(ξ,η)ξφ(ξ,η)ξ=2(ξr1TAr1+ηr1TAP0r1Tr1)φ(ξ,η)ξ=2(ξr1TAP0+ηP0TAP0)\begin{aligned} \frac{\varphi(\xi,\eta)}{\xi} \frac{\partial\varphi(\xi,\eta)}{\partial\xi} & = 2(\xi\bold r_1^T\bold A\bold r_1+\eta\bold r_1^T\bold A \bold P_0-\bold r_1^T\bold r_1)\\ \frac{\partial\varphi(\xi,\eta)}{\partial\xi} & = 2(\xi\bold r_1^T\bold A\bold P_0+\eta\bold P_0^T\bold A \bold P_0) \end{aligned}

     令(19)(20)为 0 ,即得 φ\varphiπ2\pi_2 内的唯一极小值解

x~=x1+ξ0r1+η0P0\widetilde{\bold x} = \bold x_1 + \xi_0\bold r_1 + \eta_0\bold P_0

     其中:ξ0,  η0  \xi_0,\;\eta_0\; 满足(19)=(20)= 0
     即求方程组:{  ξr1TAr1+ηr1TAP0=r1Tr1  ξr1TAP0+ηP0TAP0\begin{cases} \;\xi\bold r_1^T\bold A\bold r_1+\eta\bold r_1^T\bold A \bold P_0=\bold r_1^T\bold r_1\\ \;\xi\bold r_1^T\bold A\bold P_0+\eta\bold P_0^T\bold A \bold P_0 \end{cases} 【 如果这里直接求解 ξ0,  η0\xi_0,\;\eta_0 比较困难,可以先求出 β0=η0ξ0=r1TAP0P0TAP0\beta_0=\frac{\eta_0}{\xi_0} = -\frac{\bold r_1^T\bold A \bold P_0}{\bold P_0^T \bold A \bold P_0} ,得出方向,再求方向的系数 】

     (21)中, 若 r10    ξ00r_1 \neq 0 \;则\;\xi_0\neq 0 ,故可取 x2=x1+α1P1\bold x_2 = \bold x_1 + \alpha_1 \bold P_1P1=r1+β0P0\bold P_1 = \bold r_1 + \beta_0\bold P_0 作为新的方向】

  • 再求 α1\alpha_1

     因 α1\alpha_1 也满足 φ(x1+α1P1)=minαφ(x1+αP1),  αR\varphi(\bold x_1+\alpha_1\bold P_1) = \mathop{\min}\limits_{\alpha}\varphi(\bold x_1+\alpha \bold P_1),\;\alpha \in \bold R
     故,得到: α1=r1TP1P1TAP1\alpha_1 = \frac{\bold r_1^T\bold P_1}{\bold P_1^T\bold A \bold P_1}
     由 P1=r1+β0P0\bold P_1 = \bold r_1 + \beta_0 \bold P_0 ,得 α1=r1T+β0P0P1TAP1=r1Tr1P1TAP1\alpha_1 = \frac{\bold r_1^T+\beta_0\bold P_0}{\bold P_1^T\bold A \bold P_1} = \frac{\bold r_1^T\bold r_1}{\bold P_1^T \bold A \bold P_1} 【 由 ()(*)r1\bold r_1P0\bold P_0 垂直,即 r1TP0=(P0Tr1)T=0\bold r_1^T \bold P_0 = (\bold P_0^T\bold r_1)^T = 0

     综上:x2=x1+α1P1,  P1=r1+β0P0,  r2=bAx2\bold x_2 = \bold x_1 +\alpha_1\bold P_1,\;\bold P_1 = \bold r_1+\beta_0\bold P_0,\;\bold r_2 = \bold b - \bold A\bold x_2

一些向量的性质

P1TAP0=(r1+β0P0)TAP0=r1TAP0+β0P0TAP0=r1TAP0r1TAP0P0TAP0P0TAP0=0\bold P_1^T\bold A \bold P_0 = (\bold r_1 + \beta_0 \bold P_0)^T \bold A \bold P_0 = \bold r_1^T\bold A \bold P_0 + \beta_0\bold P_0^T\bold A \bold P_0 = \bold r_1^T\bold A \bold P_0 - \frac{\bold r_1^T\bold A \bold P_0}{\bold P_0^T\bold A \bold P_0}\bold P_0^T\bold A\bold P_0 = 0

P0Tr2=P0T(bAx1r1α1AP1)=P0Tr1α1P0TAP1=0\bold P_0^T \bold r_2 = \bold P_0^T(\underbrace{\bold b -\bold A \bold x_1}_{\bold r_1} - \alpha_1 \bold A \bold P_1) = \bold P_0^T \bold r_1 - \alpha_1\bold P_0^T\bold A \bold P_1 = 0

P1Tr2=P1T(bAx1α1AP1)=P1Tr1α1P1TAP1=P1Tr1r1Tr1P1TAP1P1TAP1=P1Tr1r1Tr1=(r1+β0P0)Tr1r1Tr1=β0P0Tr1=β00=0\begin{aligned} \bold P_1^T \bold r_2 = \bold P_1^T(\bold b-\bold A\bold x_1-\alpha_1\bold A\bold P_1) & = \bold P_1^T \bold r_1 - \alpha_1 \bold P_1^T\bold A\bold P_1\\& = \bold P_1^T\bold r_1 - \frac{\bold r_1^T\bold r_1}{\bold P_1^T\bold A\bold P_1}\bold P_1^T\bold A \bold P_1 \\& =\bold P_1^T\bold r_1 - \bold r_1^T\bold r_1 \\ & =(\bold r_1+\beta_0\bold P_0)^T\bold r_1 - \bold r_1^T \bold r_1\\& =\beta_0\bold P_0^T \bold r_1 = \beta_0\cdot 0 = 0\end{aligned}

r0Tr2=P0Tr2=0\bold r_0^T \bold r_2 = \bold P_0^T\bold r_2 = 0

r1Tr2=r1T(bAx1α1AP1)=r1T(r1α1AP1)=r1Tr1r1Tα1AP1=r1Tr1α1r1TAP1=r1Tr1r1TP1P1TAP1r1TAP1=r1Tr1r1TP1(r1+β0P0)TAP1r1TAP1=r1Tr1r1TP1r1TAP1+β0P0TAP1r1TAP1=r1Tr1r1Tr1r1TAP1r1TAP1=0\begin{aligned} \bold r_1^T\bold r_2 = \bold r_1^T(\bold b - \bold A \bold x_1 - \alpha_1\bold A \bold P_1) & = \bold r_1^T(r_1-\alpha_1\bold A \bold P_1) \\&= \bold r_1^T \bold r_1-\bold r_1^T\bold \alpha_1\bold A\bold P_1 \\ &= \bold r_1^T \bold r_1 - \alpha_1\bold r_1^T\bold A \bold P_1\\& =\bold r_1^T \bold r_1-\frac{\bold r_1^T\bold P_1}{\bold P_1^T\bold A\bold P_1}\bold r_1^T\bold A\bold P_1\\&= \bold r_1^T \bold r_1-\frac{\bold r_1^T\bold P_1}{(\bold r_1+\beta_0\bold P_0)^T\bold A\bold P_1}\bold r_1^T\bold A\bold P_1\\&= \bold r_1^T \bold r_1-\frac{\bold r_1^T\bold P_1}{\bold r_1^T\bold A\bold P_1 + \beta_0\bold P_0^T\bold A\bold P_1}\bold r_1^T\bold A\bold P_1 \\&= \bold r_1^T \bold r_1-\frac{\bold r_1^T\bold r_1}{\bold r_1^T\bold A\bold P_1}\bold r_1^T\bold A\bold P_1 \\& = 0 \end{aligned}

(3)若 r20\bold r_2 \neq 0 ,则重复第二步,总结为:

xk+1=xk+αkPkPk=rk+βk1Pk1βk1=rkTAPk1Pk1TAPk1=rkTrk1rk1Trk1αk=rkTrkPkTAPkrk+1=bAxk+1=rkαkAPk\begin{aligned} \bold x_{k+1} & = \bold x_k+\alpha_k \bold P_k\\ \bold P_k & = \bold r_k + \beta_{k-1}\bold P_{k-1}\\ \beta_{k-1} & = -\frac{\bold r_k^T\bold A\bold P_{k-1}}{\bold P_{k-1}^T\bold A\bold P_{k-1}} = \frac{\bold r_k^T\bold r_{k-1}}{\bold r_{k-1}^T\bold r_{k-1}}\\ \bold \alpha_k & = \frac{\bold r_k^T\bold r_k}{\bold P_{k}^T\bold A\bold P_k} \\ \bold r_{k+1} & = \bold b - \bold A\bold x_{k+1} = \bold r_k - \alpha_k\bold A \bold P_k \end{aligned}

如此,便得到共轭梯度法的计算步骤

计算步骤

推导出来的计算步骤如下:

x0=初值\bold x_0 = 初值
r0=bAx0;  k=0\bold r_0 = \bold b - \bold A\bold x_0 ;\;k = 0
while    rk0【实际上,应该用范数是否小于设定值来判断while \;\;\bold r_k \neq 0 【实际上,应该用范数是否小于设定值来判断
     k=k+1k = k + 1
     if  k=1    P0=r0if \; k = 1 \;\; \bold P_0 = \bold r_0
     elseelse
         βk2=rk1Trk1rk2Trk2\beta_{k-2} = \frac{\bold r_{k-1}^T\bold r_{k-1}}{\bold r_{k-2}^T\bold r_{k-2}}
         Pk1=rk1+βk2Pk2\bold P_{k-1} = \bold r_{k-1}+\beta_{k-2}\bold P_{k-2}
     endend
     αk1=rk1Trk1Pk1TAPk1\alpha_{k-1} = \frac{\bold r_{k-1}^T\bold r_{k-1}}{\bold P_{k-1}^T\bold A \bold P_{k-1}}
     xk=xk1+αk1Pk1\bold x_k = \bold x_{k-1}+\alpha_{k-1}\bold P_{k-1}
     rk=rk1αk1APk1\bold r_k = \bold r_{k-1} - \alpha_{k-1}\bold A\bold P_{k-1}
endend
x=xk\bold x = \bold x_k

但是在编写程序过程中,应当按照如下方式编写:

x0=初值\bold x_0 = 初值
r0=bAx0;  k=0\bold r_0 = \bold b - \bold A\bold x_0 ;\;k = 0
while    rktol    (tol为一设定的很小的值)while \;\;||\bold r_k||\geq tol \;\;(tol为一设定的很小的值)
     k=k+1k = k + 1
         if  k=1    P0=r0if \; k = 1 \;\; \bold P_0 = \bold r_0
     elseelse
         βk2=rk1Trk1rk2Trk2\beta_{k-2} = \frac{\bold r_{k-1}^T\bold r_{k-1}}{\bold r_{k-2}^T\bold r_{k-2}}
         Pk1=rk1+βk2Pk2\bold P_{k-1} = \bold r_{k-1}+\beta_{k-2}\bold P_{k-2}
     endend
     temp=APk1temp = \bold A \bold P_{k-1}
     αk1=rk1Trk1Pk1Ttemp\alpha_{k-1} = \frac{\bold r_{k-1}^T\bold r_{k-1}}{\bold P_{k-1}^T\cdot temp}
     xk=xk1+αk1Pk1\bold x_k = \bold x_{k-1}+\alpha_{k-1}\bold P_{k-1}
     rk=rk1αk1temp\bold r_k = \bold r_{k-1} - \alpha_{k-1}\cdot temp
endend
x=xk\bold x = \bold x_k

补充

定理3:由共轭梯度法得到的向量组 {ri}\{\bold r_i\}{Pi}\{\bold P_i\} 具有下面的性质:

(1)PITrj=0,  0i<jk\bold P_I^T\bold r_j = 0,\;0\leq i < j \leq k
(2)riTrj=0,  ij,  0i,jk\bold r_i^T\bold r_j = 0,\;i \neq j,\; 0 \leq i,j\leq k
(3)PiTAPj=0,  ij,  0i,jk\bold P_i^T\bold A \bold P_j = 0,\;i \neq j,\;0\leq i,j\leq k
(4)span{r0,  r1,  ,  rk}=span{P0,  P1,  ,  Pk}=K~(A,  r0,  k+1)span\{\bold r_0,\;\bold r_1,\;\cdots,\;\bold r_k\} = span\{\bold P_0,\;\bold P_1,\;\cdots,\;\bold P_k\} = \widetilde{K}(\bold A,\;\bold r_0,\;k+1)
K~(A,  r0,  k+1)=span{r0,  Ar0,  ,  Akr0}通常称为krylov子空间\widetilde{K}(\bold A,\;\bold r_0,\;k+1) = span\{\bold r_0,\;\bold A\bold r_0,\;\cdots,\;\bold A^k\bold r_0\} 通常称为krylov子空间