1*59599516SKenneth E. Jansen subroutine SolGMRe (y, ac, yold, acold, 2*59599516SKenneth E. Jansen & x, iBC, BC, EGmass, 3*59599516SKenneth E. Jansen & res, BDiag, HBrg, eBrg, 4*59599516SKenneth E. Jansen & yBrg, Rcos, Rsin, iper, 5*59599516SKenneth E. Jansen & ilwork, shp, shgl, shpb, 6*59599516SKenneth E. Jansen & shglb, Dy, rerr) 7*59599516SKenneth E. Jansenc 8*59599516SKenneth E. Jansenc---------------------------------------------------------------------- 9*59599516SKenneth E. Jansenc 10*59599516SKenneth E. Jansenc This is the preconditioned GMRES driver routine. 11*59599516SKenneth E. Jansenc 12*59599516SKenneth E. Jansenc input: 13*59599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_v 14*59599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 15*59599516SKenneth E. Jansenc yold (nshg,ndof) : Y-variables at beginning of step 16*59599516SKenneth E. Jansenc acold (nshg,ndof) : Primvar. accel. variable at begng step 17*59599516SKenneth E. Jansenc x (numnp,nsd) : node coordinates 18*59599516SKenneth E. Jansenc iBC (nshg) : BC codes 19*59599516SKenneth E. Jansenc BC (nshg,ndofBC) : BC constraint parameters 20*59599516SKenneth E. Jansenc HBrg (Kspace+1,Kspace) : Hessenberg matrix (LHS matrix) 21*59599516SKenneth E. Jansenc eBrg (Kspace+1) : RHS of Hessenberg minim. problem 22*59599516SKenneth E. Jansenc yBrg (Kspace) : solution of Hessenberg minim. problem 23*59599516SKenneth E. Jansenc Rcos (Kspace) : Rotational cosine of QR algorithm 24*59599516SKenneth E. Jansenc Rsin (Kspace) : Rotational sine of QR algorithm 25*59599516SKenneth E. Jansenc shp(b) (nen,maxsh,melCat) : element shape functions (boundary) 26*59599516SKenneth E. Jansenc shgl(b)(nsd,nen,maxsh,melCat) : local gradients of shape functions 27*59599516SKenneth E. Jansenc 28*59599516SKenneth E. Jansenc output: 29*59599516SKenneth E. Jansenc res (nshg,nflow) : preconditioned residual 30*59599516SKenneth E. Jansenc BDiag (nshg,nflow,nflow) : block-diagonal preconditioner 31*59599516SKenneth E. Jansenc 32*59599516SKenneth E. Jansenc 33*59599516SKenneth E. Jansenc Zdenek Johan, Winter 1991. (Fortran 90) 34*59599516SKenneth E. Jansenc---------------------------------------------------------------------- 35*59599516SKenneth E. Jansenc 36*59599516SKenneth E. Jansen use pointer_data 37*59599516SKenneth E. Jansen 38*59599516SKenneth E. Jansen include "common.h" 39*59599516SKenneth E. Jansen include "mpif.h" 40*59599516SKenneth E. Jansen include "auxmpi.h" 41*59599516SKenneth E. Jansenc 42*59599516SKenneth E. Jansen dimension y(nshg,ndof), ac(nshg,ndof), 43*59599516SKenneth E. Jansen & yold(nshg,ndof), acold(nshg,ndof), 44*59599516SKenneth E. Jansen & x(numnp,nsd), 45*59599516SKenneth E. Jansen & iBC(nshg), BC(nshg,ndofBC), 46*59599516SKenneth E. Jansen & res(nshg,nflow), 47*59599516SKenneth E. Jansen & BDiag(nshg,nflow,nflow), 48*59599516SKenneth E. Jansen & HBrg(Kspace+1,*), eBrg(*), 49*59599516SKenneth E. Jansen & yBrg(*), Rcos(*), 50*59599516SKenneth E. Jansen & Rsin(*), ilwork(nlwork), 51*59599516SKenneth E. Jansen & iper(nshg), EGmass(numel,nedof,nedof)!, 52*59599516SKenneth E. Jansenctoomuchmem & Binv(numel,nedof,nedof) 53*59599516SKenneth E. Jansenc 54*59599516SKenneth E. Jansen dimension Dy(nshg,nflow), rmes(nshg,nflow), 55*59599516SKenneth E. Jansen & temp(nshg,nflow), 56*59599516SKenneth E. Jansen & uBrg(nshg,nflow,Kspace+1) 57*59599516SKenneth E. Jansenc 58*59599516SKenneth E. Jansen dimension shp(MAXTOP,maxsh,MAXQPT), 59*59599516SKenneth E. Jansen & shgl(MAXTOP,nsd,maxsh,MAXQPT), 60*59599516SKenneth E. Jansen & shpb(MAXTOP,maxsh,MAXQPT), 61*59599516SKenneth E. Jansen & shglb(MAXTOP,nsd,maxsh,MAXQPT) 62*59599516SKenneth E. Jansen real*8 rerr(nshg,10) 63*59599516SKenneth E. Jansenc 64*59599516SKenneth E. Jansenc.... *******************>> Element Data Formation <<****************** 65*59599516SKenneth E. Jansenc 66*59599516SKenneth E. Jansenc 67*59599516SKenneth E. Jansenc.... set the parameters for flux and surface tension calculations 68*59599516SKenneth E. Jansenc 69*59599516SKenneth E. Jansenc 70*59599516SKenneth E. Jansen idflx = 0 71*59599516SKenneth E. Jansen if(idiff >= 1) idflx= idflx + (nflow-1) * nsd 72*59599516SKenneth E. Jansen if (isurf == 1) idflx=idflx + nsd 73*59599516SKenneth E. Jansenc 74*59599516SKenneth E. Jansenc.... form the LHS matrices, the residual vector, and the block 75*59599516SKenneth E. Jansenc diagonal preconditioner 76*59599516SKenneth E. Jansenc 77*59599516SKenneth E. Jansen call ElmGMRe(y, ac, x, 78*59599516SKenneth E. Jansen & shp, shgl, iBC, 79*59599516SKenneth E. Jansen & BC, shpb, 80*59599516SKenneth E. Jansen & shglb, res, 81*59599516SKenneth E. Jansen & rmes, BDiag, iper, 82*59599516SKenneth E. Jansen & ilwork, EGmass, rerr ) 83*59599516SKenneth E. Jansenc 84*59599516SKenneth E. Jansenc.... **********************>> EBE - GMRES <<******************** 85*59599516SKenneth E. Jansenc 86*59599516SKenneth E. Jansen call timer ('Solver ') 87*59599516SKenneth E. Jansenc 88*59599516SKenneth E. Jansenc.... ------------------------> Initialization <----------------------- 89*59599516SKenneth E. Jansenc 90*59599516SKenneth E. Jansenc 91*59599516SKenneth E. Jansenc.... LU decompose the block diagonals 92*59599516SKenneth E. Jansenc 93*59599516SKenneth E. Jansen if (iprec .ne. 0) 94*59599516SKenneth E. Jansen & call i3LU (BDiag, res, 'LU_Fact ') 95*59599516SKenneth E. Jansen 96*59599516SKenneth E. Jansenc 97*59599516SKenneth E. Jansenc.... block diagonal precondition residual 98*59599516SKenneth E. Jansenc 99*59599516SKenneth E. Jansen call i3LU (BDiag, res, 'forward ') 100*59599516SKenneth E. Jansenc 101*59599516SKenneth E. Jansenc.... initialize Dy 102*59599516SKenneth E. Jansenc 103*59599516SKenneth E. Jansen Dy = zero 104*59599516SKenneth E. Jansenc 105*59599516SKenneth E. Jansenc.... Pre-precondition the LHS mass matrix and set up the element 106*59599516SKenneth E. Jansenc by element preconditioners 107*59599516SKenneth E. Jansenc 108*59599516SKenneth E. Jansenctoomuchmemory note that Binv is demoted from huge array to just one 109*59599516SKenneth E. Jansenc real*8 in i3pre because it takes too much memory 110*59599516SKenneth E. Jansen 111*59599516SKenneth E. Jansen call i3pre (BDiag, Binv, EGmass, ilwork) 112*59599516SKenneth E. Jansenc 113*59599516SKenneth E. Jansenc.... left EBE precondition the residual 114*59599516SKenneth E. Jansenc 115*59599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, res, ilwork, 'L_Pcond ') 116*59599516SKenneth E. Jansenc 117*59599516SKenneth E. Jansenc.... copy res in uBrg(1) 118*59599516SKenneth E. Jansenc 119*59599516SKenneth E. Jansen uBrg(:,:,1) = res 120*59599516SKenneth E. Jansenc 121*59599516SKenneth E. Jansenc.... calculate norm of residual 122*59599516SKenneth E. Jansenc 123*59599516SKenneth E. Jansen temp = res**2 124*59599516SKenneth E. Jansen 125*59599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 126*59599516SKenneth E. Jansen unorm = sqrt(summed) 127*59599516SKenneth E. Jansenc 128*59599516SKenneth E. Jansenc.... check if GMRES iterations are required 129*59599516SKenneth E. Jansenc 130*59599516SKenneth E. Jansen iKs = 0 131*59599516SKenneth E. Jansen lGMRES = 0 132*59599516SKenneth E. Jansenc 133*59599516SKenneth E. Jansenc.... if we are down to machine precision, don't bother solving 134*59599516SKenneth E. Jansenc 135*59599516SKenneth E. Jansen if (unorm .lt. 100.*epsM**2) goto 3000 136*59599516SKenneth E. Jansenc 137*59599516SKenneth E. Jansenc.... set up tolerance of the Hessenberg's problem 138*59599516SKenneth E. Jansenc 139*59599516SKenneth E. Jansen epsnrm = etol * unorm 140*59599516SKenneth E. Jansenc 141*59599516SKenneth E. Jansenc.... ------------------------> GMRES Loop <------------------------- 142*59599516SKenneth E. Jansenc 143*59599516SKenneth E. Jansenc.... loop through GMRES cycles 144*59599516SKenneth E. Jansenc 145*59599516SKenneth E. Jansen do 2000 mGMRES = 1, nGMRES 146*59599516SKenneth E. Jansen lGMRES = mGMRES - 1 147*59599516SKenneth E. Jansenc 148*59599516SKenneth E. Jansen if (lGMRES .gt. 0) then 149*59599516SKenneth E. Jansenc 150*59599516SKenneth E. Jansenc.... if GMRES restarts are necessary, calculate R - A x 151*59599516SKenneth E. Jansenc 152*59599516SKenneth E. Jansenc 153*59599516SKenneth E. Jansenc.... right precondition Dy 154*59599516SKenneth E. Jansenc 155*59599516SKenneth E. Jansen temp = Dy 156*59599516SKenneth E. Jansen 157*59599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, temp, ilwork, 'R_Pcond ') 158*59599516SKenneth E. Jansenc 159*59599516SKenneth E. Jansenc.... perform the A x product 160*59599516SKenneth E. Jansenc 161*59599516SKenneth E. Jansen call Au1GMR (EGmass, temp, ilwork, iBC,iper) 162*59599516SKenneth E. Jansenc 163*59599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 164*59599516SKenneth E. Jansenc 165*59599516SKenneth E. Jansen call bc3per (iBC, temp, iper, ilwork, nflow) 166*59599516SKenneth E. Jansenc 167*59599516SKenneth E. Jansenc.... left preconditioning 168*59599516SKenneth E. Jansenc 169*59599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, temp, ilwork, 'L_Pcond ') 170*59599516SKenneth E. Jansenc 171*59599516SKenneth E. Jansenc.... subtract A x from residual and calculate the norm 172*59599516SKenneth E. Jansenc 173*59599516SKenneth E. Jansen temp = res - temp 174*59599516SKenneth E. Jansen uBrg(:,:,1) = temp 175*59599516SKenneth E. Jansenc 176*59599516SKenneth E. Jansenc.... calculate the norm 177*59599516SKenneth E. Jansenc 178*59599516SKenneth E. Jansen temp = temp**2 179*59599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 180*59599516SKenneth E. Jansen unorm = sqrt(summed) 181*59599516SKenneth E. Jansenc 182*59599516SKenneth E. Jansenc.... flop count 183*59599516SKenneth E. Jansenc 184*59599516SKenneth E. Jansen flops = flops + nflow*nshg+nshg 185*59599516SKenneth E. Jansenc 186*59599516SKenneth E. Jansen endif 187*59599516SKenneth E. Jansenc 188*59599516SKenneth E. Jansenc.... set up RHS of the Hessenberg's problem 189*59599516SKenneth E. Jansenc 190*59599516SKenneth E. Jansen call clear (eBrg, Kspace+1) 191*59599516SKenneth E. Jansen eBrg(1) = unorm 192*59599516SKenneth E. Jansenc 193*59599516SKenneth E. Jansenc.... normalize the first Krylov vector 194*59599516SKenneth E. Jansenc 195*59599516SKenneth E. Jansen uBrg(:,:,1) = uBrg(:,:,1) / unorm 196*59599516SKenneth E. Jansenc 197*59599516SKenneth E. Jansenc.... loop through GMRES iterations 198*59599516SKenneth E. Jansenc 199*59599516SKenneth E. Jansen do 1000 iK = 1, Kspace 200*59599516SKenneth E. Jansen iKs = iK 201*59599516SKenneth E. Jansen 202*59599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs) 203*59599516SKenneth E. Jansenc 204*59599516SKenneth E. Jansenc.... right EBE precondition the LHS ( u_{i+1} <-- inverse(U) u_i ) 205*59599516SKenneth E. Jansenc 206*59599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, uBrg(:,:,iKs+1), ilwork, 'R_Pcond ') 207*59599516SKenneth E. Jansenc 208*59599516SKenneth E. Jansenc.... Au product ( u_{i+1} <-- EGmass u_{i+1} ) 209*59599516SKenneth E. Jansenc 210*59599516SKenneth E. Jansen call Au1GMR ( EGmass, uBrg(:,:,iKs+1), ilwork, iBC,iper) 211*59599516SKenneth E. Jansenc 212*59599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 213*59599516SKenneth E. Jansenc 214*59599516SKenneth E. Jansen call bc3per (iBC, uBrg(:,:,iKs+1), iper, ilwork, nflow) 215*59599516SKenneth E. Jansen 216*59599516SKenneth E. Jansenc 217*59599516SKenneth E. Jansenc.... left EBE precondition the LHS ( u_{i+1} <-- inverse(L) u_{i+1} ) 218*59599516SKenneth E. Jansenc 219*59599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, uBrg(:,:,iKs+1), ilwork, 'L_Pcond ') 220*59599516SKenneth E. Jansenc 221*59599516SKenneth E. Jansenc.... orthogonalize and get the norm 222*59599516SKenneth E. Jansenc 223*59599516SKenneth E. Jansen do jK = 1, iKs+1 224*59599516SKenneth E. Jansenc 225*59599516SKenneth E. Jansen if (jK .eq. 1) then 226*59599516SKenneth E. Jansenc 227*59599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,1) ! {u_{i+1}*u_1} vector 228*59599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_1) 229*59599516SKenneth E. Jansenc 230*59599516SKenneth E. Jansen else 231*59599516SKenneth E. Jansenc 232*59599516SKenneth E. Jansenc project off jK-1 vector 233*59599516SKenneth E. Jansenc 234*59599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs+1) - beta * uBrg(:,:,jK-1) 235*59599516SKenneth E. Jansenc 236*59599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,jK) !{u_{i+1}*u_j} vector 237*59599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_j) 238*59599516SKenneth E. Jansenc 239*59599516SKenneth E. Jansen endif 240*59599516SKenneth E. Jansenc 241*59599516SKenneth E. Jansen HBrg(jK,iKs) = beta ! put this in the Hessenberg Matrix 242*59599516SKenneth E. Jansenc 243*59599516SKenneth E. Jansen enddo 244*59599516SKenneth E. Jansenc 245*59599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*numnp+(iKs+1)*numnp 246*59599516SKenneth E. Jansenc 247*59599516SKenneth E. Jansenc the last inner product was with what was left of the vector (after 248*59599516SKenneth E. Jansenc projecting off all of the previous vectors 249*59599516SKenneth E. Jansenc 250*59599516SKenneth E. Jansen unorm = sqrt(beta) 251*59599516SKenneth E. Jansen HBrg(iKs+1,iKs) = unorm ! this fills the 1 sub diagonal band 252*59599516SKenneth E. Jansenc 253*59599516SKenneth E. Jansenc.... normalize the Krylov vector 254*59599516SKenneth E. Jansenc 255*59599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs+1) / unorm ! normalize the next Krylov 256*59599516SKenneth E. Jansenc vector 257*59599516SKenneth E. Jansenc 258*59599516SKenneth E. Jansenc.... construct and reduce the Hessenberg Matrix 259*59599516SKenneth E. Jansenc since there is only one subdiagonal we can use a Givens rotation to 260*59599516SKenneth E. Jansenc rotate off each subdiagonal AS IT IS FORMED. We do this because it 261*59599516SKenneth E. Jansenc allows us to check progress of solution and quit when satisfied. Note 262*59599516SKenneth E. Jansenc that all future K vects will put a subdiagonal in the next column so 263*59599516SKenneth E. Jansenc there is no penalty to work ahead as the rotation for the next vector 264*59599516SKenneth E. Jansenc will be unaffected by this rotation. 265*59599516SKenneth E. Jansen 266*59599516SKenneth E. Jansenc 267*59599516SKenneth E. Jansenc H Y = E ========> R_i H Y = R_i E 268*59599516SKenneth E. Jansenc 269*59599516SKenneth E. Jansen do jK = 1, iKs-1 270*59599516SKenneth E. Jansen tmp = Rcos(jK) * HBrg(jK, iKs) + 271*59599516SKenneth E. Jansen & Rsin(jK) * HBrg(jK+1,iKs) 272*59599516SKenneth E. Jansen HBrg(jK+1,iKs) = -Rsin(jK) * HBrg(jK, iKs) + 273*59599516SKenneth E. Jansen & Rcos(jK) * HBrg(jK+1,iKs) 274*59599516SKenneth E. Jansen HBrg(jK, iKs) = tmp 275*59599516SKenneth E. Jansen enddo 276*59599516SKenneth E. Jansenc 277*59599516SKenneth E. Jansen tmp = sqrt(HBrg(iKs,iKs)**2 + HBrg(iKs+1,iKs)**2) 278*59599516SKenneth E. Jansen Rcos(iKs) = HBrg(iKs, iKs) / tmp 279*59599516SKenneth E. Jansen Rsin(iKs) = HBrg(iKs+1,iKs) / tmp 280*59599516SKenneth E. Jansen HBrg(iKs, iKs) = tmp 281*59599516SKenneth E. Jansen HBrg(iKs+1,iKs) = zero 282*59599516SKenneth E. Jansenc 283*59599516SKenneth E. Jansenc.... rotate eBrg R_i E 284*59599516SKenneth E. Jansenc 285*59599516SKenneth E. Jansen tmp = Rcos(iKs) * eBrg(iKs) + Rsin(iKs) * eBrg(iKs+1) 286*59599516SKenneth E. Jansen eBrg(iKs+1) =-Rsin(iKs) * eBrg(iKs) + Rcos(iKs) * eBrg(iKs+1) 287*59599516SKenneth E. Jansen eBrg(iKs) = tmp 288*59599516SKenneth E. Jansenc 289*59599516SKenneth E. Jansenc.... check for convergence 290*59599516SKenneth E. Jansenc 291*59599516SKenneth E. Jansen ntotGM = ntotGM + 1 292*59599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 293*59599516SKenneth E. Jansen if (echeck .le. epsnrm) exit 294*59599516SKenneth E. Jansenc 295*59599516SKenneth E. Jansenc.... end of GMRES iteration loop 296*59599516SKenneth E. Jansenc 297*59599516SKenneth E. Jansen 1000 continue 298*59599516SKenneth E. Jansenc 299*59599516SKenneth E. Jansenc.... -------------------------> Solution <------------------------ 300*59599516SKenneth E. Jansenc 301*59599516SKenneth E. Jansenc.... if converged or end of Krylov space 302*59599516SKenneth E. Jansenc 303*59599516SKenneth E. Jansenc.... solve for yBrg 304*59599516SKenneth E. Jansenc 305*59599516SKenneth E. Jansen do jK = iKs, 1, -1 306*59599516SKenneth E. Jansen yBrg(jK) = eBrg(jK) / HBrg(jK,jK) 307*59599516SKenneth E. Jansen do lK = 1, jK-1 308*59599516SKenneth E. Jansen eBrg(lK) = eBrg(lK) - yBrg(jK) * HBrg(lK,jK) 309*59599516SKenneth E. Jansen enddo 310*59599516SKenneth E. Jansen enddo 311*59599516SKenneth E. Jansenc 312*59599516SKenneth E. Jansenc.... update Dy 313*59599516SKenneth E. Jansenc 314*59599516SKenneth E. Jansen do jK = 1, iKs 315*59599516SKenneth E. Jansen Dy = Dy + yBrg(jK) * uBrg(:,:,jK) 316*59599516SKenneth E. Jansen enddo 317*59599516SKenneth E. Jansenc 318*59599516SKenneth E. Jansenc.... flop count 319*59599516SKenneth E. Jansenc 320*59599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*nshg 321*59599516SKenneth E. Jansenc 322*59599516SKenneth E. Jansenc.... check for convergence 323*59599516SKenneth E. Jansenc 324*59599516SKenneth E. Jansen 325*59599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 326*59599516SKenneth E. Jansen if (echeck .le. epsnrm) exit 327*59599516SKenneth E. Jansen if(myrank.eq.master) write(*,*)'solver tolerance %satisfaction', 328*59599516SKenneth E. Jansen & (one-echeck/unorm)/(one-etol)*100 329*59599516SKenneth E. Jansenc 330*59599516SKenneth E. Jansenc.... end of mGMRES loop 331*59599516SKenneth E. Jansenc 332*59599516SKenneth E. Jansen 2000 continue 333*59599516SKenneth E. Jansenc 334*59599516SKenneth E. Jansenc.... ------------------------> Converged <------------------------ 335*59599516SKenneth E. Jansenc 336*59599516SKenneth E. Jansenc.... if converged 337*59599516SKenneth E. Jansenc 338*59599516SKenneth E. Jansen 3000 continue 339*59599516SKenneth E. Jansenc 340*59599516SKenneth E. Jansenc.... back EBE precondition the results 341*59599516SKenneth E. Jansenc 342*59599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, Dy, ilwork, 'R_Pcond ') 343*59599516SKenneth E. Jansenc 344*59599516SKenneth E. Jansenc.... back block-diagonal precondition the results 345*59599516SKenneth E. Jansenc 346*59599516SKenneth E. Jansen call i3LU (BDiag, Dy, 'backward') 347*59599516SKenneth E. Jansenc 348*59599516SKenneth E. Jansenc 349*59599516SKenneth E. Jansenc.... output the statistics 350*59599516SKenneth E. Jansenc 351*59599516SKenneth E. Jansen call rstat (res, ilwork) 352*59599516SKenneth E. Jansenc 353*59599516SKenneth E. Jansenc.... stop the timer 354*59599516SKenneth E. Jansenc 355*59599516SKenneth E. Jansen 3002 continue ! no solve just res. 356*59599516SKenneth E. Jansen call timer ('Back ') 357*59599516SKenneth E. Jansenc 358*59599516SKenneth E. Jansenc.... end 359*59599516SKenneth E. Jansenc 360*59599516SKenneth E. Jansen return 361*59599516SKenneth E. Jansen end 362*59599516SKenneth E. Jansen 363*59599516SKenneth E. Jansen 364*59599516SKenneth E. Jansen 365*59599516SKenneth E. Jansen 366*59599516SKenneth E. Jansen 367*59599516SKenneth E. Jansen subroutine SolGMRs(y, ac, yold, acold, 368*59599516SKenneth E. Jansen & x, iBC, BC, 369*59599516SKenneth E. Jansen & col, row, lhsk, 370*59599516SKenneth E. Jansen & res, BDiag, HBrg, eBrg, 371*59599516SKenneth E. Jansen & yBrg, Rcos, Rsin, iper, 372*59599516SKenneth E. Jansen & ilwork, shp, shgl, shpb, 373*59599516SKenneth E. Jansen & shglb, Dy, rerr) 374*59599516SKenneth E. Jansenc 375*59599516SKenneth E. Jansenc---------------------------------------------------------------------- 376*59599516SKenneth E. Jansenc 377*59599516SKenneth E. Jansenc This is the preconditioned GMRES driver routine. 378*59599516SKenneth E. Jansenc 379*59599516SKenneth E. Jansenc input: 380*59599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_v 381*59599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 382*59599516SKenneth E. Jansenc yold (nshg,ndof) : Y-variables at beginning of step 383*59599516SKenneth E. Jansenc acold (nshg,ndof) : Primvar. accel. variable at begng step 384*59599516SKenneth E. Jansenc x (numnp,nsd) : node coordinates 385*59599516SKenneth E. Jansenc iBC (nshg) : BC codes 386*59599516SKenneth E. Jansenc BC (nshg,ndofBC) : BC constraint parameters 387*59599516SKenneth E. Jansenc HBrg (Kspace+1,Kspace) : Hessenberg matrix (LHS matrix) 388*59599516SKenneth E. Jansenc eBrg (Kspace+1) : RHS of Hessenberg minim. problem 389*59599516SKenneth E. Jansenc yBrg (Kspace) : solution of Hessenberg minim. problem 390*59599516SKenneth E. Jansenc Rcos (Kspace) : Rotational cosine of QR algorithm 391*59599516SKenneth E. Jansenc Rsin (Kspace) : Rotational sine of QR algorithm 392*59599516SKenneth E. Jansenc shp(b) (nen,maxsh,melCat) : element shape functions (boundary) 393*59599516SKenneth E. Jansenc shgl(b)(nsd,nen,maxsh,melCat) : local gradients of shape functions 394*59599516SKenneth E. Jansenc 395*59599516SKenneth E. Jansenc output: 396*59599516SKenneth E. Jansenc res (nshg,nflow) : preconditioned residual 397*59599516SKenneth E. Jansenc BDiag (nshg,nflow,nflow) : block-diagonal preconditioner 398*59599516SKenneth E. Jansenc 399*59599516SKenneth E. Jansenc 400*59599516SKenneth E. Jansenc Zdenek Johan, Winter 1991. (Fortran 90) 401*59599516SKenneth E. Jansenc---------------------------------------------------------------------- 402*59599516SKenneth E. Jansenc 403*59599516SKenneth E. Jansen use pointer_data 404*59599516SKenneth E. Jansen 405*59599516SKenneth E. Jansen include "common.h" 406*59599516SKenneth E. Jansen include "mpif.h" 407*59599516SKenneth E. Jansen include "auxmpi.h" 408*59599516SKenneth E. Jansenc 409*59599516SKenneth E. Jansen integer col(nshg+1), row(nnz*nshg) 410*59599516SKenneth E. Jansen real*8 lhsK(nflow*nflow,nnz_tot) 411*59599516SKenneth E. Jansen 412*59599516SKenneth E. Jansen 413*59599516SKenneth E. Jansen dimension y(nshg,ndof), ac(nshg,ndof), 414*59599516SKenneth E. Jansen & yold(nshg,ndof), acold(nshg,ndof), 415*59599516SKenneth E. Jansen & x(numnp,nsd), 416*59599516SKenneth E. Jansen & iBC(nshg), BC(nshg,ndofBC), 417*59599516SKenneth E. Jansen & res(nshg,nflow), 418*59599516SKenneth E. Jansen & BDiag(nshg,nflow,nflow), 419*59599516SKenneth E. Jansen & HBrg(Kspace+1,Kspace), eBrg(Kspace+1), 420*59599516SKenneth E. Jansen & yBrg(Kspace), Rcos(Kspace), 421*59599516SKenneth E. Jansen & Rsin(Kspace), ilwork(nlwork), 422*59599516SKenneth E. Jansen & iper(nshg) 423*59599516SKenneth E. Jansenc 424*59599516SKenneth E. Jansen dimension Dy(nshg,nflow), rmes(nshg,nflow), 425*59599516SKenneth E. Jansen & temp(nshg,nflow), 426*59599516SKenneth E. Jansen & uBrg(nshg,nflow,Kspace+1) 427*59599516SKenneth E. Jansenc 428*59599516SKenneth E. Jansen dimension shp(MAXTOP,maxsh,MAXQPT), 429*59599516SKenneth E. Jansen & shgl(MAXTOP,nsd,maxsh,MAXQPT), 430*59599516SKenneth E. Jansen & shpb(MAXTOP,maxsh,MAXQPT), 431*59599516SKenneth E. Jansen & shglb(MAXTOP,nsd,maxsh,MAXQPT) 432*59599516SKenneth E. Jansen real*8 rerr(nshg,10) 433*59599516SKenneth E. Jansenc 434*59599516SKenneth E. Jansenc 435*59599516SKenneth E. Jansenc.... *******************>> Element Data Formation <<****************** 436*59599516SKenneth E. Jansenc 437*59599516SKenneth E. Jansenc 438*59599516SKenneth E. Jansenc.... set the parameters for flux and surface tension calculations 439*59599516SKenneth E. Jansenc 440*59599516SKenneth E. Jansenc 441*59599516SKenneth E. Jansen idflx = 0 442*59599516SKenneth E. Jansen if(idiff >= 1) idflx= idflx + (nflow-1) * nsd 443*59599516SKenneth E. Jansen if (isurf == 1) idflx=idflx + nsd 444*59599516SKenneth E. Jansenc 445*59599516SKenneth E. Jansenc.... form the LHS matrices, the residual vector, and the block 446*59599516SKenneth E. Jansenc diagonal preconditioner 447*59599516SKenneth E. Jansenc 448*59599516SKenneth E. Jansen call ElmGMRs(y, ac, x, 449*59599516SKenneth E. Jansen & shp, shgl, iBC, 450*59599516SKenneth E. Jansen & BC, shpb, 451*59599516SKenneth E. Jansen & shglb, res, 452*59599516SKenneth E. Jansen & rmes, BDiag, iper, 453*59599516SKenneth E. Jansen & ilwork, lhsK, col, 454*59599516SKenneth E. Jansen & row, rerr ) 455*59599516SKenneth E. Jansen 456*59599516SKenneth E. Jansen call tnanq(res,5, 'res_egmr') 457*59599516SKenneth E. Jansen call tnanq(BDiag,25, 'bdg_egmr') 458*59599516SKenneth E. Jansenc 459*59599516SKenneth E. Jansenc.... **********************>> EBE - GMRES <<******************** 460*59599516SKenneth E. Jansenc 461*59599516SKenneth E. Jansen call timer ('Solver ') 462*59599516SKenneth E. Jansenc 463*59599516SKenneth E. Jansenc.... ------------------------> Initialization <----------------------- 464*59599516SKenneth E. Jansenc 465*59599516SKenneth E. Jansenc 466*59599516SKenneth E. Jansenc.... LU decompose the block diagonals 467*59599516SKenneth E. Jansenc 468*59599516SKenneth E. Jansen if (iprec .ne. 0) then 469*59599516SKenneth E. Jansen call i3LU (BDiag, res, 'LU_Fact ') 470*59599516SKenneth E. Jansen if (numpe > 1) then 471*59599516SKenneth E. Jansen call commu (BDiag , ilwork, nflow*nflow , 'out') 472*59599516SKenneth E. Jansen endif 473*59599516SKenneth E. Jansen endif 474*59599516SKenneth E. Jansenc 475*59599516SKenneth E. Jansenc.... block diagonal precondition residual 476*59599516SKenneth E. Jansenc 477*59599516SKenneth E. Jansen call i3LU (BDiag, res, 'forward ') 478*59599516SKenneth E. Jansenc 479*59599516SKenneth E. Jansenc Check the residual for divering trend 480*59599516SKenneth E. Jansenc 481*59599516SKenneth E. Jansen call rstatCheck(res,ilwork,y,ac) 482*59599516SKenneth E. Jansenc 483*59599516SKenneth E. Jansenc.... initialize Dy 484*59599516SKenneth E. Jansenc 485*59599516SKenneth E. Jansen Dy = zero 486*59599516SKenneth E. Jansenc 487*59599516SKenneth E. Jansenc.... Pre-precondition the LHS mass matrix and set up the sparse 488*59599516SKenneth E. Jansenc preconditioners 489*59599516SKenneth E. Jansenc 490*59599516SKenneth E. Jansen 491*59599516SKenneth E. Jansen if(lhs.eq.1) call Spsi3pre (BDiag, lhsK, col, row) 492*59599516SKenneth E. Jansenc 493*59599516SKenneth E. Jansenc.... copy res in uBrg(1) 494*59599516SKenneth E. Jansenc 495*59599516SKenneth E. Jansen uBrg(:,:,1) = res 496*59599516SKenneth E. Jansenc 497*59599516SKenneth E. Jansenc.... calculate norm of residual 498*59599516SKenneth E. Jansenc 499*59599516SKenneth E. Jansen temp = res**2 500*59599516SKenneth E. Jansen 501*59599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 502*59599516SKenneth E. Jansen unorm = sqrt(summed) 503*59599516SKenneth E. Jansenc 504*59599516SKenneth E. Jansenc.... check if GMRES iterations are required 505*59599516SKenneth E. Jansenc 506*59599516SKenneth E. Jansen iKs = 0 507*59599516SKenneth E. Jansen lGMRES = 0 508*59599516SKenneth E. Jansenc 509*59599516SKenneth E. Jansenc.... if we are down to machine precision, don't bother solving 510*59599516SKenneth E. Jansenc 511*59599516SKenneth E. Jansen if (unorm .lt. 100.*epsM**2) goto 3000 512*59599516SKenneth E. Jansenc 513*59599516SKenneth E. Jansenc.... set up tolerance of the Hessenberg's problem 514*59599516SKenneth E. Jansenc 515*59599516SKenneth E. Jansen epsnrm = etol * unorm 516*59599516SKenneth E. Jansenc 517*59599516SKenneth E. Jansenc.... ------------------------> GMRES Loop <------------------------- 518*59599516SKenneth E. Jansenc 519*59599516SKenneth E. Jansenc.... loop through GMRES cycles 520*59599516SKenneth E. Jansenc 521*59599516SKenneth E. Jansen do 2000 mGMRES = 1, nGMRES 522*59599516SKenneth E. Jansen lGMRES = mGMRES - 1 523*59599516SKenneth E. Jansenc 524*59599516SKenneth E. Jansen if (lGMRES .gt. 0) then 525*59599516SKenneth E. Jansenc 526*59599516SKenneth E. Jansenc.... if GMRES restarts are necessary, calculate R - A x 527*59599516SKenneth E. Jansenc 528*59599516SKenneth E. Jansenc 529*59599516SKenneth E. Jansenc.... right precondition Dy 530*59599516SKenneth E. Jansenc 531*59599516SKenneth E. Jansen temp = Dy 532*59599516SKenneth E. Jansen 533*59599516SKenneth E. Jansenc 534*59599516SKenneth E. Jansenc.... perform the A x product 535*59599516SKenneth E. Jansenc 536*59599516SKenneth E. Jansen call SparseAp (iper,ilwork,iBC, col, row, lhsK, temp) 537*59599516SKenneth E. Jansen call tnanq(temp,5, 'q_spAPrs') 538*59599516SKenneth E. Jansen 539*59599516SKenneth E. Jansenc 540*59599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 541*59599516SKenneth E. Jansenc 542*59599516SKenneth E. Jansen call bc3per (iBC, temp, iper, ilwork, nflow) 543*59599516SKenneth E. Jansen call tnanq(temp,5, 'q_BCprs') 544*59599516SKenneth E. Jansenc 545*59599516SKenneth E. Jansenc.... subtract A x from residual and calculate the norm 546*59599516SKenneth E. Jansenc 547*59599516SKenneth E. Jansen temp = res - temp 548*59599516SKenneth E. Jansen uBrg(:,:,1) = temp 549*59599516SKenneth E. Jansenc 550*59599516SKenneth E. Jansenc.... calculate the norm 551*59599516SKenneth E. Jansenc 552*59599516SKenneth E. Jansen temp = temp**2 553*59599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 554*59599516SKenneth E. Jansen unorm = sqrt(summed) 555*59599516SKenneth E. Jansenc 556*59599516SKenneth E. Jansenc.... flop count 557*59599516SKenneth E. Jansenc 558*59599516SKenneth E. Jansen flops = flops + nflow*nshg+nshg 559*59599516SKenneth E. Jansenc 560*59599516SKenneth E. Jansen endif 561*59599516SKenneth E. Jansenc 562*59599516SKenneth E. Jansenc.... set up RHS of the Hessenberg's problem 563*59599516SKenneth E. Jansenc 564*59599516SKenneth E. Jansen call clear (eBrg, Kspace+1) 565*59599516SKenneth E. Jansen eBrg(1) = unorm 566*59599516SKenneth E. Jansenc 567*59599516SKenneth E. Jansenc.... normalize the first Krylov vector 568*59599516SKenneth E. Jansenc 569*59599516SKenneth E. Jansen uBrg(:,:,1) = uBrg(:,:,1) / unorm 570*59599516SKenneth E. Jansenc 571*59599516SKenneth E. Jansenc.... loop through GMRES iterations 572*59599516SKenneth E. Jansenc 573*59599516SKenneth E. Jansen do 1000 iK = 1, Kspace 574*59599516SKenneth E. Jansen iKs = iK 575*59599516SKenneth E. Jansen 576*59599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs) 577*59599516SKenneth E. Jansenc 578*59599516SKenneth E. Jansenc.... Au product ( u_{i+1} <-- EGmass u_{i+1} ) 579*59599516SKenneth E. Jansenc 580*59599516SKenneth E. Jansen call SparseAp (iper, ilwork, iBC, 581*59599516SKenneth E. Jansen & col, row, lhsK, 582*59599516SKenneth E. Jansen & uBrg(:,:,iKs+1) ) 583*59599516SKenneth E. Jansen call tnanq(uBrg(:,:,iKS+1),5, 'q_spAP') 584*59599516SKenneth E. Jansen 585*59599516SKenneth E. Jansenc 586*59599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 587*59599516SKenneth E. Jansenc 588*59599516SKenneth E. Jansen call bc3per (iBC, uBrg(:,:,iKs+1), iper, ilwork, nflow) 589*59599516SKenneth E. Jansen call tnanq(uBrg(:,:,iKS+1),5, 'q_bc') 590*59599516SKenneth E. Jansen 591*59599516SKenneth E. Jansenc 592*59599516SKenneth E. Jansenc.... orthogonalize and get the norm 593*59599516SKenneth E. Jansenc 594*59599516SKenneth E. Jansen do jK = 1, iKs+1 595*59599516SKenneth E. Jansenc 596*59599516SKenneth E. Jansen if (jK .eq. 1) then 597*59599516SKenneth E. Jansenc 598*59599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,1) ! {u_{i+1}*u_1} vector 599*59599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_1) 600*59599516SKenneth E. Jansenc 601*59599516SKenneth E. Jansen else 602*59599516SKenneth E. Jansenc 603*59599516SKenneth E. Jansenc project off jK-1 vector 604*59599516SKenneth E. Jansenc 605*59599516SKenneth E. Jansen uBrg(:,:,iKs+1)=uBrg(:,:,iKs+1)-beta * uBrg(:,:,jK-1) 606*59599516SKenneth E. Jansenc 607*59599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,jK) !{u_{i+1}*u_j} vector 608*59599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_j) 609*59599516SKenneth E. Jansenc 610*59599516SKenneth E. Jansen endif 611*59599516SKenneth E. Jansenc 612*59599516SKenneth E. Jansen HBrg(jK,iKs) = beta ! put this in the Hessenberg Matrix 613*59599516SKenneth E. Jansenc 614*59599516SKenneth E. Jansen enddo 615*59599516SKenneth E. Jansenc 616*59599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*numnp+(iKs+1)*numnp 617*59599516SKenneth E. Jansenc 618*59599516SKenneth E. Jansenc the last inner product was with what was left of the vector (after 619*59599516SKenneth E. Jansenc projecting off all of the previous vectors 620*59599516SKenneth E. Jansenc 621*59599516SKenneth E. Jansen if(beta.le.0) write(*,*) 'beta in solgrm non-positive' 622*59599516SKenneth E. Jansen unorm = sqrt(beta) 623*59599516SKenneth E. Jansen HBrg(iKs+1,iKs) = unorm ! this fills the 1 sub diagonal band 624*59599516SKenneth E. Jansenc 625*59599516SKenneth E. Jansenc.... normalize the Krylov vector 626*59599516SKenneth E. Jansenc 627*59599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs+1) / unorm ! normalize the next Krylov 628*59599516SKenneth E. Jansenc vector 629*59599516SKenneth E. Jansenc 630*59599516SKenneth E. Jansenc.... construct and reduce the Hessenberg Matrix 631*59599516SKenneth E. Jansenc since there is only one subdiagonal we can use a Givens rotation to 632*59599516SKenneth E. Jansenc rotate off each subdiagonal AS IT IS FORMED. We do this because it 633*59599516SKenneth E. Jansenc allows us to check progress of solution and quit when satisfied. Note 634*59599516SKenneth E. Jansenc that all future K vects will put a subdiagonal in the next column so 635*59599516SKenneth E. Jansenc there is no penalty to work ahead as the rotation for the next vector 636*59599516SKenneth E. Jansenc will be unaffected by this rotation. 637*59599516SKenneth E. Jansen 638*59599516SKenneth E. Jansenc 639*59599516SKenneth E. Jansenc H Y = E ========> R_i H Y = R_i E 640*59599516SKenneth E. Jansenc 641*59599516SKenneth E. Jansen do jK = 1, iKs-1 642*59599516SKenneth E. Jansen tmp = Rcos(jK) * HBrg(jK, iKs) + 643*59599516SKenneth E. Jansen & Rsin(jK) * HBrg(jK+1,iKs) 644*59599516SKenneth E. Jansen HBrg(jK+1,iKs) = -Rsin(jK) * HBrg(jK, iKs) + 645*59599516SKenneth E. Jansen & Rcos(jK) * HBrg(jK+1,iKs) 646*59599516SKenneth E. Jansen HBrg(jK, iKs) = tmp 647*59599516SKenneth E. Jansen enddo 648*59599516SKenneth E. Jansenc 649*59599516SKenneth E. Jansen tmp = sqrt(HBrg(iKs,iKs)**2 + HBrg(iKs+1,iKs)**2) 650*59599516SKenneth E. Jansen Rcos(iKs) = HBrg(iKs, iKs) / tmp 651*59599516SKenneth E. Jansen Rsin(iKs) = HBrg(iKs+1,iKs) / tmp 652*59599516SKenneth E. Jansen HBrg(iKs, iKs)= tmp 653*59599516SKenneth E. Jansen HBrg(iKs+1,iKs)= zero 654*59599516SKenneth E. Jansenc 655*59599516SKenneth E. Jansenc.... rotate eBrg R_i E 656*59599516SKenneth E. Jansenc 657*59599516SKenneth E. Jansen tmp = Rcos(iKs) * eBrg(iKs) + Rsin(iKs) * eBrg(iKs+1) 658*59599516SKenneth E. Jansen eBrg(iKs+1)=-Rsin(iKs) * eBrg(iKs) + Rcos(iKs) * eBrg(iKs+1) 659*59599516SKenneth E. Jansen eBrg(iKs) = tmp 660*59599516SKenneth E. Jansenc 661*59599516SKenneth E. Jansenc.... check for convergence 662*59599516SKenneth E. Jansenc 663*59599516SKenneth E. Jansen ntotGM = ntotGM + 1 664*59599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 665*59599516SKenneth E. Jansen if (echeck .le. epsnrm.and. iKs .ge. minIters) exit 666*59599516SKenneth E. Jansenc 667*59599516SKenneth E. Jansenc.... end of GMRES iteration loop 668*59599516SKenneth E. Jansenc 669*59599516SKenneth E. Jansen 1000 continue 670*59599516SKenneth E. Jansenc 671*59599516SKenneth E. Jansenc.... -------------------------> Solution <------------------------ 672*59599516SKenneth E. Jansenc 673*59599516SKenneth E. Jansenc.... if converged or end of Krylov space 674*59599516SKenneth E. Jansenc 675*59599516SKenneth E. Jansenc.... solve for yBrg 676*59599516SKenneth E. Jansenc 677*59599516SKenneth E. Jansen do jK = iKs, 1, -1 678*59599516SKenneth E. Jansen yBrg(jK) = eBrg(jK) / HBrg(jK,jK) 679*59599516SKenneth E. Jansen do lK = 1, jK-1 680*59599516SKenneth E. Jansen eBrg(lK) = eBrg(lK) - yBrg(jK) * HBrg(lK,jK) 681*59599516SKenneth E. Jansen enddo 682*59599516SKenneth E. Jansen enddo 683*59599516SKenneth E. Jansenc 684*59599516SKenneth E. Jansenc.... update Dy 685*59599516SKenneth E. Jansenc 686*59599516SKenneth E. Jansen do jK = 1, iKs 687*59599516SKenneth E. Jansen Dy = Dy + yBrg(jK) * uBrg(:,:,jK) 688*59599516SKenneth E. Jansen enddo 689*59599516SKenneth E. Jansenc 690*59599516SKenneth E. Jansenc.... flop count 691*59599516SKenneth E. Jansenc 692*59599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*nshg 693*59599516SKenneth E. Jansenc 694*59599516SKenneth E. Jansenc.... check for convergence 695*59599516SKenneth E. Jansenc 696*59599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 697*59599516SKenneth E. Jansen if (echeck .le. epsnrm) exit 698*59599516SKenneth E. Jansen if(myrank.eq.master) write(*,*)'solver tolerance %satisfaction', 699*59599516SKenneth E. Jansen & (one-echeck*etol/epsnrm)/(one-etol)*100 700*59599516SKenneth E. Jansen 701*59599516SKenneth E. Jansenc 702*59599516SKenneth E. Jansenc.... end of mGMRES loop 703*59599516SKenneth E. Jansenc 704*59599516SKenneth E. Jansen 2000 continue 705*59599516SKenneth E. Jansenc 706*59599516SKenneth E. Jansenc.... ------------------------> Converged <------------------------ 707*59599516SKenneth E. Jansenc 708*59599516SKenneth E. Jansenc.... if converged 709*59599516SKenneth E. Jansenc 710*59599516SKenneth E. Jansen 3000 continue 711*59599516SKenneth E. Jansen 712*59599516SKenneth E. Jansenc 713*59599516SKenneth E. Jansenc 714*59599516SKenneth E. Jansenc.... back block-diagonal precondition the results 715*59599516SKenneth E. Jansenc 716*59599516SKenneth E. Jansen call i3LU (BDiag, Dy, 'backward') 717*59599516SKenneth E. Jansenc 718*59599516SKenneth E. Jansenc 719*59599516SKenneth E. Jansenc.... output the statistics 720*59599516SKenneth E. Jansenc 721*59599516SKenneth E. Jansen call rstat (res, ilwork) 722*59599516SKenneth E. Jansenc 723*59599516SKenneth E. Jansenc.... stop the timer 724*59599516SKenneth E. Jansenc 725*59599516SKenneth E. Jansen 3002 continue ! no solve just res. 726*59599516SKenneth E. Jansen call timer ('Back ') 727*59599516SKenneth E. Jansenc 728*59599516SKenneth E. Jansenc.... end 729*59599516SKenneth E. Jansenc 730*59599516SKenneth E. Jansen return 731*59599516SKenneth E. Jansen end 732*59599516SKenneth E. Jansen 733*59599516SKenneth E. Jansen subroutine SolGMRSclr(y, ac, yold, 734*59599516SKenneth E. Jansen & acold, EGmasst, 735*59599516SKenneth E. Jansen & x, elDw, 736*59599516SKenneth E. Jansen & iBC, BC, 737*59599516SKenneth E. Jansen & rest, HBrg, eBrg, 738*59599516SKenneth E. Jansen & yBrg, Rcos, Rsin, iper, 739*59599516SKenneth E. Jansen & ilwork, 740*59599516SKenneth E. Jansen & shp, shgl, 741*59599516SKenneth E. Jansen & shpb, shglb, Dyt) 742*59599516SKenneth E. Jansenc 743*59599516SKenneth E. Jansenc---------------------------------------------------------------------- 744*59599516SKenneth E. Jansenc 745*59599516SKenneth E. Jansenc This is the preconditioned GMRES driver routine. 746*59599516SKenneth E. Jansenc 747*59599516SKenneth E. Jansenc input: 748*59599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_f 749*59599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 750*59599516SKenneth E. Jansenc yold (nshg,ndof) : Y-variables at beginning of step 751*59599516SKenneth E. Jansenc acold (nshg,ndof) : Primvar. accel. variable at begng step 752*59599516SKenneth E. Jansenc x (numnp,nsd) : node coordinates 753*59599516SKenneth E. Jansenc iBC (nshg) : BC codes 754*59599516SKenneth E. Jansenc BC (nshg,ndofBC) : BC constraint parameters 755*59599516SKenneth E. Jansenc iper (nshg) : periodic nodal information 756*59599516SKenneth E. Jansenc 757*59599516SKenneth E. Jansenc output: 758*59599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_f 759*59599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 760*59599516SKenneth E. Jansenc HBrg (Kspace+1,Kspace) : Hessenberg matrix (LHS matrix) 761*59599516SKenneth E. Jansenc eBrg (Kspace+1) : RHS of Hessenberg minim. problem 762*59599516SKenneth E. Jansenc yBrg (Kspace) : solution of Hessenberg minim. problem 763*59599516SKenneth E. Jansenc Rcos (Kspace) : Rotational cosine of QR algorithm 764*59599516SKenneth E. Jansenc Rsin (Kspace) : Rotational sine of QR algorithm 765*59599516SKenneth E. Jansenc output: 766*59599516SKenneth E. Jansenc rest (numnp) : preconditioned residual 767*59599516SKenneth E. Jansenc 768*59599516SKenneth E. Jansenc 769*59599516SKenneth E. Jansenc Zdenek Johan, Winter 1991. (Fortran 90) 770*59599516SKenneth E. Jansenc---------------------------------------------------------------------- 771*59599516SKenneth E. Jansenc 772*59599516SKenneth E. Jansen use pointer_data 773*59599516SKenneth E. Jansen 774*59599516SKenneth E. Jansen include "common.h" 775*59599516SKenneth E. Jansen include "mpif.h" 776*59599516SKenneth E. Jansen include "auxmpi.h" 777*59599516SKenneth E. Jansenc 778*59599516SKenneth E. Jansen dimension y(nshg,ndof), ac(nshg,ndof), 779*59599516SKenneth E. Jansen & yold(nshg,ndof), acold(nshg,ndof), 780*59599516SKenneth E. Jansen & x(numnp,nsd), 781*59599516SKenneth E. Jansen & iBC(nshg), BC(nshg,ndofBC), 782*59599516SKenneth E. Jansen & rest(nshg), 783*59599516SKenneth E. Jansen & Diag(nshg), 784*59599516SKenneth E. Jansen & HBrg(Kspace+1,*), eBrg(*), 785*59599516SKenneth E. Jansen & yBrg(*), Rcos(*), 786*59599516SKenneth E. Jansen & Rsin(*), ilwork(nlwork), 787*59599516SKenneth E. Jansen & iper(nshg), EGmasst(numel,nshape,nshape) 788*59599516SKenneth E. Jansenc 789*59599516SKenneth E. Jansen dimension Dyt(nshg), rmest(nshg), 790*59599516SKenneth E. Jansen & tempt(nshg), Dinv(nshg), 791*59599516SKenneth E. Jansen & uBrgt(nshg,Kspace+1) 792*59599516SKenneth E. Jansenc 793*59599516SKenneth E. Jansen dimension shp(MAXTOP,maxsh,MAXQPT), 794*59599516SKenneth E. Jansen & shgl(MAXTOP,nsd,maxsh,MAXQPT), 795*59599516SKenneth E. Jansen & shpb(MAXTOP,maxsh,MAXQPT), 796*59599516SKenneth E. Jansen & shglb(MAXTOP,nsd,maxsh,MAXQPT) 797*59599516SKenneth E. Jansen real*8 elDw(numel) 798*59599516SKenneth E. Jansenc 799*59599516SKenneth E. Jansenc.... *******************>> Element Data Formation <<****************** 800*59599516SKenneth E. Jansenc 801*59599516SKenneth E. Jansenc 802*59599516SKenneth E. Jansenc.... form the LHS matrices, the residual vector, and the block 803*59599516SKenneth E. Jansenc diagonal preconditioner 804*59599516SKenneth E. Jansenc 805*59599516SKenneth E. Jansen call ElmGMRSclr(y, ac, 806*59599516SKenneth E. Jansen & x, elDw, shp, shgl, 807*59599516SKenneth E. Jansen & iBC, BC, 808*59599516SKenneth E. Jansen & shpb, shglb, 809*59599516SKenneth E. Jansen & rest, 810*59599516SKenneth E. Jansen & rmest, Diag, iper, 811*59599516SKenneth E. Jansen & ilwork, EGmasst) 812*59599516SKenneth E. Jansenc 813*59599516SKenneth E. Jansenc.... **********************>> EBE - GMRES <<******************** 814*59599516SKenneth E. Jansenc 815*59599516SKenneth E. Jansen call timer ('Solver ') 816*59599516SKenneth E. Jansenc 817*59599516SKenneth E. Jansenc.... ------------------------> Initialization <----------------------- 818*59599516SKenneth E. Jansenc 819*59599516SKenneth E. Jansenc 820*59599516SKenneth E. Jansen id = isclr+5 821*59599516SKenneth E. Jansenc.... initialize Dy 822*59599516SKenneth E. Jansenc 823*59599516SKenneth E. Jansen Dyt = zero 824*59599516SKenneth E. Jansenc 825*59599516SKenneth E. Jansenc.... Right preconditioner 826*59599516SKenneth E. Jansenc 827*59599516SKenneth E. Jansen call i3preSclr(Diag, Dinv, EGmassT, ilwork) 828*59599516SKenneth E. Jansenc 829*59599516SKenneth E. Jansenc Check the residual for divering trend 830*59599516SKenneth E. Jansenc 831*59599516SKenneth E. Jansen call rstatCheckSclr(rest,ilwork,y,ac) 832*59599516SKenneth E. Jansen 833*59599516SKenneth E. Jansenc 834*59599516SKenneth E. Jansenc.... copy rest in uBrgt(1) 835*59599516SKenneth E. Jansenc 836*59599516SKenneth E. Jansen uBrgt(:,1) = rest 837*59599516SKenneth E. Jansenc 838*59599516SKenneth E. Jansenc.... calculate norm of residual 839*59599516SKenneth E. Jansenc 840*59599516SKenneth E. Jansen tempt = rest**2 841*59599516SKenneth E. Jansen 842*59599516SKenneth E. Jansen call sumgat (tempt, 1, summed, ilwork) 843*59599516SKenneth E. Jansen unorm = sqrt(summed) 844*59599516SKenneth E. Jansenc 845*59599516SKenneth E. Jansenc.... check if GMRES iterations are required 846*59599516SKenneth E. Jansenc 847*59599516SKenneth E. Jansen iKst = 0 848*59599516SKenneth E. Jansen lGMRESt = 0 849*59599516SKenneth E. Jansenc 850*59599516SKenneth E. Jansenc.... if we are down to machine precision, don't bother solving 851*59599516SKenneth E. Jansenc 852*59599516SKenneth E. Jansen if (unorm .lt. 100.*epsM**2) goto 3000 853*59599516SKenneth E. Jansenc 854*59599516SKenneth E. Jansenc.... set up tolerance of the Hessenberg's problem 855*59599516SKenneth E. Jansenc 856*59599516SKenneth E. Jansen epsnrm = etol * unorm 857*59599516SKenneth E. Jansenc 858*59599516SKenneth E. Jansenc.... ------------------------> GMRES Loop <------------------------- 859*59599516SKenneth E. Jansenc 860*59599516SKenneth E. Jansenc.... loop through GMRES cycles 861*59599516SKenneth E. Jansenc 862*59599516SKenneth E. Jansen do 2000 mGMRES = 1, nGMRES 863*59599516SKenneth E. Jansen lGMRESt = mGMRES - 1 864*59599516SKenneth E. Jansenc 865*59599516SKenneth E. Jansen if (lGMRESt .gt. 0) then 866*59599516SKenneth E. Jansenc 867*59599516SKenneth E. Jansenc.... if GMRES restarts are necessary, calculate R - A x 868*59599516SKenneth E. Jansenc 869*59599516SKenneth E. Jansenc 870*59599516SKenneth E. Jansen 871*59599516SKenneth E. Jansenc.... perform the A x product 872*59599516SKenneth E. Jansenc 873*59599516SKenneth E. Jansen call Au1GMRSclr (EGmasst, tempt, ilwork, iper) 874*59599516SKenneth E. Jansenc 875*59599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 876*59599516SKenneth E. Jansenc 877*59599516SKenneth E. Jansenc call bc3perSclr (iBC, tempt, iper) 878*59599516SKenneth E. Jansenc 879*59599516SKenneth E. Jansenc.... subtract A x from residual and calculate the norm 880*59599516SKenneth E. Jansenc 881*59599516SKenneth E. Jansen tempt = rest - tempt 882*59599516SKenneth E. Jansen uBrgt(:,1) = tempt 883*59599516SKenneth E. Jansenc 884*59599516SKenneth E. Jansenc.... calculate the norm 885*59599516SKenneth E. Jansenc 886*59599516SKenneth E. Jansen tempt = tempt**2 887*59599516SKenneth E. Jansen call sumgat (tempt, 1, summed, ilwork) 888*59599516SKenneth E. Jansen unorm = sqrt(summed) 889*59599516SKenneth E. Jansenc 890*59599516SKenneth E. Jansenc.... flop count 891*59599516SKenneth E. Jansenc 892*59599516SKenneth E. Jansen flops = flops + ndof*numnp+numnp 893*59599516SKenneth E. Jansenc 894*59599516SKenneth E. Jansen endif 895*59599516SKenneth E. Jansenc 896*59599516SKenneth E. Jansenc.... set up RHS of the Hessenberg's problem 897*59599516SKenneth E. Jansenc 898*59599516SKenneth E. Jansen call clear (eBrg, Kspace+1) 899*59599516SKenneth E. Jansen call clear (HBrg, Kspace+1) 900*59599516SKenneth E. Jansen eBrg(1) = unorm 901*59599516SKenneth E. Jansenc 902*59599516SKenneth E. Jansenc.... normalize the first Krylov vector 903*59599516SKenneth E. Jansenc 904*59599516SKenneth E. Jansen uBrgt(:,1) = uBrgt(:,1) / unorm 905*59599516SKenneth E. Jansenc 906*59599516SKenneth E. Jansenc.... loop through GMRES iterations 907*59599516SKenneth E. Jansenc 908*59599516SKenneth E. Jansen do 1000 iK = 1, Kspace 909*59599516SKenneth E. Jansen iKst = iK 910*59599516SKenneth E. Jansen 911*59599516SKenneth E. Jansen uBrgt(:,iKst+1) = uBrgt(:,iKst) 912*59599516SKenneth E. Jansen 913*59599516SKenneth E. Jansenc.... Au product ( u_{i+1} <-- EGmass u_{i+1} ) 914*59599516SKenneth E. Jansenc 915*59599516SKenneth E. Jansen call Au1GMRSclr ( EGmasst, uBrgt(:,iKst+1), ilwork, iper ) 916*59599516SKenneth E. Jansen 917*59599516SKenneth E. Jansenc 918*59599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 919*59599516SKenneth E. Jansenc 920*59599516SKenneth E. Jansen call bc3perSclr (iBC, uBrgt(:,iKst+1), iper) 921*59599516SKenneth E. Jansenc 922*59599516SKenneth E. Jansenc.... orthogonalize and get the norm 923*59599516SKenneth E. Jansenc 924*59599516SKenneth E. Jansen do jK = 1, iKst+1 925*59599516SKenneth E. Jansenc 926*59599516SKenneth E. Jansen if (jK .eq. 1) then 927*59599516SKenneth E. Jansenc 928*59599516SKenneth E. Jansen tempt = uBrgt(:,iKst+1) * uBrgt(:,1) ! {u_{i+1}*u_1} vector 929*59599516SKenneth E. Jansen call sumgat (tempt, 1, beta, ilwork) ! sum vector=(u_{i+1},u_1) 930*59599516SKenneth E. Jansenc 931*59599516SKenneth E. Jansen else 932*59599516SKenneth E. Jansenc 933*59599516SKenneth E. Jansenc project off jK-1 vector 934*59599516SKenneth E. Jansenc 935*59599516SKenneth E. Jansen uBrgt(:,iKst+1) = uBrgt(:,iKst+1) - beta * uBrgt(:,jK-1) 936*59599516SKenneth E. Jansenc 937*59599516SKenneth E. Jansen tempt = uBrgt(:,iKst+1) * uBrgt(:,jK) !{u_{i+1}*u_j} vector 938*59599516SKenneth E. Jansen call sumgat (tempt, 1, beta, ilwork) ! sum vector=(u_{i+1},u_j) 939*59599516SKenneth E. Jansenc 940*59599516SKenneth E. Jansen endif 941*59599516SKenneth E. Jansenc 942*59599516SKenneth E. Jansen HBrg(jK,iKst) = beta ! put this in the Hessenberg Matrix 943*59599516SKenneth E. Jansenc 944*59599516SKenneth E. Jansen enddo 945*59599516SKenneth E. Jansenc 946*59599516SKenneth E. Jansen flops = flops + (3*iKst+1)*ndof*numnp+(iKst+1)*numnp 947*59599516SKenneth E. Jansenc 948*59599516SKenneth E. Jansenc the last inner product was with what was left of the vector (after 949*59599516SKenneth E. Jansenc projecting off all of the previous vectors 950*59599516SKenneth E. Jansenc 951*59599516SKenneth E. Jansen unorm = sqrt(beta) 952*59599516SKenneth E. Jansen HBrg(iKst+1,iKst) = unorm ! this fills the 1 sub diagonal band 953*59599516SKenneth E. Jansenc 954*59599516SKenneth E. Jansenc.... normalize the Krylov vector 955*59599516SKenneth E. Jansenc 956*59599516SKenneth E. Jansen uBrgt(:,iKst+1) = uBrgt(:,iKst+1) / unorm ! normalize the next Krylov 957*59599516SKenneth E. Jansenc vector 958*59599516SKenneth E. Jansenc 959*59599516SKenneth E. Jansenc.... construct and reduce the Hessenberg Matrix 960*59599516SKenneth E. Jansenc since there is only one subdiagonal we can use a Givens rotation to 961*59599516SKenneth E. Jansenc rotate off each subdiagonal AS IT IS FORMED. We do this because it 962*59599516SKenneth E. Jansenc allows us to check progress of solution and quit when satisfied. Note 963*59599516SKenneth E. Jansenc that all future K vects will put a subdiagonal in the next column so 964*59599516SKenneth E. Jansenc there is no penalty to work ahead as the rotation for the next vector 965*59599516SKenneth E. Jansenc will be unaffected by this rotation. 966*59599516SKenneth E. Jansen 967*59599516SKenneth E. Jansenc 968*59599516SKenneth E. Jansenc H Y = E ========> R_i H Y = R_i E 969*59599516SKenneth E. Jansenc 970*59599516SKenneth E. Jansen do jK = 1, iKst-1 971*59599516SKenneth E. Jansen tmp = Rcos(jK) * HBrg(jK, iKst) + 972*59599516SKenneth E. Jansen & Rsin(jK) * HBrg(jK+1,iKst) 973*59599516SKenneth E. Jansen HBrg(jK+1,iKst) = -Rsin(jK) * HBrg(jK, iKst) + 974*59599516SKenneth E. Jansen & Rcos(jK) * HBrg(jK+1,iKst) 975*59599516SKenneth E. Jansen HBrg(jK, iKst) = tmp 976*59599516SKenneth E. Jansen enddo 977*59599516SKenneth E. Jansenc 978*59599516SKenneth E. Jansen tmp = sqrt(HBrg(iKst,iKst)**2 + HBrg(iKst+1,iKst)**2) 979*59599516SKenneth E. Jansen Rcos(iKst) = HBrg(iKst, iKst) / tmp 980*59599516SKenneth E. Jansen Rsin(iKst) = HBrg(iKst+1,iKst) / tmp 981*59599516SKenneth E. Jansen HBrg(iKst, iKst) = tmp 982*59599516SKenneth E. Jansen HBrg(iKst+1,iKst) = zero 983*59599516SKenneth E. Jansenc 984*59599516SKenneth E. Jansenc.... rotate eBrg R_i E 985*59599516SKenneth E. Jansenc 986*59599516SKenneth E. Jansen tmp = Rcos(iKst)*eBrg(iKst) + Rsin(iKst)*eBrg(iKst+1) 987*59599516SKenneth E. Jansen eBrg(iKst+1)=-Rsin(iKst)*eBrg(iKst) + Rcos(iKst)*eBrg(iKst+1) 988*59599516SKenneth E. Jansen eBrg(iKst) = tmp 989*59599516SKenneth E. Jansenc 990*59599516SKenneth E. Jansenc.... check for convergence 991*59599516SKenneth E. Jansenc 992*59599516SKenneth E. Jansen ercheck=eBrg(iKst+1) 993*59599516SKenneth E. Jansen ntotGMt = ntotGMt + 1 994*59599516SKenneth E. Jansen if (abs(eBrg(iKst+1)) .le. epsnrm) exit 995*59599516SKenneth E. Jansenc 996*59599516SKenneth E. Jansenc.... end of GMRES iteration loop 997*59599516SKenneth E. Jansenc 998*59599516SKenneth E. Jansen 1000 continue 999*59599516SKenneth E. Jansenc 1000*59599516SKenneth E. Jansenc.... -------------------------> Solution <------------------------ 1001*59599516SKenneth E. Jansenc 1002*59599516SKenneth E. Jansenc.... if converged or end of Krylov space 1003*59599516SKenneth E. Jansenc 1004*59599516SKenneth E. Jansenc.... solve for yBrg 1005*59599516SKenneth E. Jansenc 1006*59599516SKenneth E. Jansen do jK = iKst, 1, -1 1007*59599516SKenneth E. Jansen yBrg(jK) = eBrg(jK) / HBrg(jK,jK) 1008*59599516SKenneth E. Jansen do lK = 1, jK-1 1009*59599516SKenneth E. Jansen eBrg(lK) = eBrg(lK) - yBrg(jK) * HBrg(lK,jK) 1010*59599516SKenneth E. Jansen enddo 1011*59599516SKenneth E. Jansen enddo 1012*59599516SKenneth E. Jansenc 1013*59599516SKenneth E. Jansenc.... update Dy 1014*59599516SKenneth E. Jansenc 1015*59599516SKenneth E. Jansen do jK = 1, iKst 1016*59599516SKenneth E. Jansen Dyt = Dyt + yBrg(jK) * uBrgt(:,jK) 1017*59599516SKenneth E. Jansen enddo 1018*59599516SKenneth E. Jansenc 1019*59599516SKenneth E. Jansenc.... flop count 1020*59599516SKenneth E. Jansenc 1021*59599516SKenneth E. Jansen flops = flops + (3*iKst+1)*ndof*numnp 1022*59599516SKenneth E. Jansenc 1023*59599516SKenneth E. Jansenc.... check for convergence 1024*59599516SKenneth E. Jansenc 1025*59599516SKenneth E. Jansen if (abs(eBrg(iKst+1)) .le. epsnrm) exit 1026*59599516SKenneth E. Jansenc 1027*59599516SKenneth E. Jansenc.... end of mGMRES loop 1028*59599516SKenneth E. Jansenc 1029*59599516SKenneth E. Jansen 2000 continue 1030*59599516SKenneth E. Jansenc 1031*59599516SKenneth E. Jansenc.... ------------------------> Converged <------------------------ 1032*59599516SKenneth E. Jansenc 1033*59599516SKenneth E. Jansenc.... if converged 1034*59599516SKenneth E. Jansenc 1035*59599516SKenneth E. Jansen 3000 continue 1036*59599516SKenneth E. Jansenc 1037*59599516SKenneth E. Jansenc.... back precondition the result 1038*59599516SKenneth E. Jansenc 1039*59599516SKenneth E. Jansen Dyt(:) = Dyt(:) * Dinv(:) 1040*59599516SKenneth E. Jansenc 1041*59599516SKenneth E. Jansenc.... output the statistics 1042*59599516SKenneth E. Jansenc 1043*59599516SKenneth E. Jansen call rstatSclr( rest, ilwork,lgmrest,iKst) 1044*59599516SKenneth E. Jansenc.... stop the timer 1045*59599516SKenneth E. Jansenc 1046*59599516SKenneth E. Jansen 3002 continue ! no solve just res. 1047*59599516SKenneth E. Jansen call timer ('Back ') 1048*59599516SKenneth E. Jansenc 1049*59599516SKenneth E. Jansenc.... end 1050*59599516SKenneth E. Jansenc 1051*59599516SKenneth E. Jansen return 1052*59599516SKenneth E. Jansen end 1053*59599516SKenneth E. Jansen 1054*59599516SKenneth E. Jansen 1055