159599516SKenneth E. Jansen subroutine SolGMRe (y, ac, yold, acold, 259599516SKenneth E. Jansen & x, iBC, BC, EGmass, 359599516SKenneth E. Jansen & res, BDiag, HBrg, eBrg, 459599516SKenneth E. Jansen & yBrg, Rcos, Rsin, iper, 559599516SKenneth E. Jansen & ilwork, shp, shgl, shpb, 659599516SKenneth E. Jansen & shglb, Dy, rerr) 759599516SKenneth E. Jansenc 859599516SKenneth E. Jansenc---------------------------------------------------------------------- 959599516SKenneth E. Jansenc 1059599516SKenneth E. Jansenc This is the preconditioned GMRES driver routine. 1159599516SKenneth E. Jansenc 1259599516SKenneth E. Jansenc input: 1359599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_v 1459599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 1559599516SKenneth E. Jansenc yold (nshg,ndof) : Y-variables at beginning of step 1659599516SKenneth E. Jansenc acold (nshg,ndof) : Primvar. accel. variable at begng step 1759599516SKenneth E. Jansenc x (numnp,nsd) : node coordinates 1859599516SKenneth E. Jansenc iBC (nshg) : BC codes 1959599516SKenneth E. Jansenc BC (nshg,ndofBC) : BC constraint parameters 2059599516SKenneth E. Jansenc HBrg (Kspace+1,Kspace) : Hessenberg matrix (LHS matrix) 2159599516SKenneth E. Jansenc eBrg (Kspace+1) : RHS of Hessenberg minim. problem 2259599516SKenneth E. Jansenc yBrg (Kspace) : solution of Hessenberg minim. problem 2359599516SKenneth E. Jansenc Rcos (Kspace) : Rotational cosine of QR algorithm 2459599516SKenneth E. Jansenc Rsin (Kspace) : Rotational sine of QR algorithm 2559599516SKenneth E. Jansenc shp(b) (nen,maxsh,melCat) : element shape functions (boundary) 2659599516SKenneth E. Jansenc shgl(b)(nsd,nen,maxsh,melCat) : local gradients of shape functions 2759599516SKenneth E. Jansenc 2859599516SKenneth E. Jansenc output: 2959599516SKenneth E. Jansenc res (nshg,nflow) : preconditioned residual 3059599516SKenneth E. Jansenc BDiag (nshg,nflow,nflow) : block-diagonal preconditioner 3159599516SKenneth E. Jansenc 3259599516SKenneth E. Jansenc 3359599516SKenneth E. Jansenc Zdenek Johan, Winter 1991. (Fortran 90) 3459599516SKenneth E. Jansenc---------------------------------------------------------------------- 3559599516SKenneth E. Jansenc 3659599516SKenneth E. Jansen use pointer_data 3759599516SKenneth E. Jansen 3859599516SKenneth E. Jansen include "common.h" 3959599516SKenneth E. Jansen include "mpif.h" 4059599516SKenneth E. Jansen include "auxmpi.h" 4159599516SKenneth E. Jansenc 4259599516SKenneth E. Jansen dimension y(nshg,ndof), ac(nshg,ndof), 4359599516SKenneth E. Jansen & yold(nshg,ndof), acold(nshg,ndof), 4459599516SKenneth E. Jansen & x(numnp,nsd), 4559599516SKenneth E. Jansen & iBC(nshg), BC(nshg,ndofBC), 4659599516SKenneth E. Jansen & res(nshg,nflow), 4759599516SKenneth E. Jansen & BDiag(nshg,nflow,nflow), 4859599516SKenneth E. Jansen & HBrg(Kspace+1,*), eBrg(*), 4959599516SKenneth E. Jansen & yBrg(*), Rcos(*), 5059599516SKenneth E. Jansen & Rsin(*), ilwork(nlwork), 5159599516SKenneth E. Jansen & iper(nshg), EGmass(numel,nedof,nedof)!, 5259599516SKenneth E. Jansenctoomuchmem & Binv(numel,nedof,nedof) 5359599516SKenneth E. Jansenc 5459599516SKenneth E. Jansen dimension Dy(nshg,nflow), rmes(nshg,nflow), 5559599516SKenneth E. Jansen & temp(nshg,nflow), 5659599516SKenneth E. Jansen & uBrg(nshg,nflow,Kspace+1) 5759599516SKenneth E. Jansenc 5859599516SKenneth E. Jansen dimension shp(MAXTOP,maxsh,MAXQPT), 5959599516SKenneth E. Jansen & shgl(MAXTOP,nsd,maxsh,MAXQPT), 6059599516SKenneth E. Jansen & shpb(MAXTOP,maxsh,MAXQPT), 6159599516SKenneth E. Jansen & shglb(MAXTOP,nsd,maxsh,MAXQPT) 6259599516SKenneth E. Jansen real*8 rerr(nshg,10) 6359599516SKenneth E. Jansenc 6459599516SKenneth E. Jansenc.... *******************>> Element Data Formation <<****************** 6559599516SKenneth E. Jansenc 6659599516SKenneth E. Jansenc 6759599516SKenneth E. Jansenc.... set the parameters for flux and surface tension calculations 6859599516SKenneth E. Jansenc 6959599516SKenneth E. Jansenc 7059599516SKenneth E. Jansen idflx = 0 7159599516SKenneth E. Jansen if(idiff >= 1) idflx= idflx + (nflow-1) * nsd 7259599516SKenneth E. Jansen if (isurf == 1) idflx=idflx + nsd 7359599516SKenneth E. Jansenc 7459599516SKenneth E. Jansenc.... form the LHS matrices, the residual vector, and the block 7559599516SKenneth E. Jansenc diagonal preconditioner 7659599516SKenneth E. Jansenc 7759599516SKenneth E. Jansen call ElmGMRe(y, ac, x, 7859599516SKenneth E. Jansen & shp, shgl, iBC, 7959599516SKenneth E. Jansen & BC, shpb, 8059599516SKenneth E. Jansen & shglb, res, 8159599516SKenneth E. Jansen & rmes, BDiag, iper, 8259599516SKenneth E. Jansen & ilwork, EGmass, rerr ) 8359599516SKenneth E. Jansenc 8459599516SKenneth E. Jansenc.... **********************>> EBE - GMRES <<******************** 8559599516SKenneth E. Jansenc 8659599516SKenneth E. Jansen call timer ('Solver ') 8759599516SKenneth E. Jansenc 8859599516SKenneth E. Jansenc.... ------------------------> Initialization <----------------------- 8959599516SKenneth E. Jansenc 9059599516SKenneth E. Jansenc 9159599516SKenneth E. Jansenc.... LU decompose the block diagonals 9259599516SKenneth E. Jansenc 9359599516SKenneth E. Jansen if (iprec .ne. 0) 9459599516SKenneth E. Jansen & call i3LU (BDiag, res, 'LU_Fact ') 9559599516SKenneth E. Jansen 9659599516SKenneth E. Jansenc 9759599516SKenneth E. Jansenc.... block diagonal precondition residual 9859599516SKenneth E. Jansenc 9959599516SKenneth E. Jansen call i3LU (BDiag, res, 'forward ') 10059599516SKenneth E. Jansenc 10159599516SKenneth E. Jansenc.... initialize Dy 10259599516SKenneth E. Jansenc 10359599516SKenneth E. Jansen Dy = zero 10459599516SKenneth E. Jansenc 10559599516SKenneth E. Jansenc.... Pre-precondition the LHS mass matrix and set up the element 10659599516SKenneth E. Jansenc by element preconditioners 10759599516SKenneth E. Jansenc 10859599516SKenneth E. Jansenctoomuchmemory note that Binv is demoted from huge array to just one 10959599516SKenneth E. Jansenc real*8 in i3pre because it takes too much memory 11059599516SKenneth E. Jansen 11159599516SKenneth E. Jansen call i3pre (BDiag, Binv, EGmass, ilwork) 11259599516SKenneth E. Jansenc 11359599516SKenneth E. Jansenc.... left EBE precondition the residual 11459599516SKenneth E. Jansenc 11559599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, res, ilwork, 'L_Pcond ') 11659599516SKenneth E. Jansenc 11759599516SKenneth E. Jansenc.... copy res in uBrg(1) 11859599516SKenneth E. Jansenc 11959599516SKenneth E. Jansen uBrg(:,:,1) = res 12059599516SKenneth E. Jansenc 12159599516SKenneth E. Jansenc.... calculate norm of residual 12259599516SKenneth E. Jansenc 12359599516SKenneth E. Jansen temp = res**2 12459599516SKenneth E. Jansen 12559599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 12659599516SKenneth E. Jansen unorm = sqrt(summed) 12759599516SKenneth E. Jansenc 12859599516SKenneth E. Jansenc.... check if GMRES iterations are required 12959599516SKenneth E. Jansenc 13059599516SKenneth E. Jansen iKs = 0 13159599516SKenneth E. Jansen lGMRES = 0 13259599516SKenneth E. Jansenc 13359599516SKenneth E. Jansenc.... if we are down to machine precision, don't bother solving 13459599516SKenneth E. Jansenc 13559599516SKenneth E. Jansen if (unorm .lt. 100.*epsM**2) goto 3000 13659599516SKenneth E. Jansenc 13759599516SKenneth E. Jansenc.... set up tolerance of the Hessenberg's problem 13859599516SKenneth E. Jansenc 13959599516SKenneth E. Jansen epsnrm = etol * unorm 14059599516SKenneth E. Jansenc 14159599516SKenneth E. Jansenc.... ------------------------> GMRES Loop <------------------------- 14259599516SKenneth E. Jansenc 14359599516SKenneth E. Jansenc.... loop through GMRES cycles 14459599516SKenneth E. Jansenc 14559599516SKenneth E. Jansen do 2000 mGMRES = 1, nGMRES 14659599516SKenneth E. Jansen lGMRES = mGMRES - 1 14759599516SKenneth E. Jansenc 14859599516SKenneth E. Jansen if (lGMRES .gt. 0) then 14959599516SKenneth E. Jansenc 15059599516SKenneth E. Jansenc.... if GMRES restarts are necessary, calculate R - A x 15159599516SKenneth E. Jansenc 15259599516SKenneth E. Jansenc 15359599516SKenneth E. Jansenc.... right precondition Dy 15459599516SKenneth E. Jansenc 15559599516SKenneth E. Jansen temp = Dy 15659599516SKenneth E. Jansen 15759599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, temp, ilwork, 'R_Pcond ') 15859599516SKenneth E. Jansenc 15959599516SKenneth E. Jansenc.... perform the A x product 16059599516SKenneth E. Jansenc 16159599516SKenneth E. Jansen call Au1GMR (EGmass, temp, ilwork, iBC,iper) 16259599516SKenneth E. Jansenc 16359599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 16459599516SKenneth E. Jansenc 16559599516SKenneth E. Jansen call bc3per (iBC, temp, iper, ilwork, nflow) 16659599516SKenneth E. Jansenc 16759599516SKenneth E. Jansenc.... left preconditioning 16859599516SKenneth E. Jansenc 16959599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, temp, ilwork, 'L_Pcond ') 17059599516SKenneth E. Jansenc 17159599516SKenneth E. Jansenc.... subtract A x from residual and calculate the norm 17259599516SKenneth E. Jansenc 17359599516SKenneth E. Jansen temp = res - temp 17459599516SKenneth E. Jansen uBrg(:,:,1) = temp 17559599516SKenneth E. Jansenc 17659599516SKenneth E. Jansenc.... calculate the norm 17759599516SKenneth E. Jansenc 17859599516SKenneth E. Jansen temp = temp**2 17959599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 18059599516SKenneth E. Jansen unorm = sqrt(summed) 18159599516SKenneth E. Jansenc 18259599516SKenneth E. Jansenc.... flop count 18359599516SKenneth E. Jansenc 18459599516SKenneth E. Jansen flops = flops + nflow*nshg+nshg 18559599516SKenneth E. Jansenc 18659599516SKenneth E. Jansen endif 18759599516SKenneth E. Jansenc 18859599516SKenneth E. Jansenc.... set up RHS of the Hessenberg's problem 18959599516SKenneth E. Jansenc 19059599516SKenneth E. Jansen call clear (eBrg, Kspace+1) 19159599516SKenneth E. Jansen eBrg(1) = unorm 19259599516SKenneth E. Jansenc 19359599516SKenneth E. Jansenc.... normalize the first Krylov vector 19459599516SKenneth E. Jansenc 19559599516SKenneth E. Jansen uBrg(:,:,1) = uBrg(:,:,1) / unorm 19659599516SKenneth E. Jansenc 19759599516SKenneth E. Jansenc.... loop through GMRES iterations 19859599516SKenneth E. Jansenc 19959599516SKenneth E. Jansen do 1000 iK = 1, Kspace 20059599516SKenneth E. Jansen iKs = iK 20159599516SKenneth E. Jansen 20259599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs) 20359599516SKenneth E. Jansenc 20459599516SKenneth E. Jansenc.... right EBE precondition the LHS ( u_{i+1} <-- inverse(U) u_i ) 20559599516SKenneth E. Jansenc 20659599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, uBrg(:,:,iKs+1), ilwork, 'R_Pcond ') 20759599516SKenneth E. Jansenc 20859599516SKenneth E. Jansenc.... Au product ( u_{i+1} <-- EGmass u_{i+1} ) 20959599516SKenneth E. Jansenc 21059599516SKenneth E. Jansen call Au1GMR ( EGmass, uBrg(:,:,iKs+1), ilwork, iBC,iper) 21159599516SKenneth E. Jansenc 21259599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 21359599516SKenneth E. Jansenc 21459599516SKenneth E. Jansen call bc3per (iBC, uBrg(:,:,iKs+1), iper, ilwork, nflow) 21559599516SKenneth E. Jansen 21659599516SKenneth E. Jansenc 21759599516SKenneth E. Jansenc.... left EBE precondition the LHS ( u_{i+1} <-- inverse(L) u_{i+1} ) 21859599516SKenneth E. Jansenc 21959599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, uBrg(:,:,iKs+1), ilwork, 'L_Pcond ') 22059599516SKenneth E. Jansenc 22159599516SKenneth E. Jansenc.... orthogonalize and get the norm 22259599516SKenneth E. Jansenc 22359599516SKenneth E. Jansen do jK = 1, iKs+1 22459599516SKenneth E. Jansenc 22559599516SKenneth E. Jansen if (jK .eq. 1) then 22659599516SKenneth E. Jansenc 22759599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,1) ! {u_{i+1}*u_1} vector 22859599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_1) 22959599516SKenneth E. Jansenc 23059599516SKenneth E. Jansen else 23159599516SKenneth E. Jansenc 23259599516SKenneth E. Jansenc project off jK-1 vector 23359599516SKenneth E. Jansenc 23459599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs+1) - beta * uBrg(:,:,jK-1) 23559599516SKenneth E. Jansenc 23659599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,jK) !{u_{i+1}*u_j} vector 23759599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_j) 23859599516SKenneth E. Jansenc 23959599516SKenneth E. Jansen endif 24059599516SKenneth E. Jansenc 24159599516SKenneth E. Jansen HBrg(jK,iKs) = beta ! put this in the Hessenberg Matrix 24259599516SKenneth E. Jansenc 24359599516SKenneth E. Jansen enddo 24459599516SKenneth E. Jansenc 24559599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*numnp+(iKs+1)*numnp 24659599516SKenneth E. Jansenc 24759599516SKenneth E. Jansenc the last inner product was with what was left of the vector (after 24859599516SKenneth E. Jansenc projecting off all of the previous vectors 24959599516SKenneth E. Jansenc 25059599516SKenneth E. Jansen unorm = sqrt(beta) 25159599516SKenneth E. Jansen HBrg(iKs+1,iKs) = unorm ! this fills the 1 sub diagonal band 25259599516SKenneth E. Jansenc 25359599516SKenneth E. Jansenc.... normalize the Krylov vector 25459599516SKenneth E. Jansenc 25559599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs+1) / unorm ! normalize the next Krylov 25659599516SKenneth E. Jansenc vector 25759599516SKenneth E. Jansenc 25859599516SKenneth E. Jansenc.... construct and reduce the Hessenberg Matrix 25959599516SKenneth E. Jansenc since there is only one subdiagonal we can use a Givens rotation to 26059599516SKenneth E. Jansenc rotate off each subdiagonal AS IT IS FORMED. We do this because it 26159599516SKenneth E. Jansenc allows us to check progress of solution and quit when satisfied. Note 26259599516SKenneth E. Jansenc that all future K vects will put a subdiagonal in the next column so 26359599516SKenneth E. Jansenc there is no penalty to work ahead as the rotation for the next vector 26459599516SKenneth E. Jansenc will be unaffected by this rotation. 26559599516SKenneth E. Jansen 26659599516SKenneth E. Jansenc 26759599516SKenneth E. Jansenc H Y = E ========> R_i H Y = R_i E 26859599516SKenneth E. Jansenc 26959599516SKenneth E. Jansen do jK = 1, iKs-1 27059599516SKenneth E. Jansen tmp = Rcos(jK) * HBrg(jK, iKs) + 27159599516SKenneth E. Jansen & Rsin(jK) * HBrg(jK+1,iKs) 27259599516SKenneth E. Jansen HBrg(jK+1,iKs) = -Rsin(jK) * HBrg(jK, iKs) + 27359599516SKenneth E. Jansen & Rcos(jK) * HBrg(jK+1,iKs) 27459599516SKenneth E. Jansen HBrg(jK, iKs) = tmp 27559599516SKenneth E. Jansen enddo 27659599516SKenneth E. Jansenc 27759599516SKenneth E. Jansen tmp = sqrt(HBrg(iKs,iKs)**2 + HBrg(iKs+1,iKs)**2) 27859599516SKenneth E. Jansen Rcos(iKs) = HBrg(iKs, iKs) / tmp 27959599516SKenneth E. Jansen Rsin(iKs) = HBrg(iKs+1,iKs) / tmp 28059599516SKenneth E. Jansen HBrg(iKs, iKs) = tmp 28159599516SKenneth E. Jansen HBrg(iKs+1,iKs) = zero 28259599516SKenneth E. Jansenc 28359599516SKenneth E. Jansenc.... rotate eBrg R_i E 28459599516SKenneth E. Jansenc 28559599516SKenneth E. Jansen tmp = Rcos(iKs) * eBrg(iKs) + Rsin(iKs) * eBrg(iKs+1) 28659599516SKenneth E. Jansen eBrg(iKs+1) =-Rsin(iKs) * eBrg(iKs) + Rcos(iKs) * eBrg(iKs+1) 28759599516SKenneth E. Jansen eBrg(iKs) = tmp 28859599516SKenneth E. Jansenc 28959599516SKenneth E. Jansenc.... check for convergence 29059599516SKenneth E. Jansenc 29159599516SKenneth E. Jansen ntotGM = ntotGM + 1 29259599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 29359599516SKenneth E. Jansen if (echeck .le. epsnrm) exit 29459599516SKenneth E. Jansenc 29559599516SKenneth E. Jansenc.... end of GMRES iteration loop 29659599516SKenneth E. Jansenc 29759599516SKenneth E. Jansen 1000 continue 29859599516SKenneth E. Jansenc 29959599516SKenneth E. Jansenc.... -------------------------> Solution <------------------------ 30059599516SKenneth E. Jansenc 30159599516SKenneth E. Jansenc.... if converged or end of Krylov space 30259599516SKenneth E. Jansenc 30359599516SKenneth E. Jansenc.... solve for yBrg 30459599516SKenneth E. Jansenc 30559599516SKenneth E. Jansen do jK = iKs, 1, -1 30659599516SKenneth E. Jansen yBrg(jK) = eBrg(jK) / HBrg(jK,jK) 30759599516SKenneth E. Jansen do lK = 1, jK-1 30859599516SKenneth E. Jansen eBrg(lK) = eBrg(lK) - yBrg(jK) * HBrg(lK,jK) 30959599516SKenneth E. Jansen enddo 31059599516SKenneth E. Jansen enddo 31159599516SKenneth E. Jansenc 31259599516SKenneth E. Jansenc.... update Dy 31359599516SKenneth E. Jansenc 31459599516SKenneth E. Jansen do jK = 1, iKs 31559599516SKenneth E. Jansen Dy = Dy + yBrg(jK) * uBrg(:,:,jK) 31659599516SKenneth E. Jansen enddo 31759599516SKenneth E. Jansenc 31859599516SKenneth E. Jansenc.... flop count 31959599516SKenneth E. Jansenc 32059599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*nshg 32159599516SKenneth E. Jansenc 32259599516SKenneth E. Jansenc.... check for convergence 32359599516SKenneth E. Jansenc 32459599516SKenneth E. Jansen 32559599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 32659599516SKenneth E. Jansen if (echeck .le. epsnrm) exit 32759599516SKenneth E. Jansen if(myrank.eq.master) write(*,*)'solver tolerance %satisfaction', 32859599516SKenneth E. Jansen & (one-echeck/unorm)/(one-etol)*100 32959599516SKenneth E. Jansenc 33059599516SKenneth E. Jansenc.... end of mGMRES loop 33159599516SKenneth E. Jansenc 33259599516SKenneth E. Jansen 2000 continue 33359599516SKenneth E. Jansenc 33459599516SKenneth E. Jansenc.... ------------------------> Converged <------------------------ 33559599516SKenneth E. Jansenc 33659599516SKenneth E. Jansenc.... if converged 33759599516SKenneth E. Jansenc 33859599516SKenneth E. Jansen 3000 continue 33959599516SKenneth E. Jansenc 34059599516SKenneth E. Jansenc.... back EBE precondition the results 34159599516SKenneth E. Jansenc 34259599516SKenneth E. Jansenctoomuchmem call i3Pcond (Binv, Dy, ilwork, 'R_Pcond ') 34359599516SKenneth E. Jansenc 34459599516SKenneth E. Jansenc.... back block-diagonal precondition the results 34559599516SKenneth E. Jansenc 34659599516SKenneth E. Jansen call i3LU (BDiag, Dy, 'backward') 34759599516SKenneth E. Jansenc 34859599516SKenneth E. Jansenc 34959599516SKenneth E. Jansenc.... output the statistics 35059599516SKenneth E. Jansenc 35159599516SKenneth E. Jansen call rstat (res, ilwork) 35259599516SKenneth E. Jansenc 35359599516SKenneth E. Jansenc.... stop the timer 35459599516SKenneth E. Jansenc 35559599516SKenneth E. Jansen 3002 continue ! no solve just res. 35659599516SKenneth E. Jansen call timer ('Back ') 35759599516SKenneth E. Jansenc 35859599516SKenneth E. Jansenc.... end 35959599516SKenneth E. Jansenc 36059599516SKenneth E. Jansen return 36159599516SKenneth E. Jansen end 36259599516SKenneth E. Jansen 36359599516SKenneth E. Jansen 36459599516SKenneth E. Jansen 36559599516SKenneth E. Jansen 36659599516SKenneth E. Jansen 36759599516SKenneth E. Jansen subroutine SolGMRs(y, ac, yold, acold, 36859599516SKenneth E. Jansen & x, iBC, BC, 36959599516SKenneth E. Jansen & col, row, lhsk, 37059599516SKenneth E. Jansen & res, BDiag, HBrg, eBrg, 37159599516SKenneth E. Jansen & yBrg, Rcos, Rsin, iper, 37259599516SKenneth E. Jansen & ilwork, shp, shgl, shpb, 37359599516SKenneth E. Jansen & shglb, Dy, rerr) 37459599516SKenneth E. Jansenc 37559599516SKenneth E. Jansenc---------------------------------------------------------------------- 37659599516SKenneth E. Jansenc 37759599516SKenneth E. Jansenc This is the preconditioned GMRES driver routine. 37859599516SKenneth E. Jansenc 37959599516SKenneth E. Jansenc input: 38059599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_v 38159599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 38259599516SKenneth E. Jansenc yold (nshg,ndof) : Y-variables at beginning of step 38359599516SKenneth E. Jansenc acold (nshg,ndof) : Primvar. accel. variable at begng step 38459599516SKenneth E. Jansenc x (numnp,nsd) : node coordinates 38559599516SKenneth E. Jansenc iBC (nshg) : BC codes 38659599516SKenneth E. Jansenc BC (nshg,ndofBC) : BC constraint parameters 38759599516SKenneth E. Jansenc HBrg (Kspace+1,Kspace) : Hessenberg matrix (LHS matrix) 38859599516SKenneth E. Jansenc eBrg (Kspace+1) : RHS of Hessenberg minim. problem 38959599516SKenneth E. Jansenc yBrg (Kspace) : solution of Hessenberg minim. problem 39059599516SKenneth E. Jansenc Rcos (Kspace) : Rotational cosine of QR algorithm 39159599516SKenneth E. Jansenc Rsin (Kspace) : Rotational sine of QR algorithm 39259599516SKenneth E. Jansenc shp(b) (nen,maxsh,melCat) : element shape functions (boundary) 39359599516SKenneth E. Jansenc shgl(b)(nsd,nen,maxsh,melCat) : local gradients of shape functions 39459599516SKenneth E. Jansenc 39559599516SKenneth E. Jansenc output: 39659599516SKenneth E. Jansenc res (nshg,nflow) : preconditioned residual 39759599516SKenneth E. Jansenc BDiag (nshg,nflow,nflow) : block-diagonal preconditioner 39859599516SKenneth E. Jansenc 39959599516SKenneth E. Jansenc 40059599516SKenneth E. Jansenc Zdenek Johan, Winter 1991. (Fortran 90) 40159599516SKenneth E. Jansenc---------------------------------------------------------------------- 40259599516SKenneth E. Jansenc 40359599516SKenneth E. Jansen use pointer_data 40459599516SKenneth E. Jansen 40559599516SKenneth E. Jansen include "common.h" 40659599516SKenneth E. Jansen include "mpif.h" 40759599516SKenneth E. Jansen include "auxmpi.h" 40859599516SKenneth E. Jansenc 40959599516SKenneth E. Jansen integer col(nshg+1), row(nnz*nshg) 41059599516SKenneth E. Jansen real*8 lhsK(nflow*nflow,nnz_tot) 41159599516SKenneth E. Jansen 41259599516SKenneth E. Jansen 41359599516SKenneth E. Jansen dimension y(nshg,ndof), ac(nshg,ndof), 41459599516SKenneth E. Jansen & yold(nshg,ndof), acold(nshg,ndof), 41559599516SKenneth E. Jansen & x(numnp,nsd), 41659599516SKenneth E. Jansen & iBC(nshg), BC(nshg,ndofBC), 41759599516SKenneth E. Jansen & res(nshg,nflow), 41859599516SKenneth E. Jansen & BDiag(nshg,nflow,nflow), 41959599516SKenneth E. Jansen & HBrg(Kspace+1,Kspace), eBrg(Kspace+1), 42059599516SKenneth E. Jansen & yBrg(Kspace), Rcos(Kspace), 42159599516SKenneth E. Jansen & Rsin(Kspace), ilwork(nlwork), 42259599516SKenneth E. Jansen & iper(nshg) 42359599516SKenneth E. Jansenc 42459599516SKenneth E. Jansen dimension Dy(nshg,nflow), rmes(nshg,nflow), 42559599516SKenneth E. Jansen & temp(nshg,nflow), 42659599516SKenneth E. Jansen & uBrg(nshg,nflow,Kspace+1) 42759599516SKenneth E. Jansenc 42859599516SKenneth E. Jansen dimension shp(MAXTOP,maxsh,MAXQPT), 42959599516SKenneth E. Jansen & shgl(MAXTOP,nsd,maxsh,MAXQPT), 43059599516SKenneth E. Jansen & shpb(MAXTOP,maxsh,MAXQPT), 43159599516SKenneth E. Jansen & shglb(MAXTOP,nsd,maxsh,MAXQPT) 43259599516SKenneth E. Jansen real*8 rerr(nshg,10) 43359599516SKenneth E. Jansenc 43459599516SKenneth E. Jansenc 43559599516SKenneth E. Jansenc.... *******************>> Element Data Formation <<****************** 43659599516SKenneth E. Jansenc 43759599516SKenneth E. Jansenc 43859599516SKenneth E. Jansenc.... set the parameters for flux and surface tension calculations 43959599516SKenneth E. Jansenc 44059599516SKenneth E. Jansenc 44159599516SKenneth E. Jansen idflx = 0 44259599516SKenneth E. Jansen if(idiff >= 1) idflx= idflx + (nflow-1) * nsd 44359599516SKenneth E. Jansen if (isurf == 1) idflx=idflx + nsd 44459599516SKenneth E. Jansenc 44559599516SKenneth E. Jansenc.... form the LHS matrices, the residual vector, and the block 44659599516SKenneth E. Jansenc diagonal preconditioner 44759599516SKenneth E. Jansenc 44859599516SKenneth E. Jansen call ElmGMRs(y, ac, x, 44959599516SKenneth E. Jansen & shp, shgl, iBC, 45059599516SKenneth E. Jansen & BC, shpb, 45159599516SKenneth E. Jansen & shglb, res, 45259599516SKenneth E. Jansen & rmes, BDiag, iper, 45359599516SKenneth E. Jansen & ilwork, lhsK, col, 45459599516SKenneth E. Jansen & row, rerr ) 45559599516SKenneth E. Jansen 45659599516SKenneth E. Jansen call tnanq(res,5, 'res_egmr') 45759599516SKenneth E. Jansen call tnanq(BDiag,25, 'bdg_egmr') 45859599516SKenneth E. Jansenc 45959599516SKenneth E. Jansenc.... **********************>> EBE - GMRES <<******************** 46059599516SKenneth E. Jansenc 46159599516SKenneth E. Jansen call timer ('Solver ') 46259599516SKenneth E. Jansenc 46359599516SKenneth E. Jansenc.... ------------------------> Initialization <----------------------- 46459599516SKenneth E. Jansenc 46559599516SKenneth E. Jansenc 46659599516SKenneth E. Jansenc.... LU decompose the block diagonals 46759599516SKenneth E. Jansenc 46859599516SKenneth E. Jansen if (iprec .ne. 0) then 46959599516SKenneth E. Jansen call i3LU (BDiag, res, 'LU_Fact ') 47059599516SKenneth E. Jansen if (numpe > 1) then 47159599516SKenneth E. Jansen call commu (BDiag , ilwork, nflow*nflow , 'out') 47259599516SKenneth E. Jansen endif 47359599516SKenneth E. Jansen endif 47459599516SKenneth E. Jansenc 47559599516SKenneth E. Jansenc.... block diagonal precondition residual 47659599516SKenneth E. Jansenc 47759599516SKenneth E. Jansen call i3LU (BDiag, res, 'forward ') 47859599516SKenneth E. Jansenc 47959599516SKenneth E. Jansenc Check the residual for divering trend 48059599516SKenneth E. Jansenc 48159599516SKenneth E. Jansen call rstatCheck(res,ilwork,y,ac) 48259599516SKenneth E. Jansenc 48359599516SKenneth E. Jansenc.... initialize Dy 48459599516SKenneth E. Jansenc 48559599516SKenneth E. Jansen Dy = zero 48659599516SKenneth E. Jansenc 48759599516SKenneth E. Jansenc.... Pre-precondition the LHS mass matrix and set up the sparse 48859599516SKenneth E. Jansenc preconditioners 48959599516SKenneth E. Jansenc 49059599516SKenneth E. Jansen 49159599516SKenneth E. Jansen if(lhs.eq.1) call Spsi3pre (BDiag, lhsK, col, row) 49259599516SKenneth E. Jansenc 49359599516SKenneth E. Jansenc.... copy res in uBrg(1) 49459599516SKenneth E. Jansenc 49559599516SKenneth E. Jansen uBrg(:,:,1) = res 49659599516SKenneth E. Jansenc 49759599516SKenneth E. Jansenc.... calculate norm of residual 49859599516SKenneth E. Jansenc 49959599516SKenneth E. Jansen temp = res**2 50059599516SKenneth E. Jansen 50159599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 50259599516SKenneth E. Jansen unorm = sqrt(summed) 50359599516SKenneth E. Jansenc 50459599516SKenneth E. Jansenc.... check if GMRES iterations are required 50559599516SKenneth E. Jansenc 50659599516SKenneth E. Jansen iKs = 0 507*513954efSKenneth E. Jansen lGMRESs = 0 50859599516SKenneth E. Jansenc 50959599516SKenneth E. Jansenc.... if we are down to machine precision, don't bother solving 51059599516SKenneth E. Jansenc 51159599516SKenneth E. Jansen if (unorm .lt. 100.*epsM**2) goto 3000 51259599516SKenneth E. Jansenc 51359599516SKenneth E. Jansenc.... set up tolerance of the Hessenberg's problem 51459599516SKenneth E. Jansenc 51559599516SKenneth E. Jansen epsnrm = etol * unorm 51659599516SKenneth E. Jansenc 51759599516SKenneth E. Jansenc.... ------------------------> GMRES Loop <------------------------- 51859599516SKenneth E. Jansenc 51959599516SKenneth E. Jansenc.... loop through GMRES cycles 52059599516SKenneth E. Jansenc 52159599516SKenneth E. Jansen do 2000 mGMRES = 1, nGMRES 522*513954efSKenneth E. Jansen lGMRESs = mGMRES - 1 52359599516SKenneth E. Jansenc 52459599516SKenneth E. Jansen if (lGMRES .gt. 0) then 52559599516SKenneth E. Jansenc 52659599516SKenneth E. Jansenc.... if GMRES restarts are necessary, calculate R - A x 52759599516SKenneth E. Jansenc 52859599516SKenneth E. Jansenc 52959599516SKenneth E. Jansenc.... right precondition Dy 53059599516SKenneth E. Jansenc 53159599516SKenneth E. Jansen temp = Dy 53259599516SKenneth E. Jansen 53359599516SKenneth E. Jansenc 53459599516SKenneth E. Jansenc.... perform the A x product 53559599516SKenneth E. Jansenc 53659599516SKenneth E. Jansen call SparseAp (iper,ilwork,iBC, col, row, lhsK, temp) 537*513954efSKenneth E. Jansenc call tnanq(temp,5, 'q_spAPrs') 53859599516SKenneth E. Jansen 53959599516SKenneth E. Jansenc 54059599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 54159599516SKenneth E. Jansenc 54259599516SKenneth E. Jansen call bc3per (iBC, temp, iper, ilwork, nflow) 543*513954efSKenneth E. Jansenc call tnanq(temp,5, 'q_BCprs') 54459599516SKenneth E. Jansenc 54559599516SKenneth E. Jansenc.... subtract A x from residual and calculate the norm 54659599516SKenneth E. Jansenc 54759599516SKenneth E. Jansen temp = res - temp 54859599516SKenneth E. Jansen uBrg(:,:,1) = temp 54959599516SKenneth E. Jansenc 55059599516SKenneth E. Jansenc.... calculate the norm 55159599516SKenneth E. Jansenc 55259599516SKenneth E. Jansen temp = temp**2 55359599516SKenneth E. Jansen call sumgat (temp, nflow, summed, ilwork) 55459599516SKenneth E. Jansen unorm = sqrt(summed) 55559599516SKenneth E. Jansenc 55659599516SKenneth E. Jansenc.... flop count 55759599516SKenneth E. Jansenc 55859599516SKenneth E. Jansen flops = flops + nflow*nshg+nshg 55959599516SKenneth E. Jansenc 56059599516SKenneth E. Jansen endif 56159599516SKenneth E. Jansenc 56259599516SKenneth E. Jansenc.... set up RHS of the Hessenberg's problem 56359599516SKenneth E. Jansenc 56459599516SKenneth E. Jansen call clear (eBrg, Kspace+1) 56559599516SKenneth E. Jansen eBrg(1) = unorm 56659599516SKenneth E. Jansenc 56759599516SKenneth E. Jansenc.... normalize the first Krylov vector 56859599516SKenneth E. Jansenc 56959599516SKenneth E. Jansen uBrg(:,:,1) = uBrg(:,:,1) / unorm 57059599516SKenneth E. Jansenc 57159599516SKenneth E. Jansenc.... loop through GMRES iterations 57259599516SKenneth E. Jansenc 57359599516SKenneth E. Jansen do 1000 iK = 1, Kspace 57459599516SKenneth E. Jansen iKs = iK 57559599516SKenneth E. Jansen 57659599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs) 57759599516SKenneth E. Jansenc 57859599516SKenneth E. Jansenc.... Au product ( u_{i+1} <-- EGmass u_{i+1} ) 57959599516SKenneth E. Jansenc 58059599516SKenneth E. Jansen call SparseAp (iper, ilwork, iBC, 58159599516SKenneth E. Jansen & col, row, lhsK, 58259599516SKenneth E. Jansen & uBrg(:,:,iKs+1) ) 583*513954efSKenneth E. Jansenc call tnanq(uBrg(:,:,iKS+1),5, 'q_spAP') 58459599516SKenneth E. Jansen 58559599516SKenneth E. Jansenc 58659599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 58759599516SKenneth E. Jansenc 58859599516SKenneth E. Jansen call bc3per (iBC, uBrg(:,:,iKs+1), iper, ilwork, nflow) 589*513954efSKenneth E. Jansenc call tnanq(uBrg(:,:,iKS+1),5, 'q_bc') 59059599516SKenneth E. Jansen 59159599516SKenneth E. Jansenc 59259599516SKenneth E. Jansenc.... orthogonalize and get the norm 59359599516SKenneth E. Jansenc 59459599516SKenneth E. Jansen do jK = 1, iKs+1 59559599516SKenneth E. Jansenc 59659599516SKenneth E. Jansen if (jK .eq. 1) then 59759599516SKenneth E. Jansenc 59859599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,1) ! {u_{i+1}*u_1} vector 59959599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_1) 60059599516SKenneth E. Jansenc 60159599516SKenneth E. Jansen else 60259599516SKenneth E. Jansenc 60359599516SKenneth E. Jansenc project off jK-1 vector 60459599516SKenneth E. Jansenc 60559599516SKenneth E. Jansen uBrg(:,:,iKs+1)=uBrg(:,:,iKs+1)-beta * uBrg(:,:,jK-1) 60659599516SKenneth E. Jansenc 60759599516SKenneth E. Jansen temp = uBrg(:,:,iKs+1) * uBrg(:,:,jK) !{u_{i+1}*u_j} vector 60859599516SKenneth E. Jansen call sumgat (temp, nflow, beta, ilwork) ! sum vector=(u_{i+1},u_j) 60959599516SKenneth E. Jansenc 61059599516SKenneth E. Jansen endif 61159599516SKenneth E. Jansenc 61259599516SKenneth E. Jansen HBrg(jK,iKs) = beta ! put this in the Hessenberg Matrix 61359599516SKenneth E. Jansenc 61459599516SKenneth E. Jansen enddo 61559599516SKenneth E. Jansenc 61659599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*numnp+(iKs+1)*numnp 61759599516SKenneth E. Jansenc 61859599516SKenneth E. Jansenc the last inner product was with what was left of the vector (after 61959599516SKenneth E. Jansenc projecting off all of the previous vectors 62059599516SKenneth E. Jansenc 621*513954efSKenneth E. Jansen if(beta.le.0) write(*,*) 'beta in solgmr non-positive' 62259599516SKenneth E. Jansen unorm = sqrt(beta) 62359599516SKenneth E. Jansen HBrg(iKs+1,iKs) = unorm ! this fills the 1 sub diagonal band 62459599516SKenneth E. Jansenc 62559599516SKenneth E. Jansenc.... normalize the Krylov vector 62659599516SKenneth E. Jansenc 62759599516SKenneth E. Jansen uBrg(:,:,iKs+1) = uBrg(:,:,iKs+1) / unorm ! normalize the next Krylov 62859599516SKenneth E. Jansenc vector 62959599516SKenneth E. Jansenc 63059599516SKenneth E. Jansenc.... construct and reduce the Hessenberg Matrix 63159599516SKenneth E. Jansenc since there is only one subdiagonal we can use a Givens rotation to 63259599516SKenneth E. Jansenc rotate off each subdiagonal AS IT IS FORMED. We do this because it 63359599516SKenneth E. Jansenc allows us to check progress of solution and quit when satisfied. Note 63459599516SKenneth E. Jansenc that all future K vects will put a subdiagonal in the next column so 63559599516SKenneth E. Jansenc there is no penalty to work ahead as the rotation for the next vector 63659599516SKenneth E. Jansenc will be unaffected by this rotation. 63759599516SKenneth E. Jansen 63859599516SKenneth E. Jansenc 63959599516SKenneth E. Jansenc H Y = E ========> R_i H Y = R_i E 64059599516SKenneth E. Jansenc 64159599516SKenneth E. Jansen do jK = 1, iKs-1 64259599516SKenneth E. Jansen tmp = Rcos(jK) * HBrg(jK, iKs) + 64359599516SKenneth E. Jansen & Rsin(jK) * HBrg(jK+1,iKs) 64459599516SKenneth E. Jansen HBrg(jK+1,iKs) = -Rsin(jK) * HBrg(jK, iKs) + 64559599516SKenneth E. Jansen & Rcos(jK) * HBrg(jK+1,iKs) 64659599516SKenneth E. Jansen HBrg(jK, iKs) = tmp 64759599516SKenneth E. Jansen enddo 64859599516SKenneth E. Jansenc 64959599516SKenneth E. Jansen tmp = sqrt(HBrg(iKs,iKs)**2 + HBrg(iKs+1,iKs)**2) 65059599516SKenneth E. Jansen Rcos(iKs) = HBrg(iKs, iKs) / tmp 65159599516SKenneth E. Jansen Rsin(iKs) = HBrg(iKs+1,iKs) / tmp 65259599516SKenneth E. Jansen HBrg(iKs, iKs)= tmp 65359599516SKenneth E. Jansen HBrg(iKs+1,iKs)= zero 65459599516SKenneth E. Jansenc 65559599516SKenneth E. Jansenc.... rotate eBrg R_i E 65659599516SKenneth E. Jansenc 65759599516SKenneth E. Jansen tmp = Rcos(iKs) * eBrg(iKs) + Rsin(iKs) * eBrg(iKs+1) 65859599516SKenneth E. Jansen eBrg(iKs+1)=-Rsin(iKs) * eBrg(iKs) + Rcos(iKs) * eBrg(iKs+1) 65959599516SKenneth E. Jansen eBrg(iKs) = tmp 66059599516SKenneth E. Jansenc 66159599516SKenneth E. Jansenc.... check for convergence 66259599516SKenneth E. Jansenc 66359599516SKenneth E. Jansen ntotGM = ntotGM + 1 66459599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 66559599516SKenneth E. Jansen if (echeck .le. epsnrm.and. iKs .ge. minIters) exit 66659599516SKenneth E. Jansenc 66759599516SKenneth E. Jansenc.... end of GMRES iteration loop 66859599516SKenneth E. Jansenc 66959599516SKenneth E. Jansen 1000 continue 67059599516SKenneth E. Jansenc 67159599516SKenneth E. Jansenc.... -------------------------> Solution <------------------------ 67259599516SKenneth E. Jansenc 67359599516SKenneth E. Jansenc.... if converged or end of Krylov space 67459599516SKenneth E. Jansenc 67559599516SKenneth E. Jansenc.... solve for yBrg 67659599516SKenneth E. Jansenc 67759599516SKenneth E. Jansen do jK = iKs, 1, -1 67859599516SKenneth E. Jansen yBrg(jK) = eBrg(jK) / HBrg(jK,jK) 67959599516SKenneth E. Jansen do lK = 1, jK-1 68059599516SKenneth E. Jansen eBrg(lK) = eBrg(lK) - yBrg(jK) * HBrg(lK,jK) 68159599516SKenneth E. Jansen enddo 68259599516SKenneth E. Jansen enddo 68359599516SKenneth E. Jansenc 68459599516SKenneth E. Jansenc.... update Dy 68559599516SKenneth E. Jansenc 68659599516SKenneth E. Jansen do jK = 1, iKs 68759599516SKenneth E. Jansen Dy = Dy + yBrg(jK) * uBrg(:,:,jK) 68859599516SKenneth E. Jansen enddo 68959599516SKenneth E. Jansenc 69059599516SKenneth E. Jansenc.... flop count 69159599516SKenneth E. Jansenc 69259599516SKenneth E. Jansen flops = flops + (3*iKs+1)*nflow*nshg 69359599516SKenneth E. Jansenc 69459599516SKenneth E. Jansenc.... check for convergence 69559599516SKenneth E. Jansenc 69659599516SKenneth E. Jansen echeck=abs(eBrg(iKs+1)) 69759599516SKenneth E. Jansen if (echeck .le. epsnrm) exit 698*513954efSKenneth E. Jansen! if(myrank.eq.master) write(*,*)'solver tolerance %satisfaction', 699*513954efSKenneth E. Jansen! & (one-echeck*etol/epsnrm)/(one-etol)*100 70059599516SKenneth E. Jansen 70159599516SKenneth E. Jansenc 70259599516SKenneth E. Jansenc.... end of mGMRES loop 70359599516SKenneth E. Jansenc 70459599516SKenneth E. Jansen 2000 continue 70559599516SKenneth E. Jansenc 70659599516SKenneth E. Jansenc.... ------------------------> Converged <------------------------ 70759599516SKenneth E. Jansenc 70859599516SKenneth E. Jansenc.... if converged 70959599516SKenneth E. Jansenc 71059599516SKenneth E. Jansen 3000 continue 71159599516SKenneth E. Jansen 71259599516SKenneth E. Jansenc 71359599516SKenneth E. Jansenc 71459599516SKenneth E. Jansenc.... back block-diagonal precondition the results 71559599516SKenneth E. Jansenc 71659599516SKenneth E. Jansen call i3LU (BDiag, Dy, 'backward') 71759599516SKenneth E. Jansenc 71859599516SKenneth E. Jansenc 71959599516SKenneth E. Jansenc.... output the statistics 72059599516SKenneth E. Jansenc 72159599516SKenneth E. Jansen call rstat (res, ilwork) 722*513954efSKenneth E. Jansen 723*513954efSKenneth E. Jansen if(myrank.eq.master) then 724*513954efSKenneth E. Jansen if (echeck .le. epsnrm) then 725*513954efSKenneth E. Jansen write(*,*) 726*513954efSKenneth E. Jansen else 727*513954efSKenneth E. Jansen write(*,*)'solver tolerance %satisfaction', 728*513954efSKenneth E. Jansen & (one-echeck*etol/epsnrm)/(one-etol)*100 729*513954efSKenneth E. Jansen endif 730*513954efSKenneth E. Jansen endif 73159599516SKenneth E. Jansenc 73259599516SKenneth E. Jansenc.... stop the timer 73359599516SKenneth E. Jansenc 73459599516SKenneth E. Jansen 3002 continue ! no solve just res. 73559599516SKenneth E. Jansen call timer ('Back ') 73659599516SKenneth E. Jansenc 73759599516SKenneth E. Jansenc.... end 73859599516SKenneth E. Jansenc 73959599516SKenneth E. Jansen return 74059599516SKenneth E. Jansen end 74159599516SKenneth E. Jansen 74259599516SKenneth E. Jansen subroutine SolGMRSclr(y, ac, yold, 74359599516SKenneth E. Jansen & acold, EGmasst, 74459599516SKenneth E. Jansen & x, elDw, 74559599516SKenneth E. Jansen & iBC, BC, 74659599516SKenneth E. Jansen & rest, HBrg, eBrg, 74759599516SKenneth E. Jansen & yBrg, Rcos, Rsin, iper, 74859599516SKenneth E. Jansen & ilwork, 74959599516SKenneth E. Jansen & shp, shgl, 75059599516SKenneth E. Jansen & shpb, shglb, Dyt) 75159599516SKenneth E. Jansenc 75259599516SKenneth E. Jansenc---------------------------------------------------------------------- 75359599516SKenneth E. Jansenc 75459599516SKenneth E. Jansenc This is the preconditioned GMRES driver routine. 75559599516SKenneth E. Jansenc 75659599516SKenneth E. Jansenc input: 75759599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_f 75859599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 75959599516SKenneth E. Jansenc yold (nshg,ndof) : Y-variables at beginning of step 76059599516SKenneth E. Jansenc acold (nshg,ndof) : Primvar. accel. variable at begng step 76159599516SKenneth E. Jansenc x (numnp,nsd) : node coordinates 76259599516SKenneth E. Jansenc iBC (nshg) : BC codes 76359599516SKenneth E. Jansenc BC (nshg,ndofBC) : BC constraint parameters 76459599516SKenneth E. Jansenc iper (nshg) : periodic nodal information 76559599516SKenneth E. Jansenc 76659599516SKenneth E. Jansenc output: 76759599516SKenneth E. Jansenc y (nshg,ndof) : Y-variables at n+alpha_f 76859599516SKenneth E. Jansenc ac (nshg,ndof) : Primvar. accel. variable n+alpha_m 76959599516SKenneth E. Jansenc HBrg (Kspace+1,Kspace) : Hessenberg matrix (LHS matrix) 77059599516SKenneth E. Jansenc eBrg (Kspace+1) : RHS of Hessenberg minim. problem 77159599516SKenneth E. Jansenc yBrg (Kspace) : solution of Hessenberg minim. problem 77259599516SKenneth E. Jansenc Rcos (Kspace) : Rotational cosine of QR algorithm 77359599516SKenneth E. Jansenc Rsin (Kspace) : Rotational sine of QR algorithm 77459599516SKenneth E. Jansenc output: 77559599516SKenneth E. Jansenc rest (numnp) : preconditioned residual 77659599516SKenneth E. Jansenc 77759599516SKenneth E. Jansenc 77859599516SKenneth E. Jansenc Zdenek Johan, Winter 1991. (Fortran 90) 77959599516SKenneth E. Jansenc---------------------------------------------------------------------- 78059599516SKenneth E. Jansenc 78159599516SKenneth E. Jansen use pointer_data 78259599516SKenneth E. Jansen 78359599516SKenneth E. Jansen include "common.h" 78459599516SKenneth E. Jansen include "mpif.h" 78559599516SKenneth E. Jansen include "auxmpi.h" 78659599516SKenneth E. Jansenc 78759599516SKenneth E. Jansen dimension y(nshg,ndof), ac(nshg,ndof), 78859599516SKenneth E. Jansen & yold(nshg,ndof), acold(nshg,ndof), 78959599516SKenneth E. Jansen & x(numnp,nsd), 79059599516SKenneth E. Jansen & iBC(nshg), BC(nshg,ndofBC), 79159599516SKenneth E. Jansen & rest(nshg), 79259599516SKenneth E. Jansen & Diag(nshg), 79359599516SKenneth E. Jansen & HBrg(Kspace+1,*), eBrg(*), 79459599516SKenneth E. Jansen & yBrg(*), Rcos(*), 79559599516SKenneth E. Jansen & Rsin(*), ilwork(nlwork), 79659599516SKenneth E. Jansen & iper(nshg), EGmasst(numel,nshape,nshape) 79759599516SKenneth E. Jansenc 79859599516SKenneth E. Jansen dimension Dyt(nshg), rmest(nshg), 79959599516SKenneth E. Jansen & tempt(nshg), Dinv(nshg), 80059599516SKenneth E. Jansen & uBrgt(nshg,Kspace+1) 80159599516SKenneth E. Jansenc 80259599516SKenneth E. Jansen dimension shp(MAXTOP,maxsh,MAXQPT), 80359599516SKenneth E. Jansen & shgl(MAXTOP,nsd,maxsh,MAXQPT), 80459599516SKenneth E. Jansen & shpb(MAXTOP,maxsh,MAXQPT), 80559599516SKenneth E. Jansen & shglb(MAXTOP,nsd,maxsh,MAXQPT) 80659599516SKenneth E. Jansen real*8 elDw(numel) 80759599516SKenneth E. Jansenc 80859599516SKenneth E. Jansenc.... *******************>> Element Data Formation <<****************** 80959599516SKenneth E. Jansenc 81059599516SKenneth E. Jansenc 81159599516SKenneth E. Jansenc.... form the LHS matrices, the residual vector, and the block 81259599516SKenneth E. Jansenc diagonal preconditioner 81359599516SKenneth E. Jansenc 81459599516SKenneth E. Jansen call ElmGMRSclr(y, ac, 81559599516SKenneth E. Jansen & x, elDw, shp, shgl, 81659599516SKenneth E. Jansen & iBC, BC, 81759599516SKenneth E. Jansen & shpb, shglb, 81859599516SKenneth E. Jansen & rest, 81959599516SKenneth E. Jansen & rmest, Diag, iper, 82059599516SKenneth E. Jansen & ilwork, EGmasst) 82159599516SKenneth E. Jansenc 82259599516SKenneth E. Jansenc.... **********************>> EBE - GMRES <<******************** 82359599516SKenneth E. Jansenc 82459599516SKenneth E. Jansen call timer ('Solver ') 82559599516SKenneth E. Jansenc 82659599516SKenneth E. Jansenc.... ------------------------> Initialization <----------------------- 82759599516SKenneth E. Jansenc 82859599516SKenneth E. Jansenc 82959599516SKenneth E. Jansen id = isclr+5 83059599516SKenneth E. Jansenc.... initialize Dy 83159599516SKenneth E. Jansenc 83259599516SKenneth E. Jansen Dyt = zero 83359599516SKenneth E. Jansenc 83459599516SKenneth E. Jansenc.... Right preconditioner 83559599516SKenneth E. Jansenc 83659599516SKenneth E. Jansen call i3preSclr(Diag, Dinv, EGmassT, ilwork) 83759599516SKenneth E. Jansenc 83859599516SKenneth E. Jansenc Check the residual for divering trend 83959599516SKenneth E. Jansenc 840*513954efSKenneth E. Jansen 84159599516SKenneth E. Jansen call rstatCheckSclr(rest,ilwork,y,ac) 84259599516SKenneth E. Jansen 84359599516SKenneth E. Jansenc 84459599516SKenneth E. Jansenc.... copy rest in uBrgt(1) 84559599516SKenneth E. Jansenc 84659599516SKenneth E. Jansen uBrgt(:,1) = rest 84759599516SKenneth E. Jansenc 84859599516SKenneth E. Jansenc.... calculate norm of residual 84959599516SKenneth E. Jansenc 85059599516SKenneth E. Jansen tempt = rest**2 85159599516SKenneth E. Jansen 85259599516SKenneth E. Jansen call sumgat (tempt, 1, summed, ilwork) 85359599516SKenneth E. Jansen unorm = sqrt(summed) 85459599516SKenneth E. Jansenc 85559599516SKenneth E. Jansenc.... check if GMRES iterations are required 85659599516SKenneth E. Jansenc 857*513954efSKenneth E. Jansen iKss = 0 85859599516SKenneth E. Jansen lGMRESt = 0 85959599516SKenneth E. Jansenc 86059599516SKenneth E. Jansenc.... if we are down to machine precision, don't bother solving 86159599516SKenneth E. Jansenc 86259599516SKenneth E. Jansen if (unorm .lt. 100.*epsM**2) goto 3000 86359599516SKenneth E. Jansenc 86459599516SKenneth E. Jansenc.... set up tolerance of the Hessenberg's problem 86559599516SKenneth E. Jansenc 86659599516SKenneth E. Jansen epsnrm = etol * unorm 86759599516SKenneth E. Jansenc 86859599516SKenneth E. Jansenc.... ------------------------> GMRES Loop <------------------------- 86959599516SKenneth E. Jansenc 87059599516SKenneth E. Jansenc.... loop through GMRES cycles 87159599516SKenneth E. Jansenc 87259599516SKenneth E. Jansen do 2000 mGMRES = 1, nGMRES 87359599516SKenneth E. Jansen lGMRESt = mGMRES - 1 87459599516SKenneth E. Jansenc 87559599516SKenneth E. Jansen if (lGMRESt .gt. 0) then 87659599516SKenneth E. Jansenc 87759599516SKenneth E. Jansenc.... if GMRES restarts are necessary, calculate R - A x 87859599516SKenneth E. Jansenc 87959599516SKenneth E. Jansenc 88059599516SKenneth E. Jansen 88159599516SKenneth E. Jansenc.... perform the A x product 88259599516SKenneth E. Jansenc 88359599516SKenneth E. Jansen call Au1GMRSclr (EGmasst, tempt, ilwork, iper) 88459599516SKenneth E. Jansenc 88559599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 88659599516SKenneth E. Jansenc 88759599516SKenneth E. Jansenc call bc3perSclr (iBC, tempt, iper) 88859599516SKenneth E. Jansenc 88959599516SKenneth E. Jansenc.... subtract A x from residual and calculate the norm 89059599516SKenneth E. Jansenc 89159599516SKenneth E. Jansen tempt = rest - tempt 89259599516SKenneth E. Jansen uBrgt(:,1) = tempt 89359599516SKenneth E. Jansenc 89459599516SKenneth E. Jansenc.... calculate the norm 89559599516SKenneth E. Jansenc 89659599516SKenneth E. Jansen tempt = tempt**2 89759599516SKenneth E. Jansen call sumgat (tempt, 1, summed, ilwork) 89859599516SKenneth E. Jansen unorm = sqrt(summed) 89959599516SKenneth E. Jansenc 90059599516SKenneth E. Jansenc.... flop count 90159599516SKenneth E. Jansenc 90259599516SKenneth E. Jansen flops = flops + ndof*numnp+numnp 90359599516SKenneth E. Jansenc 90459599516SKenneth E. Jansen endif 90559599516SKenneth E. Jansenc 90659599516SKenneth E. Jansenc.... set up RHS of the Hessenberg's problem 90759599516SKenneth E. Jansenc 90859599516SKenneth E. Jansen call clear (eBrg, Kspace+1) 90959599516SKenneth E. Jansen call clear (HBrg, Kspace+1) 91059599516SKenneth E. Jansen eBrg(1) = unorm 91159599516SKenneth E. Jansenc 91259599516SKenneth E. Jansenc.... normalize the first Krylov vector 91359599516SKenneth E. Jansenc 91459599516SKenneth E. Jansen uBrgt(:,1) = uBrgt(:,1) / unorm 91559599516SKenneth E. Jansenc 91659599516SKenneth E. Jansenc.... loop through GMRES iterations 91759599516SKenneth E. Jansenc 91859599516SKenneth E. Jansen do 1000 iK = 1, Kspace 919*513954efSKenneth E. Jansen iKss = iK 92059599516SKenneth E. Jansen 921*513954efSKenneth E. Jansen uBrgt(:,iKss+1) = uBrgt(:,iKss) 92259599516SKenneth E. Jansen 92359599516SKenneth E. Jansenc.... Au product ( u_{i+1} <-- EGmass u_{i+1} ) 92459599516SKenneth E. Jansenc 925*513954efSKenneth E. Jansen call Au1GMRSclr ( EGmasst, uBrgt(:,iKss+1), ilwork, iper ) 92659599516SKenneth E. Jansen 92759599516SKenneth E. Jansenc 92859599516SKenneth E. Jansenc.... periodic nodes have to assemble results to their partners 92959599516SKenneth E. Jansenc 930*513954efSKenneth E. Jansen call bc3perSclr (iBC, uBrgt(:,iKss+1), iper) 93159599516SKenneth E. Jansenc 93259599516SKenneth E. Jansenc.... orthogonalize and get the norm 93359599516SKenneth E. Jansenc 934*513954efSKenneth E. Jansen do jK = 1, iKss+1 93559599516SKenneth E. Jansenc 93659599516SKenneth E. Jansen if (jK .eq. 1) then 93759599516SKenneth E. Jansenc 938*513954efSKenneth E. Jansen tempt = uBrgt(:,iKss+1) * uBrgt(:,1) ! {u_{i+1}*u_1} vector 93959599516SKenneth E. Jansen call sumgat (tempt, 1, beta, ilwork) ! sum vector=(u_{i+1},u_1) 94059599516SKenneth E. Jansenc 94159599516SKenneth E. Jansen else 94259599516SKenneth E. Jansenc 94359599516SKenneth E. Jansenc project off jK-1 vector 94459599516SKenneth E. Jansenc 945*513954efSKenneth E. Jansen uBrgt(:,iKss+1) = uBrgt(:,iKss+1) - beta * uBrgt(:,jK-1) 94659599516SKenneth E. Jansenc 947*513954efSKenneth E. Jansen tempt = uBrgt(:,iKss+1) * uBrgt(:,jK) !{u_{i+1}*u_j} vector 94859599516SKenneth E. Jansen call sumgat (tempt, 1, beta, ilwork) ! sum vector=(u_{i+1},u_j) 94959599516SKenneth E. Jansenc 95059599516SKenneth E. Jansen endif 95159599516SKenneth E. Jansenc 952*513954efSKenneth E. Jansen HBrg(jK,iKss) = beta ! put this in the Hessenberg Matrix 95359599516SKenneth E. Jansenc 95459599516SKenneth E. Jansen enddo 95559599516SKenneth E. Jansenc 956*513954efSKenneth E. Jansen flops = flops + (3*iKss+1)*ndof*numnp+(iKss+1)*numnp 95759599516SKenneth E. Jansenc 95859599516SKenneth E. Jansenc the last inner product was with what was left of the vector (after 95959599516SKenneth E. Jansenc projecting off all of the previous vectors 96059599516SKenneth E. Jansenc 96159599516SKenneth E. Jansen unorm = sqrt(beta) 962*513954efSKenneth E. Jansen HBrg(iKss+1,iKss) = unorm ! this fills the 1 sub diagonal band 96359599516SKenneth E. Jansenc 96459599516SKenneth E. Jansenc.... normalize the Krylov vector 96559599516SKenneth E. Jansenc 966*513954efSKenneth E. Jansen uBrgt(:,iKss+1) = uBrgt(:,iKss+1) / unorm ! normalize the next Krylov 96759599516SKenneth E. Jansenc vector 96859599516SKenneth E. Jansenc 96959599516SKenneth E. Jansenc.... construct and reduce the Hessenberg Matrix 97059599516SKenneth E. Jansenc since there is only one subdiagonal we can use a Givens rotation to 97159599516SKenneth E. Jansenc rotate off each subdiagonal AS IT IS FORMED. We do this because it 97259599516SKenneth E. Jansenc allows us to check progress of solution and quit when satisfied. Note 97359599516SKenneth E. Jansenc that all future K vects will put a subdiagonal in the next column so 97459599516SKenneth E. Jansenc there is no penalty to work ahead as the rotation for the next vector 97559599516SKenneth E. Jansenc will be unaffected by this rotation. 97659599516SKenneth E. Jansen 97759599516SKenneth E. Jansenc 97859599516SKenneth E. Jansenc H Y = E ========> R_i H Y = R_i E 97959599516SKenneth E. Jansenc 980*513954efSKenneth E. Jansen do jK = 1, iKss-1 981*513954efSKenneth E. Jansen tmp = Rcos(jK) * HBrg(jK, iKss) + 982*513954efSKenneth E. Jansen & Rsin(jK) * HBrg(jK+1,iKss) 983*513954efSKenneth E. Jansen HBrg(jK+1,iKss) = -Rsin(jK) * HBrg(jK, iKss) + 984*513954efSKenneth E. Jansen & Rcos(jK) * HBrg(jK+1,iKss) 985*513954efSKenneth E. Jansen HBrg(jK, iKss) = tmp 98659599516SKenneth E. Jansen enddo 98759599516SKenneth E. Jansenc 988*513954efSKenneth E. Jansen tmp = sqrt(HBrg(iKss,iKss)**2 + HBrg(iKss+1,iKss)**2) 989*513954efSKenneth E. Jansen Rcos(iKss) = HBrg(iKss, iKss) / tmp 990*513954efSKenneth E. Jansen Rsin(iKss) = HBrg(iKss+1,iKss) / tmp 991*513954efSKenneth E. Jansen HBrg(iKss, iKss) = tmp 992*513954efSKenneth E. Jansen HBrg(iKss+1,iKss) = zero 99359599516SKenneth E. Jansenc 99459599516SKenneth E. Jansenc.... rotate eBrg R_i E 99559599516SKenneth E. Jansenc 996*513954efSKenneth E. Jansen tmp = Rcos(iKss)*eBrg(iKss) + Rsin(iKss)*eBrg(iKss+1) 997*513954efSKenneth E. Jansen eBrg(iKss+1)=-Rsin(iKss)*eBrg(iKss) + Rcos(iKss)*eBrg(iKss+1) 998*513954efSKenneth E. Jansen eBrg(iKss) = tmp 99959599516SKenneth E. Jansenc 100059599516SKenneth E. Jansenc.... check for convergence 100159599516SKenneth E. Jansenc 1002*513954efSKenneth E. Jansen ercheck=eBrg(iKss+1) 1003*513954efSKenneth E. Jansen ntotGMs = ntotGMs + 1 1004*513954efSKenneth E. Jansen if (abs(eBrg(iKss+1)) .le. epsnrm) exit 100559599516SKenneth E. Jansenc 100659599516SKenneth E. Jansenc.... end of GMRES iteration loop 100759599516SKenneth E. Jansenc 100859599516SKenneth E. Jansen 1000 continue 100959599516SKenneth E. Jansenc 101059599516SKenneth E. Jansenc.... -------------------------> Solution <------------------------ 101159599516SKenneth E. Jansenc 101259599516SKenneth E. Jansenc.... if converged or end of Krylov space 101359599516SKenneth E. Jansenc 101459599516SKenneth E. Jansenc.... solve for yBrg 101559599516SKenneth E. Jansenc 1016*513954efSKenneth E. Jansen do jK = iKss, 1, -1 101759599516SKenneth E. Jansen yBrg(jK) = eBrg(jK) / HBrg(jK,jK) 101859599516SKenneth E. Jansen do lK = 1, jK-1 101959599516SKenneth E. Jansen eBrg(lK) = eBrg(lK) - yBrg(jK) * HBrg(lK,jK) 102059599516SKenneth E. Jansen enddo 102159599516SKenneth E. Jansen enddo 102259599516SKenneth E. Jansenc 102359599516SKenneth E. Jansenc.... update Dy 102459599516SKenneth E. Jansenc 1025*513954efSKenneth E. Jansen do jK = 1, iKss 102659599516SKenneth E. Jansen Dyt = Dyt + yBrg(jK) * uBrgt(:,jK) 102759599516SKenneth E. Jansen enddo 102859599516SKenneth E. Jansenc 102959599516SKenneth E. Jansenc.... flop count 103059599516SKenneth E. Jansenc 1031*513954efSKenneth E. Jansen flops = flops + (3*iKss+1)*ndof*numnp 103259599516SKenneth E. Jansenc 103359599516SKenneth E. Jansenc.... check for convergence 103459599516SKenneth E. Jansenc 1035*513954efSKenneth E. Jansen if (abs(eBrg(iKss+1)) .le. epsnrm) exit 103659599516SKenneth E. Jansenc 103759599516SKenneth E. Jansenc.... end of mGMRES loop 103859599516SKenneth E. Jansenc 103959599516SKenneth E. Jansen 2000 continue 104059599516SKenneth E. Jansenc 104159599516SKenneth E. Jansenc.... ------------------------> Converged <------------------------ 104259599516SKenneth E. Jansenc 104359599516SKenneth E. Jansenc.... if converged 104459599516SKenneth E. Jansenc 104559599516SKenneth E. Jansen 3000 continue 104659599516SKenneth E. Jansenc 104759599516SKenneth E. Jansenc.... back precondition the result 104859599516SKenneth E. Jansenc 104959599516SKenneth E. Jansen Dyt(:) = Dyt(:) * Dinv(:) 105059599516SKenneth E. Jansenc 105159599516SKenneth E. Jansenc.... output the statistics 105259599516SKenneth E. Jansenc 1053*513954efSKenneth E. Jansen call rstatSclr(rest, ilwork) 105459599516SKenneth E. Jansenc.... stop the timer 105559599516SKenneth E. Jansenc 105659599516SKenneth E. Jansen 3002 continue ! no solve just res. 105759599516SKenneth E. Jansen call timer ('Back ') 105859599516SKenneth E. Jansenc 105959599516SKenneth E. Jansenc.... end 106059599516SKenneth E. Jansenc 106159599516SKenneth E. Jansen return 106259599516SKenneth E. Jansen end 106359599516SKenneth E. Jansen 106459599516SKenneth E. Jansen 1065