Implicit time stepping scheme setup/config for compressible flows

Hi,

Recently I noticed the using implicit time stepping can boost simulation a lot. I checked several post about this in the forum. I tried recommended setups by Niki but it seems my solver kept blowing up after a while. My flow configuration is that I am running a 3D flow around an aerofoil with compressible solver. I restarted my simulation with sdirk43 with rk45 pseudo scheme. I have used standard rk45 scheme for the same case which works well even for slightly larger dt. I pasted my configuration file down. Can you recognise what could possibly the problem? And what is the best way to tune those parameters since there’re a lot. Thanks in ahead.

Best wishes,
Zhenyang

[solver]
system = navier-stokes
order = 2
anti-alias = flux
viscosity-correction = sutherland

;[solver-time-integrator]
;formulation = std
;scheme = rk45
;controller = pi
;tstart = 0.0
;tend = 5000.001
;dt = 0.03

;atol = 0.000001
;rtol = 0.000001
;errest-norm = l2
;safety-fact = 0.9
;min-fact = 0.15   ; 0.1 - 0.5
;max-fact = 6   ; 2.0 - 6.0

[solver-time-integrator]
formulation = dual
scheme = sdirk43
pseudo-scheme = rk45
tstart = 0.0
tend = 5000.001
dt = 0.02
controller = none
pseudo-dt = 0.0005
pseudo-niters-max = 100
pseudo-resid-tol = 1e-6
pseudo-resid-norm = uniform  ; l2
pseudo-controller = local-pi  ; only works for rk34 and rk45

atol = 0.000001
safety-fact = 0.9
min-fact = 0.1   ; 0.1 - 0.5
max-fact = 6   ; 2.0 - 6.0
pseudo-dt-max-mult = 3  ; 2 - 5

[solver-dual-time-integrator-multip]
pseudo-dt-fact = 2.3
cycle = [(2, 1), (1, 1), (0, 3), (1, 1), (2, 5)]

[solver-interfaces]
riemann-solver = hllc
ldg-beta = 0.5
ldg-tau = 0.1

[solver-elements-hex]
soln-pts = gauss-legendre
quad-deg = 7
quad-pts = gauss-legendre

[solver-elements-hex-mg-p2]
soln-pts = gauss-legendre
quad-deg = 7
quad-pts = gauss-legendre

[solver-elements-hex-mg-p1]
soln-pts = gauss-legendre
quad-deg = 5
quad-pts = gauss-legendre

[solver-elements-hex-mg-p0]
soln-pts = gauss-legendre
quad-deg = 3
quad-pts = gauss-legendre

... more element types

At a quick glance I don’t see anything egregious in the config you sent. Although potentially the ratio of dt/pseudo-dt is quite large. This can lead to needing a large number of pseudo steps to converge the system.

To debug this you probably want to look at two things: what does the solution look like before blowing up and what do the convergence statistics look like from the pseudostats plugin. This will tell you if you current set up of dual time stepping is able to adequately solving the system. Looking at the solution is really just to make sure it isn’t something simple like an incorrect boundary condition etc.

Thanks Will. I changed dt/pseudo-dt ratio to 10. And since this test is restarted from a simulation with standard rk45 stepping, I can get ‘averaged dt’ from that simulation. I read from a post that pseudo-dt is bounded by CFL number then I set pseudo dt around this number. I pasted my pseudostates result down there and I checked solution at t = 700.2 and it is already blown up. It doesn’t seem like boundary conditions which cause problems since I have this case restarted from another simulation without any problem. I wonder how do you normally choose dt and pseudo-dt comparing with std schemes. And do you have recommended combination of pseudo-scheme to start with?

Best wishes,
Zhenyang

1,700.0,1,1.2218154695052905,2.638529360375996,0.0538187114890543,0.9027911639907784,3.3847396753547523
2,700.0,2,2.0258986270648144e-05,4.4654266874929216e-05,6.178426603316394e-05,1.3965275864375278e-05,5.365047094724519e-05
3,700.0,3,2.4004105701328312e-06,1.4930912620039756e-06,2.596450495429102e-06,3.3542831556458283e-07,6.351482449416711e-06
4,700.0,4,7.993729381925195e-08,7.712767559724717e-08,4.805977970313394e-08,1.0046092417267826e-08,2.1729345041717294e-07
5,700.0,1,0.0,0.0,0.0,0.0,0.0
6,700.0,1,0.0,0.0,0.0,0.0,0.0
7,700.02,1,0.0,0.0,0.0,0.0,0.0
8,700.02,1,0.0,0.0,0.0,0.0,0.0
9,700.02,1,0.0,0.0,0.0,0.0,0.0
10,700.04,1,0.0,0.0,0.0,0.0,0.0
11,700.04,1,0.0,0.0,0.0,0.0,0.0
12,700.04,1,0.0,0.0,0.0,0.0,0.0
13,700.06,1,0.0,0.0,0.0,0.0,0.0
14,700.06,1,0.0,0.0,0.0,0.0,0.0
15,700.06,1,0.0,0.0,0.0,0.0,0.0
16,700.0799999999999,1,0.0,0.0,0.0,0.0,0.0
17,700.0799999999999,1,0.0,0.0,0.0,0.0,0.0
18,700.0799999999999,1,0.0,0.0,0.0,0.0,0.0
19,700.0999999999999,1,0.0,0.0,0.0,0.0,0.0
20,700.0999999999999,1,0.0,0.0,0.0,0.0,0.0
21,700.0999999999999,1,0.0,0.0,0.0,0.0,0.0
22,700.1199999999999,1,0.0,0.0,0.0,0.0,0.0
23,700.1199999999999,1,0.0,0.0,0.0,0.0,0.0
24,700.1199999999999,1,0.0,0.0,0.0,0.0,0.0
25,700.1399999999999,1,0.0,0.0,0.0,0.0,0.0
26,700.1399999999999,1,0.0,0.0,0.0,0.0,0.0
27,700.1399999999999,1,0.0,0.0,0.0,0.0,0.0
28,700.1599999999999,1,0.0,0.0,0.0,0.0,0.0
29,700.1599999999999,1,0.0,0.0,0.0,0.0,0.0
30,700.1599999999999,1,0.0,0.0,0.0,0.0,0.0

What’s very strange is that your residual has hit zero.

Can you try running the case in the most basic configuration. Ie turn off anti-aliasing, sutherlands law, p-multigrid, pseudo-controller. Then try turning on p-multigrid, followed by the pseudo-controller. This should help narrow down the problem.

Hi Will,

Yes I made it running properly with pseudostates of reasonable number. However, implicit solver become much much slower than standard solver. This time I used following configuration:

[solver-time-integrator]
formulation = dual
scheme = sdirk43
pseudo-scheme = tvd-rk3
tstart = 0.0
tend = 7101
dt = 0.02
controller = none
pseudo-dt = 0.001
pseudo-niters-max = 100
pseudo-resid-tol = 1e-5
pseudo-resid-norm = l2  ; uniform  ; l2
pseudo-controller = none

Implicit solver will take 6 times longer than standard tvd-rk3 time stepping with dt=0.001. And if one checks pseudostates will find it converges very slow:

n,t,i,rho,rhou,rhov,rhow,E
1,7100.0,1,0.020818976696622266,0.09829219478842562,0.15880090729213642,0.1275615687909794,0.0364952687113483
2,7100.0,2,0.02038821831057994,0.09630663972790712,0.1555448043241856,0.12497511004683985,0.03573871604796547
3,7100.0,3,0.019967408928588594,0.09436164888068814,0.15235494842667807,0.12244036717619758,0.03776019937591393
4,7100.0,4,0.023811975141891095,0.0924563666739131,0.1492300032370752,0.11995632084215314,0.0452839608799286
5,7100.0,5,0.026640256381996266,0.09058995677655872,0.14616865905797063,0.1175219714978203,0.05146594686373992
6,7100.0,6,0.02835000606165764,0.08876160162565093,0.14316963233474628,0.11513633900768827,0.05637239184456599
7,7100.0,7,0.029198826351387652,0.0869705019647499,0.14023166514319188,0.11279846227636686,0.05905973069757229
8,7100.0,8,0.02939360393666993,0.08521587639414757,0.1373535246866902,0.11050739888433984,0.06006403714292553
9,7100.0,9,0.029099343603971015,0.08349696093330551,0.13453400280292901,0.10826222473057379,0.05981146213467663
10,7100.0,10,0.02844671078717531,0.08181300859431136,0.13177191548003256,0.10606203368200343,0.058638339008428395
11,7100.0,11,0.02753840561441731,0.0801632889671315,0.12906610238171004,0.10390593722969896,0.05680809945754917
12,7100.0,12,0.026454500983750746,0.07854708781568817,0.12642038488065505,0.10179306415140331,0.05452538480787372
13,7100.0,13,0.02525687717933011,0.0769637066843726,0.12389928560889846,0.09972256018052934,0.051947726980330636
14,7100.0,14,0.023992879151731636,0.07541246251585458,0.12142803009993375,0.09769358768138126,0.04919514942912287
15,7100.0,15,0.022698312499099343,0.07389268727807899,0.11900564180650039,0.09570532533050292,0.04635800648422455

... Till 100 

96,7100.0,96,0.003075801322687942,0.014205068520142708,0.023776865430908303,0.018470797842253323,0.005349950645748315
97,7100.0,97,0.0030143219166299576,0.013917903672437299,0.023309533267101782,0.018097702142398087,0.005244378183767142
98,7100.0,98,0.0029540455717379822,0.013636512123625004,0.022851304457696696,0.01773204282342924,0.005140855227173257
99,7100.0,99,0.0028949491236622948,0.013360778353851598,0.022402003330512562,0.017373673520762675,0.005039342677633385
100,7100.0,100,0.002837009859021258,0.0130905891535306,0.021961457571489353,0.017022450716657856,0.004939802169332077
101,7100.0,1,0.008286476932685303,0.03912452846929907,0.06309858394842324,0.0505308015857718,0.014295995873011145
102,7100.0,2,0.008120678910347148,0.038336859164928794,0.06183718471377943,0.04949848158481686,0.013997313440326856
103,7100.0,3,0.00795835867426229,0.03756508132554957,0.06060079694860311,0.04848698383915667,0.013704801896351008
104,7100.0,4,0.007799431564110369,0.03680886960644404,0.05938892894889686,0.04749589338944672,0.013418334655608367

How should I tune this?

Best wishes,
Zhenyang

This is to be expected. In order to get good performance you’ll need to enable polynomial multigrid and variable local time stepping. The former can require some tuning, but is essential for getting good convergence.

Regards, Freddie.

Hi Freddie,

I actually enabled polynomial multigrid with:

[solver-dual-time-integrator-multip]
pseudo-dt-fact = 1.6    ;  2.3
cycle = [(2, 1), (1, 1), (0, 3), (1, 1), (2, 5)]

[solver-elements-hex]
soln-pts = gauss-legendre
quad-deg = 7  ;  11  ; 7   ; 13
quad-pts = gauss-legendre

[solver-elements-hex-mg-p2]
soln-pts = gauss-legendre
quad-deg = 7 
quad-pts = gauss-legendre

[solver-elements-hex-mg-p1]
soln-pts = gauss-legendre
quad-deg = 5  
quad-pts = gauss-legendre

[solver-elements-hex-mg-p0]
soln-pts = gauss-legendre
quad-deg = 3  
quad-pts = gauss-legendre

... more element types

what kind of tuning can be useful for good convergence? How do you choose number of cycles for each polynomial order? I will try variable local time stepping on the next run.

Best wishes,
Zhenyang

It is often not obvious what the optimal cycle is. (We are working on some machine learning technology to automatically pick a cycle, but this is not yet ready for production.) However, the right choice of cycle can easily have an order of magnitude impact on performance.

Higher polynomial orders (on coarser meshes) are also to be preferred when running with implicit time stepping as this gives more coarsening opportunities for polynomial multigrid. Going from p = 4 to p =0 is much more substantial than p = 2 to p = 0 and thus polynomial multigrid is able to be more effective.

Picking the right ratio is between dt and pseudo-dt is also important. @WillT has more direct experience here, however.

Regards, Freddie.

Thanks. My production run is aiming at polynomial order of 5. I will try that order to see what will happen. For your experience while using implicit stepping, how much performance gain can be achieved comparing with standard time stepping scheme on the compressible NS solver?

Best wishes,
Zhenyang

In general I would only expect to see a minimal performance improvement for the compressible solver.

Regards, Freddie.