PyFR simulating NACA0021 airfoil in deep stall

I have now ran the 4c case with 12 V100’s GPUs. The start-up went as:

100.0% [===========================>] 100.00/100.00 ela: 01:26:54 rem: 00:00:00

which is reasonable (although p = 1 is somewhat pathological and so the speed up is less than the 7/4 we might expect). Restarting at p = 4 I observed:

31.7% [++++++==>                  ] 126.96/400.00 ela: 19:28:46 rem: 197:17:17

which has made it nicely past the point where you were observing issues. The 7c case is also still running without issues with:

32.7% [++++++==>                  ] 130.91/400.00 ela: 36:44:03 rem: 319:45:38

In terms of next steps can you try moving to the latest (Git master) version of PyFR, reimporting your mesh, and re-running?

Regards, Freddie.

Hello Freddie,

and thanks for your kind answer. I am actually using PyFR v 1.12.0. Unfortunately, our new cluster does not have internet access, therefore I cannot use git. However, I noticed that the most recent version of PyFR is 1.12.1, so I can try to do that manually, substituing this latter one in my Python environment via

python setup.py build
python setup.py install

from the newest version.

I’ll try that ASAP and will soon come back to you.

Best,
Federico Cipolletta.

Hi Federico,

Yes, its definitely worth giving v1.12.1 a try. I think there was an issue with the mesh importer in v1.12.0 that led to erroneous linearisation of curved element meshes that could be causing your stability problems.

Peter

Hello everyone,

I tried installing PyFR v 1.12.1 on top of my python environment which was containing the PyFR v 1.12.0. However, the upgrade done via simply

python setup.py build
python setup.py install

seems not to be working out-of-the-box. From the second command, I am obtaining:

...
running install_data
copying pyfr/__main__.py -> build/bdist.linux-x86_64/egg/
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying pyfr.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pyfr.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pyfr.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pyfr.egg-info/entry_points.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pyfr.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pyfr.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
  File "build/bdist.linux-x86_64/egg/pyfr/solvers/base/system.py", line 80
    if (m := re.match(f'spt_(.+?)_p{rallocs.prank}$', f)):
          ^
SyntaxError: invalid syntax

  File "build/bdist.linux-x86_64/egg/pyfr/writers/native.py", line 98
    if (m := re.match(bn, f)):
          ^
SyntaxError: invalid syntax

  File "build/bdist.linux-x86_64/egg/pyfr/integrators/base.py", line 71
    if (m := re.match('soln-plugin-(.+?)(?:-(.+))?$', s)):
          ^
SyntaxError: invalid syntax

  File "build/bdist.linux-x86_64/egg/pyfr/integrators/dual/pseudo/multip.py", line 79
    if (m := re.match(f'solver-(.*)-mg-p{l}$', s)):
          ^
SyntaxError: invalid syntax

  File "build/bdist.linux-x86_64/egg/pyfr/rank_allocator.py", line 58
    if (m := re.match(r'con_p(\d+)p(\d+)$', f)):
          ^
SyntaxError: invalid syntax

  File "build/bdist.linux-x86_64/egg/pyfr/partitioners/base.py", line 64
    if (mi := re.match(r'con_p(\d+)$', f)):
           ^
SyntaxError: invalid syntax

zip_safe flag not set; analyzing archive contents...
creating dist
...

and, even if the pip freeze reports the correct new version of PyFR (i.e. 1.12.1), when I attempt to run the NACA0021 test, I obtain the following error:

...
  File "/davinci-1/home/cipollettaf/pyfr_mlnx/lib/python3.7/site-packages/pyfr-1.12.1-py3.7.egg/pyfr/partitioners/base.py", line 64
    if (mi := re.match(r'con_p(\d+)$', f)):
           ^
SyntaxError: invalid syntax
...

repetead as many time as per the number of GPUs I am running on. This error, seems to happen at mesh importt, because the file .pyfrm is not produced.

Are you aware of differences between version 1.12.0 and 1.12.1 that can produce those errors?

If I try to install PyFR 1.12.1 via pip install, I get an error, where it seems that my Python (3.7.9) is not enough for the code that is requiring Python >= 3.8.

I verified that the setup.py script is actually explicitly requiring

python_requires='>=3.8'

on line 157. However, simply modifiying this line does not heal the issues and the errors remain. It seems that the changes in the mesh importer (basically using := in some if conditions) implicitly introduced the requirement of Python 3.8 (see https://towardsdatascience.com/when-and-why-to-use-over-in-python-b91168875453). I think that you may want to update the requirements in the official documentation, if you are not trying to avoid the Python 3.8 syntax.

Best,
Federico Cipolletta.

This is the fix for the issue I mentioned: Bug fix. · PyFR/PyFR@8f59954 · GitHub

Yes, we require Python 3.8 (although I believe this change came in with 1.12.0, so if you were using this version you should not be encountering these problems) and this carries through to PyPI which notes the dependency on 3.8 or later:

I am unsure why the setup.py build/install route does not catch this although generally pip install . is a better bet than calling setup.py manually.

In terms of next steps I would verify what release you’ve been running heretofore, as it may not have been 1.12.0. Then, bump your Python version to 3.8 or later and proceed with installing 1.12.1.

Regards, Freddie.

Unfortunately for me, as we have restrictions regarding the internet connection from within the company, adopting pip is not straightforward.

However, installing from scratch a new Python environment and proceeding with installation of the required packages from source, worked for me. The total list of packages that I have installed on my new environment is a little longer than what is reported in the requirements on documentation and is the following:

appdirs==1.4.4
Cython==3.0a6       (Required by numpy)
gimmik==2.1
h5py==3.3.0
Mako==1.1.4
MarkupSafe==1.1.1   (Required by Mako)
mpi4py==3.0.3
numpy==1.20.3
pyfr==1.12.1
pytools==2021.2.6

I am now running the p1 warmup of the NACA0021 test with the new environemnt and will then attempt switching again to p4.

I will come back to you as soon as I have results.

Best,
Federico Cipolletta.

Hello everyone,

It seems that adopting the latest version of PyFR (v 1.12.1) made my simulation stable.

To recap, I am running the 4c mesh case, with a warmup with order 1 up to time t=100 and restarting from there switching to order 4, and activating anti-aliasing. I am running the simulation on 12 NVIDIA A100 GPUs.

The warmup was completed with the following stdout:

 100.0% [===========================>] 100.00/100.00 ela: 01:04:45 rem: 00:00:00

The differences of the config files (warmup vs restart) that I am using are as follows:

--- P1_warmup/p1_warmup.ini     2021-07-20 09:41:39.000000000 +0200
+++ P4_4c/p4_aa.ini     2021-07-21 12:42:34.000000000 +0200
@@ -12,8 +12,8 @@
 [constants]
...
-mu = 0.000037
+mu = 0.0000037
@@ -22,9 +22,9 @@
 [solver]
...
-order = 1
+order = 4
+anti-alias = flux, surf-flux
...
 [solver-time-integrator]
...
@@ -32,7 +32,8 @@
-tend = 100.0
+tend = 400.0
...
[soln-plugin-writer]
...
@@ -77,7 +78,7 @@
-basename = naca_p1_4c-{t:.2f}
+basename = naca_p4_4c-{t:.2f}
...

The actual status of the restart is as follows:

  34.9% [++++++===>                 ] 139.70/400.00 ela: 20:25:31 rem: 133:56:13

NOTE: if one forgets to activate the anti-aliasing, the simulation does crash and this is probably due to the fact that the turbulence created but the deep stall position of the NACA airfoil profile with respect to the velocity of the fluid needs some artificial dissipation.

Thanks for the help!

Glad to hear you got it working. Just a note regarding:

“if one forgets to activate the anti-aliasing, the simulation does crash and this is probably due to the fact that the turbulence created but the deep stall position of the NACA airfoil profile with respect to the velocity of the fluid needs some artificial dissipation.”

Anti-aliasing doesn’t add artificial dissipation, rather it suppresses spurious transfer of energy from un-resolved modes into the highest-energy resolved modes (which can then lead to instability).

@FedericoCipolletta @fdw
Dear Federico Cipolletta and Freddie,
I am trying to run the same simulation using PyFR1.14.0. However, I get the same error (RuntimeError: Minimum sized time step rejected) even in the first part of the simulation. I followed your suggestions but it seems not working.

Here is my PyFR configuration file and shell script for running the simulation. Could you have a check, please?

[backend]
precision = double
rank-allocator = linear

[backend-openmp]
;cc = gcc
;cblas-st = Enter path to local single-threaded BLAS library for OpenMP backend
;cblas-mt = Enter path to local multi-threaded BLAS library for OpenMP backend

[constants]
gamma = 1.4
mu = 0.000037
Pr = 0.72

M = 0.1
uc = 1.0
rhoc = 1.0

[solver]
system = navier-stokes
order = 1
;viscosity-correction = none
;anti-alias = flux, surf-flux

[solver-time-integrator]
scheme = rk45
controller = pi
tstart = 0.0
tend = 100
dt = 2.5e-8
atol = 1e-6
rtol = 1e-6
min-fact = 0.3
max-fact = 1.05

[solver-interfaces]
riemann-solver = roem
ldg-beta = 0.5
ldg-tau = 0.1

[solver-interfaces-quad]
flux-pts = gauss-legendre
quad-deg = 11
quad-pts = gauss-legendre

[solver-interfaces-line]
flux-pts = gauss-legendre
quad-deg = 11

[solver-elements-tri]
soln-pts = williams-shunn
quad-deg = 9

[solver-elements-quad]
soln-pts = gauss-legendre
quad-deg = 11
quad-pts = gauss-legendre

[solver-elements-hex]
soln-pts = gauss-legendre
quad-deg = 11
quad-pts = gauss-legendre

[soln-plugin-writer]
dt-out = 5.0
basedir = ./solutions/
basename = naca_p1_4c-{t:.2f}

[soln-bcs-wall]
type = no-slp-adia-wall

[soln-bcs-inflow]
type = char-riem-inv
rho = rhoc
u = uc
v = 0
w = 0
p = rhoc*uc*uc/(M*M*gamma)

[soln-bcs-outflow]
type = char-riem-inv
p = rhoc*uc*uc/(M*M*gamma)
u = uc
v = 0
w = 0
rho = rhoc

[soln-ics]
rho = rhoc
u = uc
v = 0
w = 0
p = rhoc*uc*uc/(M*M*gamma)

[soln-filter]
freq = 0
cutoff = 0
order = 16
alpha = 36

[soln-plugin-fluidforce-wall]
nsteps = 100
file = wall-forces.csv
header = true

#!/bin/sh
rm -rf naca0021_p4_4c.pyfrm *.csv solutions/
mkdir solutions
pyfr import naca0021_p4_4c.msh naca0021_p4_4c.pyfrm
pyfr partition 20 naca0021_p4_4c.pyfrm .
mpiexec -n 20 pyfr run -b openmp -p naca0021_p4_4c.pyfrm p4_aa_01.ini

Hello @hwtang,

I only made a comparison between the config that I managed to run with respect to yours and I noticed that you are setting the following two parameters

min-fact = 0.3
max-fact = 1.05

that configures the adaptive time-stepping for your simulation (have a look at this page User Guide — Documentation). I actually did not specify anything for those two parameters but it also seems to me that you are using the default value for the min-fact and a smaller (i.e. safer from the CFL point of view) value for the max-fact.

I would suggest having a look at the dtstats file that would provide additional information about the time-stepping (for example what is rejected and what is accepted), via adding the following block of parameters:

[soln-plugin-dtstats]
flushsteps = 500
file = dtstats.csv
header = true

Finally, when adopting the OpenMP backend like in your case, I recall to must specify the following:

[backend-openmp]
cc = gcc
cblas-mt = libmkl_rt.so
gimmik-max-nnz = 8192

but it could well depend on the environment (i.e. architecture and installation paths for the libraries) that I am running on, therefore it could not be related to your case.

Hi @FedericoCipolletta ,

Thank you very much for your reply. I have tried to remove min-fact and max-fact, but still got the same error.

Regarding the openmp backend, it really makes me crazy. Whether I use or remove the following command, I get the same error (RuntimeError: Minimum sized time step rejected).

[backend-openmp]
cc = gcc
cblas-mt = libmkl_rt.so
gimmik-max-nnz = 8192

I also failed to use openmp as backend to run the example case (see here, please), and I got another error.

Things are really weird. I am not sure whether the RuntimeError: Minimum sized time step rejected error in this NACA0021 case is caused by the same reason as the example case.

Besides the min-fact and max-fact, is there any other difference between the config you managed to run and mine?

Thank you for your help.

Can you try adding the block for dtstats.csv file, choosing a reasonable frequency wrt the time of your failure and check what it reports for the rejected timestep?

Sure, would this be OK?

[soln-plugin-dtstats]
flushsteps = 1
file = dtstats.csv
header = true

I try to use different flushsteps, but it seems to make no sense. The dtstats.csv file is empty (only the header), and the wall-forces.csv is

t,px,py,pz,vx,vy,vz
0.0,-7.389644451905042e-13,-4.263256414560601e-13,0.0,0.0,0.0,0.0

@FedericoCipolletta
I have tried to set flushsteps as 1, 10, 500, or 1000, The dtstats.csv file is empty. The time config is

[solver-time-integrator]
scheme = rk45
controller = pi
tstart = 0.0
tend = 100
dt = 2.5e-6
atol = 1e-5
rtol = 1e-5

So, it seems that your simulation is crashing as soon as it starts (I think that the tstats file is empty since the code does not perform any time iteration). Did it, at least, output the initial data at time 0? The strange thing is that one would expect to have some info on the dt that has been rejected (and your error actually says that at some point the minimum time-step has been rejected)…

As far as I saw, in case no convergence is reached, the PyFR code would try to decrease the timestep down to dt-min which has a default value of 1e-12 (at least in PyFR 1.12.3 it was so…). I never used PyFR 1.14.0, so I don’t know if something changed in this regard.

Apart from that, I don’t think that I could be of any more help…when I repeated this simulation, I did not activate the anti-aliasing when running at order 1 for the warmup, so I don’t think it is necessary…

NO, the dtstats.csv file is empty (only header).

Yes, I have checked the source code, and PyFR1.14 does the same thing.

Maybe I could have a try using PyFR1.12.3. Could you send me a copy of your config file so that I could do a test, please?

PyFR 1.12.3 will do the same thing with regards to adapting dt, if I’m not mistaken. I don’t think this issue has anything to do with the PyFR version, but more so your configuration/setup. This is quite a challenging case.

@WillT
Well, thank you very much for your suggestions. They are helpful.

However, I am still confused about the openmp backend. Could you see this, please?