MPI_ABORT causes Open MPI to kill all MPI processes

Hello, The parallel computation uses OpenMPI, and errors occur when more than 20 cores are used. Does anyone know what might be causing this issue?

raise RuntimeError(‘Minimum sized time step rejected’)

RuntimeError: Minimum sized time step rejected

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

Your simulation which is running with adaptive time stepping is diverging. This is not an MPI issue but a simulation issue. It is likely that you need to refine your mesh, turn on anti-aliasing or entropy filtering, or if you’re just starting your simulation determine a suitable start-up procedure.

Regards, Freddie.

I successfully ran this case before, but now I’m using a different workstation. The simulation works fine when using fewer than 20 CPU cores, but it crashes when more than 20 cores are used.

It is possible for PyFR to give slightly different results when running across multiple ranks. For simulations at the limit of stability this can push things over the edge. You can handle this by changing ldg-beta.

My recommendation would be to see what the solution looks like at the point of divergence.

Regards, Freddie.