Kernels for corrected gradient at interface

Hi,

I was trying to schedule a graph to compute the corrected gradients at some boundary interfaces. But when trying use the graphs below, the values I get are not exactly correct. Is there any dependency I am missing?

        g1 = self.backend.graph()
        g1.add_mpi_reqs(m['scal_fpts_recv'])

        # Interpolate the solution to the flux points
        g1.add_all(k['eles/disu'])

        # Pack and send these interpolated solutions to our neighbours
        g1.add_all(k['mpiint/scal_fpts_pack'], deps=k['eles/disu'])
        for send, pack in zip(m['scal_fpts_send'], k['mpiint/scal_fpts_pack']):
            g1.add_mpi_req(send, deps=[pack])

        # Compute the common solution at our internal/boundary interfaces
        for l in k['eles/copy_fpts']:
            g1.add(l, deps=deps(l, 'eles/disu'))
        kdeps = k['eles/copy_fpts'] or k['eles/disu']
        g1.add_all(k['iint/con_u'], deps=kdeps)
        g1.add_all(k['bcint/con_u'], deps=kdeps)

        # Compute the transformed gradient of the partially corrected solution
        g1.add_all(k['eles/tgradpcoru_upts'])


        g2 = self.backend.graph()

        # Compute the common solution at our MPI interfaces
        g2.add_all(k['mpiint/scal_fpts_unpack'])
        for l in k['mpiint/con_u']:
            g2.add(l, deps=deps(l, 'mpiint/scal_fpts_unpack'))

        # Compute the transformed gradient of the corrected solution
        g2.add_all(k['eles/tgradcoru_upts'], deps=k['mpiint/con_u'])

        # Obtain the physical gradients at the solution points
        for l in k['eles/gradcoru_upts']:
            g2.add(l, deps=deps(l, 'eles/tgradcoru_upts'))

        # Interpolate these gradients to the flux points
        for l in k['eles/gradcoru_fpts']:
            ldeps = deps(l, 'eles/gradcoru_upts')
            g2.add(l, deps=ldeps)

Best

When debugging graphs it is best to start with the OpenMP backend. Can you confirm if this works?

Regards, Freddie.

For example when computing the stress tensor from the fluid force plugin:

vis.sum(axis=-1).sum(axis=-1)
array([[ 6.88441146e-01,  2.70109297e-02,  2.77951849e-11],
       [ 2.70109297e-02, -3.28062962e-01,  4.78283019e-11],
       [ 2.77951849e-11,  4.78283019e-11, -3.60378184e-01]])

After executing that graph on the OpenMP backend and copying back the stress tensor array

array([[ 7.33657501e-06,  1.55160888e-08, -3.62438320e-15],
       [ 1.55160888e-08, -3.70222175e-06,  7.77935729e-16],
       [-3.62438320e-15,  7.77935729e-16, -3.63435325e-06]])

with the stress tensor computed as in flux.mako

    fpdtype_t rho  = uin[0];
    fpdtype_t rhou = uin[1], rhov = uin[2], rhow = uin[3];
    fpdtype_t E    = uin[4];

    fpdtype_t rcprho = 1.0/rho;
    fpdtype_t u = rcprho*rhou, v = rcprho*rhov, w = rcprho*rhow;

    fpdtype_t rho_x = grad_uin[0][0];
    fpdtype_t rho_y = grad_uin[1][0];
    fpdtype_t rho_z = grad_uin[2][0];

    // Velocity derivatives (rho*grad[u,v,w])
    fpdtype_t u_x = grad_uin[0][1] - u*rho_x;
    fpdtype_t u_y = grad_uin[1][1] - u*rho_y;
    fpdtype_t u_z = grad_uin[2][1] - u*rho_z;
    fpdtype_t v_x = grad_uin[0][2] - v*rho_x;
    fpdtype_t v_y = grad_uin[1][2] - v*rho_y;
    fpdtype_t v_z = grad_uin[2][2] - v*rho_z;
    fpdtype_t w_x = grad_uin[0][3] - w*rho_x;
    fpdtype_t w_y = grad_uin[1][3] - w*rho_y;
    fpdtype_t w_z = grad_uin[2][3] - w*rho_z;


    // Negated stress tensor elements
    fpdtype_t t_xx = -2*mu_c*rcprho*(u_x - ${1.0/3.0}*(u_x + v_y + w_z));
    fpdtype_t t_yy = -2*mu_c*rcprho*(v_y - ${1.0/3.0}*(u_x + v_y + w_z));
    fpdtype_t t_zz = -2*mu_c*rcprho*(w_z - ${1.0/3.0}*(u_x + v_y + w_z));
    fpdtype_t t_xy = -mu_c*rcprho*(v_x + u_y);
    fpdtype_t t_xz = -mu_c*rcprho*(u_z + w_x);
    fpdtype_t t_yz = -mu_c*rcprho*(w_y + v_z);

Any idea? Any missing dependency or any other reason?

Can you confirm that all of the kernels are indeed present? What I mean by this is that k['something'] actually contains a list of kernels (rather than being empty). You will want to confirm this for every kernel you add to each graph.

Regards, Freddie.

The kernels that are not there are ( I am running with 1 rank )

Key: eles/disu
  Exists: True
  Not empty: [<pyfr.backends.cuda.gimmik.CUDAGiMMiKKernels.mul.<locals>.MulKernel object at 0x7f0704d92f90>]

Key: mpiint/scal_fpts_pack
  Exists: True
  Not empty: []

Key: eles/copy_fpts
  Exists: True
  Not empty: [<pyfr.backends.cuda.blasext.CUDABlasExtKernels.copy.<locals>.CopyKernel object at 0x7f0704d91ca0>]

Key: iint/con_u
  Exists: True
  Not empty: [<pyfr.backends.cuda.provider.CUDAPointwiseKernelProvider._instantiate_kernel.<locals>.PointwiseKernel object at 0x7f0782806780>]

Key: bcint/con_u
  Exists: True
  Not empty: [<pyfr.backends.cuda.provider.CUDAPointwiseKernelProvider._instantiate_kernel.<locals>.PointwiseKernel object at 0x7f06c77dfe90>, <pyfr.backends.cuda.provider.CUDAPointwiseKernelProvider._instantiate_kernel.<locals>.PointwiseKernel object at 0x7f0704d91640>, <pyfr.backends.cuda.provider.CUDAPointwiseKernelProvider._instantiate_kernel.<locals>.PointwiseKernel object at 0x7f06e23cde50>]

Key: eles/tgradpcoru_upts
  Exists: True
  Not empty: [<pyfr.backends.cuda.gimmik.CUDAGiMMiKKernels.mul.<locals>.MulKernel object at 0x7f07897c1430>]

Key: mpiint/scal_fpts_unpack
  Exists: False
  Not empty: False

Key: mpiint/con_u
  Exists: False
  Not empty: False

Key: eles/tgradcoru_upts
  Exists: True
  Not empty: [<pyfr.backends.cuda.gimmik.CUDAGiMMiKKernels.mul.<locals>.MulKernel object at 0x7f0704d91af0>]

Key: eles/gradcoru_upts
  Exists: False
  Not empty: False

Key: eles/gradcoru_fpts
  Exists: True
  Not empty: [<pyfr.backends.cuda.provider.CUDAUnorderedMetaKernel object at 0x7f06c77cb4d0>]

Is the issue is related to gradcoru_upts? Is there a reason for which it is empty?

Update: even using

  # Obtain the physical gradients at the solution points
  for l in k['eles/gradcoru_u']:
      g2.add(l, deps=deps(l, 'eles/tgradcoru_upts'))
  # Interpolate these gradients to the flux points
  for l in k['eles/gradcoru_fpts']:
      ldeps = deps(l, 'eles/gradcoru_u')
      g2.add(l, deps=ldeps)

results in the wrong values being computed

Best,

PyFR performs a lot of optimisations in this area depending on your simulation set up. This often enables us to eliminate the need for a separate gradient kernel in the main RHS evaluation.

Take a look at _compute_grads_graph to see how one can reliably obtain gradients. Note, however, that some more work will be needed to interpolate them as we can sometimes elide the interpolation kernel to the flux points (as is the case if you run with GLL points).

Regards, Freddie.

I used the graph from _compute_grads_graph and then added interpolation after that as:

  for l in k['eles/gradcoru_fpts']:
      ldeps = deps(l, 'eles/gradcoru_u')
      g2.add(l, deps=ldeps)

it looks like that when I don’t do gradient fusion results are correct whilst if I do they are not. Hence could you point out to the main differences in the gradient calculation with and without? How the kernel scheduling must be changed? I assume that all the was up to eles/gradcoru_u is correct then it is just the face interpolation that must be changed. Moreover even if I use GLL and don’t use the gradcoru_fpts kernel I still have the same issue

Best,

When gradient fusion is enabled the gradients are untransformed as part of the transformed flux kernel. In this case they are put into a separate array. When gradient fusion is disabled they find themselves in the same array as the flux (with the flux kernel overwriting them when it runs).

It is likely your issue is due to the interpolation kernel using the wrong array since eles/gradcoru_u probably puts the gradients into a different matrix than the fused flux kernel.

Regards, Freddie.

So to my understanding if gradient fusion is enabled they are in self._grad_upts and I need to get a view over this whilst without gradient fusion self._vect_upts.

So the cases are:

if self.grad_fusion:
    # Gradients are stored in self._grad_upts
elif 'flux' in self.antialias:
    # Gradients are stored in self._vect_qpts
else:
    # No gradient + flux kernel fusion, no flux-AA
    # Gradients are stored in self._vect_fpts

so based on those cases I can create the corresponding views?

Best,

Without fusion gradients end up in _vect_upts. With fusion they end up in _grad_upts. Anti-aliasing is not something you need to worry about as the gradients still end up in _vect_upts and are then interpolated to the quadrature points.

For GLL points note that the kernel to interpolate to the flux points may be empty due to another optimisation.

Regards, Freddie.

Results now are much closer but still not the same (on both backend)

Fluidforce plugin
 [[ 9.26393831e+02  1.36812069e+02  1.19075588e-07]
 [ 1.36812069e+02 -8.32560606e+02 -5.50975211e-08]
 [ 1.19075588e-07 -5.50975211e-08 -9.38332249e+01]]

Kernels
 [[ 4.90815666e+02  7.87091404e+01 -7.42978315e-08]
 [ 7.87091404e+01 -4.58450439e+02 -5.96999901e-08]
 [-7.42978315e-08 -5.96999901e-08 -3.23652266e+01]]

Is there anything else to look at?

There are likely some differences due to how the fluid force handles conversion between primitive and conservative variables. In particular if you convert before or after interpolating to the flux points. Note that the fluid force plugin can also perform quadrature (so it interpolates to quadrature points on the surface rather than flux points).

Regards, Freddie.

I am currently converting to primitive variables after the flux point interpolation kernel (like in the fluid-force plugin).

Regarding the quadrature it is not activated now and I am falling back on the solution points I specified

For reference if I compare the pressure then this is the same even when I compute its projection along xyz

Best,

Have you confirmed that the gradients are identical before interpolation to the flux points?

Regards, Freddie.

I am running with GLL points, I have no k['eles/gradcoru_fpts'] that interpolates those gradients.

Best,

The fluid force plugin does not use the corrected gradients but rather obtains them by differentiating the solution directly. This is slightly less accurate than using the gradients of the corrected solution.

Regards, Freddie.

Ok thanks a lot for the answer, so the kernels values are actually correct but the discrepancy is due to the different computations performed.

Best,

1 Like

Yes, we may change the fluid-force plugin in the future to use the more accurate gradients. However, in the majority of simulations the viscous drag is relatively small and so it is likely not worth the additional computational expense.

Regards, Freddie.