As shown in the table above, there are also some kernels that have disappeared, such as intcflux. I’m very curious as to what this is all about, is this an optimisation? How was it done?
PyFR v1.15 switched to using CUDA task graphs. It appears as if whatever tool you are using for profiling can not measure or time kernels called as part of graphs. Hence why all of the kernels involved in an RHS evaluation as missing.