Can you prevent a GPU from being used?

Hi Freddie, Peter, et. al,

I was testing out the latest release, and came across a small issue. When partitioning a case across 4 OMP processes and 1 OpenCL process I encountered the following warnings:

(venv) [zdavis@Minerva cubes_3d]$ mpirun -np 5 ./launcher.sh cube_hex24.pyfrm cube.ini

/usr/local/lib/python3.4/site-packages/pyopencl/__init__.py:59: CompilerWarning: Built kernel retrieved from cache. Original from-source build had warnings:
Build on <pyopencl.Device 'Iris Pro' on 'Apple' at 0x1024500> succeeded, but said:
<program source>:26:23: warning: double precision constant requires cl_khr_fp64, casting to single precision
if (alpha0 == 0.0)
^
<program source>:28:28: warning: double precision constant requires cl_khr_fp64, casting to single precision
else if (alpha0 == 1.0)
^
warn(text, CompilerWarning)
/usr/local/lib/python3.4/site-packages/pyopencl/__init__.py:59: CompilerWarning: From-binary build succeeded, but resulted in non-empty logs:
Build on <pyopencl.Device 'Iris Pro' on 'Apple' at 0x1024500> succeeded, but said:
<program source>:26:23: warning: double precision constant requires cl_khr_fp64, casting to single precision
if (alpha0 == 0.0)
^
<program source>:28:28: warning: double precision constant requires cl_khr_fp64, casting to single precision
else if (alpha0 == 1.0)
^
warn(text, CompilerWarning)
100.0% [===============================>] 0.10/0.10 ela: 00:08:25 rem: 00:00:00

Here you can see that PyOpenCL is attempting to use the integrated graphics card in this situation, rather than the discrete card. Given the amount of time in which it takes to complete this simulation, I am fairly certain it isn’t using the discrete card at all. Is there a way to be more explicit in the invocation to ensure the integrated graphics chip is ignored and the discrete card utilized? I haven’t had this issue in the past with integrated graphics and nVidia cards using the CUDA backend, so was curious about this scenario. I realize this isn’t your typical use case scenario, but if you have encountered this before I would be interested in any workarounds.

Best Regards,

Zach

Hi Zach,

No problem. Devices in OpenCL are specified as a combination of a
‘platform’ and a ‘device’. For example, on a Linux workstation if you
had the NVIDIA drivers installed and the Intel drivers installed you
would have a total of two platforms. You can specify the platform via

[backend-opencl]
platform-id = |

where is the name of the platform or a number (OpenCL returns
platforms in an ordered list). If not specified this defaults to 0 (a
number, hence the first platform). In your case platform 0 is named
‘Apple’ (and I suspect it is the only platform).

Each platform presents a list of devices. In PyFR the desired device
can be selected through

[backend-opencl]
platform-id = …
device-id = |

In the above example I see the name of the device being used is ‘Iris
Pro’. The default here is ‘local-rank’ which is the node-local MPI
rank. This depends on what your launcher script does. If the first MPI
rank is using the OpenCL backend then you’ll pick device 0 here. What
you’ll want to do, then, is set device-id to be the name of the NVIDIA
GPU. (Or its number, if you happen to know it.)

To make life easier PyFR substitutes environmental variables when
parsing the config file. Hence, if any trickery is required you can do
it in the launcher script.

Remove the quotation marks around device-id and everything should
work. (device-id = 1 should also do the trick.)

Regards, Freddie.