Extracting element index in reduction kernel

Hi,

a clarification regarding the element index in the reduction kernel.

Specifically for OpenMP backend I can access the flattened index associated to a variable at a given quadrature point in a given element as:
idx = _y*BLK_SZ*${ncola} + ib*BLK_SZ*${ncola}*nrow + X_IDX_AOSOA(${i}, ${ncola});but I have the need of accessing a value of an array that has just nrow = 1 and ncola = 1 (1 variable).

Is therefore correct to extract the associated element index as:
idx = ib*BLK_SZ*${ncola}*nrow + X_IDX_AOSOA(0, ${ncola});

and for CUDA backend I have a flattened index as:
ixdtype_t idx = j*ldim + SOA_IX(i, blockIdx.y, gridDim.y);
hence the element index should be

ixdtype_t i = ixdtype_t(blockIdx.x)*blockDim.x + tid;

Is this correct in both cases?
Best

In the OpenMP case I believe you can further simplify the X_IDX_AOSOA(…) macro to _xi + _xj.

Regards, Freddie.

I think something is still off on OpenMP

The initial shape on host of my array is (1, neles=52547) then when blocked on OpenMP becomes (3285, 1, neles=52547, 16).

I try to access the value (unique) at a given element with the flattened index

idx = ib\*BLK_SZ\*${ncola}\*nrow + X_IDX_AOSOA(0, ${ncola});

or

idx = ib\*BLK_SZ\*${ncola}\*nrow + _xi + _xj;

but in both cases it is wrong

Any idea?

Best

The blocked OpenMP shape you’re using seems wrong. Can you confirm the ioshape and datashape’s of your array on the backend?

To understand what is going on you may want to fill the matrix with integers counting up from 0.

Regards, Freddie.

Hi,

the ioshape of the matrix is

(1, 52547)

and the datashape is

[3285, 1, 16]

Does it help?

I do not believe the reduction code has ever been tested on a matrix with a 2D ioshape. It is possible that some indices (sizes) are being picked up incorrectly. Does anything change if you go for an ioshape of (1, 1, 52547)?

Regards, Freddie.

Wouldn’t this break the other kernels actually or not ensuring compatibility with the GPUs backend?

The reduction code is reimplemented by each backend on its own. Thus, the implementations are all slightly different and may behave differently with non-standard (i.e., 2D) matrices. If you use matrices with a 3D ioshape (which is the common case) everything should work.

Regards, Freddie.