Save view matrices

Zhenyang · 13 August 2024 09:42

Hi,

I want to save view matrices from the first step to be used in the following computation. I want to use copy kernel to do this job but before that, I need allocate corresponding memory for that via backend.matrix(shape, tag…). But the dimension of the view matrices is confusing to me. Could you tell me how are those data arranged? Thanks.

Regards,
Zhenyang

fdw · 13 August 2024 10:50

Swap out the view for an exchange view. Then run a packing kernel to pack the contents of the view into a buffer (the buffer being automatically allocated and managed by the exchange view).

There should be no need to copy the view indices themselves.

Regards, Freddie.

Zhenyang · 13 August 2024 11:30

Hi Freddie,

Is it something like this part of code (I copied it from baseacvec/interp.py)? Could I understand that self._scal_lhs is the left hand side view and self._scal_rhs is a copy of that.

# Generate the left hand view matrix and its dual
self._scal_lhs = self._scal_xchg_view(lhs, 'get_scal_fpts_for_inter')
self._scal_rhs = be.xchg_matrix_for_view(self._scal_lhs)

self._pnorm_lhs = self._const_mat(lhs, 'get_pnorms_for_inter')

# Kernels
self.kernels['scal_fpts_pack'] = lambda: be.kernel(
    'pack', self._scal_lhs
)
self.kernels['scal_fpts_unpack'] = lambda: be.kernel(
    'unpack', self._scal_rhs
)

And what does pack and unpack kernels really do? What are functions of these two kernels?

Regards,
Zhenyang

fdw · 13 August 2024 15:24

Packing is the process where the values pointed to by a view are copied into a buffer (and then that buffer is copied back to the host). The matrix is a member of self._scal_lhs.

Unpacking is the process of copying data from the host to the device (but no kernel is run here).

Regards, Freddie.

Zhenyang · 3 February 2025 17:14

Hi Freddie,

I recently restart to work on this. Just to clarify things we discussed: I did the following things to save the view from t = 0 for the t > 0 calculations:

# Swap out the view and pack it
self._sbase_lhs = self._scal_xchg_view(lhs, 'get_scal_fpts_for_inter')
# Kernels
self.kernels['lhs_fpts_pack'] = lambda: be.kernel(
    'pack', self._sbase_lhs
)

Then this packing kernel is run which copy the view values to xchgmat matrix which is a member of self._sbase_lhs. In the future time steps, this matrix is passed to some pointwise kernel, for example:

self.kernels['con_ub'] = lambda: self._be.kernel(
            'intconulin_baseflow_check', tplargs=self._tplargs, dims=[self.ninterfpts],
            ulin=self._scal_lhs, urin=self._scal_rhs,
            ublin=self._sbase_lhs.xchgmat, ubrin=self._sbase_rhs.xchgmat,
            ulout=self._comm_lhs, urout=self._comm_rhs
        )

Am I right for all those procedures? I tried to check the result of ulin[0] - ublin[0] at the t = 0 inside this pointwise kernel, but the value is not 0. And if I pass ublin=self._sbase_lhs into the kernel, the result is 0. Do you have any advice regarding that?

Regards,
Zhenyang

fdw · 3 February 2025 17:30

First, check that the arguments ublin and ubrin are specified with the mpi prefix in the kernel definition. Second, are you sure that the packing kernel is running and completing before your kernel runs?

Regards, Freddie.

Zhenyang · 3 February 2025 17:42

Ah yes mpi prefix works. Thanks. I think prefixes view, mpi and no prefix give the different mapping? Regarding the second point, the packing kernel is run right after the kernel eles/disu.

Regards,
Zhenyang

fdw · 3 February 2025 18:10

Yes, these prefixes determine how the kernel accesses the data. MPI is needed for anything that has been packed. It should probably be changed to xchg so it matches the name of the underlying data types.

Regards, Freddie.

Zhenyang · 11 May 2025 18:32

Hi Freddie,

One more question on mpi view matrix to help to understand:
in this block of code, there are kernels doing pack, send, recv and unpack:

# Generate second set of view matrices
        self._vect_lhs = self._vect_xchg_view(lhs, 'get_vect_fpts_for_inter')
        self._vect_rhs = be.xchg_matrix_for_view(self._vect_lhs)

# If we need to send our gradients to the RHS
        if self.c['ldg-beta'] != -0.5:
            self.kernels['vect_fpts_pack'] = lambda: be.kernel(
                'pack', self._vect_lhs
            )
            self.mpireqs['vect_fpts_send'] = lambda: self._vect_lhs.sendreq(
                self._rhsrank, vect_fpts_tag
            )

        # If we need to recv gradients from the RHS
        if self.c['ldg-beta'] != 0.5:
            self.mpireqs['vect_fpts_recv'] = lambda: self._vect_rhs.recvreq(
                self._rhsrank, vect_fpts_tag
            )
            self.kernels['vect_fpts_unpack'] = lambda: be.kernel(
                'unpack', self._vect_rhs
            )

And in the user defined pre-process graphs:

        g2 = self.backend.graph()
        g2.add_mpi_reqs(m['vect_fpts_recv'])

# Pack and send these interpolated gradients to our neighbours
        g2.add_all(k['mpiint/vect_fpts_pack'], deps=ideps)
        for send, pack in zip(m['vect_fpts_send'], k['mpiint/vect_fpts_pack']):
            g2.add_mpi_req(send, deps=[pack])

        g2.add_all(k['mpiint/vect_fpts_unpack'])
        g2.commit()

Is it sufficient to let this view matrix to pass through to another rank and be used in the run-time graph? You said before ‘Unpacking is the process of copying data from the host to the device (but no kernel is run here).’ what does it mean no kernel is run?

And also does packing really copy data from buffer to the host even if in the device-awared mpi mode?

Regards,
Zhenyang

fdw · 11 May 2025 18:52

Packing on the send side always involves executing a kernel to collect data together on a buffer on the backend. Depending on the backend (and other configuration options such as CUDA aware MPI) this buffer may then be copied over to the host. Unpacking never involves a kernel but may involve a copy operation. This again depends on the backend and how it has been configured.

Regards, Freddie.

Zhenyang · 12 May 2025 12:25

Thanks for clarifying everything.

Regards,
Zhenyang

Topic		Replies	Views
View matrices in interface Development	2	234	1 December 2022
Storing and Accessing Data from Old Solutions General	2	194	5 March 2020
Import and update a matrix during runtime Development	5	162	3 August 2023
Questions about matrix Development	3	164	1 September 2023
Kernels for corrected gradient at interface Development	17	75	7 April 2025

Save view matrices

Related topics