Lagrange interpolation on triangles is no different from Lagrange interpolation on any space. A good reference for the theory on this is Theory and Practice of Finite Elements by Ern and Guermond, where they give enormous detail on Lagrange finite elements.
But ultimately if you have a set of points \mathbf{x}_i \in\hat{\mathbf{K}}\subset\mathbb{R}^n, where K is the reference space, then you can form a Lagrange polynomial. For simplicity, take n=2, then we can make the approximation of f as
f^\delta = \sum^N_{i=1}f(\mathbf{x}_i)l_i(\mathbf{x})\\
l_i(\mathbf{x}) = \prod^N_{j=1,j\neq i}\frac{x-x_j}{x_i-x_j}\cdot \frac{y-y_j}{y_i-y_j}
Of course, there are mathematical implications of the topology when this approximation is used in a numerical method, but do you see that at this level you don’t have to do anything different to form the interpolation in this canonical form? Although, topology can lead to simplifications, for example, if the element has a tensor product construction.
In practice, when you define an interpolation matrix, say from the solution to the flux points, we have something like u_f=Mu_s. One way to calculate M is by using a Vandermonde matrix, however, if you do this with monomials, then the condition number grows very rapidly. See Nodal Discontinuous Galerkin Methods by Hesthaven and Warburton Chp. 3&4.
Instead, if you use an orthogonal polynomial like Legendre or Jacobi, then the condition number grows much slower. But again, it doesn’t really matter what the element topology is, so long as your Vandermonde is invertible. (You can make it tall if you have more solution points than bases, which is somewhat akin to how the over-integration anti-aliasing works). There are also other ways of calculating M that can make use of quadrature, which is one way of handling the tall Vandermonde situation, and the quadrature will be dependent on the element topology.
I hope this helps. By the way, the matrix operators in PyFR are defined in shapes.py