It depends. Usually yes, but there are many caveats on that. But for now I will say that if you have sufficient resolution to be within the asymptotic regime then, for FR, higher orders generally lead to more accurate solutions. But as you may be getting by now, it depends.
Not trying to be too blunt, but no idea, this is why we run simulations. A mesh convergence study, rather than an order convergence study, will give you a more definitive answer. Order convergence studies are a bit tricky as changing the order changes the numerical properties of the scheme. See the previous question.
If adding the over-integration anti-aliasing makes your simulation fail, then I would suggest you look at your mesh. Over-integration should help things and if you make the numerics more accurate, but the simulation fails this is normally a sign that something was under-resolved. Modal filtering can be viewed as dissipation, and hence why it will often work when over-integration doesn’t, but that doesn’t make it correct. It just means that you are damping out the under-resolved physics that was causing the failure. This physics is likely important.
Yes it can have a reasonable impact, there are a couple of schools of thought on how to set zeta. Some people suggest it should be small, some suggest it should be large. To me large makes sense as this should help convergence the divergence faster. In some some work I am doing with @Sambit, we’ve found that ~7 can often give locally optimal performance. However, what the ‘best’ value is depends on: order, multigrid cycle, pseudo stepping scheme, physical time scheme, the ratio dt/pseudo-dt, and anything else you can think of. Increasing zeta though will increase the stiffness of the problem, as you can see from the largest eigenvalue of the flux:
If the problem is significantly stiffer with a high zeta, then this will lead to slower convergence in pseudo-time. But increasing zeta can improve convergence, so there is a trade off. I have tried the preconditioner by Turkel, I have it on a public branch I think, and that can help, but it is another tunable parameter and if you get it wrong it can slow things down a lot. Hence me and Freddie decided not to add it. If you’re interested hyperbolic-diffusion can about halve your iterations to convergence and increase stability. (Neither of these features are mainline so merge at your own risk).
I think that language is a bit strong, it should be a factor, we definitely care if the pressure field is converging to a sensible value. However, given that most of the time we are interested in the velocity field, a tolerance is usual set that is stricter on these.