This can be seen by examining how the propagator is constructed, e.g. Feyman's original paper on the subject [1] or any thorough introduction to the material [2]. We consider all paths from the staring position S to the final position F. Any intermediate position P will already be covered. If we ask the probability of going from S to F through P, but without perfoming a measurement, we need to consider all paths from S to P and then all paths from P to F. However, this will include paths that start at S, go to some other arbitrary position P' and instantaneously travel through P before going back to P'. This is, of course, already included in the original calculation.

We can more explicitly see that going though an intermediate position P does not effect the final result. Consider a propagator of the form [3] eq. 11.9

$\Delta \left(S-P-F\right)=\int \frac{dq}{k}{e}^{-iq\left(S-P\right)}{e}^{-iq\left(P-F\right)}=\int \frac{dq}{k}{e}^{-iq\left(S-F\right)}$,

To the best of my knowledge this holds true for all propagators, see for example. [3] eqs. 11.119 and11.177.

In the Feynman propagator all paths are considered. The paths for virtual particles mathematically travel backwards in time. Therefore, the inclusion of virtual particles necessarily means including configurations that are in the past of the initial position S. This would only violate causality if a measurment were performed at such a time. In the main text I am arguing that such measurements define causality rather than the numerical value of the time coordinate, with only the duration between measurements being a physically meaningful quantity in the propagator.