For me, X-Form is the only way to adjust timing(**). Both Monophonic and Polyphonic cause way too much garbled audio. X-Form is very clean, but since it renders every little change, the trick to using it is to separate audio into clips that fit in the EDIT window(this way, the render is only working on smaller clips, so it happens in seconds instead of minutes).
** I should also mention that Melodyne can be used to move stuff around in time. Just COMMIT it once you like the result