The official TV playback application, found on the CD with drivers, captures samples from the card into its buffer, and plays from the other end of the buffer concurrently. If there are, on average for a few seconds, too few samples in the buffer, it means that they are consumed faster than they arrive, and so the SAA chip is told to produce them a bit faster. If they accumulate too much, the SAA chip is told to produce them slower. That's it.
Ok. Well, xc5000 (with does the audio sampling) doesn't have it, AFAIKT.
The xc5000 tuner used on this TV device doesn't provide any mechanism to control audio PLL. It just sends the audio samples to au0828 via a I2S bus. All the audio control is done by the USB bridge at au0828, and that is pretty much limited. The only control that au0828 accepts is the control of the URB buffers (e. g., number of URB packets and URB size).
It's probably worth noting that Mauro's explanation here is incorrect - the xc5000 does *not* put out I2S. It outputs an SIF which is fed to the au8522. The au8522 has the audio decoder, and it's responsible for putting out I2S to the au0828.
Hence the xc5000's PLL would have no role here.
In fact, you should see the exact same behavior on the A/V input, since the au8522 is responsible for the I2S clock which drives the cs5503 (the 5503 is in slave mode).
Devin