I worked a bit on the PXA SSP code last night and was able to come up with a configuration which uses non-network mode for I2S and works well on the Zylonite. I'll post the current series I have in a followup to this, if you could take a look that'd be great - I haven't yet worked through all the testing I'd like to do.
Unfortunately it's going to have broken Daniel's configuration since I inverted the sense of LRCLK as the chip seemed not to generate an LRCLK with a non-zero frame delay; I need to check to see if this is just something I've overlooked. Hopefully Daniel's system should just have inverted the left and right channels.
Having worked through non-network mode my feeling is that we should be able to come up with something that can figure out the extra clock cycles needed for Daniel's configuration with less of a special case. Non-network mode does seem like a better default than network mode because it avoids needing to look at the TDM configuration unless you want to use that.