By using a generic dmix code with semaphore, this performance problem is resolved. So, S16/S32 supports are OK in the end.
But this leads to another question wrt the kernel driver code: why the driver allocates / maps with uncached page, not with write- combined? Pierre, Jerome, any clue?
In CHT and BYT the organization of the hardware fabric is such that the HDMI DMA transactions are not snooped and so it will fetch data only from DDR. In most non-atom platforms it is snooped, and so fabric will return data from cache if it is updated.
In the past we faced problem where the DMA was fetching some old data because cache was not flushed into DDR. That's where we marked the pages as uncached.
Thanks, that's my expectation. The similar hacks are applied to some audio platforms like AMD HDMI HD-audio. But my question is about the write-combined (WC) pages. There are four modes about page caching: write-back (WB), writhe-through (WT), write-combined (WC) and uncached (UC).
Usually WC is enough to work as uncached like the use case above, not necessary to disable the whole cache via UC. WC worked fine for HD-audio, at least. For LPE audio, is UC really stated as mandatory requirement?
No, as such there is no guidelines from HW guys. Technically WC would work as well.
Question I have is, when we use the smaller period size with say 2 periods with WC, how do we ensure that the last write has been propagated to the main memory. Last write size may be smaller than the WC cache size.
thanks,
Takashi