[alsa-devel] Audio buffer mmap-ing on sh arch vs. cache aliasing problem

Takashi Iwai tiwai at suse.de
Tue Sep 16 20:18:06 CEST 2008


At Thu, 04 Sep 2008 12:23:54 +0100,
Pawel MOLL wrote:
> 
> Hi Guys,
> 
> Recently I was investigating an issue with capturing audio using USB
> Audio Class device on a sh4-based board. "Bad voice quality" was
> reported...
> 
> Finally I have traced the problem to something which is (unfortunately)
> well known to sh developers as a D-cache aliasing (or synonym) problem. 
> 
> Briefly speaking: due to some MMU design decisions, one can have two
> different virtual address pointing to the same physical location, which
> is fine, but going via different cache slots! So if there was a value of
> "0" in the memory and user "A" will write "1" there, user "B" will still
> read "0"...
> 
> The solution is to ensure all TLB entries (so virtual memory areas) are
> beginning from a 16kB-aligned virtual address. Otherwise it is necessary
> to flush the cache between accesses from "A" and "B" sides.
> 
> And now. The USB Audio Class driver (sound/usb/usbaudio.c) is allocating
> the sound buffer like this...
> 
> static int snd_pcm_alloc_vmalloc_buffer(struct snd_pcm_substream *subs, size_t size)
> {
> [...]
> 	runtime->dma_area = vmalloc(size);
> [...]
> }
> 
> ... and vmalloc will return a page(4k)-aligned pointer, possibly not
> 16k-aligned one. This is the source of all evil ;-) 

Well, the main problem is the cache coherency with mmap in general,
and not about the page size.  The 4k and 16k things are just a
coincidence, I'd say.

> When using RW transfers everything is fine, as data is memcopied between
> two independent buffers.
> 
> In a case of MMAP mode we will end up with such a situation:
> 
> 1. The driver will memcpy data from an URB to the buffer. It will
> populate several cache lines. Let's call it a "kernel cache".
> 2. As the library has mapped the non-16k-aligned address via different
> cache line ("user cache"), it will read some rubbish from the physical
> memory, populating the "user cache" by the way.
> 3. Some time later the "kernel cache" is flushed, writing the data to
> the memory.
> 4. Some new data from URB is entering "kernel cache".
> 5. The library will access mmap-ed area again, going via the "user
> cache", which may or may not reflect the correct data (depending on the
> fact was the "user cache" flushed or not) etc.
> 
> Of course this cycle is completely not deterministic, so some of the
> "kernel cache" lines will be flushed before being accessed from user
> space, other not... The final effect is... hmm... bizarre :-) First, of
> course, you will get a (probably loud) glitch (rubbish from buffer's
> underlying memory, before the first valid data is written back), and
> then something that could be described as an "echo" ;-) I mean - you are
> capturing "1 2 3 4 5 6 7..." and the result is (G stands for Glitch ;-)
> "G 1 23 3 4 4 6 67..."
> 
> As a quick-and-dirty work-around I have modified
> snd_pcm_alloc_vmalloc_buffer() to allocate always 12kB more and then use
> 16k-aligned pointer:
> 
> static int snd_pcm_alloc_vmalloc_buffer(struct snd_pcm_substream *subs, size_t size)
> {
>         struct snd_pcm_runtime *runtime = subs->runtime;
>         if (runtime->dma_addr) {
>                 if (runtime->dma_bytes >= size)
>                         return 0; /* already large enough */
>                 vfree((void *)runtime->dma_addr);
>         }
>         runtime->dma_addr = (unsigned long)vmalloc(size + 12 * 1024);
>         if (!runtime->dma_addr)
>                 return -ENOMEM;
>         runtime->dma_area = (void *)ALIGN(runtime->dma_addr, 16 * 1024);
>         runtime->dma_bytes = size;
>         return 0;
> }
> 
> Of course it cannot be regarded as anything more as a hack. And
> definitely not as a solution... So my question is:
> 
> Any idea how the "proper solution" should look like? I see no obvious
> method to get an arch-independent "mmap-compliant" buffer. This problem
> was found on a sh arch, using an USB Audio Class device, however out
> there some other architectures are suffering from this MMU "feature" as
> well (ie. see http://docs.hp.com/en/B3906-90006/ch07s09.html) and
> possibly other drivers could behave "wrongly" (quoted as the driver is
> actually innocent...)

Currently, it's PITA on Linux to get a non-cached mmap properly in a
generic way.  Especially together with coherent memory allocations,
there is no API for that.  But, in the case of usb-audio, the problem
is the vmaloc'ed intermediate buffer, and this could be a bit easier.

One thing we can try is a patch like below.  But, I'm not sure whether
this is correct over all architectures, too.  At best, a generic API
would be helpful for such a thing...

thanks,

Takashi

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 9cf81bc..cecf4b4 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3136,6 +3136,7 @@ static int snd_pcm_default_mmap(struct snd_pcm_substream *substream,
 				struct vm_area_struct *area)
 {
 	area->vm_ops = &snd_pcm_vm_ops_data;
+	area->vm_page_prot = pgprot_noncached(area->vm_page_prot);
 	area->vm_private_data = substream;
 	area->vm_flags |= VM_RESERVED;
 	atomic_inc(&substream->mmap_count);


More information about the Alsa-devel mailing list