[alsa-devel] paravirtualized alsa kernel driver for XEN
Hi,
I am Stefano Panella, I am new to the list and I would like to take the opportunity to ask some questions since I am trying to write a paravirtualized alsa driver for XEN.
If all goes well I would also like to upstream it on linux.
I have been reading the documentation on "Writing an ALSA Driver" and I am still not completely clear on the meaning of the "pointer" callback in the pcm operations.
The description say:
"This callback is called when the PCM middle layer inquires the current hardware position on the buffer."
My question are:
1) In case of a playback stream, is the pointer referring to wich sample is currently playing on the DAC or to which is it the last frame read by the HW from the alsa memory buffer?
2) What does the pointer mean in case of a capture stream? Is it the position of the current frame on the ADC or is the latest frame written into the alsa buffer?
3) in case it is the frame on the DAC/ADC, what happens if the callback does not return the real DAC/ADC frame position but an approximate value, let say rounded to 64 frames only?
4) is there any test I could run to check I have implemented correctly the "pointer" callback? Or any application which would need very high "pointer" precision like frame precision?
Thanks very much in advance,
Stefano
On 03/19/2012 06:15 PM, Stefano Panella wrote:
Hi,
I am Stefano Panella, I am new to the list and I would like to take the opportunity to ask some questions since I am trying to write a paravirtualized alsa driver for XEN.
If all goes well I would also like to upstream it on linux.
I have been reading the documentation on "Writing an ALSA Driver" and I am still not completely clear on the meaning of the "pointer" callback in the pcm operations.
The description say:
"This callback is called when the PCM middle layer inquires the current hardware position on the buffer."
My question are:
- In case of a playback stream, is the pointer referring to wich sample
is currently playing on the DAC or to which is it the last frame read by the HW from the alsa memory buffer?
- What does the pointer mean in case of a capture stream? Is it the
position of the current frame on the ADC or is the latest frame written into the alsa buffer?
I'd say that for both, it is being used by applications to know what memory they can read from or write to. But other people here might know better.
- in case it is the frame on the DAC/ADC, what happens if the callback
does not return the real DAC/ADC frame position but an approximate value, let say rounded to 64 frames only?
For the JACK sound server, I think it only needs to be as accurate as the period (i e if you have 4 periods with 64 samples each, you need to be able to return 0, 64, 128 and 192).
For PulseAudio it's worse. The worse granularity, the more difficult for PulseAudio to have low-latency operation. PulseAudio also rewinds/rewrites the buffer occasionally and uses the pointer to know from when it should start rewriting.
- is there any test I could run to check I have implemented correctly
the "pointer" callback? Or any application which would need very high "pointer" precision like frame precision?
PulseAudio has an alsa-time-test application that relies heavily on the pointer callback being accurate. It's only for playback and I've never used it myself so I'm not completely sure about how to interpret the numbers.
In general, I believe PulseAudio (especially with timer-scheduling mode enabled) stress tests the driver quite hard and as such it is sometimes being used as a measure to see if the audio driver is successful. :-)
Hopefully this provides some initial insights.
Hi,
Thanks very much for your quick, detailed and useful response.
On 20/03/12 09:52, David Henningsson wrote:
On 03/19/2012 06:15 PM, Stefano Panella wrote:
Hi,
I am Stefano Panella, I am new to the list and I would like to take the opportunity to ask some questions since I am trying to write a paravirtualized alsa driver for XEN.
If all goes well I would also like to upstream it on linux.
I have been reading the documentation on "Writing an ALSA Driver" and I am still not completely clear on the meaning of the "pointer" callback in the pcm operations.
The description say:
"This callback is called when the PCM middle layer inquires the current hardware position on the buffer."
My question are:
- In case of a playback stream, is the pointer referring to wich sample
is currently playing on the DAC or to which is it the last frame read by the HW from the alsa memory buffer?
- What does the pointer mean in case of a capture stream? Is it the
position of the current frame on the ADC or is the latest frame written into the alsa buffer?
I'd say that for both, it is being used by applications to know what memory they can read from or write to. But other people here might know better.
- in case it is the frame on the DAC/ADC, what happens if the callback
does not return the real DAC/ADC frame position but an approximate value, let say rounded to 64 frames only?
For the JACK sound server, I think it only needs to be as accurate as the period (i e if you have 4 periods with 64 samples each, you need to be able to return 0, 64, 128 and 192).
For PulseAudio it's worse. The worse granularity, the more difficult for PulseAudio to have low-latency operation. PulseAudio also rewinds/rewrites the buffer occasionally and uses the pointer to know from when it should start rewriting.
- is there any test I could run to check I have implemented correctly
the "pointer" callback? Or any application which would need very high "pointer" precision like frame precision?
PulseAudio has an alsa-time-test application that relies heavily on the pointer callback being accurate. It's only for playback and I've never used it myself so I'm not completely sure about how to interpret the numbers.
I will try this for sure.
In general, I believe PulseAudio (especially with timer-scheduling mode enabled) stress tests the driver quite hard and as such it is sometimes being used as a measure to see if the audio driver is successful. :-)
Hopefully this provides some initial insights.
Yes
I will let you know how this work is progressing
Thanks again,
Stefano
Stefano Panella wrote:
I have been reading the documentation on "Writing an ALSA Driver" and I am still not completely clear on the meaning of the "pointer" callback in the pcm operations.
- In case of a playback stream, is the pointer referring to wich sample
is currently playing on the DAC or to which is it the last frame read by the HW from the alsa memory buffer?
It's the position of the first frame not yet read from the memory buffer.
The delay between the DMA and the DAC output would be reported by adjusting runtime->delay, but drivers usually do not bother to do this, except when this delay becomes rather large because of additional queueing, e.g., in the USB driver.
- is there any test I could run to check I have implemented correctly
the "pointer" callback? Or any application which would need very high "pointer" precision like frame precision?
PulseAudio. Or run mplayer and look at the A-V value in the status line.
Regards, Clemens
Hi,
Thanks for taking the time to answer about this detail.
On 20/03/12 13:10, Clemens Ladisch wrote:
Stefano Panella wrote:
I have been reading the documentation on "Writing an ALSA Driver" and I am still not completely clear on the meaning of the "pointer" callback in the pcm operations.
- In case of a playback stream, is the pointer referring to wich sample
is currently playing on the DAC or to which is it the last frame read by the HW from the alsa memory buffer?
It's the position of the first frame not yet read from the memory buffer.
The delay between the DMA and the DAC output would be reported by adjusting runtime->delay, but drivers usually do not bother to do this, except when this delay becomes rather large because of additional queueing, e.g., in the USB driver.
ok, I was wondering about applications like skype, trying to do some background noise cancellation, or to eliminate echo when using speakers. I was thinking in this case the delay should be accurate for Playback and Capture as well, or am I wrong?
- is there any test I could run to check I have implemented correctly
the "pointer" callback? Or any application which would need very high "pointer" precision like frame precision?
PulseAudio. Or run mplayer and look at the A-V value in the status line.
I will try these as well.
Regards, Clemens
In case this pv XEN alsa driver will start to work properly, where should I post the patches? Who is currently the linux kernel alsa driver mantainer?
Thanks again,
Stefano.
Stefano Panella wrote:
On 20/03/12 13:10, Clemens Ladisch wrote:
The delay between the DMA and the DAC output would be reported by adjusting runtime->delay, but drivers usually do not bother to do this, except when this delay becomes rather large because of additional queueing, e.g., in the USB driver.
ok, I was wondering about applications like skype, trying to do some background noise cancellation, or to eliminate echo when using speakers. I was thinking in this case the delay should be accurate for Playback and Capture as well, or am I wrong?
If the sound data is regularly moved from the VM's buffer to the host's buffer, then the additional latency of the host is big enough that it's worth reporting.
If you map the host's buffer into the VM's address space, there is no additional latency, but I don't know if this is feasible. If not, you could also use the pcm_ops.copy callback to copy the data from the VM to the host as soon as the application writes it.
Who is currently the linux kernel alsa driver mantainer?
See "SOUND" in the MAINTAINERS file.
Regards, Clemens
On 21/03/12 13:37, Clemens Ladisch wrote:
Stefano Panella wrote:
On 20/03/12 13:10, Clemens Ladisch wrote:
The delay between the DMA and the DAC output would be reported by adjusting runtime->delay, but drivers usually do not bother to do this, except when this delay becomes rather large because of additional queueing, e.g., in the USB driver.
ok, I was wondering about applications like skype, trying to do some background noise cancellation, or to eliminate echo when using speakers. I was thinking in this case the delay should be accurate for Playback and Capture as well, or am I wrong?
If the sound data is regularly moved from the VM's buffer to the host's buffer, then the additional latency of the host is big enough that it's worth reporting.
If you map the host's buffer into the VM's address space, there is no additional latency, but I don't know if this is feasible. If not, you could also use the pcm_ops.copy callback to copy the data from the VM to the host as soon as the application writes it.
My PV alsa driver is allocating some non contiguous pages for the audio buffer and the get_page callback is called to retrieve the position of every page. These pages are also mapped from the backend in dom0.
The backend in dom0 is a userspace process using portaudio on top of alsa. The process is running in realtime priority and is calling a callback to feed data every 64 frames. In the callback I copy 64 frames from the shared pages to the portaudio buffer and update the HW pointer in an other shared page accordingly.
Would it be possible to run the backend in dom0 kernel space and to use my shared pages from the alsa-driver in the VM as real pages for the HW instead to go all the way from userspace->portaudio->alsalib->alsa-kernel-layer->Real-HW ?
Who is currently the linux kernel alsa driver mantainer?
See "SOUND" in the MAINTAINERS file.
OK
Regards, Clemens
Thanks again very much and regards,
Stefano
Stefano Panella wrote:
The backend in dom0 is a userspace process using portaudio on top of alsa. The process is running in realtime priority and is calling a callback to feed data every 64 frames. In the callback I copy 64 frames from the shared pages to the portaudio buffer and update the HW pointer in an other shared page accordingly.
If PortAudio requires to use a callback, this is the only algorithm that you can implement.
Would it be possible to run the backend in dom0 kernel space and to use my shared pages from the alsa-driver in the VM as real pages for the HW?
No, the buffer pages are always under the control of and allocated by the actual driver (some devices might have special requirements, or do not support mmap at all).
If the actual driver support mmap, you would have to map these dom0 pages into the VM, and I guess this is not possible. (?)
As far as I can see, you have two options:
1) Stay with the current algorithm. You get an additional latency corresponding to dom0's buffer size, and your process is forced to wake up every 64 frames (or whatever PortAudio is configured for).
2) Replace PortAudio with ALSA in the backend, and implement the copy callback in your driver. Any call of snd_pcm_write*() in the VM will result in one or more calls to your driver's copy(), which should end up as a call to snd_pcm_write*() in dom0. (Using the copy callback also implies that the driver does not support mmap.)
Regards, Clemens
Hi,
On 22/03/12 12:09, Clemens Ladisch wrote:
Stefano Panella wrote:
The backend in dom0 is a userspace process using portaudio on top of alsa. The process is running in realtime priority and is calling a callback to feed data every 64 frames. In the callback I copy 64 frames from the shared pages to the portaudio buffer and update the HW pointer in an other shared page accordingly.
If PortAudio requires to use a callback, this is the only algorithm that you can implement.
Would it be possible to run the backend in dom0 kernel space and to use my shared pages from the alsa-driver in the VM as real pages for the HW?
No, the buffer pages are always under the control of and allocated by the actual driver (some devices might have special requirements, or do not support mmap at all).
If the actual driver support mmap, you would have to map these dom0 pages into the VM, and I guess this is not possible. (?)
As far as I can see, you have two options:
Stay with the current algorithm. You get an additional latency corresponding to dom0's buffer size, and your process is forced to wake up every 64 frames (or whatever PortAudio is configured for).
Replace PortAudio with ALSA in the backend, and implement the copy callback in your driver. Any call of snd_pcm_write*() in the VM will result in one or more calls to your driver's copy(), which should end up as a call to snd_pcm_write*() in dom0. (Using the copy callback also implies that the driver does not support mmap.)
I am now switching to write the backend using alsa + dmix + dsnoop directley since I did not manage to get information about precise timing in .
I would like echo cancellation to work in the guest but I am getting a bit lost and I am not certain anymore which would be the best approch.
I have available for sure:
1) a set of shared pages, used for: - playback buffer - capture buffer - general purpose commands/events/flags
2) a way to send notification from the backend resulting in an interrupt in the guest pv alsa kernel driver. - I use it at the moment to send a period interrupt
3) a way to have a callback executed in the backend code triggered from the guest alsa driver, like as I use for: -prepare -trigger -open -close
My question is now:
To make echo cancellation work in the guest the guest needs to know exactly which sample in the buffer is going in and out, right?
How can I design my alsa backend userspace program and the pv alsa kernel driver for the guest to do this?
Is there any echo cancellation test program to check pv alsa driver in the guest + backend alsa app in dom0 are working well without douing a skype call every time?
What would be very critical in the guest pv alsa driver? - Should the first audio sample be played/captured right after the trigger or it can happen at later time (runtime->delay maybe could be setted)? - should the pointer callback reflect the sample being currently played/captured? - what should the pointer callback return in case of a fixed delay in playback and capture? - do I need an interrupt every period triggered from the backend or can I use the hrtimer as in the dummy sound card example? -in order to only use the copy callback, and remove mmap support, what should I do?
On the backend application side: - how using dmix, dsnoop, asym plugins affects sample position accuracy using snd_pcm_delay, snd_pcm_avail, snd_pcm_status_get_delay ? - how could I be woken up exactly every period using dmix, dsnoop, asym plugins?
Sorry for these many questions but I have been doing experiments so far and all is working well but echo cancellation and I would like to know if I completely misunderstood something.
Thanks for all your patience,
Regards,
Stefano
Regards, Clemens
Stefano Panella wrote:
To make echo cancellation work in the guest the guest needs to know exactly which sample in the buffer is going in and out, right?
Yes. (Although many drivers don't bother to report the delay, and Skype works anyway, but that's probably because the delay isn't too big anyway.)
How can I design my alsa backend userspace program and the pv alsa kernel driver for the guest to do this?
Report the correct delay, i.e., in the pv pointer callback, call snd_pcm_status() in the backend to get both DMA position and delay, and compute from these the delay to set in runtime->delay.
(Well, I don't know if Skype actually uses the delay value ...)
Is there any echo cancellation test program
None that I know of.
What would be very critical in the guest pv alsa driver?
- Should the first audio sample be played/captured right after the trigger or it can happen at later time
In your pv trigger callback, you should ultimately call the backend trigger, i.e., snd_pcm_start() etc.
How fast the first sample is played depends on the host sound driver (and the latency of your pv->dom0 call).
(runtime->delay maybe could be setted)?
You do not know if there is a constant delay or if it can change (as with, e.g., the USB driver), so you should recompute the delay in the pointer callback.
- should the pointer callback reflect the sample being currently played/captured?
No, it's the DMA position, i.e., the position in the buffer where the application can read or write data.
- what should the pointer callback return in case of a fixed delay in playback and capture?
The callback's return value doesn't depend on the delay. However, if the delay changes, it must adjust runtime->delay.
- do I need an interrupt every period triggered from the backend or can I use the hrtimer as in the dummy sound card example?
The interrupts (i.e., the snd_pcm_period_elapsed() calls) must be synchronized with the sample clock. A period trigger is a guarantee from the driver to the ALSA framework that 1) (at least) one period has elapsed since the last call, and 2) all the data in the period has been played, so the application can overwrite it with new data (i.e., the return value of the pointer callback is now at least at the end of the period).
The dummy driver uses a timer because there is no other clock, and what is in the buffer doesn't actually matter for it.
You must ensure that a period_elapsed call in the pv driver does not happen earlier than the corresponding period interrupt in the backend; the easiest way to do this would be to connect them.
-in order to only use the copy callback, and remove mmap support, what should I do?
Er, implement the copy callback (copy the indicated data from userspace into the shared buffer, or the other way around), and drop the SNDRV_PCM_INFO_MMAP* flags. And you don't need to allocate a buffer in the pv driver.
On the backend application side:
- how using dmix, dsnoop, asym plugins affects sample position accuracy using snd_pcm_delay, snd_pcm_avail, snd_pcm_status_get_delay ?
The asym plugin just instantiates another plugin.
The dmix and dsnoop plugins just pass through snd_pcm_avail, but AFAICS they do not report the delay at all.
- how could I be woken up exactly every period using dmix, dsnoop, asym plugins?
The dmix/dsnoop plugins get their interrupts from the base device; they are woken up normally like any other type of device.
all is working well but echo cancellation
Try with hw instead of dmix. If that works, it would imply that echo cancellation does not work with an unvirtualized dmix either, but I don't know what device Skype actually opens.
Regards, Clemens
participants (3)
-
Clemens Ladisch
-
David Henningsson
-
Stefano Panella