[alsa-devel] [PATCH] ad1848 ac loop
New thread for the the final version of the patches for my fixes to the ad1848 auto-calibration loop. I made a couple suggested changes. These should be ok to apply, and should be independent of the other as yet un-applied patches that have been posted by Rene and Krzysztof.
I'm not an expert on this chip by any means, but from looking at the code I think there are few things that could be fixed.
The reg_lock spinlock isn't acquired from the irq handler, as the handler doesn't use any of the indirect registers. I think this means that is isn't necessary to use the irq disabling versions of the spin_lock functions with reg_lock.
It does look like there is a different SMP race condition wrt the irq handler.
On 09/19/2007 07:22 AM, Trent Piepho wrote:
New thread for the the final version of the patches for my fixes to the ad1848 auto-calibration loop. I made a couple suggested changes. These should be ok to apply, and should be independent of the other as yet un-applied patches that have been posted by Rene and Krzysztof.
I don't believe I have anything outstanding anymore. If these are applied, I'll place one more on top to clean up that printk mess in there.
First "simplest fix" patch acked as a necesary fix for a dumb msleep while atomic bug that was introduced (by me).
Second "simplify" patch acked both as a simplification and a bugfix for a race that was introduced in the conversion to msleep().
Rene.
On 09/19/2007 07:22 AM, Trent Piepho wrote:
I'm not an expert on this chip by any means, but from looking at the code I think there are few things that could be fixed.
I'd agree, although I'd suggest to at this point keep anything (in ad1848, cs4231 wants a similar setup in its mce_down) not completely trivial for after 1.0.15.
The reg_lock spinlock isn't acquired from the irq handler, as the handler doesn't use any of the indirect registers. I think this means that is isn't necessary to use the irq disabling versions of the spin_lock functions with reg_lock.
This would appear to be the case yes.
It does look like there is a different SMP race condition wrt the irq handler. From snd_ad1848_interrupt(): 578 if ((chip->mode & AD1848_MODE_PLAY) && chip->playback_substream && 579 (chip->mode & AD1848_MODE_RUNNING)) 580 snd_pcm_period_elapsed(chip->playback_substream);
Another CPU could close the substream between the check and the call to snd_pcm_period_elapsed(). snd_pcm_period_elapsed() could be called with either NULL or a stale substream pointer depending on how the code compiled. I'd think creating an irq spinlock would be an easy way to fix this. The irq handler would grab it, and the same with the open() and close() functions around the code that sets the substream pointers. Alternatively, one could disabled the irq handler during open and close.
Also sounds sensible, and the irq spinlock the expected method.
I think there might also be problem with setting mixers controls during auto-calibration . While waiting for auto-calibration to finish, the chip isn't locked. If another thread tries to set the volume via the mixer interface, it won't be blocked and will try to talk to the chip during AC. The datasheet just says to wait for auto-calibration to finish, it doesn't say what happens if you don't. So maybe there isn't any problem here.
Guess we should be testing this, but generallt, "hand-off" would be the best policy yes.
Rene.
On Tue, 18 Sep 2007 22:22:42 -0700 (PDT) Trent Piepho xyzzy@speakeasy.org wrote:
I'm not an expert on this chip by any means, but from looking at the code I think there are few things that could be fixed.
The reg_lock spinlock isn't acquired from the irq handler, as the handler doesn't use any of the indirect registers. I think this means that is isn't necessary to use the irq disabling versions of the spin_lock functions with reg_lock.
Normally, this is not true as at least counter registers are reloaded. This is not a case for the ad1848. However, quick test showed that playback is stopped from the irq handler if there is no more data (indirect register 9 write).
It does look like there is a different SMP race condition wrt the irq handler. From snd_ad1848_interrupt(): 578 if ((chip->mode & AD1848_MODE_PLAY) && chip->playback_substream && 579 (chip->mode & AD1848_MODE_RUNNING)) 580 snd_pcm_period_elapsed(chip->playback_substream);
Another CPU could close the substream between the check and the call to snd_pcm_period_elapsed(). snd_pcm_period_elapsed() could be called with either NULL or a stale substream pointer depending on how the code compiled. I'd think creating an irq spinlock would be an easy way to fix this. The irq handler would grab it, and the same with the open() and close() functions around the code that sets the substream pointers. Alternatively, one could disabled the irq handler during open and close.
I would not do this in the driver code. This should be handled in the alsa pcm layer and probably it is. At least the snd_pcm_period_elapsed() does locking and irq disabling. Could any alsa guru enlighten us on pcm open/close locking and interrupt disabling?
If it is handled in the pcm layer it is fixed for all drivers in one place.
I think there might also be problem with setting mixers controls during auto-calibration . While waiting for auto-calibration to finish, the chip isn't locked. If another thread tries to set the volume via the mixer interface, it won't be blocked and will try to talk to the chip during AC. The datasheet just says to wait for auto-calibration to finish, it doesn't say what happens if you don't. So maybe there isn't any problem here.
I would not do this until someone experience the problem. Otherwise, we will created the code to fix bugs which may not exist. Later, someone would try to simplify the code and nobody will remember why it was done.
The ad1848 calibrates only DAC/ADC so no mixer values are set. In the worst case, I expect some noise during auto-calibration, but I experience "pops" always during auto-calibration on the ad1848 (sc-6000 board). The only solution is to do the calibration only once during initialization than switch it off. Muting outputs does not help.
Regards, Krzysztof
On Wed, 19 Sep 2007, Krzysztof Helt wrote:
On Tue, 18 Sep 2007 22:22:42 -0700 (PDT) Trent Piepho xyzzy@speakeasy.org wrote:
I'm not an expert on this chip by any means, but from looking at the code I think there are few things that could be fixed.
The reg_lock spinlock isn't acquired from the irq handler, as the handler doesn't use any of the indirect registers. I think this means that is isn't necessary to use the irq disabling versions of the spin_lock functions with reg_lock.
Normally, this is not true as at least counter registers are reloaded. This is not a case for the ad1848. However, quick test showed that playback is stopped from the irq handler if there is no more data (indirect register 9 write).
Hmm, I don't see where that happens. Unless snd_pcm_period_elapsed() is allowed to call the trigger callback? I know it will call the pointer callback, which doesn't use any indirect registers.
I know that trigger, ack, and pointer are called with interrupts disabled and a spinlock held. But that is not the same as being called from interrupt context. AFAIK, only pointer via snd_pcm_period_elapsed() will be called from interrupt context.
It does look like there is a different SMP race condition wrt the irq handler. From snd_ad1848_interrupt(): 578 if ((chip->mode & AD1848_MODE_PLAY) && chip->playback_substream && 579 (chip->mode & AD1848_MODE_RUNNING)) 580 snd_pcm_period_elapsed(chip->playback_substream);
Another CPU could close the substream between the check and the call to snd_pcm_period_elapsed(). snd_pcm_period_elapsed() could be called with either NULL or a stale substream pointer depending on how the code compiled. I'd think creating an irq spinlock would be an easy way to fix this. The irq handler would grab it, and the same with the open() and close() functions around the code that sets the substream pointers. Alternatively, one could disabled the irq handler during open and close.
I would not do this in the driver code. This should be handled in the alsa pcm layer and probably it is. At least the snd_pcm_period_elapsed() does locking and irq disabling. Could any alsa guru enlighten us on pcm open/close locking and interrupt disabling?
I don't see how the alsa pcm layer could lock around the interrupt handler. The handler is registered directly to the kernel, not via alsa, so there's no way the alsa pcm layer could know if the irq handler is running, or if one even exists.
On Wed, 19 Sep 2007 12:50:54 -0700 (PDT) Trent Piepho xyzzy@speakeasy.org wrote:
On Wed, 19 Sep 2007, Krzysztof Helt wrote:
However, quick test showed that playback is stopped from the irq handler if there is no more data (indirect register 9 write).
Hmm, I don't see where that happens. Unless snd_pcm_period_elapsed() is allowed to call the trigger callback? I know it will call the pointer callback, which doesn't use any indirect registers.
I know that trigger, ack, and pointer are called with interrupts disabled and a spinlock held. But that is not the same as being called from interrupt context.
I do not understand these two sentences so probably you are right. I can provide you backtrace how the snd_ad1848_trigger() is called:
... snd_ad1848_trigger snd_ad1848_playback_trigger snd_pcm_do_stop snd_pcm_action_single snd_pcm_period_elapsed scheduler_tick run_timer_softirq snd_ad1848_interrupt ... do_IRQ ...
Another CPU could close the substream between the check and the call to snd_pcm_period_elapsed(). snd_pcm_period_elapsed() could be called with either NULL or a stale substream pointer depending on how the code compiled. I'd think creating an irq spinlock would be an easy way to fix this. The irq handler would grab it, and the same with the open() and close() functions around the code that sets the substream pointers. Alternatively, one could disabled the irq handler during open and close.
I would not do this in the driver code. This should be handled in the alsa pcm layer and probably it is. At least the snd_pcm_period_elapsed() does locking and irq disabling. Could any alsa guru enlighten us on pcm open/close locking and interrupt disabling?
I don't see how the alsa pcm layer could lock around the interrupt handler. The handler is registered directly to the kernel, not via alsa, so there's no way the alsa pcm layer could know if the irq handler is running, or if one even exists.
The pcm layer may disable interrupts. IMO, this should be enough to stop entering any interrupt handler. I am not expert on this but following your reasoning any core kernel code (vm, scheduler) could not be interrupt-safe because most of interrupt handlers are in drivers and the core has no idea that drivers exist.
Regards, Krzysztof
On Wed, 19 Sep 2007, Krzysztof Helt wrote:
On Wed, 19 Sep 2007 12:50:54 -0700 (PDT) Trent Piepho xyzzy@speakeasy.org wrote:
On Wed, 19 Sep 2007, Krzysztof Helt wrote:
However, quick test showed that playback is stopped from the irq handler if there is no more data (indirect register 9 write).
Hmm, I don't see where that happens. Unless snd_pcm_period_elapsed() is allowed to call the trigger callback? I know it will call the pointer callback, which doesn't use any indirect registers.
I know that trigger, ack, and pointer are called with interrupts disabled and a spinlock held. But that is not the same as being called from interrupt context.
I do not understand these two sentences so probably you are right. I can provide you backtrace how the snd_ad1848_trigger() is called:
You're right, it looks like trigger can be called from the irq handler. It looks like this: irqhander snd_pcm_period_elapsed snd_pcm_update_hw_ptr_interrupt snd_pcm_update_hw_ptr_pos substream->ops->pointer(substream) snd_pcm_update_hw_ptr_post snd_pcm_playback_avail snd_pcm_drain_done snd_pcm_action_single(&snd_pcm_action_stop) ops->do_action = snd_pcm_do_stop substream->ops->trigger(substream, SNDRV_PCM_TRIGGER_STOP)
Another CPU could close the substream between the check and the call to snd_pcm_period_elapsed(). snd_pcm_period_elapsed() could be called with either NULL or a stale substream pointer depending on how the code compiled. I'd think creating an irq spinlock would be an easy way to fix this. The irq handler would grab it, and the same with the open() and close() functions around the code that sets the substream pointers. Alternatively, one could disabled the irq handler during open and close.
I would not do this in the driver code. This should be handled in the alsa pcm layer and probably it is. At least the snd_pcm_period_elapsed() does locking and irq disabling. Could any alsa guru enlighten us on pcm open/close locking and interrupt disabling?
I don't see how the alsa pcm layer could lock around the interrupt handler. The handler is registered directly to the kernel, not via alsa, so there's no way the alsa pcm layer could know if the irq handler is running, or if one even exists.
The pcm layer may disable interrupts. IMO, this should be enough to stop entering any interrupt handler.
The pcm layer can't disable interrupts before calling _any_ of the pcm ops. When interrupts are disabled, there are things you can't do, like sleep or call vmalloc. Only the pointer, trigger, and ack pcm ops have interrupts disabled.
It's also worth noting that only local interrupts are disabled, not global interrupts. Absent a spin-lock, an SMP system can still have an irq handler run while another thread on another CPU has interupts disabled.
I am not expert on this but following your reasoning any core kernel code (vm, scheduler) could not be interrupt-safe because most of interrupt handlers are in drivers and the core has no idea that drivers exist.
None of the variables used by drivers' interrupt handlers are modified by the core kernel code. In this case, chip->playback_substream and chip->record_substream are the relevant variables. They are used non-atomically from the irq handler, so any code which modifies them must be protected from the irq handler.
On 9/20/07, Trent Piepho xyzzy@speakeasy.org wrote:
It's also worth noting that only local interrupts are disabled, not global interrupts. Absent a spin-lock, an SMP system can still have an irq handler run while another thread on another CPU has interupts disabled.
I am not expert on this but following your reasoning any core kernel code (vm, scheduler) could not be interrupt-safe because most of interrupt handlers are in drivers and the core has no idea that drivers exist.
None of the variables used by drivers' interrupt handlers are modified by the core kernel code. In this case, chip->playback_substream and chip->record_substream are the relevant variables. They are used non-atomically from the irq handler, so any code which modifies them must be protected from the irq handler.
Most ALSA drivers would not enable the device interrupt until trigger START, and disable them in trigger STOP. So your interrupt should never be enabled during open()/close().
Lee
On Thu, 20 Sep 2007, Lee Revell wrote:
On 9/20/07, Trent Piepho xyzzy@speakeasy.org wrote:
It's also worth noting that only local interrupts are disabled, not global interrupts. Absent a spin-lock, an SMP system can still have an irq handler run while another thread on another CPU has interupts disabled.
I am not expert on this but following your reasoning any core kernel code (vm, scheduler) could not be interrupt-safe because most of interrupt handlers are in drivers and the core has no idea that drivers exist.
None of the variables used by drivers' interrupt handlers are modified by the core kernel code. In this case, chip->playback_substream and chip->record_substream are the relevant variables. They are used non-atomically from the irq handler, so any code which modifies them must be protected from the irq handler.
Most ALSA drivers would not enable the device interrupt until trigger START, and disable them in trigger STOP. So your interrupt should never be enabled during open()/close().
Multiple substreams per device. One could call open or close on one substream while another is active.
Just because one has told the device to stop generating interrupts, doesn't mean there isn't a pending interrupt already being handled. That's what synchronize_irq() is for.
At Tue, 18 Sep 2007 22:22:42 -0700 (PDT), Trent Piepho wrote:
New thread for the the final version of the patches for my fixes to the ad1848 auto-calibration loop. I made a couple suggested changes. These should be ok to apply, and should be independent of the other as yet un-applied patches that have been posted by Rene and Krzysztof.
OK, I merged them to HG tree now. Thanks.
I'm not an expert on this chip by any means, but from looking at the code I think there are few things that could be fixed.
The reg_lock spinlock isn't acquired from the irq handler, as the handler doesn't use any of the indirect registers. I think this means that is isn't necessary to use the irq disabling versions of the spin_lock functions with reg_lock.
It does look like there is a different SMP race condition wrt the irq handler.
From snd_ad1848_interrupt():
578 if ((chip->mode & AD1848_MODE_PLAY) && chip->playback_substream && 579 (chip->mode & AD1848_MODE_RUNNING)) 580 snd_pcm_period_elapsed(chip->playback_substream);
Another CPU could close the substream between the check and the call to snd_pcm_period_elapsed(). snd_pcm_period_elapsed() could be called with either NULL or a stale substream pointer depending on how the code compiled. I'd think creating an irq spinlock would be an easy way to fix this. The irq handler would grab it, and the same with the open() and close() functions around the code that sets the substream pointers. Alternatively, one could disabled the irq handler during open and close.
Yes, it'd be an easy solution.
I think there might also be problem with setting mixers controls during auto-calibration . While waiting for auto-calibration to finish, the chip isn't locked. If another thread tries to set the volume via the mixer interface, it won't be blocked and will try to talk to the chip during AC. The datasheet just says to wait for auto-calibration to finish, it doesn't say what happens if you don't. So maybe there isn't any problem here.
Hm, this may happen, too.
Anyway, let's do more cleanups after 1.0.15 release.
Takashi
participants (5)
-
Krzysztof Helt
-
Lee Revell
-
Rene Herman
-
Takashi Iwai
-
Trent Piepho