[alsa-devel] snd_pcsp locking mess

newer
[alsa-devel] Nvidia hdmi problems

older
[alsa-devel] [PATCH1/2] hda: add...

Stas Sergeev

18 May 2008 18 May '08

9:20 p.m.

bugme-daemon@bugzilla.kernel.org wrote:

...

------- Comment #17 from tiwai@suse.de 2008-05-18 10:22 ------- Anyway, what I meant is that the part touching I/O ports could be a faster path than softirq. And the hrtimer code can use a bit simpler locks. The hrtimer callback needs just one thing about DMA buffer - the buffer is allocated and the address doesn't change during the callback. So, we'd need the lock for the hr callback and for the PCM prepare, hwparams and free callbacks, i.e. to assure that the hrtimer callback is finished when these PCM callbacks may change the DMA buffer. Thus this lock doesn't have to be PCM substream lock.

I wonder if it would be better to introduce such a lock in an alsa core rather than in a driver. The same way as snd_pcm_stream_lock().

...

The current lock mess may come from the PCM trigger (STOP) called from snd_pcm_period_elapsed(). So, putting snd_pcm_period_elapsed() out of hrtimer lock is anyway necessary. A tasklet for snd_pcm_period_elapsed() is needed for this reason, too.

Wouldn't it be better to just make snd_pcm_period_elapsed() to not take the stream lock? There is a requirement right now to drop the stream lock before calling this. I still beleive this is the root of all the headache. If you need to drop it, then you either accept the race or need a second lock, and the troubles ensue.

Show replies by date

Takashi Iwai

19 May 19 May

8:50 a.m.

At Sun, 18 May 2008 22:20:43 +0400, Stas Sergeev wrote:

...

bugme-daemon@bugzilla.kernel.org wrote:

...
------- Comment #17 from tiwai@suse.de 2008-05-18 10:22 ------- Anyway, what I meant is that the part touching I/O ports could be a faster path than softirq. And the hrtimer code can use a bit simpler locks. The hrtimer callback needs just one thing about DMA buffer - the buffer is allocated and the address doesn't change during the callback. So, we'd need the lock for the hr callback and for the PCM prepare, hwparams and free callbacks, i.e. to assure that the hrtimer callback is finished when these PCM callbacks may change the DMA buffer. Thus this lock doesn't have to be PCM substream lock.

I wonder if it would be better to introduce such a lock in an alsa core rather than in a driver. The same way as snd_pcm_stream_lock().

I don't think it's good to introduce yet another lock in the PCM core. More lock, more complex. Even now the lock handling is too complex (and might be buggy in some cases) in ALSA PCM core.

I think, however, it'd be worth to consider snd_pcm_period_elapsed() to be softirq in PCM core so that the driver doesn't need initializations. This will also reduce the spin unlock/re-lock procedure in IRQ handler around snd_pcm_perild_elapsed().

...

...
The current lock mess may come from the PCM trigger (STOP) called from snd_pcm_period_elapsed(). So, putting snd_pcm_period_elapsed() out of hrtimer lock is anyway necessary. A tasklet for snd_pcm_period_elapsed() is needed for this reason, too.

Wouldn't it be better to just make snd_pcm_period_elapsed() to not take the stream lock? There is a requirement right now to drop the stream lock before calling this. I still beleive this is the root of all the headache.

No. The work done by snd_pcm_period_elapsed() touches critical sections and is delicated enough to protect by a lock.

The root of all the headache is the order of the spinlock. It's just one lock in the PCM; and you must take it somewhere to protect.

A typical solution for this kind of problem is to use a global lock on the top. But, in your case, we can't -- because hrtimer doesn't want such (of course). That's why snd_pcm_period_elapsed() is better put out to a tasklet in your case. Also, this will gives two advantages: no spinlock mess and the reduction of too long codepath in hrtimer callback.

...

If you need to drop it, then you either accept the race or need a second lock, and the troubles ensue.

Takashi

Stas Sergeev

8:01 p.m.

Hello.

Takashi Iwai wrote:

...

...
...
Anyway, what I meant is that the part touching I/O ports could be a faster path than softirq. And the hrtimer code can use a bit simpler locks. The hrtimer callback needs just one thing about DMA buffer - the buffer is allocated and the address doesn't change during the callback. So, we'd need the lock for the hr callback and for the PCM prepare, hwparams and free callbacks, i.e. to assure that the hrtimer callback is finished when these PCM callbacks may change the DMA buffer. Thus this lock doesn't have to be PCM substream lock.

I wonder if it would be better to introduce such a lock in an alsa core rather than in a driver. The same way as snd_pcm_stream_lock().

I don't think it's good to introduce yet another lock in the PCM core. More lock, more complex. Even now the lock handling is too complex (and might be buggy in some cases) in ALSA PCM core.

Then I might be missing something. Right now I touch runtime->dma_area[] in a callback. You say it have to be protected, but not with alsa lock, but with my own lock. But I do not touch it ever in any other place. Everything else happens in an alsa core. So how would I possibly protect it with some lock private to snd-pcsp?

...

I think, however, it'd be worth to consider snd_pcm_period_elapsed() to be softirq in PCM core so that the driver doesn't need initializations. This will also reduce the spin unlock/re-lock procedure in IRQ handler around snd_pcm_perild_elapsed().

I guess it will help, but such a solution has a tendency of being classified as a "stupid hack", which you've already seen recently. :) There really must be other ways, that do not carry such a risk.

...

...
Wouldn't it be better to just make snd_pcm_period_elapsed() to not take the stream lock? There is a requirement right now to drop the stream lock before calling this. I still beleive this is the root of all the headache.

No. The work done by snd_pcm_period_elapsed() touches critical sections and is delicated enough to protect by a lock.

I didn't mean it should run without a lock. But it shouldn't take it itself, allowing the caller to _not drop_ it before calling. Is this acceptable?

...

The root of all the headache is the order of the spinlock. It's just one lock in the PCM; and you must take it somewhere to protect.

Certainly, I take it. But the unfortunate policy is that I have to drop it before calling snd_pcm_period_elapsed(). Can we change that? Of course, that will mean more changes than just to not take the lock in snd_pcm_period_elapsed(). Most likely the snd_pcm_stream_lock() and all its users will have to be slightly adjusted, but I think its pretty much a trivial changes, which will lead to a much cleaner code. And I remember the last time we discussed this, you admitted (some of) the other drivers do get this part wrong too. So fixing it properly once and for all might be a good idea as you are at it again.

Takashi Iwai

21 May 21 May

3:33 p.m.

At Mon, 19 May 2008 21:01:20 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
...
...
Anyway, what I meant is that the part touching I/O ports could be a faster path than softirq. And the hrtimer code can use a bit simpler locks. The hrtimer callback needs just one thing about DMA buffer - the buffer is allocated and the address doesn't change during the callback. So, we'd need the lock for the hr callback and for the PCM prepare, hwparams and free callbacks, i.e. to assure that the hrtimer callback is finished when these PCM callbacks may change the DMA buffer. Thus this lock doesn't have to be PCM substream lock.

I wonder if it would be better to introduce such a lock in an alsa core rather than in a driver. The same way as snd_pcm_stream_lock().

I don't think it's good to introduce yet another lock in the PCM core. More lock, more complex. Even now the lock handling is too complex (and might be buggy in some cases) in ALSA PCM core.

Then I might be missing something. Right now I touch runtime->dma_area[] in a callback. You say it have to be protected, but not with alsa lock, but with my own lock. But I do not touch it ever in any other place. Everything else happens in an alsa core. So how would I possibly protect it with some lock private to snd-pcsp?

The only thing that we make sure is that runtime->dma_area is present and not changed during the hrtimer callback. Since this callback is called only when the stream is started and running, the only places we check are callbacks that may change the DMA buffer, namely, hw_params and hw_free. These should sync with hrtimer callback so that it's no longer called before actually changing the buffer.

In this scenario, the trigger-stop could be asynchronous. Thus, the prepare PCM callback should sync also with the hrtimer callback to assure that the pending callback is finished.

...

...
I think, however, it'd be worth to consider snd_pcm_period_elapsed() to be softirq in PCM core so that the driver doesn't need initializations. This will also reduce the spin unlock/re-lock procedure in IRQ handler around snd_pcm_perild_elapsed().

I guess it will help, but such a solution has a tendency of being classified as a "stupid hack", which you've already seen recently. :)

No, it's totally different. The purpose of moving snd_pcm_period_elapsed() to a tasklet is completely valid, just like what we did for bottom-halves -- splitting fast and slow paths. Using a tasklet for starting the stream is really a dirty hack.

...

There really must be other ways, that do not carry such a risk.

...
...
Wouldn't it be better to just make snd_pcm_period_elapsed() to not take the stream lock? There is a requirement right now to drop the stream lock before calling this. I still beleive this is the root of all the headache.

No. The work done by snd_pcm_period_elapsed() touches critical sections and is delicated enough to protect by a lock.

I didn't mean it should run without a lock. But it shouldn't take it itself, allowing the caller to _not drop_ it before calling. Is this acceptable?

Not yet. Some callbacks are supposed to be called outside the lock. But, yes, it's worth to consider in future.

...

...
The root of all the headache is the order of the spinlock. It's just one lock in the PCM; and you must take it somewhere to protect.

Certainly, I take it. But the unfortunate policy is that I have to drop it before calling snd_pcm_period_elapsed(). Can we change that?

The substream lock can be removed from hrtimer callback gracefully. So, you have no longer the issue with this lock/unlock procedure. In hrtimer callback, you just need to have the DMA-buffer. That can be protected without the substream lock, too, I believe.

...

Of course, that will mean more changes than just to not take the lock in snd_pcm_period_elapsed(). Most likely the snd_pcm_stream_lock() and all its users will have to be slightly adjusted, but I think its pretty much a trivial changes, which will lead to a much cleaner code. And I remember the last time we discussed this, you admitted (some of) the other drivers do get this part wrong too. So fixing it properly once and for all might be a good idea as you are at it again.

I don't agree here unless any code is shown :)

thanks,

Takashi

Stas Sergeev

22 May 22 May

11:28 p.m.

Hello.

Takashi Iwai wrote:

...

The substream lock can be removed from hrtimer callback gracefully. So, you have no longer the issue with this lock/unlock procedure. In hrtimer callback, you just need to have the DMA-buffer. That can be protected without the substream lock, too, I believe.

OK. But still, the substream have to be protected against the close callback. So I still need 2 locks. One for the substream, another for the DMA buffer. That makes things better but not much. I don't want 2 locks in a softirq callback, even if they are independant. Or am I still missing your point?

...

In this scenario, the trigger-stop could be asynchronous. Thus, the prepare PCM callback should sync also with the hrtimer callback to assure that the pending callback is finished.

OK, so I need 3 locks after all - adding one for the position pointers. Now that may be not even better than using the pcm_stream_lock() after all... Can there be a more advanced version of a pcm stream lock? The one that will take something persistant as an argument (chip->pcm perhaps), take some lock from there, then check if the substream is not freed, and then take the substream lock. And if substream was freed - it unlocks and returns an error. Effectively similar to the current snd_pcm_stream_lock(), but without a need for the second lock to protect the substream itself.

...

...
code. And I remember the last time we discussed this, you admitted (some of) the other drivers do get this part wrong too. So fixing it properly once and for all might be a good idea as you are at it again.

I don't agree here unless any code is shown :)

OK, lets try. My assumption is that all drivers do have the locking wrong because of that policy, but please show me what am I missing. :) So lets see hda_intel.c. What it does is to release some reg_lock before calling snd_pcm_period_elapsed(). It takes that lock in azx_pcm_close(). If it spins on that lock in azx_pcm_close() on one CPU and drops that lock in an IRQ handler on another CPU, then, I beleive, it will crash in snd_pcm_period_elapsed(), as the substream is freed before it takes the lock again. What am I missing? (that can't happen with a single CPU perhaps, as the lock is IRQ-safe, but IIRC this doesn't help on SMP at all) Or, what nm256.c does... hmm, it doesn't even do "substream = NULL" in a close callback - why?

Takashi Iwai

23 May 23 May

1:51 p.m.

At Fri, 23 May 2008 00:28:36 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
The substream lock can be removed from hrtimer callback gracefully. So, you have no longer the issue with this lock/unlock procedure. In hrtimer callback, you just need to have the DMA-buffer. That can be protected without the substream lock, too, I believe.

OK. But still, the substream have to be protected against the close callback.

No, close callback is always after the hw_free callback. So, in hw_free callback, you just need to assure that hrtimer callback is finished (and/or cancel it). After that, you are free to work without considering hrtimer stuff.

We don't drop pcm substream lock here because we don't need to acquire it.

Takashi

Stas Sergeev

27 May 27 May

4:46 p.m.

Hello.

Takashi Iwai wrote:

...

No, close callback is always after the hw_free callback. So, in hw_free callback, you just need to assure that hrtimer callback is finished (and/or cancel it). After that, you are free to work without considering hrtimer stuff.

Here is my first attempt on that. The optimization you mention above is possible, but I haven't done it (yet). Because I think it will obscure things. I can't really say if this code is better than before - it looks cleaner but adds 3 locks.

...

We don't drop pcm substream lock here because we don't need to acquire it.

Why not? Because it is not recommended to use any more, or for some other reason? I mostly need the hrtimer callback to be protected from all the pcm callbacks, modulo the (IMO questionable) optimization. So why not would I just use the substream lock?

Stas Sergeev

4:47 p.m.

Hello.

Takashi Iwai wrote:

...

No, close callback is always after the hw_free callback. So, in hw_free callback, you just need to assure that hrtimer callback is finished (and/or cancel it). After that, you are free to work without considering hrtimer stuff.

...

We don't drop pcm substream lock here because we don't need to acquire it.

Takashi Iwai

6:50 p.m.

At Tue, 27 May 2008 17:47:19 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
No, close callback is always after the hw_free callback. So, in hw_free callback, you just need to assure that hrtimer callback is finished (and/or cancel it). After that, you are free to work without considering hrtimer stuff.

Here is my first attempt on that. The optimization you mention above is possible, but I haven't done it (yet). Because I think it will obscure things. I can't really say if this code is better than before - it looks cleaner but adds 3 locks.

Grrr, that's horrible.

...

...
We don't drop pcm substream lock here because we don't need to acquire it.

Why not? Because it is not recommended to use any more, or for some other reason? I mostly need the hrtimer callback to be protected from all the pcm callbacks, modulo the (IMO questionable) optimization. So why not would I just use the substream lock?

Well, let's take a look back how the callbacks are called. The callbacks that are called during PCM running are the following:

- hrtimer callback gets called from hrtimer: This updates the pcsp port and updates the PCM pointer.

- PCM pointer callback: This returns the current PCM pointer

So, actually between these calls, we have only one thing to protect - the PCM pointer. The DMA-buffer readiness is necessary only for hrtimer callback.

And, in addition, we need to call somewhere snd_pcm_period_elapsed() when the period is finished. This can be done in a tasklet. But, note that this is no callback, and not what pcsp driver itself actually cares.

Now, remember that hrtimer is started only via trigger start callback. That is, hrtimer callback isn called first after PCM is actually started; hence the DMA buffer is ready when it's called.

Then, if we assure that the DMA buffer isn't changed and the PCM substream still exists during hrtimer callback (and/or its execution finishes), hrtimer callback can safely run without *any* lock. This is pretty easy. We just need to add appropriate calls to synchronize the finish of hrtimer (and tasklet) at each place that can change the DMA buffers.

Maybe codes tell better than words. The below is my untested patch.

Takashi

--- diff --git a/sound/drivers/pcsp/pcsp.c b/sound/drivers/pcsp/pcsp.c index 1899cf0..87219bf 100644 --- a/sound/drivers/pcsp/pcsp.c +++ b/sound/drivers/pcsp/pcsp.c @@ -96,7 +96,7 @@ static int __devinit snd_card_pcsp_probe(int devnum, struct device *dev) return -EINVAL;

hrtimer_init(&pcsp_chip.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); - pcsp_chip.timer.cb_mode = HRTIMER_CB_SOFTIRQ; + pcsp_chip.timer.cb_mode = HRTIMER_CB_IRQSAFE; pcsp_chip.timer.function = pcsp_do_timer;

card = snd_card_new(index, id, THIS_MODULE, 0); @@ -188,10 +188,8 @@ static int __devexit pcsp_remove(struct platform_device *dev)

static void pcsp_stop_beep(struct snd_pcsp *chip) { - spin_lock_irq(&chip->substream_lock); - if (!chip->playback_substream) - pcspkr_stop_sound(); - spin_unlock_irq(&chip->substream_lock); + pcsp_sync_stop(chip); + pcspkr_stop_sound(); }

#ifdef CONFIG_PM diff --git a/sound/drivers/pcsp/pcsp.h b/sound/drivers/pcsp/pcsp.h index 1d661f7..70533a3 100644 --- a/sound/drivers/pcsp/pcsp.h +++ b/sound/drivers/pcsp/pcsp.h @@ -77,6 +77,7 @@ struct snd_pcsp { extern struct snd_pcsp pcsp_chip;

extern enum hrtimer_restart pcsp_do_timer(struct hrtimer *handle); +extern void pcsp_sync_stop(struct snd_pcsp *chip);

extern int snd_pcsp_new_pcm(struct snd_pcsp *chip); extern int snd_pcsp_new_mixer(struct snd_pcsp *chip); diff --git a/sound/drivers/pcsp/pcsp_lib.c b/sound/drivers/pcsp/pcsp_lib.c index e341f3f..40f95f5 100644 --- a/sound/drivers/pcsp/pcsp_lib.c +++ b/sound/drivers/pcsp/pcsp_lib.c @@ -8,6 +8,7 @@

#include <linux/module.h> #include <linux/moduleparam.h> +#include <linux/interrupt.h> #include <sound/pcm.h> #include <asm/io.h> #include "pcsp.h" @@ -19,6 +20,22 @@ MODULE_PARM_DESC(nforce_wa, "Apply NForce chipset workaround "

#define DMIX_WANTS_S16 1

+/* + * Call snd_pcm_period_elapsed in a tasklet + * This avoids spinlock messes and long-running irq contexts + */ +static void pcsp_call_pcm_elapsed(unsigned long priv) +{ + if (atomic_read(&pcsp_chip.timer_active)) { + struct snd_pcm_substream *substream; + substream = pcsp_chip.playback_substream; + if (substream) + snd_pcm_period_elapsed(substream); + } +} + +static DECLARE_TASKLET(pcsp_pcm_tasklet, pcsp_call_pcm_elapsed, 0); + enum hrtimer_restart pcsp_do_timer(struct hrtimer *handle) { unsigned char timer_cnt, val; @@ -28,41 +45,23 @@ enum hrtimer_restart pcsp_do_timer(struct hrtimer *handle) struct snd_pcm_substream *substream; struct snd_pcm_runtime *runtime; struct snd_pcsp *chip = container_of(handle, struct snd_pcsp, timer); + unsigned long flags;

if (chip->thalf) { outb(chip->val61, 0x61); chip->thalf = 0; if (!atomic_read(&chip->timer_active)) - return HRTIMER_NORESTART; + goto stop; hrtimer_forward(&chip->timer, chip->timer.expires, ktime_set(0, chip->ns_rem)); return HRTIMER_RESTART; }

- spin_lock_irq(&chip->substream_lock); - /* Takashi Iwai says regarding this extra lock: - - If the irq handler handles some data on the DMA buffer, it should - do snd_pcm_stream_lock(). - That protects basically against all races among PCM callbacks, yes. - However, there are two remaining issues: - 1. The substream pointer you try to lock isn't protected _before_ - this lock yet. - 2. snd_pcm_period_elapsed() itself acquires the lock. - The requirement of another lock is because of 1. When you get - chip->playback_substream, it's not protected. - Keeping this lock while snd_pcm_period_elapsed() assures the substream - is still protected (at least, not released). And the other status is - handled properly inside snd_pcm_stream_lock() in - snd_pcm_period_elapsed(). - - */ - if (!chip->playback_substream) - goto exit_nr_unlock1; - substream = chip->playback_substream; - snd_pcm_stream_lock(substream); if (!atomic_read(&chip->timer_active)) - goto exit_nr_unlock2; + goto stop; + substream = chip->playback_substream; + if (!substream) + goto stop;

runtime = substream->runtime; fmt_size = snd_pcm_format_physical_width(runtime->format) >> 3; @@ -87,6 +86,8 @@ enum hrtimer_restart pcsp_do_timer(struct hrtimer *handle)

period_bytes = snd_pcm_lib_period_bytes(substream); buffer_bytes = snd_pcm_lib_buffer_bytes(substream); + + spin_lock_irqsave(&chip->substream_lock, flags); chip->playback_ptr += PCSP_INDEX_INC() * fmt_size; periods_elapsed = chip->playback_ptr - chip->period_ptr; if (periods_elapsed < 0) { @@ -102,18 +103,15 @@ enum hrtimer_restart pcsp_do_timer(struct hrtimer *handle) * or ALSA will BUG on us. */ chip->playback_ptr %= buffer_bytes;

- snd_pcm_stream_unlock(substream); - if (periods_elapsed) { - snd_pcm_period_elapsed(substream); chip->period_ptr += periods_elapsed * period_bytes; chip->period_ptr %= buffer_bytes; + tasklet_schedule(&pcsp_pcm_tasklet); } - - spin_unlock_irq(&chip->substream_lock); + spin_unlock_irqrestore(&chip->substream_lock, flags);

if (!atomic_read(&chip->timer_active)) - return HRTIMER_NORESTART; + goto stop;

chip->ns_rem = PCSP_PERIOD_NS(); ns = (chip->thalf ? PCSP_CALC_NS(timer_cnt) : chip->ns_rem); @@ -121,10 +119,7 @@ enum hrtimer_restart pcsp_do_timer(struct hrtimer *handle) hrtimer_forward(&chip->timer, chip->timer.expires, ktime_set(0, ns)); return HRTIMER_RESTART;

-exit_nr_unlock2: - snd_pcm_stream_unlock(substream); -exit_nr_unlock1: - spin_unlock_irq(&chip->substream_lock); + stop: return HRTIMER_NORESTART; }

@@ -164,26 +159,35 @@ static void pcsp_stop_playing(struct snd_pcsp *chip) spin_unlock(&i8253_lock); }

+/* + * Force to stop and sync the stream + */ +void pcsp_sync_stop(struct snd_pcsp *chip) +{ + local_irq_disable(); + pcsp_stop_playing(chip); + local_irq_enable(); + hrtimer_cancel(&chip->timer); + tasklet_kill(&pcsp_pcm_tasklet); +} + static int snd_pcsp_playback_close(struct snd_pcm_substream *substream) { struct snd_pcsp *chip = snd_pcm_substream_chip(substream); #if PCSP_DEBUG printk(KERN_INFO "PCSP: close called\n"); #endif - if (atomic_read(&chip->timer_active)) { - printk(KERN_ERR "PCSP: timer still active\n"); - pcsp_stop_playing(chip); - } - spin_lock_irq(&chip->substream_lock); + pcsp_sync_stop(chip); chip->playback_substream = NULL; - spin_unlock_irq(&chip->substream_lock); return 0; }

static int snd_pcsp_playback_hw_params(struct snd_pcm_substream *substream, struct snd_pcm_hw_params *hw_params) { + struct snd_pcsp *chip = snd_pcm_substream_chip(substream); int err; + pcsp_sync_stop(chip); err = snd_pcm_lib_malloc_pages(substream, params_buffer_bytes(hw_params)); if (err < 0) @@ -193,9 +197,11 @@ static int snd_pcsp_playback_hw_params(struct snd_pcm_substream *substream,

static int snd_pcsp_playback_hw_free(struct snd_pcm_substream *substream) { + struct snd_pcsp *chip = snd_pcm_substream_chip(substream); #if PCSP_DEBUG printk(KERN_INFO "PCSP: hw_free called\n"); #endif + pcsp_sync_stop(chip); return snd_pcm_lib_free_pages(substream); }

@@ -211,6 +217,7 @@ static int snd_pcsp_playback_prepare(struct snd_pcm_substream *substream) snd_pcm_lib_period_bytes(substream), substream->runtime->periods); #endif + pcsp_sync_stop(chip); chip->playback_ptr = 0; chip->period_ptr = 0; return 0; @@ -241,7 +248,11 @@ static snd_pcm_uframes_t snd_pcsp_playback_pointer(struct snd_pcm_substream *substream) { struct snd_pcsp *chip = snd_pcm_substream_chip(substream); - return bytes_to_frames(substream->runtime, chip->playback_ptr); + unsigned int pos; + spin_lock(&chip->substream_lock); + pos = chip->playback_ptr; + spin_unlock(&chip->substream_lock); + return bytes_to_frames(substream->runtime, pos); }

static struct snd_pcm_hardware snd_pcsp_playback = { @@ -278,9 +289,7 @@ static int snd_pcsp_playback_open(struct snd_pcm_substream *substream) return -EBUSY; } runtime->hw = snd_pcsp_playback; - spin_lock_irq(&chip->substream_lock); chip->playback_substream = substream; - spin_unlock_irq(&chip->substream_lock); return 0; }

Stas Sergeev

8:40 p.m.

Hello.

Takashi Iwai wrote:

...

Well, let's take a look back how the callbacks are called. The callbacks that are called during PCM running are the following:

hrtimer callback gets called from hrtimer: This updates the pcsp port and updates the PCM pointer.

PCM pointer callback: This returns the current PCM pointer

So, actually between these calls, we have only one thing to protect - the PCM pointer. The DMA-buffer readiness is necessary only for hrtimer callback.

OK.

...

And, in addition, we need to call somewhere snd_pcm_period_elapsed() when the period is finished. This can be done in a tasklet.

What is the difference between calling everything in softirq (current solution) and calling snd_pcm_period_elapsed() in a tasklet, with the rest in a hardirq context? What will this solve?

...

Now, remember that hrtimer is started only via trigger start callback. That is, hrtimer callback isn called first after PCM is actually started; hence the DMA buffer is ready when it's called.

Then, if we assure that the DMA buffer isn't changed and the PCM substream still exists during hrtimer callback (and/or its execution finishes), hrtimer callback can safely run without *any* lock.

OK so how do you protect the PCM pointer after all? Even in your example you use the chip->substream_lock for that (why?), so it doesn't look like you run a callback without any lock at all.

...

This is pretty easy. We just need to add appropriate calls to synchronize the finish of hrtimer (and tasklet) at each place that can change the DMA buffers.

I wonder if this is SMP-safe (comments below).

...

+/*

Force to stop and sync the stream

*/

+void pcsp_sync_stop(struct snd_pcsp *chip) +{

local_irq_disable();

So what will happen if the timer callback is executing on another CPU at that point?

...

pcsp_stop_playing(chip);

This will set chip->timer_active to 0, but who knows if the callback on another CPU have already passed the check or not?

...

local_irq_enable();

hrtimer_cancel(&chip->timer);

tasklet_kill(&pcsp_pcm_tasklet);

Ouch, IIRC all three are discouraged to use...

I (hopefully) now understand the logic you are trying to explain, and I don't think it is completely trouble-free (I may be wrong of course, but certainly it won't be straightforward for every reader why is that code correct). So the question is still why not to simply use snd_pcm_stream_lock(). And if right now it is barely usefull, then why it can't be improved to simply not require a second lock that guards a substream itself?

Takashi Iwai

28 May 28 May

1:13 p.m.

At Tue, 27 May 2008 21:40:13 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
Well, let's take a look back how the callbacks are called. The callbacks that are called during PCM running are the following:

hrtimer callback gets called from hrtimer: This updates the pcsp port and updates the PCM pointer.

PCM pointer callback: This returns the current PCM pointer

So, actually between these calls, we have only one thing to protect - the PCM pointer. The DMA-buffer readiness is necessary only for hrtimer callback.

OK.

...
And, in addition, we need to call somewhere snd_pcm_period_elapsed() when the period is finished. This can be done in a tasklet.

What is the difference between calling everything in softirq (current solution) and calling snd_pcm_period_elapsed() in a tasklet, with the rest in a hardirq context? What will this solve?

Think just like the old-good bottomhalf. The register flip with hrtimer callback is definitely a fast path, so it should be called inside the hard irq.

OTOH, the update of the whole PCM via snd_pcm_period_elapsed() is a slow path. Especially in the case of hrtimer, it brings the lock mess, too. Thus it's better to put this into a softirq.

Issuing softirq at each hrtimer callback isn't optimal from performance perspective. For RT-kernels, it doesn't matter, but for normal kernels, it does.

...

...
Now, remember that hrtimer is started only via trigger start callback. That is, hrtimer callback isn called first after PCM is actually started; hence the DMA buffer is ready when it's called.

Then, if we assure that the DMA buffer isn't changed and the PCM substream still exists during hrtimer callback (and/or its execution finishes), hrtimer callback can safely run without *any* lock.

OK so how do you protect the PCM pointer after all? Even in your example you use the chip->substream_lock for that (why?),

Because the pointer callback and the hrtimer callback can race about chip->playback_ptr. The lock is simply to protect it. Don't take it as a PCM substream lock.

It can be implemened even without a lock, e.g. using a simple barrier, though. I was just too lazy.

...

so it doesn't look like you run a callback without any lock at all.

You don't need a lock for the PCM callback itself. The pointer callback is called inside the PCM substream lock.

...

...
This is pretty easy. We just need to add appropriate calls to synchronize the finish of hrtimer (and tasklet) at each place that can change the DMA buffers.

I wonder if this is SMP-safe (comments below).

...
+/*

Force to stop and sync the stream

*/

+void pcsp_sync_stop(struct snd_pcsp *chip) +{

local_irq_disable();

So what will happen if the timer callback is executing on another CPU at that point?

That doesn't matter, because this irqsave is needed just for i8253_lock in pcsp_stop_playing() (as it doesn't save irq).

...

...

pcsp_stop_playing(chip);

This will set chip->timer_active to 0, but who knows if the callback on another CPU have already passed the check or not?

If it passed, then we simply wait until the next hrtimer callback finished. That's what hrtimer_cancel() does. So, after that, it is guaranteed that neither hrtimer and tasklet remains.

...

...

local_irq_enable();

hrtimer_cancel(&chip->timer);

tasklet_kill(&pcsp_pcm_tasklet);

Ouch, IIRC all three are discouraged to use...

Really? Any pointer?

...

I (hopefully) now understand the logic you are trying to explain, and I don't think it is completely trouble-free (I may be wrong of course, but certainly it won't be straightforward for every reader why is that code correct). So the question is still why not to simply use snd_pcm_stream_lock().

Because the substream lock isn't necessary! It just makes things more complicated -- if we take substream lock, AB/BA deadlocks occur again.

...

And if right now it is barely usefull, then why it can't be improved to simply not require a second lock that guards a substream itself?

Because it won't be simpler, as you already tried.

Takashi

Stas Sergeev

11:08 p.m.

Hello.

Takashi Iwai wrote:

...

OTOH, the update of the whole PCM via snd_pcm_period_elapsed() is a slow path. Especially in the case of hrtimer, it brings the lock mess, too.

Unless the entire callback is done in a softirq, as it currently is, no?

...

Issuing softirq at each hrtimer callback isn't optimal from performance perspective. For RT-kernels, it doesn't matter, but for normal kernels, it does.

Oh. But in that case I guess it would really be better if just snd_pcm_period_elapsed() would raise a softirq. That's something you proposed in the past IIRC. Is this acceptable to have that change not in the driver, or do you think it is snd-pcsp-specific, and no other driver will benefit? Btw, could you please point me to where to read about the softirq expense? I thought they are very cheap, almost zero overhead. And yes, if the sotfirqs are expensive, then indeed I see why you propose all these tricks you do... But if they are not and we can keep the entire callback in a softirq as it currently is, then there might be a simpler solutions.

...

...
the chip->substream_lock for that (why?),

Because the pointer callback and the hrtimer callback can race about

I asked only why have you chosen chip->substream_lock - IIRC it was intended not for this.

...

...
...

pcsp_stop_playing(chip);

This will set chip->timer_active to 0, but who knows if the callback on another CPU have already passed the check or not?

If it passed, then we simply wait until the next hrtimer callback finished. That's what hrtimer_cancel() does. So, after that, it is guaranteed that neither hrtimer and tasklet remains.

Yes, but the state doesn't seem to be guaranteed. Running the handler after the pcsp_stop_playing(chip) called, looks nasty. The timer will be in a different mode. Nothing bad will likely to happen, but... esp with the nforce workaround, where the callback is supposed to be called even amount of times. Doesn't look very good to me, even if nothing bad can happen...

...

Because the substream lock isn't necessary! It just makes things more complicated -- if we take substream lock, AB/BA deadlocks occur again.

Even if the callback is called within the softirq?

...

...
And if right now it is barely usefull, then why it can't be improved to simply not require a second lock that guards a substream itself?

Because it won't be simpler, as you already tried.

I haven't tried that. :) I have tried the snd_pcm_stream_lock(), and yes, its nasty. But to me its nastiness is only because it needs another lock to protect the substream itself. So it is barely usefull. But it can be improved (or an alternative provided), the way the second lock is not needed. Then I will need only that - a single lock, nothing more. This might not be even difficult to do at all.

Takashi Iwai

29 May 29 May

9:03 a.m.

At Thu, 29 May 2008 00:08:32 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
OTOH, the update of the whole PCM via snd_pcm_period_elapsed() is a slow path. Especially in the case of hrtimer, it brings the lock mess, too.

Unless the entire callback is done in a softirq, as it currently is, no?

...
Issuing softirq at each hrtimer callback isn't optimal from performance perspective. For RT-kernels, it doesn't matter, but for normal kernels, it does.

Oh. But in that case I guess it would really be better if just snd_pcm_period_elapsed() would raise a softirq. That's something you proposed in the past IIRC. Is this acceptable to have that change not in the driver, or do you think it is snd-pcsp-specific, and no other driver will benefit?

It would be good in general. However, better to be the next step after changing pcsp.

...

Btw, could you please point me to where to read about the softirq expense? I thought they are very cheap, almost zero overhead.

One big disadvantage of the softirq is its (possible) timing inaccuracy. (Note that this is about the normal non-preempt kernel.) The thing like pcsp driver does, flipping the bits at a high rate, isn't for softirq per se.

...

And yes, if the sotfirqs are expensive, then indeed I see why you propose all these tricks you do... But if they are not and we can keep the entire callback in a softirq as it currently is, then there might be a simpler solutions.

...
...
the chip->substream_lock for that (why?),

Because the pointer callback and the hrtimer callback can race about

I asked only why have you chosen chip->substream_lock - IIRC it was intended not for this.

...
...
...

pcsp_stop_playing(chip);

This will set chip->timer_active to 0, but who knows if the callback on another CPU have already passed the check or not?

If it passed, then we simply wait until the next hrtimer callback finished. That's what hrtimer_cancel() does. So, after that, it is guaranteed that neither hrtimer and tasklet remains.

Yes, but the state doesn't seem to be guaranteed. Running the handler after the pcsp_stop_playing(chip) called, looks nasty.

This is there because pcsp_stop_playing() doesn't sync. Both hrtimer_cancel() and tasklet_kill() are basically just for sync, not for killing the object.

...

The timer will be in a different mode. Nothing bad will likely to happen, but... esp with the nforce workaround, where the callback is supposed to be called even amount of times. Doesn't look very good to me, even if nothing bad can happen...

Well, it can be done in another away if you are so nervous about it.

Simply add chip->timer_running atomic_t and reset it when the callback finished with HRTIMER_NORESTART. In another side, wait until chip->timer_running becomes zero. But, it's just an uneeded addition of the code.

...

...
Because the substream lock isn't necessary! It just makes things more complicated -- if we take substream lock, AB/BA deadlocks occur again.

Even if the callback is called within the softirq?

The runtime callbacks, pointer and trigger, are called already inside they substream lock held by the PCM core. Thus it doesn't matter whether it's in a hard or a softirq.

...

...
...
And if right now it is barely usefull, then why it can't be improved to simply not require a second lock that guards a substream itself?

Because it won't be simpler, as you already tried.

I haven't tried that. :) I have tried the snd_pcm_stream_lock(), and yes, its nasty. But to me its nastiness is only because it needs another lock to protect the substream itself.

No, you don't need to protect the substream at all. Remember that hrtimer callback requires only the DMA-buffer pointer during its operation. No other thing is needed from the substream. This can be guaranteed with a simpler logic.

...

So it is barely usefull. But it can be improved (or an alternative provided), the way the second lock is not needed. Then I will need only that - a single lock, nothing more. This might not be even difficult to do at all.

Well, my version is almost i8259_lock only. As mentioned, chip->substream_lock can be removed safely with a barrier.

Takashi

Stas Sergeev

8:07 p.m.

Hello.

Takashi Iwai wrote:

...

...
really be better if just snd_pcm_period_elapsed() would raise a softirq. That's something you proposed in the past IIRC. Is this acceptable to have that change not in the driver, or do you think it is snd-pcsp-specific, and no other driver will benefit?

It would be good in general. However, better to be the next step after changing pcsp.

But why do you need both changes? If you are going to change snd_pcm_period_elapsed() itself, then how would the snd-pcsp change be justified?

...

One big disadvantage of the softirq is its (possible) timing inaccuracy. (Note that this is about the normal non-preempt kernel.) The thing like pcsp driver does, flipping the bits at a high rate, isn't for softirq per se.

OK, I see.

...

Well, it can be done in another away if you are so nervous about it. Simply add chip->timer_running atomic_t and reset it when the callback finished with HRTIMER_NORESTART. In another side, wait until chip->timer_running becomes zero. But, it's just an uneeded addition of the code.

But for some reasons it looks safer to me...

...

...
...
Because the substream lock isn't necessary! It just makes things more complicated -- if we take substream lock, AB/BA deadlocks occur again.

Even if the callback is called within the softirq?

The runtime callbacks, pointer and trigger, are called already inside they substream lock held by the PCM core. Thus it doesn't matter whether it's in a hard or a softirq.

The difference is that in a softirq, the hrtimer code doesn't take a lock. The AB/BA problem was because of the hrtimer' lock, which is not there with the softirq callback. So what exactly A and B do you mean here?

...

...
and yes, its nasty. But to me its nastiness is only because it needs another lock to protect the substream itself.

No, you don't need to protect the substream at all.

But that was suggested by you in the past. :) But adding a syncronization point should be better I agree. I'll see if some minimal change is possible that will look sane to you and wont make me ask 100 questions about its safety at the same time. :)

Takashi Iwai

2 Jun 2 Jun

12:36 p.m.

At Thu, 29 May 2008 21:07:55 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
...
really be better if just snd_pcm_period_elapsed() would raise a softirq. That's something you proposed in the past IIRC. Is this acceptable to have that change not in the driver, or do you think it is snd-pcsp-specific, and no other driver will benefit?

It would be good in general. However, better to be the next step after changing pcsp.

But why do you need both changes? If you are going to change snd_pcm_period_elapsed() itself, then how would the snd-pcsp change be justified?

snd-pcsp needs anyway the change to HRTIMER_IRQSAFE again.

And, moving snd_pcm_period_elapsed() to a tasklet isn't needed for RT kernels. It's basically a good thing, but just not mandatory like snd-pcsp.

...

...
One big disadvantage of the softirq is its (possible) timing inaccuracy. (Note that this is about the normal non-preempt kernel.) The thing like pcsp driver does, flipping the bits at a high rate, isn't for softirq per se.

OK, I see.

...
Well, it can be done in another away if you are so nervous about it. Simply add chip->timer_running atomic_t and reset it when the callback finished with HRTIMER_NORESTART. In another side, wait until chip->timer_running becomes zero. But, it's just an uneeded addition of the code.

But for some reasons it looks safer to me...

A spinlock is like a gun: you can protect yourself with it, and you can shoot others with it :)

...

...
...
...
Because the substream lock isn't necessary! It just makes things more complicated -- if we take substream lock, AB/BA deadlocks occur again.

Even if the callback is called within the softirq?

The runtime callbacks, pointer and trigger, are called already inside they substream lock held by the PCM core. Thus it doesn't matter whether it's in a hard or a softirq.

The difference is that in a softirq, the hrtimer code doesn't take a lock.

But it may cause automatically latencies for beep controls. The latency isn't a big issue for snd_pcm_period_elapsed() (unless you want realtime thingy with snd-pcsp), but for beep controls, it does.

...

The AB/BA problem was because of the hrtimer' lock, which is not there with the softirq callback. So what exactly A and B do you mean here?

That's what I meant for, too. Note that in the above text, I presume that the hrtimer is called in HRTIMER_IRQSAFE.

...

...
...
and yes, its nasty. But to me its nastiness is only because it needs another lock to protect the substream itself.

No, you don't need to protect the substream at all.

But that was suggested by you in the past. :)

Yeah, that's my failure, obviously. I just thought your hrtimer callback must be constantly called and hard to sync.

...

But adding a syncronization point should be better I agree. I'll see if some minimal change is possible that will look sane to you and wont make me ask 100 questions about its safety at the same time. :)

OK.

thanks,

Takashi

Stas Sergeev

21 Aug 21 Aug

11:06 a.m.

Hello.

Takashi Iwai wrote:

...

Maybe codes tell better than words. The below is my untested patch.

I finally got around to have a look at the patch and gave it some testing. I think it is the best solution for what we currently have, but I see a few simplifications to it if some simple changes to the alsa core are done.

1. We can have only one sync point instead of many, and that should be the stop callback. It is not possible right now because the stop callback can be asynchronous. This can be solved by providing the separate callback to be called from snd_pcm_period_elapsed(). I already proposed this a few years ago and made a patch, but it was rejected in favour of the current excessive locking we have in pcsp. But now as we are at simplifying it, I wonder if that idea can be re-evaluated. Note that the change was not intrusive at all. The new callback was optional. If it is not provided, then the old logic applies.

...

...
...
+/*

...
...

Force to stop and sync the stream

*/

+void pcsp_sync_stop(struct snd_pcsp *chip) +{

local_irq_disable();

So what will happen if the timer callback is executing on another CPU at that point?

That doesn't matter, because this irqsave is needed just for i8253_lock in pcsp_stop_playing() (as it doesn't save irq).

local_irq_disable() can be avoided as well by introducing the separate async stop callback.

3. The AB/BA locking can be avoided without the use of the softirq I think. Right now snd_pcm_period_elapsed() takes the pcm_stream_lock() because it calls the stop callback which needs that lock. But, having the separate stop callback, snd_pcm_period_elapsed() may probably just take some other lock for its internal needs, and take the pcm_stream_lock() only if the separate stop callback is not provided.

The above changes look fairly simple, and I even had the patch for 1 already. The changes to snd-pcsp will became much simpler with them I beleive. And having the possibility to provide a separate callback for the IRQ context looks like the really must-have thing to me from the very beginning. I even remember the old OSS callbacks which were passed the flag to indicate whether or not they are called from an IRQ context. I prefer the separate callback to the flag, but in alsa there is currently neither.

Takashi Iwai

12:06 p.m.

At Thu, 21 Aug 2008 12:06:07 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
Maybe codes tell better than words. The below is my untested patch.

I finally got around to have a look at the patch and gave it some testing. I think it is the best solution for what we currently have, but I see a few simplifications to it if some simple changes to the alsa core are done.

We can have only one sync point

instead of many, and that should be the stop callback. It is not possible right now because the stop callback can be asynchronous. This can be solved by providing the separate callback to be called from snd_pcm_period_elapsed(). I already proposed this a few years ago and made a patch, but it was rejected in favour of the current excessive locking we have in pcsp. But now as we are at simplifying it, I wonder if that idea can be re-evaluated. Note that the change was not intrusive at all. The new callback was optional. If it is not provided, then the old logic applies.

I don't remember exactly, but adding a new callback doesn't sound so perfect. Well, it's just one pointer, but it's added all over places.

Anyway, could you repost it? Then we can discuss about it more concretely.

...

...
...
...
+/*

...
...

Force to stop and sync the stream

*/

+void pcsp_sync_stop(struct snd_pcsp *chip) +{

local_irq_disable();

So what will happen if the timer callback is executing on another CPU at that point?

That doesn't matter, because this irqsave is needed just for i8253_lock in pcsp_stop_playing() (as it doesn't save irq).

local_irq_disable() can be avoided as well by introducing the separate async stop callback.

It'll be nice, then.

...

The AB/BA locking can be avoided

without the use of the softirq I think. Right now snd_pcm_period_elapsed() takes the pcm_stream_lock() because it calls the stop callback which needs that lock. But, having the separate stop callback, snd_pcm_period_elapsed() may probably just take some other lock for its internal needs, and take the pcm_stream_lock() only if the separate stop callback is not provided.

The above changes look fairly simple, and I even had the patch for 1 already. The changes to snd-pcsp will became much simpler with them I beleive. And having the possibility to provide a separate callback for the IRQ context looks like the really must-have thing to me from the very beginning. I even remember the old OSS callbacks which were passed the flag to indicate whether or not they are called from an IRQ context. I prefer the separate callback to the flag, but in alsa there is currently neither.

Well, the IRQ context and locking are different things. A known problem with PCM substream lock is that it's to be used for multiple bound streams as well. My concern is whether this can be still avoided well by your changes.

But, in general, I'd love to clean up the PCM stuff, especially these lock messes. Let's cook a stew a bit more.

thanks,

Takashi

Stas Sergeev

1:25 p.m.

Hello.

Takashi Iwai wrote:

...

I don't remember exactly, but adding a new callback doesn't sound so perfect. Well, it's just one pointer, but it's added all over places.

Well, it can be a concern only as long as the callback is entirely useless. :) If it does the right thing, then it certainly worth a pointer.

...

Anyway, could you repost it? Then we can discuss about it more concretely.

Attached, and the entire message is at the bottom.

...

A known problem with PCM substream lock is that it's to be used for multiple bound streams as well. My concern is whether this can be still avoided well by your changes.

That's why I haven't made the patch for the PCM locking change. Someone else should know better how to make it the safe way.

Below is an old message with the patch. --- Hi.

I was trying to get the locking right in my pcsp driver, and I have the following problem. I am using the chip->playback_substream in the IRQ handler context. To prevent the chance of closing the substream on another CPU while the IRQ handler still messes with it, I decided to protect it with the spinlock. So I acquire the same lock both in an IRQ handler and in the pcsp_stop_playing() routine. That way I can be sure they never step on each other's feet, even on SMP. The problem is though that ALSA calls the trigger() function to stop the playback both within the task context and within the IRQ context (later is via snd_pcm_period_elapsed(), which is called from an IRQ context). So the above locking scheme does not work, because trigger() being called from an IRQ context, takes the already taken lock. I can drop the lock before calling snd_pcm_period_elapsed(), but I beleive this opens a (very small) race condition. What can be a solution to this problem? I think it is never too good to call the same callbacks from both the task and IRQ contexts, but the ALSA does this. In other words, can something like the attached patch ever be applied, or am I misunderstanding the problem completely? The patch adds the separate callback, which is intended to be called from an IRQ context only. Does this look like the right solution? ---

Takashi Iwai

20 Oct 20 Oct

4:05 p.m.

Hi,

whipping this old horse again...

At Thu, 21 Aug 2008 14:25:36 +0400, Stas Sergeev wrote:

...

Hello.

Takashi Iwai wrote:

...
I don't remember exactly, but adding a new callback doesn't sound so perfect. Well, it's just one pointer, but it's added all over places.

Well, it can be a concern only as long as the callback is entirely useless. :) If it does the right thing, then it certainly worth a pointer.

...
Anyway, could you repost it? Then we can discuss about it more concretely.

Attached, and the entire message is at the bottom.

...
A known problem with PCM substream lock is that it's to be used for multiple bound streams as well. My concern is whether this can be still avoided well by your changes.

That's why I haven't made the patch for the PCM locking change. Someone else should know better how to make it the safe way.

Indeed, the async trigger is nice to have in the common place. However, the change wouldn't be as trivial as it sounds, as you mentioned. By async nature, there can be a transition phase between the XRUN and STOP, which can cause races.

A possible solution is to introduce an intermediate sound state (e.g. SOUND_STATE_TRIGGER_PENDING), but this anyway requires a major rewrite in the PCM core code.

So, I first applied my patch for 2.6.29 (not 28) for further testing. Let's consider more on async trigger later...

thanks,

Takashi

...

Below is an old message with the patch.

Hi.

I was trying to get the locking right in my pcsp driver, and I have the following problem. I am using the chip->playback_substream in the IRQ handler context. To prevent the chance of closing the substream on another CPU while the IRQ handler still messes with it, I decided to protect it with the spinlock. So I acquire the same lock both in an IRQ handler and in the pcsp_stop_playing() routine. That way I can be sure they never step on each other's feet, even on SMP. The problem is though that ALSA calls the trigger() function to stop the playback both within the task context and within the IRQ context (later is via snd_pcm_period_elapsed(), which is called from an IRQ context). So the above locking scheme does not work, because trigger() being called from an IRQ context, takes the already taken lock. I can drop the lock before calling snd_pcm_period_elapsed(), but I beleive this opens a (very small) race condition. What can be a solution to this problem? I think it is never too good to call the same callbacks from both the task and IRQ contexts, but the ALSA does this. In other words, can something like the attached patch ever be applied, or am I misunderstanding the problem completely? The patch adds the separate callback, which is intended to be called from an IRQ context only. Does this look like the right solution?

[2 as_stop.diff <text/x-patch (7bit)>] --- linux-2.6.21/sound/core/pcm_native.c.old 2007-05-12 11:30:06.000000000 +0400 +++ linux-2.6.21/sound/core/pcm_native.c 2007-05-20 01:22:45.000000000 +0400 @@ -922,8 +922,14 @@ static int snd_pcm_do_stop(struct snd_pcm_substream *substream, int state) { if (substream->runtime->trigger_master == substream &&
   snd_pcm_running(substream))
substream->ops->trigger(substream, SNDRV_PCM_TRIGGER_STOP);
   snd_pcm_running(substream)) {
if (substream->ops->async_stop)
	/* the driver provides a separate callback
	 * for the IRQ context */
	substream->ops->async_stop(substream);
else
	substream->ops->trigger(substream, SNDRV_PCM_TRIGGER_STOP);
} return 0; /* unconditonally stop all substreams */
}

--- linux-2.6.21/include/sound/pcm.h.old 2007-05-12 11:30:01.000000000 +0400 +++ linux-2.6.21/include/sound/pcm.h 2007-05-20 01:21:29.000000000 +0400 @@ -68,6 +68,7 @@ int (*hw_free)(struct snd_pcm_substream *substream); int (*prepare)(struct snd_pcm_substream *substream); int (*trigger)(struct snd_pcm_substream *substream, int cmd);

int (*async_stop)(struct snd_pcm_substream *substream); snd_pcm_uframes_t (*pointer)(struct snd_pcm_substream *substream); int (*copy)(struct snd_pcm_substream *substream, int channel, snd_pcm_uframes_t pos,

Stas Sergeev

21 Oct 21 Oct

12:51 a.m.

Hi.

Takashi Iwai wrote:

...

Indeed, the async trigger is nice to have in the common place. However, the change wouldn't be as trivial as it sounds, as you mentioned. By async nature, there can be a transition phase between the XRUN and STOP, which can cause races.

Could you please elaborate on how my proposed patch could possibly affect that? It basically doesn't do anything at all except providing one more callback for what would otherwise had to be done in a trigger() callback anyway. If there is a race with that patch, then I pretty much suspect it was with an old code too. I can't imagine any possible change. What have changed?

Takashi Iwai

9:27 a.m.

At Tue, 21 Oct 2008 01:51:57 +0400, Stas Sergeev wrote:

...

Hi.

Takashi Iwai wrote:

...
Indeed, the async trigger is nice to have in the common place. However, the change wouldn't be as trivial as it sounds, as you mentioned. By async nature, there can be a transition phase between the XRUN and STOP, which can cause races.

Could you please elaborate on how my proposed patch could possibly affect that? It basically doesn't do anything at all except providing one more callback for what would otherwise had to be done in a trigger() callback anyway. If there is a race with that patch, then I pretty much suspect it was with an old code too. I can't imagine any possible change. What have changed?

The PCM status is changed immediately after calling trigger(_async) callback XRUN or SETUP. That is, you can start the stream again soon after snd_pcm_stop(). In the case of async operation, the hardware may be likely still running, but the PCM core doesn't know about it and allows you to restart the stream. So it's racy, at least from the PCM core viewpoint.

Usually async operation has a way to indicate the pending status, either setting the state to WORKING/PENDING, or having an additional flag. In either way, we need the change in the PCM core side.

Takashi

Stas Sergeev

10:08 a.m.

Hi.

Takashi Iwai wrote:

...

The PCM status is changed immediately after calling trigger(async) callback XRUN or SETUP. That is, you can start the stream again soon after snd_pcm_stop(). In the case of async operation, the hardware may be likely still running, but the PCM core doesn't know about it and allows you to restart the stream. So it's racy, at least from the PCM core viewpoint.

OK but how does _my patch_ affects this? Before, the trigger() callback was called both synchronously and asynchronously. My patch only provides the callback to make it possible to separate the sync and async parts. It doesn't do anything more. It doesn't change anything at all. So how could exactly that patch introduce the race you mentioned? There was already an async invocation of trigger() callback. I wanted to add just a callback under different name for that. What could my patch possibly change? How does it _introduce_ the race?

Takashi Iwai

10:16 a.m.

At Tue, 21 Oct 2008 11:08:17 +0400, Stas Sergeev wrote:

...

Hi.

Takashi Iwai wrote:

...
The PCM status is changed immediately after calling trigger(async) callback XRUN or SETUP. That is, you can start the stream again soon after snd_pcm_stop(). In the case of async operation, the hardware may be likely still running, but the PCM core doesn't know about it and allows you to restart the stream. So it's racy, at least from the PCM core viewpoint.

OK but how does _my patch_ affects this?

No, your patch does essentially NOTHING by itself. However...

...

Before, the trigger() callback was called both synchronously and asynchronously. My patch only provides the callback to make it possible to separate the sync and async parts. It doesn't do anything more. It doesn't change anything at all. So how could exactly that patch introduce the race you mentioned? There was already an async invocation of trigger() callback. I wanted to add just a callback under different name for that. What could my patch possibly change? How does it _introduce_ the race?

Because the async trigger itself can be racy easily, there is no merit to add a common callback until we have a proper handling of delayed triggers in the PCM core side.

Takashi

6096

Age (days ago)

6252

Last active (days ago)

List overview

Download

22 comments

2 participants

tags (0)

participants (2)

Stas Sergeev
Takashi Iwai