On Fri, 09 Sep 2022 17:45:25 +0200, Ville Syrjälä wrote:
Hi Takashi,
commit 7206998f578d ("ALSA: hda: Fix potential deadlock at codec unbinding") introduced a problem on at least one of my older machines.
The problem happens when hda_codec_driver_remove() encounters a codec without any pcms (and thus the refcount is 1) and tries to call refcount_dec(). Turns out refcount_dec() doesn't like to be used for dropping the refcount to 0, and instead if spews a warning and does its saturate thing. The subsequent wait_event() is then permanently stuck waiting on the saturated refcount.
I've definitely seen the same kind of pattern used elsewhere in the kernel as well, so the fact that refcount_t can't be used to implement it is a bit of surprise to me. I guess most other places still use atomic_t instead.
Does the patch below work around it? It seem to be a subtle difference between refcount_dec() and refcount_dec_and_test().
thanks,
Takashi
-- 8< -- --- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -157,10 +157,11 @@ static int hda_codec_driver_remove(struct device *dev) return codec->bus->core.ext_ops->hdev_detach(&codec->core); }
- refcount_dec(&codec->pcm_ref); - snd_hda_codec_disconnect_pcms(codec); - snd_hda_jack_tbl_disconnect(codec); - wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); + if (!refcount_dec_and_test(&codec->pcm_ref)) { + snd_hda_codec_disconnect_pcms(codec); + snd_hda_jack_tbl_disconnect(codec); + wait_event(codec->remove_sleep, !refcount_read(&codec->pcm_ref)); + } snd_power_sync_ref(codec->bus->card);
if (codec->patch_ops.free)