[alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more than one codecs are connected to the bus, but are blocked. Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one on-board codec and one integrated display codec. During initialization, the codecs are created and configured in a serial way.
Creating a codec may cost 6~20ms, and then building controls make cost about 15~30ms. So I had thought it's helpful to build controls for different codecs in parallel in function snd_hda_build_controls(), as we did in driver suspend/resume.
But the test shows this doesn't work:
(1) Using a workqueue without WQ_UNBOUND, the work items, which calls snd_hda_codec_build_controls(), are processed one after one
(I think because there is no explicit sleep in the work function). So the total time does not change.
(2) Using an unbouned work queue for the bus or using separate kthreads for each codec, the work items for different codecs are processed in parallel.
However, the total time is still the sum of two codec as that without parallel. I think that because snd_hda_codec_read() are used during initialization, the command execution on bus is serialized in the low level. So even if we parallel initializing two codecs in high level, commands on bus are still processed one by one so the total time cannot reduce.
We're blocked here. Is there any other means to reduce the time?
Thanks Mengdong
At Thu, 28 Nov 2013 14:57:22 +0000, Lin, Mengdong wrote:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more than one codecs are connected to the bus, but are blocked. Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one on-board codec and one integrated display codec. During initialization, the codecs are created and configured in a serial way.
Creating a codec may cost 6~20ms, and then building controls make cost about 15~30ms. So I had thought it's helpful to build controls for different codecs in parallel in function snd_hda_build_controls(), as we did in driver suspend/resume.
But the test shows this doesn't work:
(1) Using a workqueue without WQ_UNBOUND, the work items, which calls snd_hda_codec_build_controls(), are processed one after one
You have to synchronize before starting building PCMs and controls. That is, you can parallelize snd_hda_codec_configure() calls, but the rest has to be serialized.
(I think because there is no explicit sleep in the work function). So the total time does not change.
(2) Using an unbouned work queue for the bus or using separate kthreads for each codec, the work items for different codecs are processed in parallel.
However, the total time is still the sum of two codec as that without parallel. I think that because snd_hda_codec_read() are used during initialization, the command execution on bus is serialized in the low level. So even if we parallel initializing two codecs in high level, commands on bus are still processed one by one so the total time cannot reduce.
If the bottleneck is the time consumed to perform the whole verbs, then it's unavoidable, of course. You have only a single street no matter how many cars are running.
So, you must analyze the problem before starting such optimizations. For example, put tracepoints to measure the time consumption.
We're blocked here. Is there any other means to reduce the time?
Reduce the amount of init verbs.
So far, the driver tries to initialize all things properly by itself without trusting the default values the codec sets (e.g. setting the amp to mute, selecting the selector to 0). These verbs are often redundant.
But, assuming some default values is of course risky, if the hardware doesn't follow it by some reason -- or if the same path is executed in a different situation like runtime PM.
Takashi
2013-11-28 22:57, Lin, Mengdong skrev:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more than one codecs are connected to the bus, but are blocked. Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one on-board codec and one integrated display codec. During initialization, the codecs are created and configured in a serial way.
Creating a codec may cost 6~20ms, and then building controls make cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote:
2013-11-28 22:57, Lin, Mengdong skrev:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more than one codecs are connected to the bus, but are blocked. Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one on-board codec and one integrated display codec. During initialization, the codecs are created and configured in a serial way.
Creating a codec may cost 6~20ms, and then building controls make cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
BTW, reading the description above again, I wonder why building controls takes so long time. It's essentially only a bunch of ctl element creations, so it should be almost purely a CPU-bound task...
Takashi
2013-11-29 14:40, Takashi Iwai skrev:
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote:
2013-11-28 22:57, Lin, Mengdong skrev:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more than one codecs are connected to the bus, but are blocked. Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one on-board codec and one integrated display codec. During initialization, the codecs are created and configured in a serial way.
Creating a codec may cost 6~20ms, and then building controls make cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
Okay. It just seems to me that from a bird's eye view that parallelizing module loading and pci probing would give us much bigger benefits than trying to parallellize internally in the snd-hda-intel driver.
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with other drivers? If so we could consider making that the default.
At Fri, 29 Nov 2013 15:31:59 +0800, David Henningsson wrote:
2013-11-29 14:40, Takashi Iwai skrev:
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote:
2013-11-28 22:57, Lin, Mengdong skrev:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more than one codecs are connected to the bus, but are blocked. Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one on-board codec and one integrated display codec. During initialization, the codecs are created and configured in a serial way.
Creating a codec may cost 6~20ms, and then building controls make cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
Okay. It just seems to me that from a bird's eye view that parallelizing module loading and pci probing would give us much bigger benefits than trying to parallellize internally in the snd-hda-intel driver.
Yeah, for a long run, this would be far benefit. But be warned, several attempts in the past failed due to weird hardwares :)
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with other drivers? If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
Takashi
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, November 29, 2013 3:50 PM To: David Henningsson Cc: Lin, Mengdong; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Fri, 29 Nov 2013 15:31:59 +0800, David Henningsson wrote:
2013-11-29 14:40, Takashi Iwai skrev:
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote:
2013-11-28 22:57, Lin, Mengdong skrev:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more
than one codecs are connected to the bus, but are blocked.
Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one
on-board codec and one integrated display codec.
During initialization, the codecs are created and configured in a
serial way.
Creating a codec may cost 6~20ms, and then building controls make
cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
Okay. It just seems to me that from a bird's eye view that parallelizing module loading and pci probing would give us much bigger benefits than trying to parallellize internally in the snd-hda-intel driver.
Yeah, for a long run, this would be far benefit. But be warned, several attempts in the past failed due to weird hardwares :)
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with other drivers? If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete audio information after boot? I mean is it possible that azx_probe_continue() does not complete before user space check ALSA info?
Thanks Mengdong
At Fri, 29 Nov 2013 08:05:35 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, November 29, 2013 3:50 PM To: David Henningsson Cc: Lin, Mengdong; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Fri, 29 Nov 2013 15:31:59 +0800, David Henningsson wrote:
2013-11-29 14:40, Takashi Iwai skrev:
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote:
2013-11-28 22:57, Lin, Mengdong skrev:
Hi Takashi,
We're trying to reduce the HD-A driver initialization time when more
than one codecs are connected to the bus, but are blocked.
Would you please share some advices on this?
Usually, there is one HD-A controller connecting to two codecs: one
on-board codec and one integrated display codec.
During initialization, the codecs are created and configured in a
serial way.
Creating a codec may cost 6~20ms, and then building controls make
cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
Okay. It just seems to me that from a bird's eye view that parallelizing module loading and pci probing would give us much bigger benefits than trying to parallellize internally in the snd-hda-intel driver.
Yeah, for a long run, this would be far benefit. But be warned, several attempts in the past failed due to weird hardwares :)
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with other drivers? If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete audio information after boot? I mean is it possible that azx_probe_continue() does not complete before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Takashi
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, November 29, 2013 4:08 PM To: Lin, Mengdong Cc: David Henningsson; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Fri, 29 Nov 2013 08:05:35 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, November 29, 2013 3:50 PM To: David Henningsson Cc: Lin, Mengdong; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Fri, 29 Nov 2013 15:31:59 +0800, David Henningsson wrote:
2013-11-29 14:40, Takashi Iwai skrev:
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote:
2013-11-28 22:57, Lin, Mengdong skrev: > Hi Takashi, > > We're trying to reduce the HD-A driver initialization time > when more
than one codecs are connected to the bus, but are blocked.
> Would you please share some advices on this? > > Usually, there is one HD-A controller connecting to two > codecs: one
on-board codec and one integrated display codec.
> During initialization, the codecs are created and configured > in a
serial way.
> > Creating a codec may cost 6~20ms, and then building controls > make
cost about 15~30ms.
Sorry for interrupting, but I just wonder - I assume you have a maximum of 4 CPUs. Can't the other 3 CPUs be used to load other non-audio hardware in parallel instead? It sounds you're going to run into lock contention instead if you try to modify the same card from two threads simultaneously.
IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
Okay. It just seems to me that from a bird's eye view that parallelizing module loading and pci probing would give us much bigger benefits than trying to parallellize internally in the
snd-hda-intel driver.
Yeah, for a long run, this would be far benefit. But be warned, several attempts in the past failed due to weird hardwares :)
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with other drivers? If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete
audio information after boot?
I mean is it possible that azx_probe_continue() does not complete
before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
If yes, is it also necessary to defer snd_card_create() at the end of azx_probe_continue()? So user space can only see the sound card when azx_probe_continue() completes.
We've checked the driver initialization time, the bottleneck is the bus. In three major three time-consuming functions, snd_hda_codec_new, azx_codec_configure, and snd_hda_codec_build_controls, almost all the time is spent in codec_exec_verb().
So deferring azx_probe_continue() will be more helpful to reduce the initialization time and get more parallelization with other drivers.
Thanks Mengdong
At Mon, 2 Dec 2013 07:23:11 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, November 29, 2013 4:08 PM To: Lin, Mengdong Cc: David Henningsson; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Fri, 29 Nov 2013 08:05:35 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, November 29, 2013 3:50 PM To: David Henningsson Cc: Lin, Mengdong; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Fri, 29 Nov 2013 15:31:59 +0800, David Henningsson wrote:
2013-11-29 14:40, Takashi Iwai skrev:
At Fri, 29 Nov 2013 14:30:48 +0800, David Henningsson wrote: > 2013-11-28 22:57, Lin, Mengdong skrev: >> Hi Takashi, >> >> We're trying to reduce the HD-A driver initialization time >> when more
than one codecs are connected to the bus, but are blocked.
>> Would you please share some advices on this? >> >> Usually, there is one HD-A controller connecting to two >> codecs: one
on-board codec and one integrated display codec.
>> During initialization, the codecs are created and configured >> in a
serial way.
>> >> Creating a codec may cost 6~20ms, and then building controls >> make
cost about 15~30ms.
> Sorry for interrupting, but I just wonder - I assume you have a > maximum of 4 CPUs. Can't the other 3 CPUs be used to load other > non-audio hardware in parallel instead? It sounds you're going > to run into lock contention instead if you try to modify the > same card from two threads simultaneously. IIRC, PCI device probes are done sequentially because asynchronous probe caused too many troubles. And if it's a module, it's anyway more strictly serialized.
Okay. It just seems to me that from a bird's eye view that parallelizing module loading and pci probing would give us much bigger benefits than trying to parallellize internally in the
snd-hda-intel driver.
Yeah, for a long run, this would be far benefit. But be warned, several attempts in the past failed due to weird hardwares :)
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with other drivers? If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete
audio information after boot?
I mean is it possible that azx_probe_continue() does not complete
before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
Yes.
If yes, is it also necessary to defer snd_card_create() at the end of azx_probe_continue()?
No, that's a significant difference. The snd_card_create() invocation isn't related with the hardware access itself, thus it's no point to delay. If you delay the card instance creation, this may result in races between the card order, which we'd like to avoid.
So user space can only see the sound card when azx_probe_continue() completes.
The completion of the card creation is notified only after snd_card_register(), so it doesn't matter when to call snd_card_create().
The below is an untested patch. I should have added in the previous reply...
Takashi
--- diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index dfdb96603636..3deec907bf1a 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -3805,7 +3805,7 @@ static int azx_probe(struct pci_dev *pci, static int dev; struct snd_card *card; struct azx *chip; - bool probe_now; + bool schedule_probe; int err;
if (dev >= SNDRV_CARDS) @@ -3844,7 +3844,7 @@ static int azx_probe(struct pci_dev *pci, chip->disabled = true; }
- probe_now = !chip->disabled; + schedule_probe = !chip->disabled;
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3855,17 @@ static int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free; - probe_now = false; /* continued in azx_firmware_cb() */ + schedule_probe = false; /* continued in azx_firmware_cb() */ } #endif /* CONFIG_SND_HDA_PATCH_LOADER */
- /* continue probing in work context, avoid request_module deadlock */ - if (probe_now && (chip->driver_caps & AZX_DCAPS_I915_POWERWELL)) { -#ifdef CONFIG_SND_HDA_I915 - probe_now = false; - schedule_work(&chip->probe_work); -#else +#ifndef CONFIG_SND_HDA_I915 + if (chip->driver_caps & AZX_DCAPS_I915_POWERWELL) snd_printk(KERN_ERR SFX "Haswell must build in CONFIG_SND_HDA_I915\n"); #endif - }
- if (probe_now) { - err = azx_probe_continue(chip); - if (err < 0) - goto out_free; - } + if (schedule_probe) + schedule_work(&chip->probe_work);
dev++; complete_all(&chip->probe_wait);
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 4:24 PM
Btw, using a deferred azx_probe_continue (like we do if there's a firmware file), would one get more parallelization with
other drivers?
If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete
audio information after boot?
I mean is it possible that azx_probe_continue() does not complete
before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
Yes.
If yes, is it also necessary to defer snd_card_create() at the end of
azx_probe_continue()?
No, that's a significant difference. The snd_card_create() invocation isn't related with the hardware access itself, thus it's no point to delay. If you delay the card instance creation, this may result in races between the card order, which we'd like to avoid.
So user space can only see the sound card when azx_probe_continue()
completes.
The completion of the card creation is notified only after snd_card_register(), so it doesn't matter when to call snd_card_create().
The below is an untested patch. I should have added in the previous reply...
Hi Takashi,
Many thanks for the patch and information!
Is my understanding below is right? snd_card_register() will create the device files under /sys/devices/pci0000:00/0000:00:1b.0/sound/cardx (or /sys/devices/pci0000:00/0000:00:03.0/sound/cardx for Haswell display audio), and trigger uevent to notify user space that an audio device is ready.
And for the patch,
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3855,17 @@ static int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free;
probe_now = false; /* continued in azx_firmware_cb() */
}schedule_probe = false; /* continued in azx_firmware_cb() */
Should it be 'schedule_probe = true'? I guess the driver intends to defer firmware loading.
Thanks Mengdong
At Mon, 2 Dec 2013 09:48:24 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 4:24 PM
> Btw, using a deferred azx_probe_continue (like we do if > there's a firmware file), would one get more parallelization with
other drivers?
> If so we could consider making that the default.
This sounds like a good idea as we have already that code. The patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete
audio information after boot?
I mean is it possible that azx_probe_continue() does not complete
before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
Yes.
If yes, is it also necessary to defer snd_card_create() at the end of
azx_probe_continue()?
No, that's a significant difference. The snd_card_create() invocation isn't related with the hardware access itself, thus it's no point to delay. If you delay the card instance creation, this may result in races between the card order, which we'd like to avoid.
So user space can only see the sound card when azx_probe_continue()
completes.
The completion of the card creation is notified only after snd_card_register(), so it doesn't matter when to call snd_card_create().
The below is an untested patch. I should have added in the previous reply...
Hi Takashi,
Many thanks for the patch and information!
Is my understanding below is right? snd_card_register() will create the device files under /sys/devices/pci0000:00/0000:00:1b.0/sound/cardx (or /sys/devices/pci0000:00/0000:00:03.0/sound/cardx for Haswell display audio), and trigger uevent to notify user space that an audio device is ready.
And for the patch,
Yes.
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3855,17 @@ static int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free;
probe_now = false; /* continued in azx_firmware_cb() */
}schedule_probe = false; /* continued in azx_firmware_cb() */
Should it be 'schedule_probe = true'? I guess the driver intends to defer firmware loading.
No, in the case of f/w loading, f/w loader itself calls azx_probe_continue(), thus you don't need to schedule it manually.
We may move the f/w loading into azx_probe_continue() for simplicity, though, after applying the always-deferred-probe patch. Then it can be back to request_firmware() again (not _nowait()).
Takashi
At Mon, 02 Dec 2013 10:52:00 +0100, Takashi Iwai wrote:
At Mon, 2 Dec 2013 09:48:24 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 4:24 PM
> > Btw, using a deferred azx_probe_continue (like we do if > > there's a firmware file), would one get more parallelization with
other drivers?
> > If so we could consider making that the default. > > This sounds like a good idea as we have already that code. The > patch would be just a few-liners.
If azx_probe_continue() is delayed, could user space get incomplete
audio information after boot?
I mean is it possible that azx_probe_continue() does not complete
before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
Yes.
If yes, is it also necessary to defer snd_card_create() at the end of
azx_probe_continue()?
No, that's a significant difference. The snd_card_create() invocation isn't related with the hardware access itself, thus it's no point to delay. If you delay the card instance creation, this may result in races between the card order, which we'd like to avoid.
So user space can only see the sound card when azx_probe_continue()
completes.
The completion of the card creation is notified only after snd_card_register(), so it doesn't matter when to call snd_card_create().
The below is an untested patch. I should have added in the previous reply...
Hi Takashi,
Many thanks for the patch and information!
Is my understanding below is right? snd_card_register() will create the device files under /sys/devices/pci0000:00/0000:00:1b.0/sound/cardx (or /sys/devices/pci0000:00/0000:00:03.0/sound/cardx for Haswell display audio), and trigger uevent to notify user space that an audio device is ready.
And for the patch,
Yes.
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3855,17 @@ static int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free;
probe_now = false; /* continued in azx_firmware_cb() */
}schedule_probe = false; /* continued in azx_firmware_cb() */
Should it be 'schedule_probe = true'? I guess the driver intends to defer firmware loading.
No, in the case of f/w loading, f/w loader itself calls azx_probe_continue(), thus you don't need to schedule it manually.
We may move the f/w loading into azx_probe_continue() for simplicity, though, after applying the always-deferred-probe patch. Then it can be back to request_firmware() again (not _nowait()).
BTW, while checking my previous patch, I noticed that the complete_all() should be put in the delayed probe part, too. The following patch fixes it.
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hda - Fix complete_all() timing in deferred probes
When the probe of snd-hda-intel driver is deferred due to f/w loading or the nested module loading, complete_all() should be also delayed until the initialization really finished. Otherwise, vga-switcheroo client would start switching before the actual init is done.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index c6d230193da6..27aa14007cbd 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -3876,7 +3876,8 @@ static int azx_probe(struct pci_dev *pci, }
dev++; - complete_all(&chip->probe_wait); + if (chip->disabled) + complete_all(&chip->probe_wait); return 0;
out_free: @@ -3953,10 +3954,10 @@ static int azx_probe_continue(struct azx *chip) if ((chip->driver_caps & AZX_DCAPS_PM_RUNTIME) || chip->use_vga_switcheroo) pm_runtime_put_noidle(&pci->dev);
- return 0; - out_free: - chip->init_failed = 1; + if (err < 0) + chip->init_failed = 1; + complete_all(&chip->probe_wait); return err; }
Hi Takashi,
Thanks again for your clarification and patch! Will these two patches enter sound.git or sound-unstable.git at the moment?
Regards Mengdong
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 6:19 PM To: Lin, Mengdong Cc: David Henningsson; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Mon, 02 Dec 2013 10:52:00 +0100, Takashi Iwai wrote:
At Mon, 2 Dec 2013 09:48:24 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 4:24 PM
> > > Btw, using a deferred azx_probe_continue (like we do if > > > there's a firmware file), would one get more > > > parallelization with
other drivers?
> > > If so we could consider making that the default. > > > > This sounds like a good idea as we have already that code. > > The patch would be just a few-liners. > > If azx_probe_continue() is delayed, could user space get > incomplete audio information after boot? > I mean is it possible that azx_probe_continue() does not > complete before user space check ALSA info?
Not impossible. But then user-space should be triggered in event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
Yes.
If yes, is it also necessary to defer snd_card_create() at the end of
azx_probe_continue()?
No, that's a significant difference. The snd_card_create() invocation isn't related with the hardware access itself, thus it's no point to delay. If you delay the card instance creation, this may result in races between the card order, which we'd like to
avoid.
So user space can only see the sound card when azx_probe_continue()
completes.
The completion of the card creation is notified only after snd_card_register(), so it doesn't matter when to call
snd_card_create().
The below is an untested patch. I should have added in the previous reply...
Hi Takashi,
Many thanks for the patch and information!
Is my understanding below is right? snd_card_register() will create the device files under /sys/devices/pci0000:00/0000:00:1b.0/sound/cardx (or /sys/devices/pci0000:00/0000:00:03.0/sound/cardx for Haswell
display audio), and trigger uevent to notify user space that an audio device is ready.
And for the patch,
Yes.
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3855,17 @@
static
int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free;
probe_now = false; /* continued in azx_firmware_cb() */
schedule_probe = false; /* continued in azx_firmware_cb()
*/
}
Should it be 'schedule_probe = true'? I guess the driver intends to defer firmware loading.
No, in the case of f/w loading, f/w loader itself calls azx_probe_continue(), thus you don't need to schedule it manually.
We may move the f/w loading into azx_probe_continue() for simplicity, though, after applying the always-deferred-probe patch. Then it can be back to request_firmware() again (not _nowait()).
BTW, while checking my previous patch, I noticed that the complete_all() should be put in the delayed probe part, too. The following patch fixes it.
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hda - Fix complete_all() timing in deferred probes
When the probe of snd-hda-intel driver is deferred due to f/w loading or the nested module loading, complete_all() should be also delayed until the initialization really finished. Otherwise, vga-switcheroo client would start switching before the actual init is done.
Signed-off-by: Takashi Iwai tiwai@suse.de
sound/pci/hda/hda_intel.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index c6d230193da6..27aa14007cbd 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -3876,7 +3876,8 @@ static int azx_probe(struct pci_dev *pci, }
dev++;
- complete_all(&chip->probe_wait);
- if (chip->disabled)
return 0;complete_all(&chip->probe_wait);
out_free: @@ -3953,10 +3954,10 @@ static int azx_probe_continue(struct azx *chip) if ((chip->driver_caps & AZX_DCAPS_PM_RUNTIME) || chip->use_vga_switcheroo) pm_runtime_put_noidle(&pci->dev);
- return 0;
out_free:
- chip->init_failed = 1;
- if (err < 0)
chip->init_failed = 1;
- complete_all(&chip->probe_wait); return err;
}
-- 1.8.4.4
At Mon, 2 Dec 2013 11:27:55 +0000, Lin, Mengdong wrote:
Hi Takashi,
Thanks again for your clarification and patch! Will these two patches enter sound.git or sound-unstable.git at the moment?
Yes, I'm going to merge them unless anyone finds problems.
Takashi
Regards Mengdong
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 6:19 PM To: Lin, Mengdong Cc: David Henningsson; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] HD-Audio: How to reduce driver initializaton time if multiple codecs present on the bus?
At Mon, 02 Dec 2013 10:52:00 +0100, Takashi Iwai wrote:
At Mon, 2 Dec 2013 09:48:24 +0000, Lin, Mengdong wrote:
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Monday, December 02, 2013 4:24 PM
> > > > Btw, using a deferred azx_probe_continue (like we do if > > > > there's a firmware file), would one get more > > > > parallelization with
other drivers?
> > > > If so we could consider making that the default. > > > > > > This sounds like a good idea as we have already that code. > > > The patch would be just a few-liners. > > > > If azx_probe_continue() is delayed, could user space get > > incomplete > audio information after boot? > > I mean is it possible that azx_probe_continue() does not > > complete > before user space check ALSA info? > > Not impossible. But then user-space should be triggered in > event driven way by udev or such.
Hi Takashi/David,
So is it okay to defer azx_probe_continue() by default?
Yes.
If yes, is it also necessary to defer snd_card_create() at the end of
azx_probe_continue()?
No, that's a significant difference. The snd_card_create() invocation isn't related with the hardware access itself, thus it's no point to delay. If you delay the card instance creation, this may result in races between the card order, which we'd like to
avoid.
So user space can only see the sound card when azx_probe_continue()
completes.
The completion of the card creation is notified only after snd_card_register(), so it doesn't matter when to call
snd_card_create().
The below is an untested patch. I should have added in the previous reply...
Hi Takashi,
Many thanks for the patch and information!
Is my understanding below is right? snd_card_register() will create the device files under /sys/devices/pci0000:00/0000:00:1b.0/sound/cardx (or /sys/devices/pci0000:00/0000:00:03.0/sound/cardx for Haswell
display audio), and trigger uevent to notify user space that an audio device is ready.
And for the patch,
Yes.
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3855,17 @@
static
int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free;
probe_now = false; /* continued in azx_firmware_cb() */
schedule_probe = false; /* continued in azx_firmware_cb()
*/
}
Should it be 'schedule_probe = true'? I guess the driver intends to defer firmware loading.
No, in the case of f/w loading, f/w loader itself calls azx_probe_continue(), thus you don't need to schedule it manually.
We may move the f/w loading into azx_probe_continue() for simplicity, though, after applying the always-deferred-probe patch. Then it can be back to request_firmware() again (not _nowait()).
BTW, while checking my previous patch, I noticed that the complete_all() should be put in the delayed probe part, too. The following patch fixes it.
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hda - Fix complete_all() timing in deferred probes
When the probe of snd-hda-intel driver is deferred due to f/w loading or the nested module loading, complete_all() should be also delayed until the initialization really finished. Otherwise, vga-switcheroo client would start switching before the actual init is done.
Signed-off-by: Takashi Iwai tiwai@suse.de
sound/pci/hda/hda_intel.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index c6d230193da6..27aa14007cbd 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -3876,7 +3876,8 @@ static int azx_probe(struct pci_dev *pci, }
dev++;
- complete_all(&chip->probe_wait);
- if (chip->disabled)
return 0;complete_all(&chip->probe_wait);
out_free: @@ -3953,10 +3954,10 @@ static int azx_probe_continue(struct azx *chip) if ((chip->driver_caps & AZX_DCAPS_PM_RUNTIME) || chip->use_vga_switcheroo) pm_runtime_put_noidle(&pci->dev);
- return 0;
out_free:
- chip->init_failed = 1;
- if (err < 0)
chip->init_failed = 1;
- complete_all(&chip->probe_wait); return err;
}
-- 1.8.4.4
At Mon, 02 Dec 2013 13:18:11 +0100, Takashi Iwai wrote:
At Mon, 2 Dec 2013 11:27:55 +0000, Lin, Mengdong wrote:
Hi Takashi,
Thanks again for your clarification and patch! Will these two patches enter sound.git or sound-unstable.git at the moment?
Yes, I'm going to merge them unless anyone finds problems.
Turned out the original patch leading to a build error depending on kconfig. Below is the fixed version.
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: hda - Always do delayed probes for HD-audio devices
HD-audio devices tend to take long time for finishing the whole probing procedure. In this patch, the time-consuming part of the probing procedure, the codec probe and the rest initializations, are moved in the work, so that they can be done asynchronously in parallel with probes of other devices.
Since we already have this mechanism in the driver code for the firmware and i915 request_symbol() stuff, we just need to enable it always; the resultant patch even reduces more lines, which is an additional bonus.
Credit goes to David Henningsson, who suggested this workaround.
Reported-by: Mengdong Lin mengdong.lin@intel.com Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 28 +++++++--------------------- 1 file changed, 7 insertions(+), 21 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 486d25fa7e97..80c5f14e8ecd 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -543,9 +543,7 @@ struct azx { /* for pending irqs */ struct work_struct irq_pending_work;
-#ifdef CONFIG_SND_HDA_I915 struct work_struct probe_work; -#endif
/* reboot notifier (for mysterious hangup problem at power-down) */ struct notifier_block reboot_notifier; @@ -3500,12 +3498,10 @@ static void azx_check_snoop_available(struct azx *chip) } }
-#ifdef CONFIG_SND_HDA_I915 static void azx_probe_work(struct work_struct *work) { azx_probe_continue(container_of(work, struct azx, probe_work)); } -#endif
/* * constructor @@ -3582,10 +3578,8 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci, return err; }
-#ifdef CONFIG_SND_HDA_I915 /* continue probing in work context as may trigger request module */ INIT_WORK(&chip->probe_work, azx_probe_work); -#endif
*rchip = chip;
@@ -3805,7 +3799,7 @@ static int azx_probe(struct pci_dev *pci, static int dev; struct snd_card *card; struct azx *chip; - bool probe_now; + bool schedule_probe; int err;
if (dev >= SNDRV_CARDS) @@ -3844,7 +3838,7 @@ static int azx_probe(struct pci_dev *pci, chip->disabled = true; }
- probe_now = !chip->disabled; + schedule_probe = !chip->disabled;
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (patch[dev] && *patch[dev]) { @@ -3855,25 +3849,17 @@ static int azx_probe(struct pci_dev *pci, azx_firmware_cb); if (err < 0) goto out_free; - probe_now = false; /* continued in azx_firmware_cb() */ + schedule_probe = false; /* continued in azx_firmware_cb() */ } #endif /* CONFIG_SND_HDA_PATCH_LOADER */
- /* continue probing in work context, avoid request_module deadlock */ - if (probe_now && (chip->driver_caps & AZX_DCAPS_I915_POWERWELL)) { -#ifdef CONFIG_SND_HDA_I915 - probe_now = false; - schedule_work(&chip->probe_work); -#else +#ifndef CONFIG_SND_HDA_I915 + if (chip->driver_caps & AZX_DCAPS_I915_POWERWELL) snd_printk(KERN_ERR SFX "Haswell must build in CONFIG_SND_HDA_I915\n"); #endif - }
- if (probe_now) { - err = azx_probe_continue(chip); - if (err < 0) - goto out_free; - } + if (schedule_probe) + schedule_work(&chip->probe_work);
dev++; if (chip->disabled)
participants (3)
-
David Henningsson
-
Lin, Mengdong
-
Takashi Iwai