[alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
System hang is observed While trying to do suspend. It was because of an infinite loop in the AIC23 resume path which was trying to restore AIC23 register values from the register cache.
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
Signed-off-by: Anuj Aggarwal anuj.aggarwal@ti.com --- sound/soc/codecs/tlv320aic23.c | 16 +++------------- 1 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/sound/soc/codecs/tlv320aic23.c b/sound/soc/codecs/tlv320aic23.c index 6b24d8b..cabe60c 100644 --- a/sound/soc/codecs/tlv320aic23.c +++ b/sound/soc/codecs/tlv320aic23.c @@ -565,13 +565,12 @@ static int tlv320aic23_set_bias_level(struct snd_soc_codec *codec, case SND_SOC_BIAS_PREPARE: break; case SND_SOC_BIAS_STANDBY: - /* everything off except vref/vmid, */ - tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040); + /* Activate the digital interface */ + tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1); break; case SND_SOC_BIAS_OFF: - /* everything off, dac mute, inactive */ + /* Deactivate the digital interface */ tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0); - tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff); break; } codec->bias_level = level; @@ -615,7 +614,6 @@ static int tlv320aic23_suspend(struct platform_device *pdev, struct snd_soc_device *socdev = platform_get_drvdata(pdev); struct snd_soc_codec *codec = socdev->card->codec;
- tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0); tlv320aic23_set_bias_level(codec, SND_SOC_BIAS_OFF);
return 0; @@ -625,14 +623,6 @@ static int tlv320aic23_resume(struct platform_device *pdev) { struct snd_soc_device *socdev = platform_get_drvdata(pdev); struct snd_soc_codec *codec = socdev->card->codec; - int i; - u16 reg; - - /* Sync reg_cache with the hardware */ - for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) { - u16 val = tlv320aic23_read_reg_cache(codec, reg); - tlv320aic23_write(codec, reg, val); - }
tlv320aic23_set_bias_level(codec, SND_SOC_BIAS_STANDBY); tlv320aic23_set_bias_level(codec, codec->suspend_bias_level);
On Wed, Nov 25, 2009 at 06:40:31PM +0530, Anuj Aggarwal wrote:
System hang is observed While trying to do suspend. It was because of an infinite loop in the AIC23 resume path which was trying to restore AIC23 register values from the register cache.
This doesn't appear to tie in with the actual patch - where's the infinite loop? The register cache restore loop has a constant bound and I can't see anything that would make it spin infinitely.
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
What do you mean by "properly activating/deactivating the digital interface"?
case SND_SOC_BIAS_STANDBY:
/* everything off except vref/vmid, */
tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040);
/* Activate the digital interface */
break; case SND_SOC_BIAS_OFF:tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1);
/* everything off, dac mute, inactive */
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0);/* Deactivate the digital interface */
break;tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff);
This looks wrong - the driver is now no longer managing bias levels at all.
- u16 reg;
- /* Sync reg_cache with the hardware */
- for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) {
u16 val = tlv320aic23_read_reg_cache(codec, reg);
tlv320aic23_write(codec, reg, val);
- }
This also looks wrong, the register cache restore has been completely removed so if the power to the device is cut during suspend then the device will go back to register defaults and no longer function correctly after resume.
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Wednesday, November 25, 2009 7:04 PM To: Aggarwal, Anuj Cc: alsa-devel@alsa-project.org; linux-omap@vger.kernel.org Subject: Re: [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Wed, Nov 25, 2009 at 06:40:31PM +0530, Anuj Aggarwal wrote:
System hang is observed While trying to do suspend. It was because of an infinite loop in the AIC23 resume path which was trying to restore AIC23 register values from the register cache.
This doesn't appear to tie in with the actual patch - where's the infinite loop? The register cache restore loop has a constant bound and I can't see anything that would make it spin infinitely.
[Aggarwal, Anuj] Here is the original code:
<snip> int i; u16 reg;
/* Sync reg_cache with the hardware */ for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) { u16 val = tlv320aic23_read_reg_cache(codec, reg); tlv320aic23_write(codec, reg, val); } </snip>
I can see 'i' getting incremented, instead of 'reg'. Isn't this an infinite loop?
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
What do you mean by "properly activating/deactivating the digital interface"?
[Aggarwal, Anuj] The driver was only deactivating the digital interface and not activating it. Hence added that part.
case SND_SOC_BIAS_STANDBY:
/* everything off except vref/vmid, */
tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040);
/* Activate the digital interface */
break; case SND_SOC_BIAS_OFF:tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1);
/* everything off, dac mute, inactive */
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0);/* Deactivate the digital interface */
break;tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff);
This looks wrong - the driver is now no longer managing bias levels at all.
[Aggarwal, Anuj] The bias levels were wrongly managed earlier. While doing a suspend, it was powering off all the interfaces and while coming up, it was not switching on the required interfaces. Hence no sound was coming out if audio was played back. Moreover, I referred twl4030 codec and there also the driver was only handling the codec power control bit in _STANDBY. So based my patch on that. On each bias level, which are the interfaces which need to be switched on / off?
- u16 reg;
- /* Sync reg_cache with the hardware */
- for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) {
u16 val = tlv320aic23_read_reg_cache(codec, reg);
tlv320aic23_write(codec, reg, val);
- }
This also looks wrong, the register cache restore has been completely removed so if the power to the device is cut during suspend then the device will go back to register defaults and no longer function correctly after resume.
[Aggarwal, Anuj] I can add that register cache restore again with the necessary checks.
On Thu, Nov 26, 2009 at 12:49:11AM +0530, Aggarwal, Anuj wrote:
<snip> int i; u16 reg;
/* Sync reg_cache with the hardware */ for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) { u16 val = tlv320aic23_read_reg_cache(codec, reg); tlv320aic23_write(codec, reg, val); }
</snip>
I can see 'i' getting incremented, instead of 'reg'. Isn't this an infinite loop?
Yes, that does look entirely buggy - it seems fairly clear that suspend and resume have never worked with this driver. The fact that you were removing the loop entirely meant I didn't read the actual code for bugs.
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
What do you mean by "properly activating/deactivating the digital interface"?
[Aggarwal, Anuj] The driver was only deactivating the digital interface and not activating it. Hence added that part.
So this is a separate issue, and should ideally be a separate patch - it looks like you've got three different changes here, one to fix the buggy register cache restore, one to manage the PWR bit on suspend and another to possibly fix the bias configuration. You certainly need to identify all three changes in the changelog, especially for things like this which look like they should go in as bug fixes (and bearing in mind that it's the end of the release cycle).
Glancing at the code there's much more management required here - there's other places where it's managed and the code needs to be joined up with them. The setting should only be restored if the device was active when suspended, not unconditionally.
case SND_SOC_BIAS_STANDBY:
/* everything off except vref/vmid, */
tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040);
/* Activate the digital interface */
break; case SND_SOC_BIAS_OFF:tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1);
/* everything off, dac mute, inactive */
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0);/* Deactivate the digital interface */
break;tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff);
This looks wrong - the driver is now no longer managing bias levels at all.
[Aggarwal, Anuj] The bias levels were wrongly managed earlier. While doing a suspend, it was powering off all the interfaces and while coming up, it was not switching on the required interfaces. Hence no sound was coming out if audio was played back.
Note that this CODEC driver impelements DAPM and so all the bits in the power register which are controlled by DAPM will be managed by the ASoC core without any action by the driver.
On each bias level, which are the interfaces which need to be switched on / off?
The biases and any other device-wide settings, everything else should be managed via DAPM if it's in use.
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Thursday, November 26, 2009 1:10 AM To: Aggarwal, Anuj Cc: alsa-devel@alsa-project.org; linux-omap@vger.kernel.org Subject: Re: [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Thu, Nov 26, 2009 at 12:49:11AM +0530, Aggarwal, Anuj wrote:
<snip> int i; u16 reg;
/* Sync reg_cache with the hardware */ for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) { u16 val = tlv320aic23_read_reg_cache(codec, reg); tlv320aic23_write(codec, reg, val); }
</snip>
I can see 'i' getting incremented, instead of 'reg'. Isn't this an infinite loop?
Yes, that does look entirely buggy - it seems fairly clear that suspend and resume have never worked with this driver. The fact that you were removing the loop entirely meant I didn't read the actual code for bugs.
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
What do you mean by "properly activating/deactivating the digital interface"?
[Aggarwal, Anuj] The driver was only deactivating the digital interface and not activating it. Hence added that part.
So this is a separate issue, and should ideally be a separate patch - it looks like you've got three different changes here, one to fix the buggy register cache restore, one to manage the PWR bit on suspend and another to possibly fix the bias configuration. You certainly need to identify all three changes in the changelog, especially for things like this which look like they should go in as bug fixes (and bearing in mind that it's the end of the release cycle).
Fine with me, I will push individual patches for all the issues.
Glancing at the code there's much more management required here - there's other places where it's managed and the code needs to be joined up with them. The setting should only be restored if the device was active when suspended, not unconditionally.
case SND_SOC_BIAS_STANDBY:
/* everything off except vref/vmid, */
tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040);
/* Activate the digital interface */
break; case SND_SOC_BIAS_OFF:tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1);
/* everything off, dac mute, inactive */
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0);/* Deactivate the digital interface */
break;tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff);
This looks wrong - the driver is now no longer managing bias levels at all.
[Aggarwal, Anuj] The bias levels were wrongly managed earlier. While
doing
a suspend, it was powering off all the interfaces and while coming up,
it
was not switching on the required interfaces. Hence no sound was coming out if audio was played back.
Note that this CODEC driver impelements DAPM and so all the bits in the power register which are controlled by DAPM will be managed by the ASoC core without any action by the driver.
On each bias level, which are the interfaces which need to be switched on / off?
The biases and any other device-wide settings, everything else should be managed via DAPM if it's in use.
Anuj Aggarwal wrote:
System hang is observed While trying to do suspend. It was because of an infinite loop in the AIC23 resume path which was trying to restore AIC23 register values from the register cache.
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
Signed-off-by: Anuj Aggarwal anuj.aggarwal@ti.com
sound/soc/codecs/tlv320aic23.c | 16 +++------------- 1 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/sound/soc/codecs/tlv320aic23.c b/sound/soc/codecs/tlv320aic23.c index 6b24d8b..cabe60c 100644 --- a/sound/soc/codecs/tlv320aic23.c +++ b/sound/soc/codecs/tlv320aic23.c @@ -565,13 +565,12 @@ static int tlv320aic23_set_bias_level(struct snd_soc_codec *codec, case SND_SOC_BIAS_PREPARE: break; case SND_SOC_BIAS_STANDBY:
/* everything off except vref/vmid, */
tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040);
/* Activate the digital interface */
break; case SND_SOC_BIAS_OFF:tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1);
/* everything off, dac mute, inactive */
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0);/* Deactivate the digital interface */
break; } codec->bias_level = level;tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff);
@@ -615,7 +614,6 @@ static int tlv320aic23_suspend(struct platform_device *pdev, struct snd_soc_device *socdev = platform_get_drvdata(pdev); struct snd_soc_codec *codec = socdev->card->codec;
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0); tlv320aic23_set_bias_level(codec, SND_SOC_BIAS_OFF);
return 0;
@@ -625,14 +623,6 @@ static int tlv320aic23_resume(struct platform_device *pdev) { struct snd_soc_device *socdev = platform_get_drvdata(pdev); struct snd_soc_codec *codec = socdev->card->codec;
- int i;
- u16 reg;
- /* Sync reg_cache with the hardware */
- for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) {
u16 val = tlv320aic23_read_reg_cache(codec, reg);
tlv320aic23_write(codec, reg, val);
- }
Changing to for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); reg++) {
should be enough to fix the infinite loop. Could you try this without the rest of the patch?
tlv320aic23_set_bias_level(codec, SND_SOC_BIAS_STANDBY); tlv320aic23_set_bias_level(codec, codec->suspend_bias_level);
Adding Arun to cc list
-----Original Message----- From: Troy Kisky [mailto:troy.kisky@boundarydevices.com] Sent: Thursday, November 26, 2009 1:53 AM To: Aggarwal, Anuj Cc: alsa-devel@alsa-project.org; linux-omap@vger.kernel.org; Mark Brown; Arun KS Subject: Re: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
Anuj Aggarwal wrote:
System hang is observed While trying to do suspend. It was because of an infinite loop in the AIC23 resume path which was trying to restore AIC23 register values from the register cache.
This patch fixes the problem by correcting the resume path and properly activating/deactivating the digital interface while doing the suspend / off transitions.
Signed-off-by: Anuj Aggarwal anuj.aggarwal@ti.com
sound/soc/codecs/tlv320aic23.c | 16 +++------------- 1 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/sound/soc/codecs/tlv320aic23.c
b/sound/soc/codecs/tlv320aic23.c
index 6b24d8b..cabe60c 100644 --- a/sound/soc/codecs/tlv320aic23.c +++ b/sound/soc/codecs/tlv320aic23.c @@ -565,13 +565,12 @@ static int tlv320aic23_set_bias_level(struct
snd_soc_codec *codec,
case SND_SOC_BIAS_PREPARE: break; case SND_SOC_BIAS_STANDBY:
/* everything off except vref/vmid, */
tlv320aic23_write(codec, TLV320AIC23_PWR, reg | 0x0040);
/* Activate the digital interface */
break; case SND_SOC_BIAS_OFF:tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x1);
/* everything off, dac mute, inactive */
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0);/* Deactivate the digital interface */
break; } codec->bias_level = level;tlv320aic23_write(codec, TLV320AIC23_PWR, 0xffff);
@@ -615,7 +614,6 @@ static int tlv320aic23_suspend(struct
platform_device *pdev,
struct snd_soc_device *socdev = platform_get_drvdata(pdev); struct snd_soc_codec *codec = socdev->card->codec;
tlv320aic23_write(codec, TLV320AIC23_ACTIVE, 0x0); tlv320aic23_set_bias_level(codec, SND_SOC_BIAS_OFF);
return 0;
@@ -625,14 +623,6 @@ static int tlv320aic23_resume(struct
platform_device *pdev)
{ struct snd_soc_device *socdev = platform_get_drvdata(pdev); struct snd_soc_codec *codec = socdev->card->codec;
- int i;
- u16 reg;
- /* Sync reg_cache with the hardware */
- for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); i++) {
u16 val = tlv320aic23_read_reg_cache(codec, reg);
tlv320aic23_write(codec, reg, val);
- }
Changing to for (reg = 0; reg < ARRAY_SIZE(tlv320aic23_reg); reg++) {
should be enough to fix the infinite loop. Could you try this without the rest of the patch?
I have already tried that but it doesn't fix the problem entirely. Still, I was not able to playback properly because the bias settings were not handled correctly and as you can see, it writes to some invalid registers too (all registers don't exist between 0 and ARRAY_SIZE).
tlv320aic23_set_bias_level(codec, SND_SOC_BIAS_STANDBY); tlv320aic23_set_bias_level(codec, codec->suspend_bias_level);
Adding Arun to cc list
On Thu, Nov 26, 2009 at 08:31:11AM +0530, Aggarwal, Anuj wrote:
I have already tried that but it doesn't fix the problem entirely. Still, I was not able to playback properly because the bias settings were not handled correctly and as you can see, it writes to some invalid registers too (all registers don't exist between 0 and ARRAY_SIZE).
Please also note the following patch which is queued for 2.6.32:
commit 50b6bce59d154b5db137907a5c0ed45a4e7a3829 Author: Mark Brown broonie@opensource.wolfsonmicro.com Date: Mon Nov 23 13:11:53 2009 +0000
ASoC: Fix suspend with active audio streams
When we get a stream suspend event force the power down since otherwise the stream would remain marked as active. In future we'll probably want to make this stream-specific and add an interface to make the power down of other widgets optional in order to support leaving bypass paths active while suspending the processor.
Cc: stable@kernel.org Reported-by: Joonyoung Shim jy0922.shim@samsung.com Tested-by: Joonyoung Shim jy0922.shim@samsung.com Acked-by: Liam Girdwood lrg@slimlogic.co.uk Signed-off-by: Mark Brown broonie@opensource.wolfsonmicro.com
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index d89f6dc..66d4c16 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -973,9 +973,19 @@ static int dapm_power_widgets(struct snd_soc_codec *codec, int event) if (!w->power_check) continue;
- power = w->power_check(w); - if (power) - sys_power = 1; + /* If we're suspending then pull down all the + * power. */ + switch (event) { + case SND_SOC_DAPM_STREAM_SUSPEND: + power = 0; + break; + + default: + power = w->power_check(w); + if (power) + sys_power = 1; + break; + }
if (w->power == power) continue; @@ -999,8 +1009,12 @@ static int dapm_power_widgets(struct snd_soc_codec *codec, int event) case SND_SOC_DAPM_STREAM_RESUME: sys_power = 1; break; + case SND_SOC_DAPM_STREAM_SUSPEND: + sys_power = 0; + break; case SND_SOC_DAPM_STREAM_NOP: sys_power = codec->bias_level != SND_SOC_BIAS_STANDBY; + break; default: break; }
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Thursday, November 26, 2009 3:53 PM To: Aggarwal, Anuj Cc: 'Troy Kisky'; alsa-devel@alsa-project.org; linux-omap@vger.kernel.org; Arun KS Subject: Re: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Thu, Nov 26, 2009 at 08:31:11AM +0530, Aggarwal, Anuj wrote:
I have already tried that but it doesn't fix the problem entirely. Still, I was not able to playback properly because the bias settings were not handled correctly and as you can see, it writes to some invalid registers too (all registers don't exist between 0 and
ARRAY_SIZE).
Please also note the following patch which is queued for 2.6.32:
commit 50b6bce59d154b5db137907a5c0ed45a4e7a3829 Author: Mark Brown broonie@opensource.wolfsonmicro.com Date: Mon Nov 23 13:11:53 2009 +0000
ASoC: Fix suspend with active audio streams
[Aggarwal, Anuj] I am also seeing one more problem with the original driver, although I am not sure if the problem is in the audio driver. When tried to capture, using NFS as storage, it gives overrun error and comes out with: arecord: pcm_read:1617: read error: Input/output error
It happen always after ~20 sec, file size ~5MB. Tried with multiple configurations in arecord but no use. When tried: arecord -f cd /dev/null, it works fine. Same issue doesn't come too when I try to store the captured audio file on a MMC card.
Any idea what could be the problem? Why arecord goes for a toss after a single overrun error and why it is happening always after ~20 sec? Is there something which can be tried to narrow down the problem?
On Thu, Nov 26, 2009 at 08:26:44PM +0530, Aggarwal, Anuj wrote:
driver, although I am not sure if the problem is in the audio driver. When tried to capture, using NFS as storage, it gives overrun error and comes out with: arecord: pcm_read:1617: read error: Input/output error
It happen always after ~20 sec, file size ~5MB. Tried with multiple configurations in arecord but no use. When tried: arecord -f cd /dev/null, it works fine. Same issue doesn't come too when I try to store the captured audio file on a MMC card.
Any idea what could be the problem? Why arecord goes for a toss after a single overrun error and why it is happening always after ~20 sec? Is there something which can be tried to narrow down the problem?
Sounds like you've narrowed the problem down to a performance issue with NFS writeout - it's probably having trouble keeping up with your I/O rate. This isn't 100% surprising with smaller systems, sometimes tuning the NFS configuration can resolve the issue but sometimes the hardware is just underspecified.
arecord is a pretty basic program and doesn't try terribly hard to recover from errors.
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Thursday, November 26, 2009 8:41 PM To: Aggarwal, Anuj Cc: 'Troy Kisky'; alsa-devel@alsa-project.org; linux-omap@vger.kernel.org; Arun KS Subject: Re: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Thu, Nov 26, 2009 at 08:26:44PM +0530, Aggarwal, Anuj wrote:
driver, although I am not sure if the problem is in the audio driver. When tried to capture, using NFS as storage, it gives overrun error and
comes out with:
arecord: pcm_read:1617: read error: Input/output error
It happen always after ~20 sec, file size ~5MB. Tried with multiple configurations in arecord but no use. When tried: arecord -f cd /dev/null, it works fine. Same issue doesn't come too when I try to store the captured audio file on a MMC card.
Any idea what could be the problem? Why arecord goes for a toss after a single overrun error and why it is happening always after ~20 sec? Is there something which can be tried to narrow down the problem?
Sounds like you've narrowed the problem down to a performance issue with NFS writeout - it's probably having trouble keeping up with your I/O rate. This isn't 100% surprising with smaller systems, sometimes tuning the NFS configuration can resolve the issue but sometimes the hardware is just underspecified.
[Aggarwal, Anuj] I am still surprised how this could be a NFS writeout issue as we are seeing a consistent read/write rate of 2Mbps over tftp. When dd command is used for read/write to further check NFS performance, 2Mbps for write and 4Mbps for read is observed. Does that still mean nfs is the culprit? What could be tweaked in audio/network driver to avoid this problem, any suggestions?
arecord is a pretty basic program and doesn't try terribly hard to recover from errors.
[Aggarwal, Anuj] Any other utility to try capture which does error recovering too?
The performance numbers are in Mbytes/sec. Sorry for the typo :(
-----Original Message----- From: linux-omap-owner@vger.kernel.org [mailto:linux-omap- owner@vger.kernel.org] On Behalf Of Aggarwal, Anuj Sent: Thursday, November 26, 2009 8:52 PM To: Mark Brown Cc: 'Troy Kisky'; alsa-devel@alsa-project.org; linux-omap@vger.kernel.org; Arun KS Subject: RE: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Thursday, November 26, 2009 8:41 PM To: Aggarwal, Anuj Cc: 'Troy Kisky'; alsa-devel@alsa-project.org; linux-
omap@vger.kernel.org;
Arun KS Subject: Re: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Thu, Nov 26, 2009 at 08:26:44PM +0530, Aggarwal, Anuj wrote:
driver, although I am not sure if the problem is in the audio driver. When tried to capture, using NFS as storage, it gives overrun error
and
comes out with:
arecord: pcm_read:1617: read error: Input/output error
It happen always after ~20 sec, file size ~5MB. Tried with multiple configurations in arecord but no use. When tried: arecord -f cd /dev/null, it works fine. Same issue doesn't come too when I try to store the captured audio file on a MMC card.
Any idea what could be the problem? Why arecord goes for a toss after
a
single overrun error and why it is happening always after ~20 sec? Is there something which can be tried to narrow down the problem?
Sounds like you've narrowed the problem down to a performance issue with NFS writeout - it's probably having trouble keeping up with your I/O rate. This isn't 100% surprising with smaller systems, sometimes tuning the NFS configuration can resolve the issue but sometimes the hardware is just underspecified.
[Aggarwal, Anuj] I am still surprised how this could be a NFS writeout issue as we are seeing a consistent read/write rate of 2Mbps over tftp. When dd command is used for read/write to further check NFS performance, 2Mbps for write and 4Mbps for read is observed. Does that still mean nfs is the culprit? What could be tweaked in audio/network driver to avoid this problem, any suggestions?
arecord is a pretty basic program and doesn't try terribly hard to recover from errors.
[Aggarwal, Anuj] Any other utility to try capture which does error recovering too? -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 26, 2009 at 08:52:08PM +0530, Aggarwal, Anuj wrote:
[Aggarwal, Anuj] I am still surprised how this could be a NFS writeout issue as we are seeing a consistent read/write rate of 2Mbps over tftp. When dd command is used for read/write to further check NFS performance, 2Mbps for write and 4Mbps for read is observed. Does that still mean nfs is the culprit? What could be tweaked in audio/network driver to avoid this problem, any suggestions?
There can also be issues with the way the data gets pushed into NFS interacting poorly - it's not just the raw data rate that's in play here, it's also things like how often the writes are done and how big they are. Possibly also overhead from interacting with the ethernet chip but that's not normally an issue for anything modern.
The fact that this only happens when NFS is in use seems a fairly clear pointer to an interacton there.
[Aggarwal, Anuj] Any other utility to try capture which does error recovering too?
Not for the console off the top of my head, and TBH I don't really know how good the error handling is in the various apps. You could also try playing with the buffer size options in arecord.
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Thursday, November 26, 2009 9:00 PM To: Aggarwal, Anuj Cc: 'Troy Kisky'; alsa-devel@alsa-project.org; linux-omap@vger.kernel.org; Arun KS Subject: Re: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Thu, Nov 26, 2009 at 08:52:08PM +0530, Aggarwal, Anuj wrote:
[Aggarwal, Anuj] I am still surprised how this could be a NFS writeout
issue
as we are seeing a consistent read/write rate of 2Mbps over tftp. When
dd
command is used for read/write to further check NFS performance, 2Mbps
for write and 4Mbps for read is observed.
Does that still mean nfs is the culprit? What could be tweaked in
audio/network driver to avoid this problem, any suggestions?
There can also be issues with the way the data gets pushed into NFS interacting poorly - it's not just the raw data rate that's in play here, it's also things like how often the writes are done and how big they are. Possibly also overhead from interacting with the ethernet chip but that's not normally an issue for anything modern.
The fact that this only happens when NFS is in use seems a fairly clear pointer to an interacton there.
[Aggarwal, Anuj] We were able to fine tune NFS and use arecord to capture large files. But some more problems cropped up when tried to suspend / resume. Basic playback is working fine wrt suspend/resume. But capture, either tried independently or with playback, is creating a system wide hang. I fixed that infinite-loop in resume path but I believe something else needs cleanup too. Any pointers?
[Aggarwal, Anuj] Any other utility to try capture which does error recovering too?
Not for the console off the top of my head, and TBH I don't really know how good the error handling is in the various apps. You could also try playing with the buffer size options in arecord.
On Fri, Nov 27, 2009 at 01:37:54PM +0530, Aggarwal, Anuj wrote:
[Aggarwal, Anuj] We were able to fine tune NFS and use arecord to capture large files. But some more problems cropped up when tried to suspend / resume. Basic playback is working fine wrt suspend/resume. But capture, either tried independently or with playback, is creating a system wide hang. I fixed that infinite-loop in resume path but I believe something else needs cleanup too. Any pointers?
Have you got CONFIG_DETECT_SOFTLOCKUP turned on? With any luck that will at least pinpoint where the code has locked up. Otherwise it's going to be a case of bisection via printk().
A contrast and compare of the hardware initialisation on initial boot compared to coming out of suspend might identify the problem. If the playback is working it's *probably* something missing in the CPU init that's specific to capture.
On Fri, Nov 27, 2009 at 01:37:54PM +0530, Aggarwal, Anuj wrote:
[Aggarwal, Anuj] We were able to fine tune NFS and use arecord to capture large files. But some more problems cropped up when tried to suspend / resume. Basic playback is working fine wrt suspend/resume. But capture,
Incidentally, it'd be good to get the fixes for the CODEC driver (at least the infinite loop one) in sooner rather than later - there's a clear bug there which it'd good to try to get into 2.6.32.
On Thu, 2009-11-26 at 20:52 +0530, Aggarwal, Anuj wrote:
-----Original Message----- From: Mark Brown [mailto:broonie@opensource.wolfsonmicro.com] Sent: Thursday, November 26, 2009 8:41 PM To: Aggarwal, Anuj Cc: 'Troy Kisky'; alsa-devel@alsa-project.org; linux-omap@vger.kernel.org; Arun KS Subject: Re: [alsa-devel] [PATCH] ASoC: AM3517: Fix AIC23 suspend/resume hang
On Thu, Nov 26, 2009 at 08:26:44PM +0530, Aggarwal, Anuj wrote:
driver, although I am not sure if the problem is in the audio driver. When tried to capture, using NFS as storage, it gives overrun error and
comes out with:
arecord: pcm_read:1617: read error: Input/output error
It happen always after ~20 sec, file size ~5MB. Tried with multiple configurations in arecord but no use. When tried: arecord -f cd /dev/null, it works fine. Same issue doesn't come too when I try to store the captured audio file on a MMC card.
Any idea what could be the problem? Why arecord goes for a toss after a single overrun error and why it is happening always after ~20 sec? Is there something which can be tried to narrow down the problem?
Sounds like you've narrowed the problem down to a performance issue with NFS writeout - it's probably having trouble keeping up with your I/O rate. This isn't 100% surprising with smaller systems, sometimes tuning the NFS configuration can resolve the issue but sometimes the hardware is just underspecified.
[Aggarwal, Anuj] I am still surprised how this could be a NFS writeout issue as we are seeing a consistent read/write rate of 2Mbps over tftp. When dd command is used for read/write to further check NFS performance, 2Mbps for write and 4Mbps for read is observed. Does that still mean nfs is the culprit? What could be tweaked in audio/network driver to avoid this problem, any suggestions?
My two cents... could also try connecting the target directly to your host machine (using nfs), through a cross-cable (one with the Tx and Rx crossed). If no errors are seen, then it surely must be an attribution of network congestion.
Regards, Philby
participants (5)
-
Aggarwal, Anuj
-
Anuj Aggarwal
-
Mark Brown
-
Philby John
-
Troy Kisky