On Mon, 15 May 2023 16:49:48 +0200, Amadeusz Sławiński wrote:
On 5/15/2023 3:02 PM, Takashi Iwai wrote:
On Mon, 15 May 2023 13:19:29 +0200, Amadeusz Sławiński wrote:
On 5/12/2023 2:24 PM, Takashi Iwai wrote:
On Fri, 12 May 2023 14:00:54 +0200, Amadeusz Sławiński wrote:
On 5/12/2023 1:33 PM, Takashi Iwai wrote:
On Fri, 12 May 2023 13:23:49 +0200, Takashi Iwai wrote: > > On Thu, 11 May 2023 19:20:17 +0200, > Amadeusz Sławiński wrote: >> >> On 5/11/2023 5:58 PM, Takashi Iwai wrote: >>> On Thu, 11 May 2023 17:31:37 +0200, >>> Amadeusz Sławiński wrote: >>>> >>>> On 5/10/2023 2:21 PM, Takashi Iwai wrote: >>>>> On Tue, 09 May 2023 12:10:06 +0200, >>>>> Amadeusz Sławiński wrote: >>>> Then capture stream starts and seems to assume that >>>> registers were already set, so it doesn't write them to hw. >>> >>> ... it seems this didn't happen, and that's the inconsistency. >>> >>> So the further question is: >>> At the point just before you start recording, is the codec in runtime >>> suspended? Or it's running? >>> >>> If it's runtime-suspended, snd_hda_regmap_sync() must be called from >>> alc269_resume() via runtime-resume, and this must write out the >>> cached values. Then the bug can be along with that line. >>> >>> Or if it's running, it means that the previous check of >>> snd_hdac_keep_power_up() was bogus (or racy). >>> >> >> Well, it is in... let's call it semi powered state. When snd_hda_intel >> driver is loaded with power_save=X option it sets timeout to X seconds >> and problem only happens when I start the stream before those X >> seconds pass and it runs first runtime suspend. After it suspends it >> then uses standard pm_runtime_resume and works correctly. That's why >> the pm_runtime_force_suspend(&codec->core.dev); mentioned in first >> email in thread "fixes" the problem, as it forces it to be instantly >> suspended instead of waiting for timeout and then later normal >> resume-play/record-suspend flow can be followed. > > Hm, then maybe it's a bad idea to rely on the usage count there. > Even if the usage is 0, the device can be still active, and the update > can be missed. > > How about the patch like below?
Scratch that, it returns a wrong value. A simpler version like below works instead?
Yes it was broken, arecord didn't even start capturing ;)
Takashi
--- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -611,10 +611,9 @@ EXPORT_SYMBOL_GPL(snd_hdac_power_up_pm); int snd_hdac_keep_power_up(struct hdac_device *codec) { if (!atomic_inc_not_zero(&codec->in_pm)) {
int ret = pm_runtime_get_if_in_use(&codec->dev);
if (!ret)
if (!pm_runtime_active(&codec->dev)) return -1;
if (ret < 0)
if (pm_runtime_get_sync(&codec->dev) < 0) return 0; } return 1;
This one seems to work, as in I'm able to record before first suspend hits. However device stays in D0 when no stream is running... # cat /sys/devices/pci0000:00/0000:00:0e.0/power_state D0
OK, one step forward. The previous change was bad in anyway, as we shouldn't sync there at all.
So, the problem becomes clearer now: it's in the lazy update mechanism that misses the case that has to be written.
Scratch the previous one again, and could you try the following one instead?
Takashi
--- a/sound/hda/hdac_regmap.c +++ b/sound/hda/hdac_regmap.c @@ -293,8 +293,17 @@ static int hda_reg_write(void *context, unsigned int reg, unsigned int val) if (verb != AC_VERB_SET_POWER_STATE) { pm_lock = codec_pm_lock(codec);
if (pm_lock < 0)
return codec->lazy_cache ? 0 : -EAGAIN;
if (pm_lock < 0) {
/* skip the actual write if it's in lazy-update mode
* and only if the device is actually suspended;
* the usage count can be zero at transition phase
* (either suspending/resuming or auto-suspend sleep)
*/
if (codec->lazy_cache &&
pm_runtime_suspended(&codec->dev))
return 0;
return -EAGAIN;
} if (is_stereo_amp_verb(reg)) {}
With this one we are back to same behavior as without it. When capture is started before first suspend it records silence. After waiting for timeout and suspend it records correctly.
Hm, interesting. Does it mean that the pm_runtime_get_if_in_use() (in snd_hdac_keep_power_up()) returns a non-zero value? Or is pm_runtime_suspended() returns really true there?
So I've tested with vanilla kernel, where pm_runtime_get_if_in_use returns -22 until loaded and then 13 times "0" until arecord.
With above patch it returns 13 times "0" and then one more time "1".
pm_runtime_suspended() returns 0 (needed to modify patch a bit)
Patch:
diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c index 035b720bf602..62880952e398 100644 --- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -612,6 +612,7 @@ int snd_hdac_keep_power_up(struct hdac_device *codec) { if (!atomic_inc_not_zero(&codec->in_pm)) { int ret = pm_runtime_get_if_in_use(&codec->dev);
pr_err("DEBUG:%s:%d %s ret=%d\n", __FILE__, __LINE__,
__func__, ret); if (!ret) return -1; if (ret < 0) diff --git a/sound/hda/hdac_regmap.c b/sound/hda/hdac_regmap.c index fe3587547cfe..d6cf3fa2d4e7 100644 --- a/sound/hda/hdac_regmap.c +++ b/sound/hda/hdac_regmap.c @@ -293,8 +293,19 @@ static int hda_reg_write(void *context, unsigned int reg, unsigned int val)
if (verb != AC_VERB_SET_POWER_STATE) { pm_lock = codec_pm_lock(codec);
if (pm_lock < 0)
return codec->lazy_cache ? 0 : -EAGAIN;
if (pm_lock < 0) {
bool x;
/* skip the actual write if it's in lazy-update mode
* and only if the device is actually suspended;
* the usage count can be zero at transition phase
* (either suspending/resuming or auto-suspend
sleep)
*/
x = pm_runtime_suspended(&codec->dev);
pr_err("DEBUG: %s:%d x = %d\n", __FILE__,
__LINE__, x);
if (codec->lazy_cache && x)
return 0;
return -EAGAIN;
} } if (is_stereo_amp_verb(reg)) {
Part of vanilla dmesg (contains only first chunk):
(snip)
Part of fully patched dmesg:
(snip)
[ 79.556191] DEBUG:sound/hda/hdac_device.c:615 snd_hdac_keep_power_up ret=0 [ 79.556234] DEBUG:sound/hda/hdac_device.c:615 snd_hdac_keep_power_up ret=0
If here ret==0, hdac_keep_power_up() should return -1, and it's the return value of codec_pm_lock(). So it must print out the value of "x" (pm_runtime_suspend() result), but I don't see it. What's the missing piece...?
I think there are two problems:
- After probe codec is powered down
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/soun...), even though according to power management it is still running
I guess it's in the auto-suspend state, so it's still not suspended but the device itself is active, while the usage count is 0.
That's fine, and I thought my second patch handling it. That is, if the usage count is 0 and the device is not suspended, it should return -EAGAIN and make the caller retry with the full power up. The code path is with CALL_RUN_FUNC() macro in hdac_regmap.c, and with -EAGAIN return value, it tries snd_hdac_power_up_pm() and call the function again.
- When stream is started before first suspend, resume function
doesn't run and it is a function which syncs cached registers. By resume function I mean https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/soun... which calls snd_hda_regmap_sync() or through in case of the platform I test it on codec->patch_ops.resume(codec) -> alc269_resume, which also calls snd_hda_regmap_sync().
It's also expected, per se. Since it's been not suspended, it assumes that the value got already written, and no resume is needed.
Takashi