Re: HDA, power saving and recording

18 May 2023

On Thu, 18 May 2023 11:00:54 +0200,
Cezary Rojewski wrote:
...
On 2023-05-17 4:52 PM, Takashi Iwai wrote:
...
On Wed, 17 May 2023 15:15:13 +0200,
Cezary Rojewski wrote:
...
...
...
After reading this conversation few times I came to conclusion that
codec device should be kept up as long as its runtime_status=0
(RPM_ACTIVE), regardless if usage_count is 0 or not. Basically, in
autosuspend case usage_count and runtime_status for the device are
both 0 so, if we are not ignoring the former by calling
pm_runtime_get_if_in_use() then we end up caching the writes during
the autosuspend period - period when the device is no longer used but
there is still some time left before it's suspended.

--- a/sound/hda/hdac_device.c
+++ b/sound/hda/hdac_device.c
@@ -611,7 +611,7 @@ EXPORT_SYMBOL_GPL(snd_hdac_power_up_pm);
  int snd_hdac_keep_power_up(struct hdac_device *codec)
  {
         if (!atomic_inc_not_zero(&codec->in_pm)) {

          int ret = pm_runtime_get_if_in_use(&codec->dev);




          int ret = pm_runtime_get_if_active(&codec->dev, true);
           if (!ret)
                   return -1;
           if (ret < 0)





Results for the above look good.
OK, this seems to be a workaround.
I took a deeper look at the issue now, and noticed that it's a messy
problem.
While we want to address this issue first - I've even messaged you
about this for the very first time early 2021 :D but it's not marked
as high prio so it remained unaddressed till now - me and Amadeo spent
some time analyzing most of the pm-related code for sound/hda and we
believe most of it could be replaced by native pm_runtime_xxx code and
fields such as ->in_pm could be dropped.
It's a bit tricky.  The in_pm refcount check is mandatory because the
very same code path is called recursively for enabling itself.
...
However, this won't take a
day or two and it's best if first we get to know the background
what/why/when some of those specific functions/members exist in the
hda code.
Yeah, I'd love to clean up / fix the stuff, but this is in some
sensitive area, so let's get it carefully.
...
...
The check with pm_runtime_get_if_in_use() itself isn't wrong, but it
needs the exceptional handling.
In addition to that, however, we need to work around the case where
the register gets updated twice with -EAGAIN handling; at the first
update, the regcache gets updated while the actual write gives an
error.  Then at the second update, it checks only the cache and
believes as if it's been already written, and the write is skipped.
This was what Amadeusz experienced with my previous patch.
So, we need to paper over those two.
OTOH, if we replace the primary check with
pm_runtime_get_if_active(true), this makes the things *mostly*
working, too.  This increases the usage count forcibly when the device
is active, and we'll continue to write the register.
The caveat is that the auto-suspend timer is still ticking in this
case.  But, this won't be a big problem, as the timer callback does
check the state and cancel itself if the device is actually in use.
So, I guess we should go for pm_runtime_get_if_in_use().
But please give it more tests.
I believe you meant pm_runtime_get_if_active(true) in the last one. If
yes, then indeed I'm delaying the patch upload until more tests are
run.
Ah correct, sorry for the typo, I mean *_if_active().
...
Once again, thank you for the input. Ramping up and addressing this
problem wouldn't happen so quickly without your guidance.
I'm going to submit your change as a proper fix patch soon later.
Then let's dig down and tidy the rest things up.
thanks,
Takashi