Re: [alsa-devel] [PATCH 2/2] ASoC: DA7219: Implement error check on reg read and write

6 Dec 2018


      On Wed, Dec 05, 2018 at 10:50:29AM -0700, Daniel Kurtz wrote:
...
I agree, there is no guarantee here once things have gone wrong, and
the concerns above are reasonable.  However, in the real world, I2C
transactions do sometimes fail for various reasons.  The I2C (and
It's *vanishingly* rare and like I say we don't have any constructive
recovery plans - I've never seen a practical system that had any better
idea than hoping the user noticed a problem and reboots.  Probably the
best we can really do is try to reset the device, or at least resync the
register map.  That would probably do something useful in the
vanishingly rare case where it were a singular glitch, at the cost of
obvious and painful user visible issues (which would probably be
happening anyway).  The nearest I've seen to that is one of the CODEC
drivers which has watchdog code to monitor the device in case it
spontaneously reboots and will resync the register cache if that happened.
I'm gathering that there are Chromebooks that do have I/O problems here?
...
other bus) APIs have ways of reporting up their errors so callers can
take appropriate action.  This codec driver can run on all kinds of
hardware that can experience transient I2C errors, thus it sounds like
a reasonable idea to have the driver do some error checking on the
APIs it calls and take whatever action it can.  Just ignoring the
errors and proceeding like nothing is wrong is one option, but we can
probably do a little better by at least checking for errors, abort the
current operation, and pass up errors to higher layers when an i2c
transaction failure is detected.  If nothing else, this would enable
higher policy layers to take appropriate corrective action - for
example, if there is an i2c error when configuring a codec, it seems
advisable to report this up so that a machine driver would know to
abort and not turn on downstream amplifiers [I am assuming here that
something like this would happen, I don't know if the sound stack
really works this way].
Yeah, so this is partly why I'm not super thrilled here - none of the
layers currently have any especially bright ideas about how to handle
things and giving up part way through an operation and returning an
error without any attempt to unwind or recover feels like it's just
trying to give a false sense of security.
...
Once the default "check, abort and report" error checking is in place,
we could perhaps think about actions that the driver could take to
recover from various failures, such as resetting the device or
unwinding previous transactions before aborting, or retrying....  but
those are all policy, and this patch is more mechanism that enables
policy.
But is this going to be a useful way of handling such policy and is any
work on that from whoever has these unreliable systems likely to be
forthcoming?  We're going to end up with partially done reconfigurations
in the register cache which is a potential issue for recovery strategies
based around resyncing that, it's not so bad on things like starting a
stream but bias level reconfigurations could be fun.

Re: [alsa-devel] [PATCH 2/2] ASoC: DA7219: Implement error check on reg read and write

Mark Brown