Guenter Roeck groeck@google.com 於 2020年7月3日 週五 下午11:58寫道:
On Fri, Jul 3, 2020 at 3:56 AM Enric Balletbo i Serra enric.balletbo@collabora.com wrote:
Hi Yu-Hsuan,
On 3/7/20 11:40, Yu-Hsuan Hsu wrote:
Enric Balletbo i Serra enric.balletbo@collabora.com 於 2020年7月3日 週五 下午5:19寫道:
Hi Yu-Hsuan,
On 3/7/20 10:48, Yu-Hsuan Hsu wrote:
Enric Balletbo i Serra enric.balletbo@collabora.com 於 2020年7月3日 週五 下午4:38寫道:
Hi Yu-Hsuan,
Thank you for your patch
On 3/7/20 9:19, Yu-Hsuan Hsu wrote: > Log results of failed EC commands to identify a problem more easily. > > Replace cros_ec_cmd_xfer_status with cros_ec_cmd_xfer because the result > has already been checked in this function. The wrapper is not needed. >
Nack, we did an effort to remove all public users of cros_ec_cmd_xfer() in favour of cros_ec_cmd_xfer_status() and you are reintroducing again. You can do the same but using cros_ec_cmd_xfer_status(). In fact, your patch will not build on top of the upcoming changes.
Thanks! But I have a question about implementing it. Does it look like the one below? ret = cros_ec_cmd_xfer_status(ec_dev, msg); if (ret < 0) {
In this case will already print an error.
What are you trying to achieve?
If the only reason is of this patch is print a message you should either, or enable dynamic printk and enable dev_dbg or event better use the kernel trace functionality. There is no need to be more verbose.
Example: $ echo 1 > /sys/kernel/debug/tracing/events/cros_ec/enable $ cat /sys/kernel/debug/tracing/trace
369.416372: cros_ec_request_start: version: 0, command: EC_CMD_USB_PD_POWER_INFO 369.420528: cros_ec_request_done: version: 0, command:
EC_CMD_USB_PD_POWER_INFO, ec result: EC_RES_SUCCESS, retval: 16
Cheers, Enric
Thank Enric,
The situation is that some users encountered errors on ChromeBook.
And, aren't you able to reproduce the issue?
From their feedback reports, we only get the message like 'cros-ec-codec GOOG0013:00: ASoC: Failed to set DAI format: -71'. We know that -71 is -EPROTO but it is not clear enough for us to find out the root cause. That's why we want the detail of the result.
If I am not mistaken this ends calling i2s_rx_set_daifmt() into the EC firmware, if the result is -EPROTO that means is not returning EC_RES_SUCCESS, so there are few options:
if (i2s_rx_enabled) return EC_RES_BUSY; if (daifmt >= EC_CODEC_I2S_RX_DAIFMT_COUNT) return EC_RES_INVALID_PARAM; if (audio_codec_i2s_rx_set_daifmt(daifmt) != EC_SUCCESS) return EC_RES_ERROR;
Because the situation happens on users' side, it is not possible for them to enable kernel trace (ChromeOS does not allow users to touch kernel).
Are you sure that when you know the error code you'll find the root cause (without adding more prints)? There is only three possibilities? You can't start adding prints just to debug a user issue because you don't allow to be more verbose. I understand that might help you but is not the way to go.
Hi Enric and Guenter,
Thanks for your inspiring comments. I'm not sure whether we will find the root cause if I know the error code. But I think it's not a point. We wanted to add this error log because we found that the current one is not enough. Since it is a real error, it would be better if we can make it more detailed, right? In addition, we thought it would be helpful in the future as well. That's why we chose to upstream instead of merging into our source tree only.
You should really reproduce the issue yourself an use actual debug tools/prints./traces
We are trying but still unable to reproduce this issue. However, as I maintained above, it is not a main concern of this change.
Another possibility would be to change cros_ec_cmd_xfer_status() to return a more granular error code, such as -EINVAL for EC_RES_INVALID_PARAM, -EBUSY for EC_RES_BUSY, -EINPROGRESS for EC_RES_IN_PROGRESS, -ETIMEDOUT for EC_RES_TIMEOUT, -EOVERFLOW for EC_RES_OVERFLOW, -ENODATA for EC_RES_UNAVAILABLE, and so on.
Since there are many kinds of results from EC, why not just make users able to check on their own? For example, users can wait and try again if the result is EC_RES_BUSY.
However, it appears that the various low level functions already replace various EC error codes with a blank EC_RES_ERROR. No amount of logging will tell us what exactly went wrong in those functions. Lucky for us, audio_codec_i2s_rx_set_daifmt() only ever returns EC_SUCCESS, so we know that the problem is either that i2s_rx_enabled is true or that daifmt is too large. None of those really warrants more verbose logging.
From the context, my personal bet is that i2s_rx_enabled is true: I don't immediately see how disabling it is enforced before trying to set the DAI format, and I don't see how "daifmt >= EC_CODEC_I2S_RX_DAIFMT_COUNT" can ever be true.
I totally agree. According to the source, it seems that both path are impossible. I'm not really understand the whole path but is it possible for EC to return other results? I will do more tests and look carefully into the source. Really thanks for your suggestions.
Cheers, Yu-Hsuan
Guenter
Guenter
Cheers, Enric
The other way we thought is changing dev_dbg to dev_err in cros_ec_cmd_xfer_status. But we are not sure whether it is also an error for other usages.
if (ret == -EPROTO) dev_err(..., msg->result) goto error; } I'm not sure whether it makes sense to check ret == -EPROTO here.
> Signed-off-by: Yu-Hsuan Hsu yuhsuan@chromium.org > --- > sound/soc/codecs/cros_ec_codec.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/sound/soc/codecs/cros_ec_codec.c b/sound/soc/codecs/cros_ec_codec.c > index 8d45c628e988e..a4ab62f59efa6 100644 > --- a/sound/soc/codecs/cros_ec_codec.c > +++ b/sound/soc/codecs/cros_ec_codec.c > @@ -90,10 +90,17 @@ static int send_ec_host_command(struct cros_ec_device *ec_dev, uint32_t cmd, > if (outsize) > memcpy(msg->data, out, outsize); > > - ret = cros_ec_cmd_xfer_status(ec_dev, msg); > + ret = cros_ec_cmd_xfer(ec_dev, msg); > if (ret < 0) > goto error; > > + if (msg->result != EC_RES_SUCCESS) { > + dev_err(ec_dev->dev, "Command %d failed: %d\n", cmd, > + msg->result); > + ret = -EPROTO; > + goto error; > + } > + > if (insize) > memcpy(in, msg->data, insize); > >