I do not see #3063 issue on my side. No initialization failure or time-out has occurred.
It's rather random, we've only seen the error in long daily tests.
Now I'm trying to solve the issue with max98373_io_init() function as suggested instead of adding regmap_cache_dirty() in the suspend function. max98373_io_init() was not called from max98373_update_status() when audio resume because max98373->hw_init was 1 and Status was SDW_SLAVE_ATTACHED. max98373_update_status() do not get SDW_SLAVE_UNATTACHED. I confirmed that the issue could be resolved if SDW_SLAVE_UNATTACHED event arrives at max98373_update_status() before SDW_SLAVE_ATTACHED is triggered. Actually sdw_handle_slave_status() get SDW_SLAVE_UNATTACHED but this function exits at https://github.com/thesofproject/linux/blob/topic/sof-dev/drivers/soundwire/... before reaching to https://github.com/thesofproject/linux/blob/topic/sof-dev/drivers/soundwire/... I'm not sure how to solve this issue because this code is commonly
used for other Soundwire drivers as well.
There may be a confusion here.
The SoundWire spec says the device will show up as Device #0. That means the status[0] = ATTACHED.
The driver reads the devID registers and programs the device number N. The device will then report as device #N in PING frames. The controller hardware will detect that device and call the function to update the status a second time.
I share the debug messages for the resume event as your reference. [ 127.490644] [DEBUG3] intel_resume_runtime [ 127.490655] [DEBUG3] intel_resume_runtime SDW_INTEL_CLK_STOP_BUS_RESET [ 127.490658] [DEBUG3] intel_init [ 127.490660] [DEBUG3] intel_link_power_up [ 127.490977] [DEBUG3] intel_resume_runtime SDW_UNATTACH_REQUEST_MASTER_RESET .. [ 127.490980] [DEBUG4] sdw_clear_slave_status request: 1 [ 127.490983] [DEBUG4] sdw_modify_slave_status, ID:7, status: 0 [ 127.490986] [DEBUG4] sdw_modify_slave_status, ID:3, status: 0 [ 127.490994] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491060] [DEBUG3] intel_shim_wake wake_enable:0 [ 127.491191] [DEBUG] max98373_resume, first_hw_init: 1, unattach_request: 1 [ 127.491194] [DEBUG] max98373_resume, INF MODE: 0 [ 127.491953] [DEBUG4] sdw_handle_slave_status IN [ 127.491956] [DEBUG4] sdw_handle_slave_status, status[1] : 0, slave->status: 0, id:7 // UNATTACHED [ 127.491958] [DEBUG4] sdw_handle_slave_status, status[2] : 0, slave->status: 0, id:3 [ 127.491960] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 1 [ 127.492808] [DEBUG4] sdw_handle_slave_status IN [ 127.492810] [DEBUG4] sdw_handle_slave_status, status[1] : 1, slave->status: 0, id:7 // ATTACHED [ 127.492812] [DEBUG4] sdw_handle_slave_status, status[2] : 1, slave->status: 0, id:3 [ 127.492814] [DEBUG4] sdw_handle_slave_status IN2 status[0] = 0 [ 127.492816] [DEBUG4] sdw_handle_slave_status IN3 [ 127.492818] [DEBUG4] sdw_handle_slave_status status[1] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:7, prev_status:0 [ 127.492820] [DEBUG4] sdw_modify_slave_status, ID:7, status: 1 [ 127.493008] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:7 [ 127.493010] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493012] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :7 [ 127.493015] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1 [ 127.493017] [DEBUG4] sdw_handle_slave_status status[2] = SDW_SLAVE_ATTACHED, slave->status : 0, slave:3, prev_status:0 [ 127.493019] [DEBUG4] sdw_modify_slave_status, ID:3, status: 1 [ 127.493199] [DEBUG4] sdw_update_slave_status update_status(1) IN slave:3 [ 127.493201] [DEBUG4] sdw_update_slave_status update_status(1) OUT [ 127.493204] [DEBUG] max98373_update_status IN hw_init:1, status: 1, slave :3 [ 127.493207] [DEBUG] max98373_update_status IN2 hw_init:1, max98373->first_hw_init: 1, status: 1
I don't really see anything in this sequence that differs from my explanations?
The update_status() is only called when the device has a non-zero device number.
There may be a real problem with update_status() not being called but I just don't see it so far.
One way to improve the traces would be to use dev_dbg, that way we'd have a trace of which device is being handled. There are two devices managed by the same driver, a trace with pr_dbg doesn't tell us much.