On 2021/04/22 Lucas Stach l.stach@pengutronix.de wrote:
But the timer of runtime_auto_suspend decide when enter runtime suspend rather than hardware, while transfer data size and transfer rate on IP bus decide when the dma interrupt happen.
But it isn't the hardware that decides to drop the rpm refcount to 0 and starting the autosuspend timer, it's the driver.
Yes, driver should keep rpm refcount never to 0 in such case. But I think that case Is a common case in dma cyclic with runtime_auto_suspend, so some dma driver also add pm_runtime_get_sync/ pm_runtime_put_autosuspend in interrupt handler like qcom/bam_dma.c for safe rather than only pm_runtime_mark_last_busy().
Generally, we can call pm_runtime_get_sync(fsl_chan->dev)/ pm_runtime_mark_last_busy in interrupt handler to hope the runtime_auto_suspend timer expiry later than interrupt coming, but if the transfer data size is larger in cyclic and transfer rate is very slow like 115200 or lower on uart, the fix autosuspend timer 100ms/200ms maybe not enough, hence, runtime suspend may execute meanwhile the dma interrupt maybe triggered and caught by GIC(but interrupt handler prevent by spin_lock_irqsave in pm_suspend_timer_fn() ),
and then interrupt handler start to run after runtime suspend.
If your driver code drops the rpm refcount to 0 and starts the autosuspend timer while a cyclic transfer is still in flight this is clearly a bug. Autosuspend is not there to paper over driver bugs, but to amortize cost of actually suspending and resuming the hardware. Your driver code must still work even if the timeout is 0, i.e. the hardware is immediately suspended after you drop the rpm refcount to 0.
If you still have transfers queued/in-flight the driver code must keep a rpm reference.
Yes, that's what I said for fix before as below. 'I have a simple workaround that disable runtime suspend in issue_pending worker by calling pm_runtime_forbid() and then enable runtime auto suspend in dmaengine_terminate_all so that we could easily regard that edma channel is always in runtime resume between issue_pending and channel terminated and ignore the above interrupt handler/scu-pd limitation'