pcm|dmaengine|imx-sdma race condition on i.MX6
Hi, we've found a race condition with the PCM on the i.MX6 which results in an -EIO for the SNDRV_PCM_IOCTL_READI_FRAMES ioctl after an -EPIPE (XRUN).
A possible reproduction may look like the following reduced call graph during a PCM capture:
us -> ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) - wait_for_avail() - schedule_timeout() -> snd_pcm_update_hw_ptr0() - snd_pcm_update_state: EPIPE (XRUN) - sdma_disable_channel_async() # get's scheduled away due to sleep us <- ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) returns -EPIPE us -> ioctl(SNDRV_PCM_IOCTL_PREPARE) # as reaction to the EPIPE (XRUN) us -> ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) # next try to capture frames - sdma_prep_dma_cyclic() - sdma_load_context() # not loaded as context_loaded is 1 - wait_for_avail() - schedule_timeout() # now the sdma_channel_terminate_work() comes back and sets # context_loaded = false and frees in vchan_dma_desc_free_list(). us <- ioctl returns -EIO (capture write error (DMA or IRQ trouble?))
What we have found out, based on our understanding: The dmaengine docu states that a dmaengine_terminate_async() must be followed by a dmaengine_synchronize(). However, in the pcm_dmaengine.c, only dmaengine_terminate_async() is called (for performance reasons and because it might be called from an interrupt handler).
In our tests, we saw that the user-space immediately calls ioctl(SNDRV_PCM_IOCTL_PREPARE) as a handler for the happened xrun (previous ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) returns with -EPIPE). In our case (imx-sdma.c), the terminate really happens asynchronously with a worker thread which is not awaited/synchronized by the ioctl(SNDRV_PCM_IOCTL_PREPARE) call.
Since the syscall immediately enters an atomic context (snd_pcm_stream_lock_irq()), we are not able to flush the work of the termination worker from within the DMA context. This leads to an unterminated DMA getting re-initialized and then terminated.
On the i.MX6 platform the problem is (if I got it correctly) that the sdma_channel_terminate_work() called after the -EPIPE gets scheduled away (for the 1-2ms sleep [1]). During that time the userspace already sends in the ioctl(SNDRV_PCM_IOCTL_PREPARE) and ioctl(SNDRV_PCM_IOCTL_READI_FRAMES). As none of them are anyhow synchronized to the terminate_worker the vchan_dma_desc_free_list() [2] and "sdmac->context_loaded = false;" [3] are executed during the wait_for_avail() [4] of the ioctl(SNDRV_PCM_IOCTL_READI_FRAMES).
To make sure we identified the problem correctly we've tested to add a "dmaengine_synchronize()" before the snd_pcm_prepare() in [5]. This fixed the race condition in all our tests. (Before we were able to reproduce it in 100% of the test runs).
Based on our understanding, there are two different points to ensure the termination: Either ensure that the termination is finished within the previous SNDRV_PCM_IOCTL_READI_FRAMES call (inside the DMA context) or finishing it in the SNDRV_PCM_IOCTL_PREPARE call (and all other applicable ioclts) before entering the atomic context (from the PCM context).
We initially thought about implementing the first approach, basically splitting up the dma_device terminate_all operation into a sync (busy-wait) and a async one. This would align the operations with the DMAengine interface and would enable a sync termination variant from atomic contexts. However, we saw that the dma_free_attrs() function has a WARN_ON on irqs disabled, which would be the case for the sync variant.
Side note: We found this issue on the current v5.4.y LTS branch, but it also affects v5.8.y.
Any feedback or pointers how we may fix the problem are warmly welcome! If anything is unclear please just ask :-)
regards; Richard Leitner Benjamin Bara
[1]https://elixir.bootlin.com/linux/v5.8/source/drivers/dma/imx-sdma.c#L1066 [2]https://elixir.bootlin.com/linux/v5.8/source/drivers/dma/imx-sdma.c#L1071 [3]https://elixir.bootlin.com/linux/v5.8/source/drivers/dma/imx-sdma.c#L1072 [4]https://elixir.bootlin.com/linux/v5.8/source/sound/core/pcm_lib.c#L1825 [5]https://elixir.bootlin.com/linux/v5.8/source/sound/core/pcm_native.c#L3226
On 2020/08/13 19:23: Richard Leitner richard.leitner@skidata.com wrote:
Hi, we've found a race condition with the PCM on the i.MX6 which results in an -EIO for the SNDRV_PCM_IOCTL_READI_FRAMES ioctl after an -EPIPE (XRUN).
A possible reproduction may look like the following reduced call graph during a PCM capture:
us -> ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) - wait_for_avail() - schedule_timeout() -> snd_pcm_update_hw_ptr0() - snd_pcm_update_state: EPIPE (XRUN) - sdma_disable_channel_async() # get's scheduled away due to sleep us <- ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) returns -EPIPE us -> ioctl(SNDRV_PCM_IOCTL_PREPARE) # as reaction to the EPIPE (XRUN) us -> ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) # next try to capture frames - sdma_prep_dma_cyclic() - sdma_load_context() # not loaded as context_loaded is 1 - wait_for_avail() - schedule_timeout() # now the sdma_channel_terminate_work() comes back and sets # context_loaded = false and frees in vchan_dma_desc_free_list(). us <- ioctl returns -EIO (capture write error (DMA or IRQ trouble?))
Seems the write error caused by context_loaded not set to false before next transfer start? If yes, please have a try with the 03/04 of the below patch set, anyway, could you post your failure log? https://lkml.org/lkml/2020/8/11/111
What we have found out, based on our understanding: The dmaengine docu states that a dmaengine_terminate_async() must be followed by a dmaengine_synchronize(). However, in the pcm_dmaengine.c, only dmaengine_terminate_async() is called (for performance reasons and because it might be called from an interrupt handler).
In our tests, we saw that the user-space immediately calls ioctl(SNDRV_PCM_IOCTL_PREPARE) as a handler for the happened xrun (previous ioctl(SNDRV_PCM_IOCTL_READI_FRAMES) returns with -EPIPE). In our case (imx-sdma.c), the terminate really happens asynchronously with a worker thread which is not awaited/synchronized by the ioctl(SNDRV_PCM_IOCTL_PREPARE) call.
Since the syscall immediately enters an atomic context (snd_pcm_stream_lock_irq()), we are not able to flush the work of the termination worker from within the DMA context. This leads to an unterminated DMA getting re-initialized and then terminated.
On the i.MX6 platform the problem is (if I got it correctly) that the sdma_channel_terminate_work() called after the -EPIPE gets scheduled away (for the 1-2ms sleep [1]). During that time the userspace already sends in the ioctl(SNDRV_PCM_IOCTL_PREPARE) and ioctl(SNDRV_PCM_IOCTL_READI_FRAMES). As none of them are anyhow synchronized to the terminate_worker the vchan_dma_desc_free_list() [2] and "sdmac->context_loaded = false;" [3] are executed during the wait_for_avail() [4] of the ioctl(SNDRV_PCM_IOCTL_READI_FRAMES).
To make sure we identified the problem correctly we've tested to add a "dmaengine_synchronize()" before the snd_pcm_prepare() in [5]. This fixed the race condition in all our tests. (Before we were able to reproduce it in 100% of the test runs).
Based on our understanding, there are two different points to ensure the termination: Either ensure that the termination is finished within the previous SNDRV_PCM_IOCTL_READI_FRAMES call (inside the DMA context) or finishing it in the SNDRV_PCM_IOCTL_PREPARE call (and all other applicable ioclts) before entering the atomic context (from the PCM context).
We initially thought about implementing the first approach, basically splitting up the dma_device terminate_all operation into a sync (busy-wait) and a async one. This would align the operations with the DMAengine interface and would enable a sync termination variant from atomic contexts. However, we saw that the dma_free_attrs() function has a WARN_ON on irqs disabled, which would be the case for the sync variant. Side note: We found this issue on the current v5.4.y LTS branch, but it also affects v5.8.y.
Any feedback or pointers how we may fix the problem are warmly welcome! If anything is unclear please just ask :-)
regards; Richard Leitner Benjamin Bara
[1]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir. bootlin.com%2Flinux%2Fv5.8%2Fsource%2Fdrivers%2Fdma%2Fimx-sdma.c%23 L1066&data=02%7C01%7Cyibin.gong%40nxp.com%7C79ad115b01ef453f7 e7408d83f7b3c4d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637 329145824068928&sdata=D9F%2FRUG27xv9nv8J1KtrLtld2eaI6gsXiWIAIgk Avjw%3D&reserved=0 [2]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir. bootlin.com%2Flinux%2Fv5.8%2Fsource%2Fdrivers%2Fdma%2Fimx-sdma.c%23 L1071&data=02%7C01%7Cyibin.gong%40nxp.com%7C79ad115b01ef453f7 e7408d83f7b3c4d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637 329145824068928&sdata=0EKDVgzOZzL7TpX4ykhqjvpz5ryUHUpWw7frRe cksBU%3D&reserved=0 [3]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir. bootlin.com%2Flinux%2Fv5.8%2Fsource%2Fdrivers%2Fdma%2Fimx-sdma.c%23 L1072&data=02%7C01%7Cyibin.gong%40nxp.com%7C79ad115b01ef453f7 e7408d83f7b3c4d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637 329145824068928&sdata=aIhatvb1ocQqyYCVFEg71LgJlRBoVusbDFPIxnte PuY%3D&reserved=0 [4]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir. bootlin.com%2Flinux%2Fv5.8%2Fsource%2Fsound%2Fcore%2Fpcm_lib.c%23L1 825&data=02%7C01%7Cyibin.gong%40nxp.com%7C79ad115b01ef453f7e 7408d83f7b3c4d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6373 29145824073919&sdata=y0Udbd%2FKGaVgqLrcp6fNOlMlFCGHCMfojkpp B4HzUuE%3D&reserved=0 [5]https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir. bootlin.com%2Flinux%2Fv5.8%2Fsource%2Fsound%2Fcore%2Fpcm_native.c%2 3L3226&data=02%7C01%7Cyibin.gong%40nxp.com%7C79ad115b01ef453f 7e7408d83f7b3c4d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63 7329145824073919&sdata=ch3BQ5DDGU5HWXqIZSvUeFnBoRoP%2BMM HEpnk8mIfWj8%3D&reserved=0
We think this is not an i.MX6-specific problem, but a problem of the DMAengine usage from the PCM. In case of a XRUN, the DMA channel is never closed but first a SNDRV_PCM_TRIGGER_STOP next a SNDRV_PCM_TRIGGER_START is triggered. The SNDRV_PCM_TRIGGER_STOP simply executes a dmaengine_terminate_async() [1] but does not await the termination by calling dmaengine_synchronize(), which is required as stated by the docu [2]. Anyways, we are not able to fix it in the pcm_dmaengine layer either at the end of SNDRV_PCM_TRIGGER_STOP (called from the DMA on complete interrupt handler) or at the beginning of SNDRV_PCM_TRIGGER_START (called from a PCM ioctl), since the dmaengine_synchronize() requires a non-atomic context.
Based on my understanding, most of the DMA implementations don't even implement device_synchronize and if they do, it might not really be necessary since the terminate_all operation is synchron.
With the i.MX6, it looks a bit different: Since [4], the terminate_all operation really schedules a worker which waits the required ~1ms and then does the context freeing. Now, the ioctl(SNDRV_PCM_IOCTL_PREPARE) and the following ioctl(SNDRV_PCM_IOCTL_READI_FRAMES), which are called from US to handle/recover from a XRUN, are in a race with the terminate_worker. If the terminate_worker finishes earlier, everything is fine. Otherwise, the sdma_prep_dma_cyclic() is called, sets up everything and as soon as it is scheduled out to wait for data, the terminate_worker is scheduled and kills it. In this case, we wait in [5] until the timeout is reached and return with -EIO.
Based on our understanding, there exist two different fixing approaches: We thought that the pcm_dmaengine should handle this by either synchronizing the DMA on a trigger or terminating it synchronously. However, as we are in an atomic context, we either have to give up the atomic context of the PCM to finish the termination or we have to design a synchronous terminate variant which is callable from an atomic context.
For the first option, which is potentially more performant, we have to leave the atomic PCM context and we are not sure if we are allowed to. For the second option, we would have to divide the dma_device terminate_all into an atomic sync and an async one, which would align with the dmaengine API, giving it the option to ensure termination in an atomic context. Based on my understanding, most of them are synchronous anyways, for the currently async ones we would have to implement busy waits. However, with this approach, we reach the WARN_ON [6] inside of an atomic context, indicating we might not do the right thing.
Ad Failure Log (at bottom): I haven't added the ioctl syscalls, but this is basically the output with additional prints to be able to follow the code execution path. A XRUN (buffer size is 480 but 960 available) leads to a SNDRV_PCM_TRIGGER_STOP. This leads to terminate_async, starting the terminate_worker. Next, the XRUN recovery triggers SNDRV_PCM_TRIGGER_START, calling sdma_prep_dma_cyclic and then waits for the DMA in wait_for_avail(). Next we see the two freeings, first the old, then the newly added one; so the terminate_worker is back at work. Now the DMA is terminated, while we are still waiting on data from it.
What do you think about it? Is any of the provided solutions practicable? If you need further information or additional logging, feel free to ask.
Best regards Benjamin
[1] https://elixir.bootlin.com/linux/latest/source/sound/core/pcm_dmaengine.c#L2... [2] https://www.kernel.org/doc/html/latest/driver-api/dmaengine/client.html#furt... [3] https://elixir.bootlin.com/linux/latest/source/sound/core/pcm_dmaengine.c#L1... [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [5] https://elixir.bootlin.com/linux/v5.8/source/sound/core/pcm_lib.c#L1875 [6] https://elixir.bootlin.com/linux/latest/source/kernel/dma/mapping.c#L306
*Failure Log from latest 5.4 LTS kernel:* [ 535.201598] imx-sgtl5000 sound: snd_pcm_period_elapsed() [ 535.201610] imx-sgtl5000 sound: snd_pcm_period_elapsed: calls snd_pcm_update_hw_ptr0() [ 535.201626] imx-sdma 20ec000.sdma: sdma_tx_status channel: 2 [ 535.201640] snd_pcm_capture_avail: hw_ptr: 960, appl_ptr: 0, avail: 960, boundary: 2013265920 [ 535.201655] imx-sgtl5000 sound: snd_dmaengine_pcm_trigger command: 0 [ 535.201664] imx-sdma 20ec000.sdma: sdma_disable_channel_async channel: 2 [ 535.201672] imx-sdma 20ec000.sdma: sdma_disable_channel channel: 2 [ 535.201752] imx-sgtl5000 sound: wait_for_avail: tout=999, state=4 [ 535.201760] imx-sdma 20ec000.sdma: sdma_channel_terminate_work channel: 2 [ 535.201877] imx-sgtl5000 sound: snd_pcm_do_reset: ioctl SNDRV_PCM_IOCTL1_RESET [ 535.201888] imx-sgtl5000 sound: snd_pcm_lib_ioctl_reset: calls snd_pcm_update_hw_ptr() [ 535.201912] imx-sgtl5000 sound: snd_dmaengine_pcm_trigger command: 1 [ 535.201922] imx-sdma 20ec000.sdma: sdma_prep_dma_cyclic channel: 2 [ 535.201931] imx-sdma 20ec000.sdma: sdma_config_write channel: 1 [ 535.201939] imx-sdma 20ec000.sdma: sdma_config_channel channel: 2 [ 535.201948] imx-sdma 20ec000.sdma: sdma_disable_channel channel: 2 [ 535.201959] imx-sdma 20ec000.sdma: sdma_load_context channel: 2 [ 535.201967] imx-sdma 20ec000.sdma: sdma_transfer_init channel: 2 [ 535.201983] imx-sdma 20ec000.sdma: sdma_load_context channel: 2 [ 535.201995] imx-sdma 20ec000.sdma: entry 0: count: 256 dma: 0x4a300000 intr [ 535.202005] imx-sdma 20ec000.sdma: entry 1: count: 256 dma: 0x4a300100 intr [ 535.202014] imx-sdma 20ec000.sdma: entry 2: count: 256 dma: 0x4a300200 intr [ 535.202023] imx-sdma 20ec000.sdma: entry 3: count: 256 dma: 0x4a300300 intr [ 535.202033] imx-sdma 20ec000.sdma: entry 4: count: 256 dma: 0x4a300400 intr [ 535.202042] imx-sdma 20ec000.sdma: entry 5: count: 256 dma: 0x4a300500 intr [ 535.202050] imx-sdma 20ec000.sdma: entry 6: count: 256 dma: 0x4a300600 intr [ 535.202059] imx-sdma 20ec000.sdma: entry 7: count: 256 dma: 0x4a300700 intr [ 535.202067] imx-sdma 20ec000.sdma: entry 8: count: 256 dma: 0x4a300800 intr [ 535.202077] imx-sdma 20ec000.sdma: entry 9: count: 256 dma: 0x4a300900 intr [ 535.202086] imx-sdma 20ec000.sdma: entry 10: count: 256 dma: 0x4a300a00 intr [ 535.202094] imx-sdma 20ec000.sdma: entry 11: count: 256 dma: 0x4a300b00 intr [ 535.202103] imx-sdma 20ec000.sdma: entry 12: count: 256 dma: 0x4a300c00 intr [ 535.202111] imx-sdma 20ec000.sdma: entry 13: count: 256 dma: 0x4a300d00 intr [ 535.202120] imx-sdma 20ec000.sdma: entry 14: count: 256 dma: 0x4a300e00 wrap intr [ 535.202135] imx-sdma 20ec000.sdma: vchan 8aa58994: txd 0a262722[8]: submitted [ 535.202145] imx-sdma 20ec000.sdma: sdma_issue_pending channel: 2 [ 535.202181] snd_pcm_capture_avail: hw_ptr: 0, appl_ptr: 0, avail: 0, boundary: 2013265920 [ 535.202192] snd_pcm_capture_avail: hw_ptr: 0, appl_ptr: 0, avail: 0, boundary: 2013265920 [ 535.202202] imx-sgtl5000 sound: wait_for_avail: avail=0, state=3, twake=64 [ 535.203182] imx-sdma 20ec000.sdma: txd 19499aa8: freeing [ 535.203207] imx-sdma 20ec000.sdma: txd 0a262722: freeing [ 545.766059] imx-sgtl5000 sound: wait_for_avail: tout=0, state=3 [ 545.766075] imx-sgtl5000 sound: capture write error (DMA or IRQ trouble?)
On 2020/08/17 15:29 Benjamin Bara - SKIDATA Benjamin.Bara@skidata.com wrote:
We think this is not an i.MX6-specific problem, but a problem of the DMAengine usage from the PCM. In case of a XRUN, the DMA channel is never closed but first a SNDRV_PCM_TRIGGER_STOP next a SNDRV_PCM_TRIGGER_START is triggered. The SNDRV_PCM_TRIGGER_STOP simply executes a dmaengine_terminate_async() [1] but does not await the termination by calling dmaengine_synchronize(), which is required as stated by the docu [2]. Anyways, we are not able to fix it in the pcm_dmaengine layer either at the end of SNDRV_PCM_TRIGGER_STOP (called from the DMA on complete interrupt handler) or at the beginning of SNDRV_PCM_TRIGGER_START (called from a PCM ioctl), since the dmaengine_synchronize() requires a non-atomic context.
Based on my understanding, most of the DMA implementations don't even implement device_synchronize and if they do, it might not really be necessary since the terminate_all operation is synchron.
With the i.MX6, it looks a bit different: Since [4], the terminate_all operation really schedules a worker which waits the required ~1ms and then does the context freeing. Now, the ioctl(SNDRV_PCM_IOCTL_PREPARE) and the following ioctl(SNDRV_PCM_IOCTL_READI_FRAMES), which are called from US to handle/recover from a XRUN, are in a race with the terminate_worker. If the terminate_worker finishes earlier, everything is fine. Otherwise, the sdma_prep_dma_cyclic() is called, sets up everything and as soon as it is scheduled out to wait for data, the terminate_worker is scheduled and kills it. In this case, we wait in [5] until the timeout is reached and return with -EIO.
Based on our understanding, there exist two different fixing approaches: We thought that the pcm_dmaengine should handle this by either synchronizing the DMA on a trigger or terminating it synchronously. However, as we are in an atomic context, we either have to give up the atomic context of the PCM to finish the termination or we have to design a synchronous terminate variant which is callable from an atomic context.
For the first option, which is potentially more performant, we have to leave the atomic PCM context and we are not sure if we are allowed to. For the second option, we would have to divide the dma_device terminate_all into an atomic sync and an async one, which would align with the dmaengine API, giving it the option to ensure termination in an atomic context. Based on my understanding, most of them are synchronous anyways, for the currently async ones we would have to implement busy waits. However, with this approach, we reach the WARN_ON [6] inside of an atomic context, indicating we might not do the right thing.
Ad Failure Log (at bottom): I haven't added the ioctl syscalls, but this is basically the output with additional prints to be able to follow the code execution path. A XRUN (buffer size is 480 but 960 available) leads to a SNDRV_PCM_TRIGGER_STOP. This leads to terminate_async, starting the terminate_worker. Next, the XRUN recovery triggers SNDRV_PCM_TRIGGER_START, calling sdma_prep_dma_cyclic and then waits for the DMA in wait_for_avail(). Next we see the two freeings, first the old, then the newly added one; so the terminate_worker is back at work. Now the DMA is terminated, while we are still waiting on data from it.
What do you think about it? Is any of the provided solutions practicable? If you need further information or additional logging, feel free to ask.
busy_wait is not good I think, would you please have a try with the attached patch which is based on https://lkml.org/lkml/2020/8/11/111? The basic idea is to keep the freed descriptor into another list for freeing in later terminate_worker instead of freeing directly all in terminate_worker by vchan_get_all_descriptors which may break next descriptor coming soon
-----Original Message----- From: Robin Gong yibin.gong@nxp.com Sent: Montag, 17. August 2020 11:23 busy_wait is not good I think, would you please have a try with the attached patch which is based on https://lkml.org/lkml/2020/8/11/111? The basic idea is to keep the freed descriptor into another list for freeing in later terminate_worker instead of freeing directly all in terminate_worker by vchan_get_all_descriptors which may break next descriptor coming soon
The idea sounds good, but with this attempt we are still not sure that the 1ms (the ultimate reason why this is a problem) is awaited between DMA disabling and re-enabling.
If we are allowed to leave the atomic PCM context on each trigger, synchronize the DMA and then enter it back again, everything is fine. This might be the most performant and elegant solution. However, since we are in an atomic context for a reason, it might not be wanted by the PCM system that the DMA termination completion of the previous context happens within the next call, but we are not sure about that. In this case, a busy wait is not a good solution, but a necessary one, or at least the only valid solution we are aware of.
Anyhow, based on my understanding, either the start or the stop trigger has to wait the 1ms (or whats left of it).
On 2020/08/17 19:38 Benjamin Bara - SKIDATA Benjamin.Bara@skidata.com wrote:
-----Original Message----- From: Robin Gong yibin.gong@nxp.com Sent: Montag, 17. August 2020 11:23 busy_wait is not good I think, would you please have a try with the attached patch which is based on https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml .org%2Flkml%2F2020%2F8%2F11%2F111&data=02%7C01%7Cyibin.gong
%40nxp.
com%7C96a66f37340648e998f108d842a2057e%7C686ea1d3bc2b4c6fa92cd99 c5c301
635%7C0%7C0%7C637332610926324334&sdata=vn80kNlIY%2BB9v9cOlXJ patNkn
YAMtVx6v7yhfvAE%2FRM%3D&reserved=0? The basic idea is to keep
the
freed descriptor into another list for freeing in later terminate_worker instead of freeing directly all in terminate_worker by vchan_get_all_descriptors which may break next descriptor coming soon
The idea sounds good, but with this attempt we are still not sure that the 1ms (the ultimate reason why this is a problem) is awaited between DMA disabling and re-enabling.
The original 1ms delay is for ensuring sdma channel stop indeed, otherwise, sdma may still access IPs's fifo like uart/sai... during last Water-Mark-Level size transfer. The worst is some IP such as uart may lead to sdma hang after UCR2_RXEN/ UCR2_TXEN disabled ("Timeout waiting for CH0 ready" would be caught). So I would suggest synchronizing dma after channel terminated. But for PCM system's limitation, seems no choice but terminate async. If sdma could access audio fifo without hang after PCM driver terminate dma channel and rx/tx data buffers are not illegal, maybe 1ms is not a must because only garbage data harmless touched by sdma and ignored by PCM driver. Current sdma driver with my patches could ensure below: -- The last terminated transfer will be stopped before the next quick transfer start. because load context(sdma_load_context) done by channel0 which is the lowest priority. In other words, calling successfully dmaengine_prep_* in the next transfer means new normal transfer without any last terminated transfer impact. -- No potential interrupt after terminated could be handled before next transfer start because 'sdmac->desc' has been set NULL in sdma_terminate_all.
If we are allowed to leave the atomic PCM context on each trigger, synchronize the DMA and then enter it back again, everything is fine. This might be the most performant and elegant solution. However, since we are in an atomic context for a reason, it might not be wanted by the PCM system that the DMA termination completion of the previous context happens within the next call, but we are not sure about that. In this case, a busy wait is not a good solution, but a necessary one, or at least the only valid solution we are aware of.
Anyhow, based on my understanding, either the start or the stop trigger has to wait the 1ms (or whats left of it).
On 8/17/20 9:28 AM, Benjamin Bara - SKIDATA wrote:
We think this is not an i.MX6-specific problem, but a problem of the DMAengine usage from the PCM. In case of a XRUN, the DMA channel is never closed but first a SNDRV_PCM_TRIGGER_STOP next a SNDRV_PCM_TRIGGER_START is triggered. The SNDRV_PCM_TRIGGER_STOP simply executes a dmaengine_terminate_async() [1] but does not await the termination by calling dmaengine_synchronize(), which is required as stated by the docu [2]. Anyways, we are not able to fix it in the pcm_dmaengine layer either at the end of SNDRV_PCM_TRIGGER_STOP (called from the DMA on complete interrupt handler) or at the beginning of SNDRV_PCM_TRIGGER_START (called from a PCM ioctl), since the dmaengine_synchronize() requires a non-atomic context.
I think this might be an sdma specific problem after all. dmaengine_terminate_async() will issue a request to stop the DMA. But it is still safe to issue the next transfer, even without calling dmaengine_synchronize(). The DMA should start the new transfer at its earliest convenience in that case.
dmaegine_synchronize() is so that the consumer has a guarantee that the DMA is finished using the resources (e.g. the memory buffers) associated with the DMA transfer so it can safely free them.
Based on my understanding, most of the DMA implementations don't even implement device_synchronize and if they do, it might not really be necessary since the terminate_all operation is synchron.
There are a lot of buggy DMAengine drivers :) Pretty much all of them need device_synchronize() to get this right.
With the i.MX6, it looks a bit different: Since [4], the terminate_all operation really schedules a worker which waits the required ~1ms and then does the context freeing. Now, the ioctl(SNDRV_PCM_IOCTL_PREPARE) and the following ioctl(SNDRV_PCM_IOCTL_READI_FRAMES), which are called from US to handle/recover from a XRUN, are in a race with the terminate_worker. If the terminate_worker finishes earlier, everything is fine. Otherwise, the sdma_prep_dma_cyclic() is called, sets up everything and as soon as it is scheduled out to wait for data, the terminate_worker is scheduled and kills it. In this case, we wait in [5] until the timeout is reached and return with -EIO.
Based on our understanding, there exist two different fixing approaches: We thought that the pcm_dmaengine should handle this by either synchronizing the DMA on a trigger or terminating it synchronously. However, as we are in an atomic context, we either have to give up the atomic context of the PCM to finish the termination or we have to design a synchronous terminate variant which is callable from an atomic context.
For the first option, which is potentially more performant, we have to leave the atomic PCM context and we are not sure if we are allowed to. For the second option, we would have to divide the dma_device terminate_all into an atomic sync and an async one, which would align with the dmaengine API, giving it the option to ensure termination in an atomic context. Based on my understanding, most of them are synchronous anyways, for the currently async ones we would have to implement busy waits. However, with this approach, we reach the WARN_ON [6] inside of an atomic context, indicating we might not do the right thing.
I don't know how feasible this is to implement in the SDMA dmaengine driver. But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the list.
- Lars
On 8/19/20 1:08 PM, Lars-Peter Clausen wrote:
On 8/17/20 9:28 AM, Benjamin Bara - SKIDATA wrote:
We think this is not an i.MX6-specific problem, but a problem of the DMAengine usage from the PCM. In case of a XRUN, the DMA channel is never closed but first a SNDRV_PCM_TRIGGER_STOP next a SNDRV_PCM_TRIGGER_START is triggered. The SNDRV_PCM_TRIGGER_STOP simply executes a dmaengine_terminate_async() [1] but does not await the termination by calling dmaengine_synchronize(), which is required as stated by the docu [2]. Anyways, we are not able to fix it in the pcm_dmaengine layer either at the end of SNDRV_PCM_TRIGGER_STOP (called from the DMA on complete interrupt handler) or at the beginning of SNDRV_PCM_TRIGGER_START (called from a PCM ioctl), since the dmaengine_synchronize() requires a non-atomic context.
I think this might be an sdma specific problem after all. dmaengine_terminate_async() will issue a request to stop the DMA. But it is still safe to issue the next transfer, even without calling dmaengine_synchronize(). The DMA should start the new transfer at its earliest convenience in that case.
dmaegine_synchronize() is so that the consumer has a guarantee that the DMA is finished using the resources (e.g. the memory buffers) associated with the DMA transfer so it can safely free them.
You can think of dmaengine_terminate_async() and dmaengine_issue_pending() as adding operations to a command queue. The DMA is responsible that the operations are executed in the same order that they were added to the queue and to make sure that their execution does not conflict.
dmaegine_synchronize() is for external consumers to wait until all operations in the command queue have been completed.
-----Original Message----- From: Lars-Peter Clausen lars@metafoo.de Sent: Mittwoch, 19. August 2020 13:08 I think this might be an sdma specific problem after all. dmaengine_terminate_async() will issue a request to stop the DMA. But it is still safe to issue the next transfer, even without calling dmaengine_synchronize(). The DMA should start the new transfer at its earliest convenience in that case.
dmaegine_synchronize() is so that the consumer has a guarantee that the DMA is finished using the resources (e.g. the memory buffers) associated with the DMA transfer so it can safely free them.
Thank you for the clarifications!
I don't know how feasible this is to implement in the SDMA dmaengine driver. But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the list.
IMHO that's nearly what Robin's patches do, so it should be sufficient to put the descriptors to free in an extra termination list and ensure that a new transfer is handled after the last termination is done.
@Robin: Is it possible to tag the commits for the stable-tree Cc: stable@vger.kernel.org?
Best regards and thank you! Benjamin Richard
-----Original Message----- From: Lars-Peter Clausen lars@metafoo.de Sent: Mittwoch, 19. August 2020 13:08 I think this might be an sdma specific problem after all. dmaengine_terminate_async() will issue a request to stop the DMA. But it is still safe to issue the next transfer, even without calling dmaengine_synchronize(). The DMA should start the new transfer at its earliest convenience in that case.
dmaegine_synchronize() is so that the consumer has a guarantee that the DMA is finished using the resources (e.g. the memory buffers) associated with the DMA transfer so it can safely free them.
Thank you for the clarifications!
I don't know how feasible this is to implement in the SDMA dmaengine driver. But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the list.
IMHO that's nearly what Robin's patches does, so this should be sufficient: Putting the descriptors to free in an extra termination list and ensuring that a new transfer is handled after the last termination is done.
@Robin: Is it possible to tag the commits for the stable-tree Cc: stable@vger.kernel.org?
Best regards and thank you all! Benjamin Richard
On 2020/08/19 22:26 Benjamin Bara - SKIDATA Benjamin.Bara@skidata.com wrote:
-----Original Message----- From: Lars-Peter Clausen lars@metafoo.de Sent: Mittwoch, 19. August 2020 13:08 I think this might be an sdma specific problem after all. dmaengine_terminate_async() will issue a request to stop the DMA. But it is still safe to issue the next transfer, even without calling dmaengine_synchronize(). The DMA should start the new transfer at its earliest convenience in that case.
dmaegine_synchronize() is so that the consumer has a guarantee that the DMA is finished using the resources (e.g. the memory buffers) associated with the DMA transfer so it can safely free them.
Thank you for the clarifications!
I don't know how feasible this is to implement in the SDMA dmaengine driver. But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the list.
IMHO that's nearly what Robin's patches does, so this should be sufficient: Putting the descriptors to free in an extra termination list and ensuring that a new transfer is handled after the last termination is done.
Hello Benjamin, It seems Lars's list is not the extra termination list in my patch. Instead, he means the next descriptor should be moved in another pending list if the last channel terminated not done yet and restart to handle the pending list after the last channel terminated done if the list is not empty.
I like the idea, but on SDMA that race condition may never happen I think, because that once the next descriptor got during usleep_range in sdma_channel_terminate_work, means the last channel stopped indeed. I have mentioned before: ' because load context(sdma_load_context) done by channel0 which is the lowest priority. In other words, calling successfully dmaengine_prep_* in the next transfer means new normal transfer without any last terminated transfer impact ' normal data transfer on dma non-channel0, such as channel1,2...etc which has higher priority than channel0, so if channel0 get to run to load context means the 'potential transfer' on the last terminated channel have been done. So no need to move list since it's no different with normal transfer although sdma_channel_terminate_work may get to run later to free the last descriptor(only software impact which my patch fix). Transfer on sdma channel will be split into many small pieces of transfer (Water-Mark-Level size). Once dma request event acknowledged and scheduled by sdma core, this sdma channel will start to transfer WML size of data and then schedule out to other channel. Now the 'potential transfer' alive on terminated channel only happen in the below two things come out in the same time: [1].The channel is running to transfer Water-Mark-Level size (sdma side). [2].The channel is terminated by sdma_disable_channel at the same time(arm side). Even if it happen, the 'potential transfer' on sdma is very short, because fetching/filling fifo in WML size(fifo size or half fifo size) is very quick. After that, this channel is terminated at HW level. Here 1ms from design team just for big safe I think. So I don't think that's a big deal if memory buffer is available and descriptor are taken care to no impact on the next transfer as my patch did.
@Robin: Is it possible to tag the commits for the stable-tree Cc: stable@vger.kernel.org?
Could my patch work in your side? If yes, I will add Cc: stable@vger.kernel.org
Best regards and thank you all! Benjamin Richard
On Thu, Aug 20, 2020 at 03:01:44PM +0000, Robin Gong wrote:
On 2020/08/19 22:26 Benjamin Bara - SKIDATA Benjamin.Bara@skidata.com wrote:
@Robin: Is it possible to tag the commits for the stable-tree Cc: stable@vger.kernel.org?
Could my patch work in your side? If yes, I will add Cc: stable@vger.kernel.org
I've tested the patches 3 & 4 (removing sdmac->context_loaded) of the series you mentioned and sent Tested-by tags for them [1,2], as they fix the EIO problems for our use case.
So from our side they are fine for stable.
[1] https://lore.kernel.org/dmaengine/20200817053813.GA551027@pcleri/T/#u [2] https://lore.kernel.org/dmaengine/20200817053820.GB551027@pcleri/T/#u
regards;rl
On 2020/08/21 12:34 Richard Leitner richard.leitner@skidata.com wrote:
On Thu, Aug 20, 2020 at 03:01:44PM +0000, Robin Gong wrote:
On 2020/08/19 22:26 Benjamin Bara - SKIDATA Benjamin.Bara@skidata.com
wrote:
@Robin: Is it possible to tag the commits for the stable-tree Cc: stable@vger.kernel.org?
Could my patch work in your side? If yes, I will add Cc: stable@vger.kernel.org
I've tested the patches 3 & 4 (removing sdmac->context_loaded) of the series you mentioned and sent Tested-by tags for them [1,2], as they fix the EIO problems for our use case.
So from our side they are fine for stable.
Okay, I thought that's just decrease the issue in your side not totally fix, and the patch I post in https://www.spinics.net/lists/arm-kernel/msg829972.html could resolve the potential next descriptor wrongly freed by vchan_get_all_descriptors in sdma_channel_terminate_work. Anyway, I'll add ' Cc: stable@vger.kernel.org' and your Tested-by tags in 3&4, then resend it again, thanks.
On Fri, Aug 21, 2020 at 09:21:37AM +0000, Robin Gong wrote:
On 2020/08/21 12:34 Richard Leitner richard.leitner@skidata.com wrote:
On Thu, Aug 20, 2020 at 03:01:44PM +0000, Robin Gong wrote:
On 2020/08/19 22:26 Benjamin Bara - SKIDATA Benjamin.Bara@skidata.com
wrote:
@Robin: Is it possible to tag the commits for the stable-tree Cc: stable@vger.kernel.org?
Could my patch work in your side? If yes, I will add Cc: stable@vger.kernel.org
I've tested the patches 3 & 4 (removing sdmac->context_loaded) of the series you mentioned and sent Tested-by tags for them [1,2], as they fix the EIO problems for our use case.
So from our side they are fine for stable.
Okay, I thought that's just decrease the issue in your side not totally fix, and the patch
As Benjamin mentioned the issue isn't "fixed" for us from the logical/ technical side. Nonetheless the EIO error won't occur anymore with the patches applied. Therefore the issue is for me "fixed from a userspace point of view", as they don't get/see the error anymore.
I post in https://www.spinics.net/lists/arm-kernel/msg829972.html could resolve the potential next descriptor wrongly freed by vchan_get_all_descriptors in sdma_channel_terminate_work. Anyway, I'll add ' Cc: stable@vger.kernel.org' and your Tested-by tags in 3&4, then resend it again, thanks.
Great. Thank you!
regards;rl
On Wed, Aug 19, 2020 at 01:08:29PM +0200, Lars-Peter Clausen wrote:
For the first option, which is potentially more performant, we have to leave the atomic PCM context and we are not sure if we are allowed to. For the second option, we would have to divide the dma_device terminate_all into an atomic sync and an async one, which would align with the dmaengine API, giving it the option to ensure termination in an atomic context. Based on my understanding, most of them are synchronous anyways, for the currently async ones we would have to implement busy waits. However, with this approach, we reach the WARN_ON [6] inside of an atomic context, indicating we might not do the right thing.
I don't know how feasible this is to implement in the SDMA dmaengine driver. But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the list.
The list is already there in form of the vchan helpers the driver uses.
I think the big mistake the driver makes is to configure fields in struct sdma_channel and also the hardware directly in sdma_prep_memcpy(), sdma_prep_slave_sg() and sdma_prep_dma_cyclic(). All information should be stored in the struct sdma_desc allocated in the prep functions and only be used when it's time to fire that specific descriptor.
More specifically sdma_config_write() may not be called from sdma_prep_slave_sg() or sdma_prep_dma_cyclic(), but instead must be called from sdma_start_desc(). sdma_config_ownership() also must be called later in sdma_start_desc(). 'direction' must be a member of struct sdma_desc, not of struct sdma_channel.
Overall this sounds like a fair amount of work to do, but should be feasible and IMO is a step in the right direction.
Sascha
On 2020/08/20 14:52 Sascha Hauer s.hauer@pengutronix.de wrote:
On Wed, Aug 19, 2020 at 01:08:29PM +0200, Lars-Peter Clausen wrote:
For the first option, which is potentially more performant, we have to leave the atomic PCM context and we are not sure if we are allowed to. For the second option, we would have to divide the dma_device terminate_all into an atomic sync and an async one, which would align with the dmaengine API, giving it the option to ensure termination in
an atomic context.
Based on my understanding, most of them are synchronous anyways, for the currently async ones we would have to implement busy waits. However, with this approach, we reach the WARN_ON [6] inside of an atomic context, indicating we might not do the right thing.
I don't know how feasible this is to implement in the SDMA dmaengine
driver.
But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the
list.
The list is already there in form of the vchan helpers the driver uses.
Seems Lars major concern is on the race condition between next descriptor and sdma_channel_terminate_work which free the last terminated descriptor, not the ability of vchan to support multi descriptors. But anyway, I think we should take care vchan_get_all_descriptors to free descriptors during terminate phase in case it's done in worker like sdma_channel_terminate_work, since that may free the next descriptor wrongly. That's what my patch attached in 0001-dmaengine-imx-sdma-add-terminated-list-for-freed-des.patch https://www.spinics.net/lists/arm-kernel/msg829972.html
I think the big mistake the driver makes is to configure fields in struct sdma_channel and also the hardware directly in sdma_prep_memcpy(), sdma_prep_slave_sg() and sdma_prep_dma_cyclic(). All information should be stored in the struct sdma_desc allocated in the prep functions and only be used when it's time to fire that specific descriptor.
Sorry Sascha, seems that's another topic and your intention is to make sure only software involved in sdma_prep_* and all HW moved into one function inside sdma_start_desc. I agree that will make code more clean but my concern is sdma_start_desc is protect by spin_lock which should be short as possible while some HW touch as context_load may cost some time. Anyway, that's another topic, maybe we can refine it in the future.
More specifically sdma_config_write() may not be called from sdma_prep_slave_sg() or sdma_prep_dma_cyclic(), but instead must be called from sdma_start_desc(). sdma_config_ownership() also must be called later in sdma_start_desc(). 'direction' must be a member of struct sdma_desc, not of struct sdma_channel.
Overall this sounds like a fair amount of work to do, but should be feasible and IMO is a step in the right direction.
Sascha
--
On Fri, Aug 21, 2020 at 09:52:00AM +0000, Robin Gong wrote:
On 2020/08/20 14:52 Sascha Hauer s.hauer@pengutronix.de wrote:
On Wed, Aug 19, 2020 at 01:08:29PM +0200, Lars-Peter Clausen wrote:
For the first option, which is potentially more performant, we have to leave the atomic PCM context and we are not sure if we are allowed to. For the second option, we would have to divide the dma_device terminate_all into an atomic sync and an async one, which would align with the dmaengine API, giving it the option to ensure termination in
an atomic context.
Based on my understanding, most of them are synchronous anyways, for the currently async ones we would have to implement busy waits. However, with this approach, we reach the WARN_ON [6] inside of an atomic context, indicating we might not do the right thing.
I don't know how feasible this is to implement in the SDMA dmaengine
driver.
But I think what is should do is to have some flag to indicate if a terminate is in progress. If a new transfer is issued while terminate is in progress the transfer should go on a list. Once terminate finishes it should check the list and start the transfer if there are any on the
list.
The list is already there in form of the vchan helpers the driver uses.
Seems Lars major concern is on the race condition between next descriptor and sdma_channel_terminate_work which free the last terminated descriptor, not the ability of vchan to support multi descriptors. But anyway, I think we should take care vchan_get_all_descriptors to free descriptors during terminate phase in case it's done in worker like sdma_channel_terminate_work, since that may free the next descriptor wrongly. That's what my patch attached in 0001-dmaengine-imx-sdma-add-terminated-list-for-freed-des.patch https://www.spinics.net/lists/arm-kernel/msg829972.html
Indeed this should solve the problem of freeing descriptors allocated between terminate_all and a following prep_slave*.
I think the big mistake the driver makes is to configure fields in struct sdma_channel and also the hardware directly in sdma_prep_memcpy(), sdma_prep_slave_sg() and sdma_prep_dma_cyclic(). All information should be stored in the struct sdma_desc allocated in the prep functions and only be used when it's time to fire that specific descriptor.
Sorry Sascha, seems that's another topic and your intention is to make sure only software involved in sdma_prep_* and all HW moved into one function inside sdma_start_desc. I agree that will make code more clean but my concern is sdma_start_desc is protect by spin_lock which should be short as possible while some HW touch as context_load may cost some time. Anyway, that's another topic, maybe we can refine it in the future.
Yes, you are right. This is another topic.
Sascha
participants (5)
-
Benjamin Bara - SKIDATA
-
Lars-Peter Clausen
-
Richard Leitner
-
Robin Gong
-
Sascha Hauer