On 10/12/2015 02:01 PM, Lars-Peter Clausen wrote:
On 10/08/2015 03:46 PM, Matt Campbell wrote:
Hi Lars,
Note: re-sending this message because I accidentally wasn't plain text email so it got bounced from the email list (thanks GMail). Sorry for the double mail to those that are CC'd.
Thanks for the info, this felt like a limitation to the DMA engine before I found this older thread so it's good to have that feeling validated.
Just for posterity I'll describe the situation I'm in with the RT patch. I'm on an i.MX6 solo processor which uses the imx-sdma driver. As far as I can tell, terminate_all properly shuts off the DMA (simply disabling the bit in the controller). The issue I'm having is the described in the following steps:
- An SDMA interrupt comes in, is handled by the top half and the
bottom half tasklet is scheduled.
- Control returns to my RT process which is higher priority than the
tasklet thread. My process decides it is time to shutdown and calls snd_pcm_close which will terminate the DMA and deallocates everything (including the substream which is the argument data to the tasklet)
- With my process out of the way the dma tasklet can run. It
eventually calls dmaengine_pcm_dma_complete (or in my case imx_pcm_dma_complete). You'll run into problems when dereferencing the now invalid substream pointer.
I have one other note about your proposed fix. In dmaengine_pcm_dma_complete you can end up with xrun handling which will result in a reset of the stream including calling terminate_all on the DMA. I'm worried that even with an API call to the DMA system to synchronize completion of DMA tasklets, you could end up with a deadlock when this is called from the tasklet itself when resetting the stream. I think we need to put the tasklet synchronization just before snd_pcm_release is called (as you suggested in the older thread). This still makes me uneasy because deadlock is still possible if the tasklet takes locks that are held when snd_pcm_release is called. Thanks for the input!
Yes, very good analysis. This is the reason why we can't synchronize in terminate_all() itself and have to introduce a separate synchronization primitive to the API. Most users will be fine with terminating and syncing at the same time, hence it makes sense to introduce a new API terminate_all_sync() to cover those common cases. For cases where it is not possible, e.g. because they either terminate the DMA from atomic context or they can end up calling terminate_all() from the completion callback, a new separate API function (e.g. dmaengine_synchronize()) will be introduced. Users which use the asynchronous terminate API need to call dmaengine_synchronize() before they free any resources that are accessed in either the completion callback or the memory that is accessed by the DMA itself.
This still makes me uneasy because deadlock is still possible if the tasklet takes locks that are held when snd_pcm_release is called. Thanks for the input
Good point. If that can happen we properly need to reference count the struct that is passed to the complete callback.
Thinking a bit more about this, this shouldn't be possible to happen. The completion callback runs in atomic context, which means in can only use spinlocks etc. On the other hand the dmaengine_synchronize() function must not be called from atomic context, this means it can't be called while holding a lock that can be taken from within the complete callback.