Re: [alsa-devel] [PATCH] ASoC: dmaengine: add runtime status checking in dmaengine_pcm_dma_complete
On Fri, Jun 07, 2013 at 07:57:29PM +0800, Qiao Zhou wrote:
the dmaengine_pcm_dma_complete callback is usually executed after the dma interrupt, which uses tasklet_schedule, workqueue, or other method for quick int handler return.
in some corner case, where pcm stream is released unexpected, like media server is killed, the runtime parameter will be freed. if it happens between the t1 and t2 in below chart, then the callback will try to access members of paramters which is already freed, and kernel panics.
to avoid this issue, add runtime checking before other handling in dmaengine_pcm_dma_complete. if pcm stream is already released, just ignore the current handling and return.
This doesn't seem like a good or robust way of fixing this, if we're tearing down the resources the DMA is using while the DMA is in progress then in the worst case that might include the memory being DMAed and of course there's races if you just check the pointer - the pointer can be checked at the same time as it's being freed (or between the free and the clear).
I think we should be either halting the DMA or waiting for it to finish here.
On 06/07/2013 04:34 PM, Mark Brown wrote:
On Fri, Jun 07, 2013 at 07:57:29PM +0800, Qiao Zhou wrote:
the dmaengine_pcm_dma_complete callback is usually executed after the dma interrupt, which uses tasklet_schedule, workqueue, or other method for quick int handler return.
in some corner case, where pcm stream is released unexpected, like media server is killed, the runtime parameter will be freed. if it happens between the t1 and t2 in below chart, then the callback will try to access members of paramters which is already freed, and kernel panics.
to avoid this issue, add runtime checking before other handling in dmaengine_pcm_dma_complete. if pcm stream is already released, just ignore the current handling and return.
This doesn't seem like a good or robust way of fixing this, if we're tearing down the resources the DMA is using while the DMA is in progress then in the worst case that might include the memory being DMAed and of course there's races if you just check the pointer - the pointer can be checked at the same time as it's being freed (or between the free and the clear).
I think we should be either halting the DMA or waiting for it to finish here.
I haven't see the original patch, but the proper solution to this problem should be to add a check to snd_dmaengine_pcm_close() to see if the DMA is still running, and if it is call dmaengine_terminate_all() for the DMA channel associated with the PCM. Everything else will probably still be racy.
- Lars
On 06/09/2013 03:37 PM, Lars-Peter Clausen wrote:
On 06/07/2013 04:34 PM, Mark Brown wrote:
On Fri, Jun 07, 2013 at 07:57:29PM +0800, Qiao Zhou wrote:
the dmaengine_pcm_dma_complete callback is usually executed after the dma interrupt, which uses tasklet_schedule, workqueue, or other method for quick int handler return.
in some corner case, where pcm stream is released unexpected, like media server is killed, the runtime parameter will be freed. if it happens between the t1 and t2 in below chart, then the callback will try to access members of paramters which is already freed, and kernel panics.
to avoid this issue, add runtime checking before other handling in dmaengine_pcm_dma_complete. if pcm stream is already released, just ignore the current handling and return.
This doesn't seem like a good or robust way of fixing this, if we're tearing down the resources the DMA is using while the DMA is in progress then in the worst case that might include the memory being DMAed and of course there's races if you just check the pointer - the pointer can be checked at the same time as it's being freed (or between the free and the clear).
I think we should be either halting the DMA or waiting for it to finish here.
I haven't see the original patch, but the proper solution to this problem should be to add a check to snd_dmaengine_pcm_close() to see if the DMA is still running.
Ok, since this will never happen, I suppose the problem is rather that the DMA callback is called after dma_terminate_all() has been called. Which sounds like it is a bug in the dmaengine driver. And this will likely also be a problem for other users of that dmaengine driver and not only the ASoC driver, so it should be fixed in the dmaengine driver.
- Lars
On Sun, Jun 09, 2013 at 03:51:09PM +0200, Lars-Peter Clausen wrote:
On 06/09/2013 03:37 PM, Lars-Peter Clausen wrote:
I haven't see the original patch, but the proper solution to this problem should be to add a check to snd_dmaengine_pcm_close() to see if the DMA is still running.
Ok, since this will never happen, I suppose the problem is rather that the DMA callback is called after dma_terminate_all() has been called. Which sounds like it is a bug in the dmaengine driver. And this will likely also be a problem for other users of that dmaengine driver and not only the ASoC driver, so it should be fixed in the dmaengine driver.
Just to clarify what is it makes you say that this will never happen?
On 06/10/2013 11:31 AM, Mark Brown wrote:
On Sun, Jun 09, 2013 at 03:51:09PM +0200, Lars-Peter Clausen wrote:
On 06/09/2013 03:37 PM, Lars-Peter Clausen wrote:
I haven't see the original patch, but the proper solution to this problem should be to add a check to snd_dmaengine_pcm_close() to see if the DMA is still running.
Ok, since this will never happen, I suppose the problem is rather that the DMA callback is called after dma_terminate_all() has been called. Which sounds like it is a bug in the dmaengine driver. And this will likely also be a problem for other users of that dmaengine driver and not only the ASoC driver, so it should be fixed in the dmaengine driver.
Just to clarify what is it makes you say that this will never happen?
At least that is my understanding of snd_pcm_release_substream(), that it will first make sure that the stream is stopped, by calling snd_pcm_drop(), before closing the stream.
- Lars
On Mon, Jun 10, 2013 at 12:46:52PM +0200, Lars-Peter Clausen wrote:
On 06/10/2013 11:31 AM, Mark Brown wrote:
On Sun, Jun 09, 2013 at 03:51:09PM +0200, Lars-Peter Clausen wrote:
On 06/09/2013 03:37 PM, Lars-Peter Clausen wrote:
I haven't see the original patch, but the proper solution to this problem should be to add a check to snd_dmaengine_pcm_close() to see if the DMA is still running.
Ok, since this will never happen, I suppose the problem is rather that the DMA callback is called after dma_terminate_all() has been called. Which sounds like it is a bug in the dmaengine driver. And this will likely also be a problem for other users of that dmaengine driver and not only the ASoC driver, so it should be fixed in the dmaengine driver.
Just to clarify what is it makes you say that this will never happen?
At least that is my understanding of snd_pcm_release_substream(), that it will first make sure that the stream is stopped, by calling snd_pcm_drop(), before closing the stream.
Yes you need to call dmaengine_terminate_all(). But even then we might have trasaction in flight or some dma controllers cant abort immediately (need to wait till FIFOs are flushed etc). In general it is a good practice to call dma_sync_wait() before you tear down the client. If you still see an issue, then it a buggy driver :)
-- ~Vinod
On 06/12/2013 09:43 AM, Vinod Koul wrote:
On Mon, Jun 10, 2013 at 12:46:52PM +0200, Lars-Peter Clausen wrote:
On 06/10/2013 11:31 AM, Mark Brown wrote:
On Sun, Jun 09, 2013 at 03:51:09PM +0200, Lars-Peter Clausen wrote:
On 06/09/2013 03:37 PM, Lars-Peter Clausen wrote:
I haven't see the original patch, but the proper solution to this problem should be to add a check to snd_dmaengine_pcm_close() to see if the DMA is still running.
Ok, since this will never happen, I suppose the problem is rather that the DMA callback is called after dma_terminate_all() has been called. Which sounds like it is a bug in the dmaengine driver. And this will likely also be a problem for other users of that dmaengine driver and not only the ASoC driver, so it should be fixed in the dmaengine driver.
Just to clarify what is it makes you say that this will never happen?
At least that is my understanding of snd_pcm_release_substream(), that it will first make sure that the stream is stopped, by calling snd_pcm_drop(), before closing the stream.
Yes you need to call dmaengine_terminate_all(). But even then we might have trasaction in flight or some dma controllers cant abort immediately (need to wait till FIFOs are flushed etc). In general it is a good practice to call dma_sync_wait() before you tear down the client. If you still see an issue, then it a buggy driver :)
Even though if the driver can't abort the transfer immediately, I'd still expect to not see any calls to the descriptors callback after dmaengine_terminate_all() has been called.
We should probably still call dma_sync_wait() though before we free any of the DMA transfer buffers. But I guess this will open a whole new can of bugs, since none of the drivers actually seem to mark a descriptor as completed if the transfer is aborted using dmaengine_terminate_all()
- Lars
On Wed, Jun 12, 2013 at 02:15:24PM +0200, Lars-Peter Clausen wrote:
On 06/12/2013 09:43 AM, Vinod Koul wrote:
Yes you need to call dmaengine_terminate_all(). But even then we might have trasaction in flight or some dma controllers cant abort immediately (need to wait till FIFOs are flushed etc). In general it is a good practice to call dma_sync_wait() before you tear down the client. If you still see an issue, then it a buggy driver :)
Even though if the driver can't abort the transfer immediately, I'd still expect to not see any calls to the descriptors callback after dmaengine_terminate_all() has been called.
It'd certainly be much less surprising - if something's terminated it really oughtn't to be generating callbacks.
We should probably still call dma_sync_wait() though before we free any of the DMA transfer buffers. But I guess this will open a whole new can of bugs, since none of the drivers actually seem to mark a descriptor as completed if the transfer is aborted using dmaengine_terminate_all()
Oh joy.
On 06/12/2013 10:39 PM, Mark Brown wrote:
On Wed, Jun 12, 2013 at 02:15:24PM +0200, Lars-Peter Clausen wrote:
On 06/12/2013 09:43 AM, Vinod Koul wrote:
Yes you need to call dmaengine_terminate_all(). But even then we might have trasaction in flight or some dma controllers cant abort immediately (need to wait till FIFOs are flushed etc). In general it is a good practice to call dma_sync_wait() before you tear down the client. If you still see an issue, then it a buggy driver :)
Even though if the driver can't abort the transfer immediately, I'd still expect to not see any calls to the descriptors callback after dmaengine_terminate_all() has been called.
It'd certainly be much less surprising - if something's terminated it really oughtn't to be generating callbacks.
We should probably still call dma_sync_wait() though before we free any of the DMA transfer buffers. But I guess this will open a whole new can of bugs, since none of the drivers actually seem to mark a descriptor as completed if the transfer is aborted using dmaengine_terminate_all()
Oh joy.
Mark, Lars, Vinod
Indeed it's a DMA handling issue, and I'm preparing to implement a proper handling in DMA driver. thanks a lot for these suggestions.
participants (4)
-
Lars-Peter Clausen
-
Mark Brown
-
Qiao Zhou
-
Vinod Koul