[alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
With today's code, we preallocate DMA buffer for substreams at pcm_new() stage, and the substream->buffer_bytes_max and substream->dma_max will save as the actually preallocated buffer size and maximum size that the dma buffer can be expanded by at hw_params() state, correspondingly.
At pcm_open() stage, the maximum constraint of HW_PARAM_BUFFER_BYTES is set to substream->buffer_bytes_max and returned to user space as the max interval of the HW_PARAM_BUFFER_BYTES, this will lead to issue that user can't choose any buffer-bytes larger than the preallocated buffer size, and the buffer reallocation will never happen actually.
Here change to use substream->dma_max as the maximum constraint of the HW_PARAM_BUFFER_BYTES and fix the issue mentioned above.
Signed-off-by: Keyon Jie yang.jie@linux.intel.com --- sound/core/pcm_native.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index c375c41496f8..326e921006e7 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -2301,7 +2301,7 @@ static int snd_pcm_hw_rule_buffer_bytes_max(struct snd_pcm_hw_params *params, struct snd_interval t; struct snd_pcm_substream *substream = rule->private; t.min = 0; - t.max = substream->buffer_bytes_max; + t.max = substream->dma_max; t.openmin = 0; t.openmax = 0; t.integer = 1;
On Thu, 16 Jan 2020 05:53:18 +0100, Keyon Jie wrote:
With today's code, we preallocate DMA buffer for substreams at pcm_new() stage, and the substream->buffer_bytes_max and substream->dma_max will save as the actually preallocated buffer size and maximum size that the dma buffer can be expanded by at hw_params() state, correspondingly.
No, it's other way round: the former, buffer_bytes_max, is the max size defined by the driver (i.e. passed in snd_pcm_hardware) and the latter, dma_max, is the max preallocation size (passed to preallocation helper).
At pcm_open() stage, the maximum constraint of HW_PARAM_BUFFER_BYTES is set to substream->buffer_bytes_max and returned to user space as the max interval of the HW_PARAM_BUFFER_BYTES, this will lead to issue that user can't choose any buffer-bytes larger than the preallocated buffer size, and the buffer reallocation will never happen actually.
Here change to use substream->dma_max as the maximum constraint of the HW_PARAM_BUFFER_BYTES and fix the issue mentioned above.
I don't think the logic in the current code you're changing is wrong. If there is any, it must be something else.
This might be rather the FIXME code found in snd_pcm_hw_constraints_complete()?
thanks,
Takashi
Signed-off-by: Keyon Jie yang.jie@linux.intel.com
sound/core/pcm_native.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index c375c41496f8..326e921006e7 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -2301,7 +2301,7 @@ static int snd_pcm_hw_rule_buffer_bytes_max(struct snd_pcm_hw_params *params, struct snd_interval t; struct snd_pcm_substream *substream = rule->private; t.min = 0;
- t.max = substream->buffer_bytes_max;
- t.max = substream->dma_max; t.openmin = 0; t.openmax = 0; t.integer = 1;
-- 2.20.1
On Thu, 2020-01-16 at 08:15 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 05:53:18 +0100, Keyon Jie wrote:
With today's code, we preallocate DMA buffer for substreams at pcm_new() stage, and the substream->buffer_bytes_max and substream->dma_max will save as the actually preallocated buffer size and maximum size that the dma buffer can be expanded by at hw_params() state, correspondingly.
No, it's other way round: the former, buffer_bytes_max, is the max size defined by the driver (i.e. passed in snd_pcm_hardware) and the latter, dma_max, is the max preallocation size (passed to preallocation helper).
Hi Takashi, thanks for your comment.
First of all, have you ever hit issue I mentioned in the commit message that we can't set buffer_bytes larger than the preallocated dma bytes?
I found this issue in kinds of platforms, not only on SOF/SoC ones, but also on legacy HDA ones.
Secondly, I am not clear about the design intention of the substream-
buffer_bytes_max and substream->dma_max, if it is as you commented
above, can you help answer my questions below inline the code?
void snd_pcm_lib_preallocate_pages(struct snd_pcm_substream *substream, int type, struct device *data, size_t size, size_t max)
static void preallocate_pages(struct snd_pcm_substream *substream, int type, struct device *data, size_t size, size_t max, bool managed) { ... if (substream->dma_buffer.bytes > 0) substream->buffer_bytes_max = substream-
dma_buffer.bytes;//Keyon: this is the actual allocated buffer bytes,
what is the intention here and why it is assigned to buffer_bytes_max which will be used to constrain on the _HW_PARAM_BUFFER_BYTES later?
substream->dma_max = max; //Keyon: looks here it is where the *max* param used only if we don't define SND_VERBOSE_PROCFS? what relationship can we have with the preallocation itself? ... }
At pcm_open() stage, the maximum constraint of HW_PARAM_BUFFER_BYTES is set to substream->buffer_bytes_max and returned to user space as the max interval of the HW_PARAM_BUFFER_BYTES, this will lead to issue that user can't choose any buffer-bytes larger than the preallocated buffer size, and the buffer reallocation will never happen actually.
Here change to use substream->dma_max as the maximum constraint of the HW_PARAM_BUFFER_BYTES and fix the issue mentioned above.
I don't think the logic in the current code you're changing is wrong. If there is any, it must be something else.
This might be rather the FIXME code found in snd_pcm_hw_constraints_complete()?
I just tried removing the FIXME part code and it doesn't help, the rule snd_pcm_hw_rule_buffer_bytes_max here limit the max of the SNDRV_PCM_HW_PARAM_BUFFER_BYTES and this will returned to user space like aplay for the subsequent hw_params(), is this intentional?
int snd_pcm_hw_constraints_complete(struct snd_pcm_substream *substream) { ... err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, snd_pcm_hw_rule_buffer_bytes_max, substream, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, -1); if (err < 0) return err;
/* FIXME: remove */ if (runtime->dma_bytes) { err = snd_pcm_hw_constraint_minmax(runtime, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, 0, runtime->dma_bytes); if (err < 0) return err; }
... return 0; }
Thanks, ~Keyon
On Thu, 16 Jan 2020 10:50:33 +0100, Keyon Jie wrote:
On Thu, 2020-01-16 at 08:15 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 05:53:18 +0100, Keyon Jie wrote:
With today's code, we preallocate DMA buffer for substreams at pcm_new() stage, and the substream->buffer_bytes_max and substream->dma_max will save as the actually preallocated buffer size and maximum size that the dma buffer can be expanded by at hw_params() state, correspondingly.
No, it's other way round: the former, buffer_bytes_max, is the max size defined by the driver (i.e. passed in snd_pcm_hardware) and the latter, dma_max, is the max preallocation size (passed to preallocation helper).
Hi Takashi, thanks for your comment.
First of all, have you ever hit issue I mentioned in the commit message that we can't set buffer_bytes larger than the preallocated dma bytes?
I found this issue in kinds of platforms, not only on SOF/SoC ones, but also on legacy HDA ones.
Secondly, I am not clear about the design intention of the substream-
buffer_bytes_max and substream->dma_max, if it is as you commented
above, can you help answer my questions below inline the code?
void snd_pcm_lib_preallocate_pages(struct snd_pcm_substream *substream, int type, struct device *data, size_t size, size_t max)
static void preallocate_pages(struct snd_pcm_substream *substream, int type, struct device *data, size_t size, size_t max, bool managed) { ... if (substream->dma_buffer.bytes > 0) substream->buffer_bytes_max = substream-
dma_buffer.bytes;//Keyon: this is the actual allocated buffer bytes,
what is the intention here and why it is assigned to buffer_bytes_max which will be used to constrain on the _HW_PARAM_BUFFER_BYTES later?
substream->dma_max = max; //Keyon: looks here it is where the *max* param used only if we don't define SND_VERBOSE_PROCFS? what relationship can we have with the preallocation itself? ... }
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-buffers. But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
thanks,
Takashi
--- diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c index d4702cc1d376..6a6c3469bbcd 100644 --- a/sound/core/pcm_memory.c +++ b/sound/core/pcm_memory.c @@ -96,6 +96,29 @@ void snd_pcm_lib_preallocate_free_for_all(struct snd_pcm *pcm) } EXPORT_SYMBOL(snd_pcm_lib_preallocate_free_for_all);
+/* set up substream->buffer_bytes_max, which is used in hw_constraint */ +static void set_buffer_bytes_max(struct snd_pcm_substream *substream, + size_t size) +{ + substream->buffer_bytes_max = UINT_MAX; + + if (!size) + return; /* no preallocation */ + + /* for SG-buffers, no limitation is needed */ + switch (substream->dma_buffer.dev.type) { +#ifdef CONFIG_SND_DMA_SGBUF + case SNDRV_DMA_TYPE_DEV_SG: + case SNDRV_DMA_TYPE_DEV_UC_SG: +#endif + case SNDRV_DMA_TYPE_VMALLOC: + return; + } + + /* for continuous buffers, limit to the preallocated size */ + substream->buffer_bytes_max = size; +} + #ifdef CONFIG_SND_VERBOSE_PROCFS /* * read callback for prealloc proc file @@ -156,10 +179,8 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, buffer->error = -ENOMEM; return; } - substream->buffer_bytes_max = size; - } else { - substream->buffer_bytes_max = UINT_MAX; } + set_buffer_bytes_max(substream, size); if (substream->dma_buffer.area) snd_dma_free_pages(&substream->dma_buffer); substream->dma_buffer = new_dmab; @@ -206,10 +227,8 @@ static void preallocate_pages(struct snd_pcm_substream *substream,
if (size > 0 && preallocate_dma && substream->number < maximum_substreams) preallocate_pcm_pages(substream, size); - - if (substream->dma_buffer.bytes > 0) - substream->buffer_bytes_max = substream->dma_buffer.bytes; substream->dma_max = max; + set_buffer_bytes_max(substream, substream->dma_buffer.bytes); if (max > 0) preallocate_info_init(substream); if (managed)
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 10:50:33 +0100,
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
Thank you for the sharing, it is interesting and knowledge learned to me.
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-buffers. But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
I just went through all invoking to snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to the *max*, some set the *max* several times to the *size*, IMHO, the *max*s are matched to those hardware's limiatation, comparing to the *size*s, aren't they?
In this case, I still think my patch hanle all TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV cases more gracefully, we will still take the limitation from the specific driver set, from the *max* param, and the test results looks very nice here, we will take what the user space wanted for buffer-bytes via aply exactly, as long as it is suitable for the interval and constraints.
What's your opinion about it?
thanks,
Takashi
diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c index d4702cc1d376..6a6c3469bbcd 100644 --- a/sound/core/pcm_memory.c +++ b/sound/core/pcm_memory.c @@ -96,6 +96,29 @@ void snd_pcm_lib_preallocate_free_for_all(struct snd_pcm *pcm) } EXPORT_SYMBOL(snd_pcm_lib_preallocate_free_for_all);
+/* set up substream->buffer_bytes_max, which is used in hw_constraint */ +static void set_buffer_bytes_max(struct snd_pcm_substream *substream,
size_t size)
+{
- substream->buffer_bytes_max = UINT_MAX;
- if (!size)
return; /* no preallocation */
- /* for SG-buffers, no limitation is needed */
- switch (substream->dma_buffer.dev.type) {
+#ifdef CONFIG_SND_DMA_SGBUF
- case SNDRV_DMA_TYPE_DEV_SG:
- case SNDRV_DMA_TYPE_DEV_UC_SG:
+#endif
- case SNDRV_DMA_TYPE_VMALLOC:
return;
- }
- /* for continuous buffers, limit to the preallocated size */
- substream->buffer_bytes_max = size;
+}
#ifdef CONFIG_SND_VERBOSE_PROCFS /*
- read callback for prealloc proc file
@@ -156,10 +179,8 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, buffer->error = -ENOMEM;
if we won't take this change from user's fiddling for SG buffer, we should not reallocate dma pages here also?
Thanks, ~Keyon
return; }
substream->buffer_bytes_max = size;
} else {
}substream->buffer_bytes_max = UINT_MAX;
if (substream->dma_buffer.area) snd_dma_free_pages(&substream->dma_buffer); substream->dma_buffer = new_dmab;set_buffer_bytes_max(substream, size);
@@ -206,10 +227,8 @@ static void preallocate_pages(struct snd_pcm_substream *substream,
if (size > 0 && preallocate_dma && substream->number < maximum_substreams) preallocate_pcm_pages(substream, size);
- if (substream->dma_buffer.bytes > 0)
substream->buffer_bytes_max = substream-
dma_buffer.bytes;
substream->dma_max = max;
- set_buffer_bytes_max(substream, substream->dma_buffer.bytes); if (max > 0) preallocate_info_init(substream); if (managed)
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Thu, 16 Jan 2020 12:25:38 +0100, Keyon Jie wrote:
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 10:50:33 +0100,
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
Thank you for the sharing, it is interesting and knowledge learned to me.
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-buffers. But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
I just went through all invoking to snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to the *max*, some set the *max* several times to the *size*, IMHO, the *max*s are matched to those hardware's limiatation, comparing to the *size*s, aren't they?
In this case, I still think my patch hanle all TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV cases more gracefully, we will still take the limitation from the specific driver set, from the *max* param, and the test results looks very nice here, we will take what the user space wanted for buffer-bytes via aply exactly, as long as it is suitable for the interval and constraints.
Well, I have a mixed feeling. Certainly we'd need some better way to allow a larger buffer allocation, especially for HDA. OTOH, if the buffer was preallocated, it's meant to be used actually. That's the point of the hw_constraint setup.
And now thinking again after another cup of coffee, I wonder why we do preallocate for HDA at all. For HD-audio, the allocation of any large buffer would succeed very likely because of SG-buffer.
So, just setting 0 to the preallocation size (but keeping else) would work, e.g. something like below? The help text needs adjustment, but you can see the rough idea.
thanks,
Takashi
--- a/sound/hda/Kconfig +++ b/sound/hda/Kconfig @@ -21,9 +21,10 @@ config SND_HDA_EXT_CORE select SND_HDA_CORE
config SND_HDA_PREALLOC_SIZE - int "Pre-allocated buffer size for HD-audio driver" + int "Pre-allocated buffer size for HD-audio driver" if !SND_DMA_SGBUF range 0 32768 - default 64 + default 64 if !SND_DMA_SGBUF + default 0 if SND_DMA_SGBUF help Specifies the default pre-allocated buffer-size in kB for the HD-audio driver. A larger buffer (e.g. 2048) is preferred
-----Original Message----- From: Alsa-devel alsa-devel-bounces@alsa-project.org On Behalf Of Takashi Iwai Sent: Thursday, January 16, 2020 7:51 PM To: Keyon Jie yang.jie@linux.intel.com Cc: alsa-devel@alsa-project.org Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
On Thu, 16 Jan 2020 12:25:38 +0100, Keyon Jie wrote:
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 10:50:33 +0100,
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
Thank you for the sharing, it is interesting and knowledge learned to me.
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-buffers. But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
I just went through all invoking to snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to
the
*max*, some set the *max* several times to the *size*, IMHO, the *max*s are matched to those hardware's limiatation, comparing to the *size*s, aren't they?
In this case, I still think my patch hanle all TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
cases more
gracefully, we will still take the limitation from the specific driver set, from the *max* param, and the test results looks very nice here, we will take what the user space wanted for buffer-bytes via aply exactly, as long as it is suitable for the interval and constraints.
Well, I have a mixed feeling. Certainly we'd need some better way to allow a larger buffer allocation, especially for HDA. OTOH, if the buffer was preallocated, it's meant to be used actually. That's the point of the hw_constraint setup.
So if the buffer was preallocated, it won't be re-allocated at hw_params() stage, is this conflict with the re-allocate logic in hw_params()?
And now thinking again after another cup of coffee, I wonder why we do preallocate for HDA at all. For HD-audio, the allocation of any large buffer would succeed very likely because of SG-buffer.
So, just setting 0 to the preallocation size (but keeping else) would work, e.g. something like below? The help text needs adjustment, but you can see the rough idea.
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Thanks, ~Keyon
thanks,
Takashi
--- a/sound/hda/Kconfig +++ b/sound/hda/Kconfig @@ -21,9 +21,10 @@ config SND_HDA_EXT_CORE select SND_HDA_CORE
config SND_HDA_PREALLOC_SIZE
- int "Pre-allocated buffer size for HD-audio driver"
- int "Pre-allocated buffer size for HD-audio driver"
if !SND_DMA_SGBUF range 0 32768
- default 64
- default 64 if !SND_DMA_SGBUF
- default 0 if SND_DMA_SGBUF help Specifies the default pre-allocated buffer-size in kB for the HD-audio driver. A larger buffer (e.g. 2048) is preferred
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
-----Original Message----- From: Jie, Yang Sent: Thursday, January 16, 2020 10:14 PM To: 'Takashi Iwai' tiwai@suse.de; Keyon Jie yang.jie@linux.intel.com Cc: alsa-devel@alsa-project.org Subject: RE: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
-----Original Message----- From: Alsa-devel alsa-devel-bounces@alsa-project.org On Behalf Of Takashi Iwai Sent: Thursday, January 16, 2020 7:51 PM To: Keyon Jie yang.jie@linux.intel.com Cc: alsa-devel@alsa-project.org Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
On Thu, 16 Jan 2020 12:25:38 +0100, Keyon Jie wrote:
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 10:50:33 +0100,
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
Thank you for the sharing, it is interesting and knowledge learned to me.
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-
buffers.
But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
I just went through all invoking to snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV,
some
of them set the *size* equal to
the
*max*, some set the *max* several times to the *size*, IMHO, the *max*s are matched to those hardware's limiatation, comparing to the *size*s, aren't they?
In this case, I still think my patch hanle all TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
cases more
gracefully, we will still take the limitation from the specific driver set, from the *max* param, and the test results looks very nice here, we will take what the user space wanted for buffer-bytes via aply exactly, as long as it is suitable for the interval and constraints.
Well, I have a mixed feeling. Certainly we'd need some better way to allow a larger buffer allocation, especially for HDA. OTOH, if the buffer was preallocated, it's meant to be used actually. That's the point of the hw_constraint setup.
So if the buffer was preallocated, it won't be re-allocated at hw_params() stage, is this conflict with the re-allocate logic in hw_params()?
And now thinking again after another cup of coffee, I wonder why we do preallocate for HDA at all. For HD-audio, the allocation of any large buffer would succeed very likely because of SG-buffer.
So, just setting 0 to the preallocation size (but keeping else) would work,
e.g.
something like below? The help text needs adjustment, but you can see the rough idea.
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Tried and found setting 0 size for preallocation doesn't work for me, I have even tried to setting the size as big as the max(which the user space may require for buffer-bytes), it still doesn't work for me.
Thanks, ~Keyon
Thanks, ~Keyon
On Thu, 16 Jan 2020 16:31:02 +0100, Jie, Yang wrote:
-----Original Message----- From: Jie, Yang Sent: Thursday, January 16, 2020 10:14 PM To: 'Takashi Iwai' tiwai@suse.de; Keyon Jie yang.jie@linux.intel.com Cc: alsa-devel@alsa-project.org Subject: RE: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
-----Original Message----- From: Alsa-devel alsa-devel-bounces@alsa-project.org On Behalf Of Takashi Iwai Sent: Thursday, January 16, 2020 7:51 PM To: Keyon Jie yang.jie@linux.intel.com Cc: alsa-devel@alsa-project.org Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
On Thu, 16 Jan 2020 12:25:38 +0100, Keyon Jie wrote:
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 10:50:33 +0100,
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
Thank you for the sharing, it is interesting and knowledge learned to me.
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-
buffers.
But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
I just went through all invoking to snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV,
some
of them set the *size* equal to
the
*max*, some set the *max* several times to the *size*, IMHO, the *max*s are matched to those hardware's limiatation, comparing to the *size*s, aren't they?
In this case, I still think my patch hanle all TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
cases more
gracefully, we will still take the limitation from the specific driver set, from the *max* param, and the test results looks very nice here, we will take what the user space wanted for buffer-bytes via aply exactly, as long as it is suitable for the interval and constraints.
Well, I have a mixed feeling. Certainly we'd need some better way to allow a larger buffer allocation, especially for HDA. OTOH, if the buffer was preallocated, it's meant to be used actually. That's the point of the hw_constraint setup.
So if the buffer was preallocated, it won't be re-allocated at hw_params() stage, is this conflict with the re-allocate logic in hw_params()?
And now thinking again after another cup of coffee, I wonder why we do preallocate for HDA at all. For HD-audio, the allocation of any large buffer would succeed very likely because of SG-buffer.
So, just setting 0 to the preallocation size (but keeping else) would work,
e.g.
something like below? The help text needs adjustment, but you can see the rough idea.
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Tried and found setting 0 size for preallocation doesn't work for me, I have even tried to setting the size as big as the max(which the user space may require for buffer-bytes), it still doesn't work for me.
How did you test it? I quickly checked now on my machine, and it seems working...
# echo 1024 > /proc/asound/card0/pcm0p/sub0/prealloc # aplay -Dhw:0 -v --buffer-size=1048576 foo.wav Hardware PCM card 0 'HDA Intel PCH' device 0 subdevice 0 Its setup is: stream : PLAYBACK .... buffer_size : 262144
# echo 0 > /proc/asound/card0/pcm0p/sub0/prealloc # aplay -Dhw:0 -v --buffer-size=1048576 foo.wav Hardware PCM card 0 'HDA Intel PCH' device 0 subdevice 0 Its setup is: stream : PLAYBACK .... buffer_size : 1048576
Takashi
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
Thanks -Pierre
On 1/16/2020 5:39 PM, Pierre-Louis Bossart wrote:
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Thanks -Pierre _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
On Thu, 16 Jan 2020 18:40:26 +0100, Pierre-Louis Bossart wrote:
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
That size shouldn't matter, and would work even with the preallocation.
My concern is that removing the limitation would allow the allocation of too large sizes. Even with dma_max limit, it can go up to 32MB physical pages per stream for HDA. Depending on the hardware setup, there can be a lot of streams assignment (e.g. HDMI codecs) and multiple codecs / controllers, and imagine that all those allocated pages are pinned and can't be swapped out...
Takashi
On 2020/1/17 上午4:37, Takashi Iwai wrote:
On Thu, 16 Jan 2020 18:40:26 +0100, Pierre-Louis Bossart wrote:
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
That size shouldn't matter, and would work even with the preallocation.
My concern is that removing the limitation would allow the allocation of too large sizes. Even with dma_max limit, it can go up to 32MB physical pages per stream for HDA. Depending on the hardware setup, there can be a lot of streams assignment (e.g. HDMI codecs) and multiple codecs / controllers, and imagine that all those allocated pages are pinned and can't be swapped out...
Hi Takashi, I get your concern here, but if we switch to use dma_max limit, we won't change the preallocated buffer, it will be still 64KB for each stream, user space can ask for re-allocate buffer for each stream up to 32MB, but those pinned and can't be swapped out ones are the 64KB preallocated ones only, am I wrong?
Thanks, ~Keyon
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Fri, 17 Jan 2020 06:30:18 +0100, Keyon Jie wrote:
On 2020/1/17 上午4:37, Takashi Iwai wrote:
On Thu, 16 Jan 2020 18:40:26 +0100, Pierre-Louis Bossart wrote:
> So, do you suggest not doing preallocation(or calling it with 0 > size) for all > driver with TYPE_SG? I am fine if this is the recommended method, > I can try > this on SOF I2S platform to see if it can work as we required for > very large > buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
That size shouldn't matter, and would work even with the preallocation.
My concern is that removing the limitation would allow the allocation of too large sizes. Even with dma_max limit, it can go up to 32MB physical pages per stream for HDA. Depending on the hardware setup, there can be a lot of streams assignment (e.g. HDMI codecs) and multiple codecs / controllers, and imagine that all those allocated pages are pinned and can't be swapped out...
Hi Takashi, I get your concern here, but if we switch to use dma_max limit, we won't change the preallocated buffer, it will be still 64KB for each stream, user space can ask for re-allocate buffer for each stream up to 32MB, but those pinned and can't be swapped out ones are the 64KB preallocated ones only, am I wrong?
No, in general, all sound hardware buffers are pinned.
Takashi
On 2020/1/17 下午3:57, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:30:18 +0100, Keyon Jie wrote:
On 2020/1/17 上午4:37, Takashi Iwai wrote:
Hi Takashi, I get your concern here, but if we switch to use dma_max limit, we won't change the preallocated buffer, it will be still 64KB for each stream, user space can ask for re-allocate buffer for each stream up to 32MB, but those pinned and can't be swapped out ones are the 64KB preallocated ones only, am I wrong?
No, in general, all sound hardware buffers are pinned.
Sorry, I must have been wrong here, what I was focusing on is those allocated SG DMA buffers, I am not sure if they are those you called "hardware buffers" here.
My understanding was like this:
1. in pcm_new() stage, the device PCM driver should call snd_pcm_lib_preallocate_pages()-> snd_pcm_lib_preallocate_pages()-> preallocate_pcm_pages() and then the substream->dma_buffer is initialized with the preallocated buffer.
2. in pcm_open() stage, the device PCM driver should call snd_pcm_lib_malloc_pages()-> snd_dma_alloc_pages() //if we need to reallocate bigger buffer. *The substream->dma_buffer won't be freed, Takashi, this is what I thought you named "pinned" buffer.* And those reallocated bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close() per my understanding?
Thanks, ~Keyon
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Fri, 17 Jan 2020 11:13:31 +0100, Keyon Jie wrote:
On 2020/1/17 下午3:57, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:30:18 +0100, Keyon Jie wrote:
On 2020/1/17 上午4:37, Takashi Iwai wrote:
Hi Takashi, I get your concern here, but if we switch to use dma_max limit, we won't change the preallocated buffer, it will be still 64KB for each stream, user space can ask for re-allocate buffer for each stream up to 32MB, but those pinned and can't be swapped out ones are the 64KB preallocated ones only, am I wrong?
No, in general, all sound hardware buffers are pinned.
Sorry, I must have been wrong here, what I was focusing on is those allocated SG DMA buffers, I am not sure if they are those you called "hardware buffers" here.
My understanding was like this:
- in pcm_new() stage, the device PCM driver should call
snd_pcm_lib_preallocate_pages()-> snd_pcm_lib_preallocate_pages()-> preallocate_pcm_pages() and then the substream->dma_buffer is initialized with the preallocated buffer.
- in pcm_open() stage, the device PCM driver should call
snd_pcm_lib_malloc_pages()-> snd_dma_alloc_pages() //if we need to reallocate bigger buffer. *The substream->dma_buffer won't be freed, Takashi, this is what I thought you named "pinned" buffer.* And those reallocated bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close() per my understanding?
What I meant as "pinned" is that the pages are not swapped out by swapper process like the user-space or anonymous pages. So if you open all streams (say 16 streams) on a machine with 32MB buffers, it'll cost a half GB. And, we have no restriction about which user may do it, so all normal users who have the access to the sound device can consume a half GB kernel space pages easily. For a big server it's no problem, but for a small system, it's costing.
Takashi
On 2020/1/17 下午6:30, Takashi Iwai wrote:
On Fri, 17 Jan 2020 11:13:31 +0100, Keyon Jie wrote:
On 2020/1/17 下午3:57, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:30:18 +0100, Keyon Jie wrote:
On 2020/1/17 上午4:37, Takashi Iwai wrote:
Hi Takashi, I get your concern here, but if we switch to use dma_max limit, we won't change the preallocated buffer, it will be still 64KB for each stream, user space can ask for re-allocate buffer for each stream up to 32MB, but those pinned and can't be swapped out ones are the 64KB preallocated ones only, am I wrong?
No, in general, all sound hardware buffers are pinned.
Sorry, I must have been wrong here, what I was focusing on is those allocated SG DMA buffers, I am not sure if they are those you called "hardware buffers" here.
My understanding was like this:
- in pcm_new() stage, the device PCM driver should call
snd_pcm_lib_preallocate_pages()-> snd_pcm_lib_preallocate_pages()-> preallocate_pcm_pages() and then the substream->dma_buffer is initialized with the preallocated buffer.
- in pcm_open() stage, the device PCM driver should call
snd_pcm_lib_malloc_pages()-> snd_dma_alloc_pages() //if we need to reallocate bigger buffer. *The substream->dma_buffer won't be freed, Takashi, this is what I thought you named "pinned" buffer.* And those reallocated bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close() per my understanding?
What I meant as "pinned" is that the pages are not swapped out by swapper process like the user-space or anonymous pages. So if you open all streams (say 16 streams) on a machine with 32MB buffers, it'll cost a half GB. And, we have no restriction about which user may do it, so all normal users who have the access to the sound device can consume a half GB kernel space pages easily. For a big server it's no problem, but for a small system, it's costing.
Understood, you are concerning about intentional attack from user space about memory consuming, you propose that normal user should be permitted to use the default 64KB only, if larger buffer required, please use proc fs expert mode, is my understanding correct?
Thanks, ~Keyon
Takashi
On Fri, 17 Jan 2020 11:56:48 +0100, Keyon Jie wrote:
On 2020/1/17 下午6:30, Takashi Iwai wrote:
On Fri, 17 Jan 2020 11:13:31 +0100, Keyon Jie wrote:
On 2020/1/17 下午3:57, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:30:18 +0100, Keyon Jie wrote:
On 2020/1/17 上午4:37, Takashi Iwai wrote:
Hi Takashi, I get your concern here, but if we switch to use dma_max limit, we won't change the preallocated buffer, it will be still 64KB for each stream, user space can ask for re-allocate buffer for each stream up to 32MB, but those pinned and can't be swapped out ones are the 64KB preallocated ones only, am I wrong?
No, in general, all sound hardware buffers are pinned.
Sorry, I must have been wrong here, what I was focusing on is those allocated SG DMA buffers, I am not sure if they are those you called "hardware buffers" here.
My understanding was like this:
- in pcm_new() stage, the device PCM driver should call
snd_pcm_lib_preallocate_pages()-> snd_pcm_lib_preallocate_pages()-> preallocate_pcm_pages() and then the substream->dma_buffer is initialized with the preallocated buffer.
- in pcm_open() stage, the device PCM driver should call
snd_pcm_lib_malloc_pages()-> snd_dma_alloc_pages() //if we need to reallocate bigger buffer. *The substream->dma_buffer won't be freed, Takashi, this is what I thought you named "pinned" buffer.* And those reallocated bigger buffer via snd_dma_alloc_pages() will be freed at pcm_close() per my understanding?
What I meant as "pinned" is that the pages are not swapped out by swapper process like the user-space or anonymous pages. So if you open all streams (say 16 streams) on a machine with 32MB buffers, it'll cost a half GB. And, we have no restriction about which user may do it, so all normal users who have the access to the sound device can consume a half GB kernel space pages easily. For a big server it's no problem, but for a small system, it's costing.
Understood, you are concerning about intentional attack from user space about memory consuming, you propose that normal user should be permitted to use the default 64KB only, if larger buffer required, please use proc fs expert mode, is my understanding correct?
Well, a normal user may want 1MB or 2MB buffer, and that's not too bad. So the most distros already set the larger preallocation for HD-audio explicitly via CONFIG_SND_HDA_PREALLOC_SIZE without procfs adjustment, I believe. Then the system allows normal users buffers up to the given size.
Takashi
On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
For S24_LE, it is 512KB, the point is that if we can't re-allocate buffer at hw_params() stage, then we need follow a BKM that we have to preallocate the largest DMA buffer that we claim to support at pcm_new(), I think this is actually another kind of wast with these largest pinned buffer that can't be swapped out...
Thanks, ~Keyon
Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Fri, 17 Jan 2020 06:37:16 +0100, Keyon Jie wrote:
On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
For S24_LE, it is 512KB, the point is that if we can't re-allocate buffer at hw_params() stage, then we need follow a BKM that we have to preallocate the largest DMA buffer that we claim to support at pcm_new(), I think this is actually another kind of wast with these largest pinned buffer that can't be swapped out...
Well, that's the case you'd need a larger preallocation. I guess many distros already set it to a higher value for PulseAudio. The default 64kB is just from historical and compatibility reason, and we may extend it to 1MB or so now.
Takashi
On 2020/1/17 下午4:00, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:37:16 +0100, Keyon Jie wrote:
On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
> So, do you suggest not doing preallocation(or calling it with 0 > size) for all > driver with TYPE_SG? I am fine if this is the recommended > method, I can try > this on SOF I2S platform to see if it can work as we required > for very large > buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
For S24_LE, it is 512KB, the point is that if we can't re-allocate buffer at hw_params() stage, then we need follow a BKM that we have to preallocate the largest DMA buffer that we claim to support at pcm_new(), I think this is actually another kind of wast with these largest pinned buffer that can't be swapped out...
Well, that's the case you'd need a larger preallocation. I guess many distros already set it to a higher value for PulseAudio. The default 64kB is just from historical and compatibility reason, and we may extend it to 1MB or so now.
In SOF driver, we don't use kernel config item like CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, le32_to_cpu(caps->buffer_size_min), le32_to_cpu(caps->buffer_size_max));
So the preallocated size is configured via topology file, that is caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
So, it looks like we have to change it to this if we don't change the ALSA core:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, - le32_to_cpu(caps->buffer_size_min), + le32_to_cpu(caps->buffer_size_max), le32_to_cpu(caps->buffer_size_max));
Thanks, ~Keyon
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Fri, 17 Jan 2020 11:43:24 +0100, Keyon Jie wrote:
On 2020/1/17 下午4:00, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:37:16 +0100, Keyon Jie wrote:
On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
>> So, do you suggest not doing preallocation(or calling it with 0 >> size) for all >> driver with TYPE_SG? I am fine if this is the recommended >> method, I can try >> this on SOF I2S platform to see if it can work as we required >> for very large >> buffer size.
Keyon, for the rest of us to follow this patch, would you mind clarifying what drives the need for a 'very large buffer size', and what order of magnitude this very large size would be.
FWIW, we've measured consistently on different Windows/Linux platforms, maybe 10 years ago, that once you reach a buffer of 1s (384 kB) the benefits from increasing that buffer size further are marginal in terms of power consumption, and generate all kinds of issues with volume updates and deferred routing changes.
We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
For S24_LE, it is 512KB, the point is that if we can't re-allocate buffer at hw_params() stage, then we need follow a BKM that we have to preallocate the largest DMA buffer that we claim to support at pcm_new(), I think this is actually another kind of wast with these largest pinned buffer that can't be swapped out...
Well, that's the case you'd need a larger preallocation. I guess many distros already set it to a higher value for PulseAudio. The default 64kB is just from historical and compatibility reason, and we may extend it to 1MB or so now.
In SOF driver, we don't use kernel config item like CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, le32_to_cpu(caps->buffer_size_min), le32_to_cpu(caps->buffer_size_max));
So the preallocated size is configured via topology file, that is caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
So, it looks like we have to change it to this if we don't change the ALSA core:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
le32_to_cpu(caps->buffer_size_min),
le32_to_cpu(caps->buffer_size_max), le32_to_cpu(caps->buffer_size_max));
Yes, passing buffer_size_min for the preallocation sounds already bad. The default value should be sufficient for usual operations, not the cost-cutting minimum. Otherwise there is no merit of preallocation.
Alternatively, we may pass 0 there, indicating no limitation, too. But, this would need a bit other adjustment, e.g. snd_pcm_hardware should have lower buffer_bytes_max.
Takashi
On 2020/1/17 下午7:12, Takashi Iwai wrote:
On Fri, 17 Jan 2020 11:43:24 +0100, Keyon Jie wrote:
On 2020/1/17 下午4:00, Takashi Iwai wrote:
On Fri, 17 Jan 2020 06:37:16 +0100, Keyon Jie wrote:
On 2020/1/17 上午1:40, Pierre-Louis Bossart wrote:
>>> So, do you suggest not doing preallocation(or calling it with 0 >>> size) for all >>> driver with TYPE_SG? I am fine if this is the recommended >>> method, I can try >>> this on SOF I2S platform to see if it can work as we required >>> for very large >>> buffer size. > > Keyon, for the rest of us to follow this patch, would you mind > clarifying what drives the need for a 'very large buffer size', > and what order of magnitude this very large size would be. > > FWIW, we've measured consistently on different Windows/Linux > platforms, maybe 10 years ago, that once you reach a buffer of 1s > (384 kB) the benefits from increasing that buffer size further are > marginal in terms of power consumption, and generate all kinds of > issues with volume updates and deferred routing changes. > We need bigger buffer on host side to compensate the wake up time from d0ix to d0 which takes ~2 seconds on my setup. So, wiith smaller buffer sizes like < 2 seconds we overwrite data since FW keeps copping while host doesn't read until its up and running again.
Right, that's a valid case, but that's 256 kB, not 'very large' or likely to ever trigger an OOM case.
For S24_LE, it is 512KB, the point is that if we can't re-allocate buffer at hw_params() stage, then we need follow a BKM that we have to preallocate the largest DMA buffer that we claim to support at pcm_new(), I think this is actually another kind of wast with these largest pinned buffer that can't be swapped out...
Well, that's the case you'd need a larger preallocation. I guess many distros already set it to a higher value for PulseAudio. The default 64kB is just from historical and compatibility reason, and we may extend it to 1MB or so now.
In SOF driver, we don't use kernel config item like CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, le32_to_cpu(caps->buffer_size_min), le32_to_cpu(caps->buffer_size_max));
So the preallocated size is configured via topology file, that is caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
So, it looks like we have to change it to this if we don't change the ALSA core:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
le32_to_cpu(caps->buffer_size_min),
le32_to_cpu(caps->buffer_size_max), le32_to_cpu(caps->buffer_size_max));
Yes, passing buffer_size_min for the preallocation sounds already bad. The default value should be sufficient for usual operations, not the cost-cutting minimum. Otherwise there is no merit of preallocation.
Alternatively, we may pass 0 there, indicating no limitation, too. But, this would need a bit other adjustment, e.g. snd_pcm_hardware should have lower buffer_bytes_max.
Thank you Takashi, then let's follow it to pre-allocate with caps->buffer_size_max, as we don't specify any limitations in snd_pcm_hardware today, we want to leave it configurable to each specific topology file for different machines.
Thanks, ~Keyon
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Sun, 19 Jan 2020 04:52:55 +0100, Keyon Jie wrote:
On 2020/1/17 下午7:12, Takashi Iwai wrote:
On Fri, 17 Jan 2020 11:43:24 +0100, Keyon Jie wrote:
In SOF driver, we don't use kernel config item like CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, le32_to_cpu(caps->buffer_size_min), le32_to_cpu(caps->buffer_size_max));
So the preallocated size is configured via topology file, that is caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
So, it looks like we have to change it to this if we don't change the ALSA core:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
le32_to_cpu(caps->buffer_size_min),
le32_to_cpu(caps->buffer_size_max), le32_to_cpu(caps->buffer_size_max));
Yes, passing buffer_size_min for the preallocation sounds already bad. The default value should be sufficient for usual operations, not the cost-cutting minimum. Otherwise there is no merit of preallocation.
Alternatively, we may pass 0 there, indicating no limitation, too. But, this would need a bit other adjustment, e.g. snd_pcm_hardware should have lower buffer_bytes_max.
Thank you Takashi, then let's follow it to pre-allocate with caps->buffer_size_max, as we don't specify any limitations in snd_pcm_hardware today, we want to leave it configurable to each specific topology file for different machines.
How big is caps->buffer_size_max? Passing the value there means actually trying to allocate the given size as default, and it'd be a lot of waste if a too large value (e.g. 32MB) is passed there.
I think we can go for passing zero as default, which means skipping preallocation. In addition, we may add an upper limit of the total amount of allocation per card, controlled in pcm_memory.c, for example. This logic can be applied to the legacy HDA, too.
This should be relatively easy, and I'll provide the patch in the next week.
Takashi
On 2020/1/19 下午3:09, Takashi Iwai wrote:
On Sun, 19 Jan 2020 04:52:55 +0100, Keyon Jie wrote:
On 2020/1/17 下午7:12, Takashi Iwai wrote:
On Fri, 17 Jan 2020 11:43:24 +0100, Keyon Jie wrote:
In SOF driver, we don't use kernel config item like CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, le32_to_cpu(caps->buffer_size_min), le32_to_cpu(caps->buffer_size_max));
So the preallocated size is configured via topology file, that is caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
So, it looks like we have to change it to this if we don't change the ALSA core:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
le32_to_cpu(caps->buffer_size_min),
le32_to_cpu(caps->buffer_size_max), le32_to_cpu(caps->buffer_size_max));
Yes, passing buffer_size_min for the preallocation sounds already bad. The default value should be sufficient for usual operations, not the cost-cutting minimum. Otherwise there is no merit of preallocation.
Alternatively, we may pass 0 there, indicating no limitation, too. But, this would need a bit other adjustment, e.g. snd_pcm_hardware should have lower buffer_bytes_max.
Thank you Takashi, then let's follow it to pre-allocate with caps->buffer_size_max, as we don't specify any limitations in snd_pcm_hardware today, we want to leave it configurable to each specific topology file for different machines.
How big is caps->buffer_size_max? Passing the value there means actually trying to allocate the given size as default, and it'd be a lot of waste if a too large value (e.g. 32MB) is passed there.
It varies for each stream, most of them are 65536 Bytes only, whereas one for Wake-On-Voice might need a > 4 Seconds buffer could be up to about 1~2MBytes, and another one for deep-buffer playback can be up to about 8MBytes.
I think we can go for passing zero as default, which means skipping preallocation. In addition, we may add an upper limit of the total
Just did an experiment and this works for me, I believe we still need to call snd_pcm_set_managed_buffer() though the preallocation is skipped in this, right?
amount of allocation per card, controlled in pcm_memory.c, for example. This logic can be applied to the legacy HDA, too.
This should be relatively easy, and I'll provide the patch in the next week.
OK, that's fine for me also, thank you.
~Keyon
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Sun, 19 Jan 2020 09:11:17 +0100, Keyon Jie wrote:
On 2020/1/19 下午3:09, Takashi Iwai wrote:
On Sun, 19 Jan 2020 04:52:55 +0100, Keyon Jie wrote:
On 2020/1/17 下午7:12, Takashi Iwai wrote:
On Fri, 17 Jan 2020 11:43:24 +0100, Keyon Jie wrote:
In SOF driver, we don't use kernel config item like CONFIG_SND_HDA_PREALLOC_SIZE for HDA, the code for it is:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev, le32_to_cpu(caps->buffer_size_min), le32_to_cpu(caps->buffer_size_max));
So the preallocated size is configured via topology file, that is caps->buffer_size_min, no chance for PulseAudio to reconfigure it.
So, it looks like we have to change it to this if we don't change the ALSA core:
snd_pcm_lib_preallocate_pages(pcm->streams[stream].substream, SNDRV_DMA_TYPE_DEV_SG, sdev->dev,
le32_to_cpu(caps->buffer_size_min),
le32_to_cpu(caps->buffer_size_max), le32_to_cpu(caps->buffer_size_max));
Yes, passing buffer_size_min for the preallocation sounds already bad. The default value should be sufficient for usual operations, not the cost-cutting minimum. Otherwise there is no merit of preallocation.
Alternatively, we may pass 0 there, indicating no limitation, too. But, this would need a bit other adjustment, e.g. snd_pcm_hardware should have lower buffer_bytes_max.
Thank you Takashi, then let's follow it to pre-allocate with caps->buffer_size_max, as we don't specify any limitations in snd_pcm_hardware today, we want to leave it configurable to each specific topology file for different machines.
How big is caps->buffer_size_max? Passing the value there means actually trying to allocate the given size as default, and it'd be a lot of waste if a too large value (e.g. 32MB) is passed there.
It varies for each stream, most of them are 65536 Bytes only, whereas one for Wake-On-Voice might need a > 4 Seconds buffer could be up to about 1~2MBytes, and another one for deep-buffer playback can be up to about 8MBytes.
Hm, so this varies so much depending on the use case? I thought it comes from the topology file and it's essentially consistent over various purposes.
I think we can go for passing zero as default, which means skipping preallocation. In addition, we may add an upper limit of the total
Just did an experiment and this works for me, I believe we still need to call snd_pcm_set_managed_buffer() though the preallocation is skipped in this, right?
No, snd_pcm_set_managed_buffer() is the new PCM preallocation API. The old snd_pcm_lib_preallocate*() is almost gone.
amount of allocation per card, controlled in pcm_memory.c, for example. This logic can be applied to the legacy HDA, too.
This should be relatively easy, and I'll provide the patch in the next week.
OK, that's fine for me also, thank you.
Below is a quick hack for HDA. We still need the certain amount of preallocation for non-x86 systems that don't support SG-buffers, so a bit of trick is applied to Kconfig.
Totally untested, as usual.
thanks,
Takashi
--- diff --git a/include/sound/core.h b/include/sound/core.h index 0e14b7a3e67b..ac8b692b69b4 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -120,6 +120,9 @@ struct snd_card { int sync_irq; /* assigned irq, used for PCM sync */ wait_queue_head_t remove_sleep;
+ size_t total_pcm_alloc_bytes; /* total amount of allocated buffers */ + struct mutex memory_mutex; /* protection for the above */ + #ifdef CONFIG_PM unsigned int power_state; /* power state */ wait_queue_head_t power_sleep; diff --git a/sound/core/init.c b/sound/core/init.c index faa9f03c01ca..b02a99766351 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -211,6 +211,7 @@ int snd_card_new(struct device *parent, int idx, const char *xid, INIT_LIST_HEAD(&card->ctl_files); spin_lock_init(&card->files_lock); INIT_LIST_HEAD(&card->files_list); + mutex_init(&card->memory_mutex); #ifdef CONFIG_PM init_waitqueue_head(&card->power_sleep); #endif diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c index d4702cc1d376..4883b0ccd475 100644 --- a/sound/core/pcm_memory.c +++ b/sound/core/pcm_memory.c @@ -27,6 +27,37 @@ MODULE_PARM_DESC(maximum_substreams, "Maximum substreams with preallocated DMA m
static const size_t snd_minimum_buffer = 16384;
+static unsigned long max_alloc_per_card = 32UL * 1024UL * 1024UL * 1024UL; +module_param(max_alloc_per_card, ulong, 0644); +MODULE_PARM_DESC(max_alloc_per_card, "Max total allocation bytes per card."); + +static int do_alloc_pages(struct snd_card *card, int type, struct device *dev, + size_t size, struct snd_dma_buffer *dmab) +{ + int err; + + if (card->total_pcm_alloc_bytes + size > max_alloc_per_card) + return -ENOMEM; + err = snd_dma_alloc_pages(type, dev, size, dmab); + if (!err) { + mutex_lock(&card->memory_mutex); + card->total_pcm_alloc_bytes += dmab->bytes; + mutex_unlock(&card->memory_mutex); + } + return err; +} + +static void do_free_pages(struct snd_card *card, struct snd_dma_buffer *dmab) +{ + if (!dmab->area) + return; + mutex_lock(&card->memory_mutex); + WARN_ON(card->total_pcm_alloc_bytes < dmab->bytes); + card->total_pcm_alloc_bytes -= dmab->bytes; + mutex_unlock(&card->memory_mutex); + snd_dma_free_pages(dmab); + dmab->area = NULL; +}
/* * try to allocate as the large pages as possible. @@ -37,16 +68,15 @@ static const size_t snd_minimum_buffer = 16384; static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t size) { struct snd_dma_buffer *dmab = &substream->dma_buffer; + struct snd_card *card = substream->pcm->card; size_t orig_size = size; int err;
do { - if ((err = snd_dma_alloc_pages(dmab->dev.type, dmab->dev.dev, - size, dmab)) < 0) { - if (err != -ENOMEM) - return err; /* fatal error */ - } else - return 0; + err = do_alloc_pages(card, dmab->dev.type, dmab->dev.dev, + size, dmab); + if (err != -ENOMEM) + return err; size >>= 1; } while (size >= snd_minimum_buffer); dmab->bytes = 0; /* tell error */ @@ -62,10 +92,7 @@ static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t siz */ static void snd_pcm_lib_preallocate_dma_free(struct snd_pcm_substream *substream) { - if (substream->dma_buffer.area == NULL) - return; - snd_dma_free_pages(&substream->dma_buffer); - substream->dma_buffer.area = NULL; + do_free_pages(substream->pcm->card, &substream->dma_buffer); }
/** @@ -130,6 +157,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, struct snd_info_buffer *buffer) { struct snd_pcm_substream *substream = entry->private_data; + struct snd_card *card = substream->pcm->card; char line[64], str[64]; size_t size; struct snd_dma_buffer new_dmab; @@ -150,9 +178,10 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, memset(&new_dmab, 0, sizeof(new_dmab)); new_dmab.dev = substream->dma_buffer.dev; if (size > 0) { - if (snd_dma_alloc_pages(substream->dma_buffer.dev.type, - substream->dma_buffer.dev.dev, - size, &new_dmab) < 0) { + if (do_alloc_pages(card, + substream->dma_buffer.dev.type, + substream->dma_buffer.dev.dev, + size, &new_dmab) < 0) { buffer->error = -ENOMEM; return; } @@ -161,7 +190,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, substream->buffer_bytes_max = UINT_MAX; } if (substream->dma_buffer.area) - snd_dma_free_pages(&substream->dma_buffer); + do_free_pages(card, &substream->dma_buffer); substream->dma_buffer = new_dmab; } else { buffer->error = -EINVAL; @@ -346,6 +375,7 @@ struct page *snd_pcm_sgbuf_ops_page(struct snd_pcm_substream *substream, unsigne */ int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size) { + struct snd_card *card = substream->pcm->card; struct snd_pcm_runtime *runtime; struct snd_dma_buffer *dmab = NULL;
@@ -374,9 +404,10 @@ int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size) if (! dmab) return -ENOMEM; dmab->dev = substream->dma_buffer.dev; - if (snd_dma_alloc_pages(substream->dma_buffer.dev.type, - substream->dma_buffer.dev.dev, - size, dmab) < 0) { + if (do_alloc_pages(card, + substream->dma_buffer.dev.type, + substream->dma_buffer.dev.dev, + size, dmab) < 0) { kfree(dmab); return -ENOMEM; } @@ -397,6 +428,7 @@ EXPORT_SYMBOL(snd_pcm_lib_malloc_pages); */ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream) { + struct snd_card *card = substream->pcm->card; struct snd_pcm_runtime *runtime;
if (PCM_RUNTIME_CHECK(substream)) @@ -406,7 +438,7 @@ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream) return 0; if (runtime->dma_buffer_p != &substream->dma_buffer) { /* it's a newly allocated buffer. release it now. */ - snd_dma_free_pages(runtime->dma_buffer_p); + do_free_pages(card, runtime->dma_buffer_p); kfree(runtime->dma_buffer_p); } snd_pcm_set_runtime_buffer(substream, NULL); diff --git a/sound/hda/Kconfig b/sound/hda/Kconfig index b0c88fe040ee..4ca6b09056f3 100644 --- a/sound/hda/Kconfig +++ b/sound/hda/Kconfig @@ -21,14 +21,16 @@ config SND_HDA_EXT_CORE select SND_HDA_CORE
config SND_HDA_PREALLOC_SIZE - int "Pre-allocated buffer size for HD-audio driver" + int "Pre-allocated buffer size for HD-audio driver" if !SND_DMA_SGBUF range 0 32768 - default 64 + default 0 if SND_DMA_SGBUF + default 64 if !SND_DMA_SGBUF help Specifies the default pre-allocated buffer-size in kB for the HD-audio driver. A larger buffer (e.g. 2048) is preferred for systems using PulseAudio. The default 64 is chosen just for compatibility reasons. + On x86 systems, the default is zero as we need no preallocation.
Note that the pre-allocation size can be changed dynamically via a proc file (/proc/asound/card*/pcm*/sub*/prealloc), too.
On 2020/1/19 下午5:04, Takashi Iwai wrote:
On Sun, 19 Jan 2020 09:11:17 +0100, Keyon Jie wrote:
On 2020/1/19 下午3:09, Takashi Iwai wrote: It varies for each stream, most of them are 65536 Bytes only, whereas one for Wake-On-Voice might need a > 4 Seconds buffer could be up to about 1~2MBytes, and another one for deep-buffer playback can be up to about 8MBytes.
Hm, so this varies so much depending on the use case? I thought it comes from the topology file and it's essentially consistent over various purposes.
Yes, we add different buffer_bytes_max limitation to each stream depending on its use case, basically we set it to the maximum value we claim to support only, we don't want to waste any of the system memory.
I think we can go for passing zero as default, which means skipping preallocation. In addition, we may add an upper limit of the total
Just did an experiment and this works for me, I believe we still need to call snd_pcm_set_managed_buffer() though the preallocation is skipped in this, right?
No, snd_pcm_set_managed_buffer() is the new PCM preallocation API. The old snd_pcm_lib_preallocate*() is almost gone.
What I asked is actually that since the preallocation will be skipped(with passing size=0), can we just not calling snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() in our SOF PCM driver? I believe no(we still need the invoking to do initialization except buffer allocating)?
amount of allocation per card, controlled in pcm_memory.c, for example. This logic can be applied to the legacy HDA, too.
This should be relatively easy, and I'll provide the patch in the next week.
OK, that's fine for me also, thank you.
Below is a quick hack for HDA. We still need the certain amount of preallocation for non-x86 systems that don't support SG-buffers, so a bit of trick is applied to Kconfig.
Totally untested, as usual.
Did a quick test(plus passing 0 size for preallocate in SOF PCM driver) and it works for my use case(no regression comparing that without applying this patch), Thank you.
Thanks, ~Keyon
thanks,
Takashi
diff --git a/include/sound/core.h b/include/sound/core.h index 0e14b7a3e67b..ac8b692b69b4 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -120,6 +120,9 @@ struct snd_card { int sync_irq; /* assigned irq, used for PCM sync */ wait_queue_head_t remove_sleep;
- size_t total_pcm_alloc_bytes; /* total amount of allocated buffers */
- struct mutex memory_mutex; /* protection for the above */
- #ifdef CONFIG_PM unsigned int power_state; /* power state */ wait_queue_head_t power_sleep;
diff --git a/sound/core/init.c b/sound/core/init.c index faa9f03c01ca..b02a99766351 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -211,6 +211,7 @@ int snd_card_new(struct device *parent, int idx, const char *xid, INIT_LIST_HEAD(&card->ctl_files); spin_lock_init(&card->files_lock); INIT_LIST_HEAD(&card->files_list);
- mutex_init(&card->memory_mutex); #ifdef CONFIG_PM init_waitqueue_head(&card->power_sleep); #endif
diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c index d4702cc1d376..4883b0ccd475 100644 --- a/sound/core/pcm_memory.c +++ b/sound/core/pcm_memory.c @@ -27,6 +27,37 @@ MODULE_PARM_DESC(maximum_substreams, "Maximum substreams with preallocated DMA m
static const size_t snd_minimum_buffer = 16384;
+static unsigned long max_alloc_per_card = 32UL * 1024UL * 1024UL * 1024UL; +module_param(max_alloc_per_card, ulong, 0644); +MODULE_PARM_DESC(max_alloc_per_card, "Max total allocation bytes per card.");
+static int do_alloc_pages(struct snd_card *card, int type, struct device *dev,
size_t size, struct snd_dma_buffer *dmab)
+{
- int err;
- if (card->total_pcm_alloc_bytes + size > max_alloc_per_card)
return -ENOMEM;
- err = snd_dma_alloc_pages(type, dev, size, dmab);
- if (!err) {
mutex_lock(&card->memory_mutex);
card->total_pcm_alloc_bytes += dmab->bytes;
mutex_unlock(&card->memory_mutex);
- }
- return err;
+}
+static void do_free_pages(struct snd_card *card, struct snd_dma_buffer *dmab) +{
- if (!dmab->area)
return;
- mutex_lock(&card->memory_mutex);
- WARN_ON(card->total_pcm_alloc_bytes < dmab->bytes);
- card->total_pcm_alloc_bytes -= dmab->bytes;
- mutex_unlock(&card->memory_mutex);
- snd_dma_free_pages(dmab);
- dmab->area = NULL;
+}
/*
- try to allocate as the large pages as possible.
@@ -37,16 +68,15 @@ static const size_t snd_minimum_buffer = 16384; static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t size) { struct snd_dma_buffer *dmab = &substream->dma_buffer;
struct snd_card *card = substream->pcm->card; size_t orig_size = size; int err;
do {
if ((err = snd_dma_alloc_pages(dmab->dev.type, dmab->dev.dev,
size, dmab)) < 0) {
if (err != -ENOMEM)
return err; /* fatal error */
} else
return 0;
err = do_alloc_pages(card, dmab->dev.type, dmab->dev.dev,
size, dmab);
if (err != -ENOMEM)
size >>= 1; } while (size >= snd_minimum_buffer); dmab->bytes = 0; /* tell error */return err;
@@ -62,10 +92,7 @@ static int preallocate_pcm_pages(struct snd_pcm_substream *substream, size_t siz */ static void snd_pcm_lib_preallocate_dma_free(struct snd_pcm_substream *substream) {
- if (substream->dma_buffer.area == NULL)
return;
- snd_dma_free_pages(&substream->dma_buffer);
- substream->dma_buffer.area = NULL;
do_free_pages(substream->pcm->card, &substream->dma_buffer); }
/**
@@ -130,6 +157,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, struct snd_info_buffer *buffer) { struct snd_pcm_substream *substream = entry->private_data;
- struct snd_card *card = substream->pcm->card; char line[64], str[64]; size_t size; struct snd_dma_buffer new_dmab;
@@ -150,9 +178,10 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, memset(&new_dmab, 0, sizeof(new_dmab)); new_dmab.dev = substream->dma_buffer.dev; if (size > 0) {
if (snd_dma_alloc_pages(substream->dma_buffer.dev.type,
substream->dma_buffer.dev.dev,
size, &new_dmab) < 0) {
if (do_alloc_pages(card,
substream->dma_buffer.dev.type,
substream->dma_buffer.dev.dev,
size, &new_dmab) < 0) { buffer->error = -ENOMEM; return; }
@@ -161,7 +190,7 @@ static void snd_pcm_lib_preallocate_proc_write(struct snd_info_entry *entry, substream->buffer_bytes_max = UINT_MAX; } if (substream->dma_buffer.area)
snd_dma_free_pages(&substream->dma_buffer);
substream->dma_buffer = new_dmab; } else { buffer->error = -EINVAL;do_free_pages(card, &substream->dma_buffer);
@@ -346,6 +375,7 @@ struct page *snd_pcm_sgbuf_ops_page(struct snd_pcm_substream *substream, unsigne */ int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size) {
- struct snd_card *card = substream->pcm->card; struct snd_pcm_runtime *runtime; struct snd_dma_buffer *dmab = NULL;
@@ -374,9 +404,10 @@ int snd_pcm_lib_malloc_pages(struct snd_pcm_substream *substream, size_t size) if (! dmab) return -ENOMEM; dmab->dev = substream->dma_buffer.dev;
if (snd_dma_alloc_pages(substream->dma_buffer.dev.type,
substream->dma_buffer.dev.dev,
size, dmab) < 0) {
if (do_alloc_pages(card,
substream->dma_buffer.dev.type,
substream->dma_buffer.dev.dev,
}size, dmab) < 0) { kfree(dmab); return -ENOMEM;
@@ -397,6 +428,7 @@ EXPORT_SYMBOL(snd_pcm_lib_malloc_pages); */ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream) {
struct snd_card *card = substream->pcm->card; struct snd_pcm_runtime *runtime;
if (PCM_RUNTIME_CHECK(substream))
@@ -406,7 +438,7 @@ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream) return 0; if (runtime->dma_buffer_p != &substream->dma_buffer) { /* it's a newly allocated buffer. release it now. */
snd_dma_free_pages(runtime->dma_buffer_p);
kfree(runtime->dma_buffer_p); } snd_pcm_set_runtime_buffer(substream, NULL);do_free_pages(card, runtime->dma_buffer_p);
diff --git a/sound/hda/Kconfig b/sound/hda/Kconfig index b0c88fe040ee..4ca6b09056f3 100644 --- a/sound/hda/Kconfig +++ b/sound/hda/Kconfig @@ -21,14 +21,16 @@ config SND_HDA_EXT_CORE select SND_HDA_CORE
config SND_HDA_PREALLOC_SIZE
- int "Pre-allocated buffer size for HD-audio driver"
- int "Pre-allocated buffer size for HD-audio driver" if !SND_DMA_SGBUF range 0 32768
- default 64
default 0 if SND_DMA_SGBUF
default 64 if !SND_DMA_SGBUF help Specifies the default pre-allocated buffer-size in kB for the HD-audio driver. A larger buffer (e.g. 2048) is preferred for systems using PulseAudio. The default 64 is chosen just for compatibility reasons.
On x86 systems, the default is zero as we need no preallocation.
Note that the pre-allocation size can be changed dynamically via a proc file (/proc/asound/card*/pcm*/sub*/prealloc), too.
On Sun, 19 Jan 2020 11:14:56 +0100, Keyon Jie wrote:
On 2020/1/19 下午5:04, Takashi Iwai wrote:
On Sun, 19 Jan 2020 09:11:17 +0100, Keyon Jie wrote:
On 2020/1/19 下午3:09, Takashi Iwai wrote: It varies for each stream, most of them are 65536 Bytes only, whereas one for Wake-On-Voice might need a > 4 Seconds buffer could be up to about 1~2MBytes, and another one for deep-buffer playback can be up to about 8MBytes.
Hm, so this varies so much depending on the use case? I thought it comes from the topology file and it's essentially consistent over various purposes.
Yes, we add different buffer_bytes_max limitation to each stream depending on its use case, basically we set it to the maximum value we claim to support only, we don't want to waste any of the system memory.
I think we can go for passing zero as default, which means skipping preallocation. In addition, we may add an upper limit of the total
Just did an experiment and this works for me, I believe we still need to call snd_pcm_set_managed_buffer() though the preallocation is skipped in this, right?
No, snd_pcm_set_managed_buffer() is the new PCM preallocation API. The old snd_pcm_lib_preallocate*() is almost gone.
What I asked is actually that since the preallocation will be skipped(with passing size=0), can we just not calling snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() in our SOF PCM driver? I believe no(we still need the invoking to do initialization except buffer allocating)?
You still need to call it. Otherwise the PCM core doesn't know what kind of buffer type has to be allocated.
Basically snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() does two things: set the buffer type and its preallocation (default and max size). The latter default size can be 0, meaning that no default preallocation is performed. Also the max can be 0, i.e. no preallocation is needed at all for the buffers (e.g. vmalloc buffers). Meanwhile the buffer type and its device pointer are mandatory and can't be skipped.
amount of allocation per card, controlled in pcm_memory.c, for example. This logic can be applied to the legacy HDA, too.
This should be relatively easy, and I'll provide the patch in the next week.
OK, that's fine for me also, thank you.
Below is a quick hack for HDA. We still need the certain amount of preallocation for non-x86 systems that don't support SG-buffers, so a bit of trick is applied to Kconfig.
Totally untested, as usual.
Did a quick test(plus passing 0 size for preallocate in SOF PCM driver) and it works for my use case(no regression comparing that without applying this patch), Thank you.
OK, will tidy up and submit later.
Takashi
On 2020/1/19 下午6:43, Takashi Iwai wrote:
On Sun, 19 Jan 2020 11:14:56 +0100, Keyon Jie wrote:
On 2020/1/19 下午5:04, Takashi Iwai wrote:
On Sun, 19 Jan 2020 09:11:17 +0100, Keyon Jie wrote:
On 2020/1/19 下午3:09, Takashi Iwai wrote: It varies for each stream, most of them are 65536 Bytes only, whereas one for Wake-On-Voice might need a > 4 Seconds buffer could be up to about 1~2MBytes, and another one for deep-buffer playback can be up to about 8MBytes.
Hm, so this varies so much depending on the use case? I thought it comes from the topology file and it's essentially consistent over various purposes.
Yes, we add different buffer_bytes_max limitation to each stream depending on its use case, basically we set it to the maximum value we claim to support only, we don't want to waste any of the system memory.
I think we can go for passing zero as default, which means skipping preallocation. In addition, we may add an upper limit of the total
Just did an experiment and this works for me, I believe we still need to call snd_pcm_set_managed_buffer() though the preallocation is skipped in this, right?
No, snd_pcm_set_managed_buffer() is the new PCM preallocation API. The old snd_pcm_lib_preallocate*() is almost gone.
What I asked is actually that since the preallocation will be skipped(with passing size=0), can we just not calling snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() in our SOF PCM driver? I believe no(we still need the invoking to do initialization except buffer allocating)?
You still need to call it. Otherwise the PCM core doesn't know what kind of buffer type has to be allocated.
Basically snd_pcm_set_managed_buffer() or snd_pcm_lib_preallocate*() does two things: set the buffer type and its preallocation (default and max size). The latter default size can be 0, meaning that no default preallocation is performed. Also the max can be 0, i.e. no preallocation is needed at all for the buffers (e.g. vmalloc buffers). Meanwhile the buffer type and its device pointer are mandatory and can't be skipped.
Got it, thanks for guiding on it Takashi.
Thanks, ~Keyon
amount of allocation per card, controlled in pcm_memory.c, for example. This logic can be applied to the legacy HDA, too.
This should be relatively easy, and I'll provide the patch in the next week.
OK, that's fine for me also, thank you.
Below is a quick hack for HDA. We still need the certain amount of preallocation for non-x86 systems that don't support SG-buffers, so a bit of trick is applied to Kconfig.
Totally untested, as usual.
Did a quick test(plus passing 0 size for preallocate in SOF PCM driver) and it works for my use case(no regression comparing that without applying this patch), Thank you.
OK, will tidy up and submit later.
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Thu, 16 Jan 2020 15:14:28 +0100, Jie, Yang wrote:
-----Original Message----- From: Alsa-devel alsa-devel-bounces@alsa-project.org On Behalf Of Takashi Iwai Sent: Thursday, January 16, 2020 7:51 PM To: Keyon Jie yang.jie@linux.intel.com Cc: alsa-devel@alsa-project.org Subject: Re: [alsa-devel] [PATCH] ALSA: pcm: fix buffer_bytes max constrained by preallocated bytes issue
On Thu, 16 Jan 2020 12:25:38 +0100, Keyon Jie wrote:
On Thu, 2020-01-16 at 11:27 +0100, Takashi Iwai wrote:
On Thu, 16 Jan 2020 10:50:33 +0100,
Oh, you're right, and I completely misread the patch.
Now I took a coffee and can tell you the story behind the scene.
I believe the current code is intentionally limiting the size to the preallocated size. This limitation was brought for not trying to allocate a larger buffer when the buffer has been preallocated. In the past, most hardware allocated the continuous pages for a buffer and the allocation of a large buffer fails quite likely. This was the reason of the buffer preallocation. So, the driver wanted to tell the user-space the limit. If user needs to have an extra large buffer, they are supposed to fiddle with prealloc procfs (either setting zero to clear the preallocation or setting a large enough buffer beforehand).
Thank you for the sharing, it is interesting and knowledge learned to me.
For SG-buffers, though, limitation makes less sense than continuous pages. e.g. a patch below removes the limitation for SG-buffers. But changing this would definitely cause the behavior difference, and I don't know whether it's a reasonable move -- I'm afraid that apps would start hogging too much memory if the limitation is gone.
I just went through all invoking to snd_pcm_lib_preallocate_pages*(), for those SNDRV_DMA_TYPE_DEV, some of them set the *size* equal to
the
*max*, some set the *max* several times to the *size*, IMHO, the *max*s are matched to those hardware's limiatation, comparing to the *size*s, aren't they?
In this case, I still think my patch hanle all TYPE_DEV/SNDRV_DMA_TYPE_DEV/TYPE_SG/SNDRV_DMA_TYPE_DEV
cases more
gracefully, we will still take the limitation from the specific driver set, from the *max* param, and the test results looks very nice here, we will take what the user space wanted for buffer-bytes via aply exactly, as long as it is suitable for the interval and constraints.
Well, I have a mixed feeling. Certainly we'd need some better way to allow a larger buffer allocation, especially for HDA. OTOH, if the buffer was preallocated, it's meant to be used actually. That's the point of the hw_constraint setup.
So if the buffer was preallocated, it won't be re-allocated at hw_params() stage, is this conflict with the re-allocate logic in hw_params()?
If a larger buffer than the preallocated one is requested, the PCM core tries to allocate another buffer while the preallocated one remains kept untouched. It's because such a larger allocation is supposed to be one-off thing. Normal usages should fit with the preallocated buffer size.
That said, if a larger buffer is required too frequently, it makes no sense to keep the preallocation. Or, preallocate larger buffers instead.
And now thinking again after another cup of coffee, I wonder why we do preallocate for HDA at all. For HD-audio, the allocation of any large buffer would succeed very likely because of SG-buffer.
So, just setting 0 to the preallocation size (but keeping else) would work, e.g. something like below? The help text needs adjustment, but you can see the rough idea.
So, do you suggest not doing preallocation(or calling it with 0 size) for all driver with TYPE_SG? I am fine if this is the recommended method, I can try this on SOF I2S platform to see if it can work as we required for very large buffer size.
This really depends on the use case, and I'm not yet sure whether no preallocation is really recommended or not. Without preallocation, each PCM open is involved with a large amount of page allocations, and it makes easier for users to hog resources more easily. It'll use vmalloc addresses that aren't unlimited, and may trigger OOM easily.
thanks,
Takashi
participants (5)
-
Jie, Yang
-
Keyon Jie
-
Pierre-Louis Bossart
-
Rajwa, Marcin
-
Takashi Iwai