[Sound-open-firmware] Mixer doubts
Hello, everyone.
I’ve been reading through the mixer code and there’s one part which leaves me somewhat doubtful:
/* mix N PCM source streams to one sink stream */ static void mix_n(struct comp_dev *dev, struct comp_buffer *sink, struct comp_buffer **sources, uint32_t num_sources, uint32_t frames) { int32_t *src; int32_t *dest = sink->w_ptr; int32_t count; int64_t val[2]; int i; int j;
count = frames * dev->params.channels;
for (i = 0; i < count; i += 2) { val[0] = 0; val[1] = 0; for (j = 0; j < num_sources; j++) { src = sources[j]->r_ptr;
/* TODO: clamp */ val[0] += src[i]; val[1] += src[i + 1]; }
/* TODO: best place for attenuation ? */ dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1)); } }
More specifically: dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1));
Going back in the history of the file I found it was supposed to be an average. However, it will only be an average if num_sources is power of 2, because otherwise such bitwise operations will yield different results from what I take should be an optimized way for division.
For example, suppose we have 2 cases: • We have 2 sources, thus num_sources >> 1 will equal 1, valid right bitshift, same as division by the number of sources, which is 2 • We have 3 sources, thus num_sources >> 1 will equal 1 once again, no longer valid, because it will yield the same result as division by 2, instead of 3.
There’s one more “problem”, by averaging the samples like that we will end up with quieter output for when we have 1 source completely silent. Is that intentional?
Whether it’s all deliberate or not, or simply whether I misunderstood the code, I hope you can help me clear this up.
Thank you, Sławomir Błauciak
On Tue, 2018-07-17 at 08:45 +0200, slawomir.blauciak@linux.intel.com wrote:
Hello, everyone.
I’ve been reading through the mixer code and there’s one part which leaves me somewhat doubtful:
/* mix N PCM source streams to one sink stream */ static void mix_n(struct comp_dev *dev, struct comp_buffer *sink, struct comp_buffer **sources, uint32_t num_sources, uint32_t frames) { int32_t *src; int32_t *dest = sink->w_ptr; int32_t count; int64_t val[2]; int i; int j;
count = frames * dev->params.channels; for (i = 0; i < count; i += 2) { val[0] = 0; val[1] = 0; for (j = 0; j < num_sources; j++) { src = sources[j]->r_ptr; /* TODO: clamp */ val[0] += src[i]; val[1] += src[i + 1]; } /* TODO: best place for attenuation ? */ dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1)); }
}
More specifically: dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1));
Going back in the history of the file I found it was supposed to be an average. However, it will only be an average if num_sources is power of 2, because otherwise such bitwise operations will yield different results from what I take should be an optimized way for division.
For example, suppose we have 2 cases: • We have 2 sources, thus num_sources >> 1 will equal 1, valid right bitshift, same as division by the number of sources, which is 2 • We have 3 sources, thus num_sources >> 1 will equal 1 once again, no longer valid, because it will yield the same result as division by 2, instead of 3.
There’s one more “problem”, by averaging the samples like that we will end up with quieter output for when we have 1 source completely silent. Is that intentional?
Whether it’s all deliberate or not, or simply whether I misunderstood the code, I hope you can help me clear this up.
The intention is we have a generic optimised version that is easy to vectorize for 2,4,8 channels (on any architecture). We will need to have a separate generic mixer for 1,3,5,6 etc.
We should also have architecture specific mixing too like we have for xtensa HiFi3 volume.
Liam
Thank you, Sławomir Błauciak
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
On 7/17/18 4:45 AM, Liam Girdwood wrote:
On Tue, 2018-07-17 at 08:45 +0200, slawomir.blauciak@linux.intel.com wrote:
Hello, everyone.
I’ve been reading through the mixer code and there’s one part which leaves me somewhat doubtful:
/* mix N PCM source streams to one sink stream */ static void mix_n(struct comp_dev *dev, struct comp_buffer *sink, struct comp_buffer **sources, uint32_t num_sources, uint32_t frames) { int32_t *src; int32_t *dest = sink->w_ptr; int32_t count; int64_t val[2]; int i; int j;
count = frames * dev->params.channels; for (i = 0; i < count; i += 2) { val[0] = 0; val[1] = 0; for (j = 0; j < num_sources; j++) { src = sources[j]->r_ptr; /* TODO: clamp */ val[0] += src[i]; val[1] += src[i + 1]; } /* TODO: best place for attenuation ? */ dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1)); }
}
More specifically: dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1));
Going back in the history of the file I found it was supposed to be an average. However, it will only be an average if num_sources is power of 2, because otherwise such bitwise operations will yield different results from what I take should be an optimized way for division.
For example, suppose we have 2 cases: • We have 2 sources, thus num_sources >> 1 will equal 1, valid right bitshift, same as division by the number of sources, which is 2 • We have 3 sources, thus num_sources >> 1 will equal 1 once again, no longer valid, because it will yield the same result as division by 2, instead of 3.
There’s one more “problem”, by averaging the samples like that we will end up with quieter output for when we have 1 source completely silent. Is that intentional?
Whether it’s all deliberate or not, or simply whether I misunderstood the code, I hope you can help me clear this up.
The intention is we have a generic optimised version that is easy to vectorize for 2,4,8 channels (on any architecture). We will need to have a separate generic mixer for 1,3,5,6 etc.
We should also have architecture specific mixing too like we have for xtensa HiFi3 volume.
I also don't agree with this right shift. It's well intended and aimed at avoiding saturations as you add more sources into the mix, but I don't think it should be done this way. In general, the only solutions that work well are: 1. make sure an attenuation is applied prior to the mix (i.e. don't compensate for saturations in the mix) 2. implement a soft limiter as part of the mixer
The #1 always win in practice for customer devices due to its simplicity. #2 is typical for pro-audio stuff.
Liam
Thank you, Sławomir Błauciak
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
On 17.07.2018 16:02, Pierre-Louis Bossart wrote:
On 7/17/18 4:45 AM, Liam Girdwood wrote:
On Tue, 2018-07-17 at 08:45 +0200, slawomir.blauciak@linux.intel.com wrote:
More specifically: dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1));
Going back in the history of the file I found it was supposed to be an average. However, it will only be an average if num_sources is power of 2, because otherwise such bitwise operations will yield different results from what I take should be an optimized way for division.
For example, suppose we have 2 cases: • We have 2 sources, thus num_sources >> 1 will equal 1, valid right bitshift, same as division by the number of sources, which is 2 • We have 3 sources, thus num_sources >> 1 will equal 1 once again, no longer valid, because it will yield the same result as division by 2, instead of 3.
There’s one more “problem”, by averaging the samples like that we will end up with quieter output for when we have 1 source completely silent. Is that intentional?
Whether it’s all deliberate or not, or simply whether I misunderstood the code, I hope you can help me clear this up.
The intention is we have a generic optimised version that is easy to vectorize for 2,4,8 channels (on any architecture). We will need to have a separate generic mixer for 1,3,5,6 etc.
We should also have architecture specific mixing too like we have for xtensa HiFi3 volume.
I also don't agree with this right shift. It's well intended and aimed at avoiding saturations as you add more sources into the mix, but I don't think it should be done this way. In general, the only solutions that work well are:
- make sure an attenuation is applied prior to the mix (i.e. don't
compensate for saturations in the mix) 2. implement a soft limiter as part of the mixer
The #1 always win in practice for customer devices due to its simplicity. #2 is typical for pro-audio stuff.
Yep, I proposed earlier this also but it was postponed. We should make sure the mixer uses saturating add operation and set mixer gain to 0 dB for all inputs. It's too tricky for system gains planning to have hidden gain enhancements (and the issue with non 2^N inputs number). We have already #1 enabled if the topology is setup to ensure no signal clipping in mixing.
The #2 with soft limiter as a separate component is in future plans too.
There's also third option to scale the pipelines to have some margin above 0 dB level (e.g. 24 dB) and have soft limiter or saturation only as last processing component in the pipelines. Though it works only with fixed point 32 bit and float pipelines where audio quality is not harmed significantly due to the extra room for the signal.
Thanks, Seppo
Liam
Thank you, Sławomir Błauciak
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
On 7/23/18 6:38 AM, Seppo Ingalsuo wrote:
On 17.07.2018 16:02, Pierre-Louis Bossart wrote:
On 7/17/18 4:45 AM, Liam Girdwood wrote:
On Tue, 2018-07-17 at 08:45 +0200, slawomir.blauciak@linux.intel.com wrote:
More specifically: dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1));
Going back in the history of the file I found it was supposed to be an average. However, it will only be an average if num_sources is power of 2, because otherwise such bitwise operations will yield different results from what I take should be an optimized way for division.
For example, suppose we have 2 cases: • We have 2 sources, thus num_sources >> 1 will equal 1, valid right bitshift, same as division by the number of sources, which is 2 • We have 3 sources, thus num_sources >> 1 will equal 1 once again, no longer valid, because it will yield the same result as division by 2, instead of 3.
There’s one more “problem”, by averaging the samples like that we will end up with quieter output for when we have 1 source completely silent. Is that intentional?
Whether it’s all deliberate or not, or simply whether I misunderstood the code, I hope you can help me clear this up.
The intention is we have a generic optimised version that is easy to vectorize for 2,4,8 channels (on any architecture). We will need to have a separate generic mixer for 1,3,5,6 etc.
We should also have architecture specific mixing too like we have for xtensa HiFi3 volume.
I also don't agree with this right shift. It's well intended and aimed at avoiding saturations as you add more sources into the mix, but I don't think it should be done this way. In general, the only solutions that work well are:
- make sure an attenuation is applied prior to the mix (i.e. don't
compensate for saturations in the mix) 2. implement a soft limiter as part of the mixer
The #1 always win in practice for customer devices due to its simplicity. #2 is typical for pro-audio stuff.
Yep, I proposed earlier this also but it was postponed. We should make sure the mixer uses saturating add operation and set mixer gain to 0 dB for all inputs. It's too tricky for system gains planning to have hidden gain enhancements (and the issue with non 2^N inputs number). We have already #1 enabled if the topology is setup to ensure no signal clipping in mixing.
The #2 with soft limiter as a separate component is in future plans too.
There's also third option to scale the pipelines to have some margin above 0 dB level (e.g. 24 dB) and have soft limiter or saturation only as last processing component in the pipelines. Though it works only with fixed point 32 bit and float pipelines where audio quality is not harmed significantly due to the extra room for the signal.
So there's no one in favor of keeping the existing solution so let's remove this right-shift for now, and refine better plans with a soft limiter for later.
On 23.07.2018 16:56, Pierre-Louis Bossart wrote:
On 7/23/18 6:38 AM, Seppo Ingalsuo wrote:
On 17.07.2018 16:02, Pierre-Louis Bossart wrote:
On 7/17/18 4:45 AM, Liam Girdwood wrote:
On Tue, 2018-07-17 at 08:45 +0200, slawomir.blauciak@linux.intel.com wrote:
More specifically: dest[i] = (val[0] >> (num_sources >> 1)); dest[i + 1] = (val[1] >> (num_sources >> 1));
Going back in the history of the file I found it was supposed to be an average. However, it will only be an average if num_sources is power of 2, because otherwise such bitwise operations will yield different results from what I take should be an optimized way for division.
For example, suppose we have 2 cases: • We have 2 sources, thus num_sources >> 1 will equal 1, valid right bitshift, same as division by the number of sources, which is 2 • We have 3 sources, thus num_sources >> 1 will equal 1 once again, no longer valid, because it will yield the same result as division by 2, instead of 3.
There’s one more “problem”, by averaging the samples like that we will end up with quieter output for when we have 1 source completely silent. Is that intentional?
Whether it’s all deliberate or not, or simply whether I misunderstood the code, I hope you can help me clear this up.
The intention is we have a generic optimised version that is easy to vectorize for 2,4,8 channels (on any architecture). We will need to have a separate generic mixer for 1,3,5,6 etc.
We should also have architecture specific mixing too like we have for xtensa HiFi3 volume.
I also don't agree with this right shift. It's well intended and aimed at avoiding saturations as you add more sources into the mix, but I don't think it should be done this way. In general, the only solutions that work well are:
- make sure an attenuation is applied prior to the mix (i.e. don't
compensate for saturations in the mix) 2. implement a soft limiter as part of the mixer
The #1 always win in practice for customer devices due to its simplicity. #2 is typical for pro-audio stuff.
Yep, I proposed earlier this also but it was postponed. We should make sure the mixer uses saturating add operation and set mixer gain to 0 dB for all inputs. It's too tricky for system gains planning to have hidden gain enhancements (and the issue with non 2^N inputs number). We have already #1 enabled if the topology is setup to ensure no signal clipping in mixing.
The #2 with soft limiter as a separate component is in future plans too.
There's also third option to scale the pipelines to have some margin above 0 dB level (e.g. 24 dB) and have soft limiter or saturation only as last processing component in the pipelines. Though it works only with fixed point 32 bit and float pipelines where audio quality is not harmed significantly due to the extra room for the signal.
So there's no one in favor of keeping the existing solution so let's remove this right-shift for now, and refine better plans with a soft limiter for later.
OK, I will make a pull request!
Thanks, Seppo
Sound-open-firmware mailing list Sound-open-firmware@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/sound-open-firmware
participants (4)
-
Liam Girdwood
-
Pierre-Louis Bossart
-
Seppo Ingalsuo
-
slawomir.blauciak@linux.intel.com