[alsa-devel] [PATCH 1/2] ASoC: dapm: Break out of widget search when source and sink are located
Currently snd_soc_dapm_add_route will continue to search the widget list even after both the source and sink for the route have been located. This patch breaks out of the search when both are located giving a small improvement in probe time for drivers.
Signed-off-by: Charles Keepax ckeepax@opensource.wolfsonmicro.com ---
Note these patches depend on:
commit 92fa12426741d52b39ec92ad77c9843d3fc2b3d6 Author: Lars-Peter Clausen lars@metafoo.de Date: Fri May 1 18:02:42 2015 +0200
ASoC: dapm: Add new widgets to the end of the widget list
Thanks, Charles
sound/soc/soc-dapm.c | 10 ++++++++-- 1 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index a2e5f22..ea3348e 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2617,14 +2617,20 @@ static int snd_soc_dapm_add_route(struct snd_soc_dapm_context *dapm, list_for_each_entry(w, &dapm->card->widgets, list) { if (!wsink && !(strcmp(w->name, sink))) { wtsink = w; - if (w->dapm == dapm) + if (w->dapm == dapm) { wsink = w; + if (wsource) + break; + } continue; } if (!wsource && !(strcmp(w->name, source))) { wtsource = w; - if (w->dapm == dapm) + if (w->dapm == dapm) { wsource = w; + if (wsink) + break; + } } } /* use widget from another DAPM context if not found from this */
Some CODECs have a significant number of DAPM routes and for each route, when it is added to the card, the entire card widget list must be searched. When adding routes it is very likely, however, that adjacent routes will require adjacent widgets. For example all the routes for a mux are likely added in a block and the sink widget will be the same each time and it is also quite likely that the source widgets are sequential located in the widget list.
This patch adds an optional cache argument to snd_soc_dapm_add_route, if given, this argument will hold the source and sink widgets from the last call to snd_soc_dapm_add_route. A small search of the widget list will be made from those points for both the sink and source. Currently this search only checks both the last widget and the one adjacent to it.
On wm8280 which has approximately 500 widgets and 30000 routes (one of the largest CODECs in mainline), the number of paths that hit the cache is 24000, which significantly improves probe time.
Signed-off-by: Charles Keepax ckeepax@opensource.wolfsonmicro.com --- sound/soc/soc-dapm.c | 40 ++++++++++++++++++++++++++++++++++++++-- 1 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index ea3348e..95d3ea5 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2585,8 +2585,26 @@ err: return ret; }
+static struct snd_soc_dapm_widget * +dapm_check_path_cache(const char *name, struct snd_soc_dapm_widget *w, int n) +{ + int i; + + if (w) { + for (i = 0; i < n; i++) { + if (!strcmp(name, w->name)) + return w; + + w = list_next_entry(w, list); + } + } + + return NULL; +} + static int snd_soc_dapm_add_route(struct snd_soc_dapm_context *dapm, - const struct snd_soc_dapm_route *route) + const struct snd_soc_dapm_route *route, + struct snd_soc_dapm_path *cache) { struct snd_soc_dapm_widget *wsource = NULL, *wsink = NULL, *w; struct snd_soc_dapm_widget *wtsource = NULL, *wtsink = NULL; @@ -2610,6 +2628,14 @@ static int snd_soc_dapm_add_route(struct snd_soc_dapm_context *dapm, source = route->source; }
+ if (cache) { + wsink = dapm_check_path_cache(sink, cache->sink, 2); + wsource = dapm_check_path_cache(source, cache->source, 2); + + if (wsink && wsource) + goto skip_search; + } + /* * find src and dest widgets over all widgets but favor a widget from * current DAPM context @@ -2650,6 +2676,12 @@ static int snd_soc_dapm_add_route(struct snd_soc_dapm_context *dapm, return -ENODEV; }
+skip_search: + if (cache) { + cache->sink = wsink; + cache->source = wsource; + } + ret = snd_soc_dapm_add_path(dapm, wsource, wsink, route->control, route->connected); if (ret) @@ -2741,10 +2773,14 @@ int snd_soc_dapm_add_routes(struct snd_soc_dapm_context *dapm, const struct snd_soc_dapm_route *route, int num) { int i, r, ret = 0; + struct snd_soc_dapm_path cache = { + .source = NULL, + .sink = NULL, + };
mutex_lock_nested(&dapm->card->dapm_mutex, SND_SOC_DAPM_CLASS_INIT); for (i = 0; i < num; i++) { - r = snd_soc_dapm_add_route(dapm, route); + r = snd_soc_dapm_add_route(dapm, route, &cache); if (r < 0) { dev_err(dapm->dev, "ASoC: Failed to add route %s -> %s -> %s\n", route->source,
On 05/07/2015 12:33 PM, Charles Keepax wrote:
Some CODECs have a significant number of DAPM routes and for each route, when it is added to the card, the entire card widget list must be searched. When adding routes it is very likely, however, that adjacent routes will require adjacent widgets. For example all the routes for a mux are likely added in a block and the sink widget will be the same each time and it is also quite likely that the source widgets are sequential located in the widget list.
This patch adds an optional cache argument to snd_soc_dapm_add_route, if given, this argument will hold the source and sink widgets from the last call to snd_soc_dapm_add_route. A small search of the widget list will be made from those points for both the sink and source. Currently this search only checks both the last widget and the one adjacent to it.
On wm8280 which has approximately 500 widgets and 30000 routes (one of the largest CODECs in mainline), the number of paths that hit the cache is 24000, which significantly improves probe time.
That's crazy! ;)
I wonder if it makes sense to come up with a more machine readable blob format that can be pre-compiled where we don't have to search the widget list each time a route is added. Instead of having it reference the widgets by name use a index that points into an array of widgets. That makes the lookup time O(1).
Signed-off-by: Charles Keepax ckeepax@opensource.wolfsonmicro.com
sound/soc/soc-dapm.c | 40 ++++++++++++++++++++++++++++++++++++++-- 1 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index ea3348e..95d3ea5 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2585,8 +2585,26 @@ err: return ret; }
+static struct snd_soc_dapm_widget * +dapm_check_path_cache(const char *name, struct snd_soc_dapm_widget *w, int n) +{
- int i;
- if (w) {
for (i = 0; i < n; i++) {
if (!strcmp(name, w->name))
return w;
w = list_next_entry(w, list);
This will return garbage if w is the last widget in the list.
Otherwise the patch looks sensible.
On Thu, May 07, 2015 at 01:22:07PM +0200, Lars-Peter Clausen wrote:
On 05/07/2015 12:33 PM, Charles Keepax wrote:
Some CODECs have a significant number of DAPM routes and for each route, when it is added to the card, the entire card widget list must be searched. When adding routes it is very likely, however, that adjacent routes will require adjacent widgets. For example all the routes for a mux are likely added in a block and the sink widget will be the same each time and it is also quite likely that the source widgets are sequential located in the widget list.
This patch adds an optional cache argument to snd_soc_dapm_add_route, if given, this argument will hold the source and sink widgets from the last call to snd_soc_dapm_add_route. A small search of the widget list will be made from those points for both the sink and source. Currently this search only checks both the last widget and the one adjacent to it.
On wm8280 which has approximately 500 widgets and 30000 routes (one of the largest CODECs in mainline), the number of paths that hit the cache is 24000, which significantly improves probe time.
That's crazy! ;)
I wonder if it makes sense to come up with a more machine readable blob format that can be pre-compiled where we don't have to search the widget list each time a route is added. Instead of having it reference the widgets by name use a index that points into an array of widgets. That makes the lookup time O(1).
I don't think its necessary just yet we have one more slightly larger CODEC (700 widgets, 46000 routes), which we haven't managed to get around to upstreaming yet. But it looks like the size is levelling off about there for now and this solution is probably good enough for that size.
Signed-off-by: Charles Keepax ckeepax@opensource.wolfsonmicro.com
sound/soc/soc-dapm.c | 40 ++++++++++++++++++++++++++++++++++++++-- 1 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index ea3348e..95d3ea5 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2585,8 +2585,26 @@ err: return ret; }
+static struct snd_soc_dapm_widget * +dapm_check_path_cache(const char *name, struct snd_soc_dapm_widget *w, int n) +{
- int i;
- if (w) {
for (i = 0; i < n; i++) {
if (!strcmp(name, w->name))
return w;
w = list_next_entry(w, list);
This will return garbage if w is the last widget in the list.
Otherwise the patch looks sensible.
Oops, thanks I will fix that up.
Thanks, Charles
On 07-05-15 13:22, Lars-Peter Clausen wrote:
On 05/07/2015 12:33 PM, Charles Keepax wrote:
Some CODECs have a significant number of DAPM routes and for each route, when it is added to the card, the entire card widget list must be searched. When adding routes it is very likely, however, that adjacent routes will require adjacent widgets. For example all the routes for a mux are likely added in a block and the sink widget will be the same each time and it is also quite likely that the source widgets are sequential located in the widget list.
This patch adds an optional cache argument to snd_soc_dapm_add_route, if given, this argument will hold the source and sink widgets from the last call to snd_soc_dapm_add_route. A small search of the widget list will be made from those points for both the sink and source. Currently this search only checks both the last widget and the one adjacent to it.
On wm8280 which has approximately 500 widgets and 30000 routes (one of the largest CODECs in mainline), the number of paths that hit the cache is 24000, which significantly improves probe time.
That's crazy! ;)
I wonder if it makes sense to come up with a more machine readable blob format that can be pre-compiled where we don't have to search the widget list each time a route is added. Instead of having it reference the widgets by name use a index that points into an array of widgets. That makes the lookup time O(1).
Storing the names in a btree or similar structure should be a nice balance. That would make name lookups O(log(n)), meaning that it would find a name in 30000 items in about 5 steps.
A hashtable or precompiled "id" would only be viable if the whole table can be calculated at compile time. With many devices being hot-pluggable and creating multiple instances at runtime, I don't really think that'd be feasible - if it were, you might as well let the linker fill in the right pointer and skip the whole searching-on-name thing.
If new entries are being added in ascending order, inserting into the btree is a fast operation as well.
Drawback is that you'll need to store the btree graph in memory so it'll consume a bit more (something like two extra pointers per entry).
...
Kind regards,
Mike Looijmans System Expert
TOPIC Embedded Products Eindhovenseweg 32-C, NL-5683 KH Best Postbus 440, NL-5680 AK Best Telefoon: +31 (0) 499 33 69 79 Telefax: +31 (0) 499 33 69 70 E-mail: mike.looijmans@topicproducts.com Website: www.topicproducts.com
Please consider the environment before printing this e-mail
On Thu, May 07, 2015 at 01:22:07PM +0200, Lars-Peter Clausen wrote:
That's crazy! ;)
I wonder if it makes sense to come up with a more machine readable blob format that can be pre-compiled where we don't have to search the widget list each time a route is added. Instead of having it reference the widgets by name use a index that points into an array of widgets. That makes the lookup time O(1).
I'm a bit reluctant to complicate the build process, but like I said inn my review there's definitely things we can do at runtime if people want to look at this - build some data structure that indexes the names in a way that allows us to look them up more quickly. In general I'd expect it to be negligable cost for small devices and a win for larger ones.
On Thu, May 07, 2015 at 11:33:59AM +0100, Charles Keepax wrote:
Some CODECs have a significant number of DAPM routes and for each route, when it is added to the card, the entire card widget list must be
The idea here is good but I'm having a hard time loving the implementation. It feels like it's hacking this specific implementation in many different places without really abstracting things - having a neat trick hidden in one place would be one thinng but it's spread through several different places.
+static struct snd_soc_dapm_widget * +dapm_check_path_cache(const char *name, struct snd_soc_dapm_widget *w, int n) +{
- int i;
- if (w) {
for (i = 0; i < n; i++) {
if (!strcmp(name, w->name))
return w;
w = list_next_entry(w, list);
}
This will not be happy if there are fewer than n entries to look at.
static int snd_soc_dapm_add_route(struct snd_soc_dapm_context *dapm,
const struct snd_soc_dapm_route *route)
const struct snd_soc_dapm_route *route,
struct snd_soc_dapm_path *cache)
The fact that we're directly using the path structure as the type for the cache here is a warning flag, it means that everywhere that knows there is a cache at all needs to know about the specific type of the cache. If we were to improve the cache in the future, for example doing something like building a hash that maps names to widgets for the duration of probe, then everywhere would need to change.
- if (cache) {
wsink = dapm_check_path_cache(sink, cache->sink, 2);
wsource = dapm_check_path_cache(source, cache->source, 2);
if (wsink && wsource)
goto skip_search;
- }
This is the only place where we have _check_path_cache() calls so n is always 2 here and it is unclear where that number comes from or what it means without going and peering at the implementation and the commit log. It seems it would be better to hide that number inside the function.
+skip_search:
- if (cache) {
cache->sink = wsink;
cache->source = wsource;
- }
Putting this into a store hit function would be good.
mutex_lock_nested(&dapm->card->dapm_mutex, SND_SOC_DAPM_CLASS_INIT); for (i = 0; i < num; i++) {
r = snd_soc_dapm_add_route(dapm, route);
r = snd_soc_dapm_add_route(dapm, route, &cache);
Should we just have the cache in the dapm context or the card instead of locally? It's only two pointers at the minute so there doesn't seem to be much cost from keeping it around and it might generate somme more hits for some use cases. Or to put it another way why is the cache optional and created and re-created here like this?
On Thu, May 07, 2015 at 01:48:11PM +0100, Mark Brown wrote:
On Thu, May 07, 2015 at 11:33:59AM +0100, Charles Keepax wrote:
Some CODECs have a significant number of DAPM routes and for each route, when it is added to the card, the entire card widget list must be
The idea here is good but I'm having a hard time loving the implementation. It feels like it's hacking this specific implementation in many different places without really abstracting things - having a neat trick hidden in one place would be one thinng but it's spread through several different places.
+static struct snd_soc_dapm_widget * +dapm_check_path_cache(const char *name, struct snd_soc_dapm_widget *w, int n) +{
- int i;
- if (w) {
for (i = 0; i < n; i++) {
if (!strcmp(name, w->name))
return w;
w = list_next_entry(w, list);
}
This will not be happy if there are fewer than n entries to look at.
Yeah will fix that.
static int snd_soc_dapm_add_route(struct snd_soc_dapm_context *dapm,
const struct snd_soc_dapm_route *route)
const struct snd_soc_dapm_route *route,
struct snd_soc_dapm_path *cache)
The fact that we're directly using the path structure as the type for the cache here is a warning flag, it means that everywhere that knows there is a cache at all needs to know about the specific type of the cache. If we were to improve the cache in the future, for example doing something like building a hash that maps names to widgets for the duration of probe, then everywhere would need to change.
- if (cache) {
wsink = dapm_check_path_cache(sink, cache->sink, 2);
wsource = dapm_check_path_cache(source, cache->source, 2);
if (wsink && wsource)
goto skip_search;
- }
This is the only place where we have _check_path_cache() calls so n is always 2 here and it is unclear where that number comes from or what it means without going and peering at the implementation and the commit log. It seems it would be better to hide that number inside the function.
I was sort of thinking along the lines of if anyone wanted to add CODEC specific levels of searching the cache or some such. But I have no problem moving this into the function, probably was a bit premature generalisation there.
+skip_search:
- if (cache) {
cache->sink = wsink;
cache->source = wsource;
- }
Putting this into a store hit function would be good.
No problems.
mutex_lock_nested(&dapm->card->dapm_mutex, SND_SOC_DAPM_CLASS_INIT); for (i = 0; i < num; i++) {
r = snd_soc_dapm_add_route(dapm, route);
r = snd_soc_dapm_add_route(dapm, route, &cache);
Should we just have the cache in the dapm context or the card instead of locally? It's only two pointers at the minute so there doesn't seem to be much cost from keeping it around and it might generate somme more hits for some use cases. Or to put it another way why is the cache optional and created and re-created here like this?
:-) The original version I did put the pointers in the DAPM context, but I somehow managed to talk myself out of it as I was nervous of bloating the DAPM context and it felt like this only applied to adding routes. Adding it to the card feels kinda reasonable as that is where the lists we are caching reside.
I will have a think over and push a new version soon.
Thanks, Charles
On Thu, May 07, 2015 at 02:52:22PM +0100, Charles Keepax wrote:
This is the only place where we have _check_path_cache() calls so n is always 2 here and it is unclear where that number comes from or what it means without going and peering at the implementation and the commit log. It seems it would be better to hide that number inside the function.
I was sort of thinking along the lines of if anyone wanted to add CODEC specific levels of searching the cache or some such. But I have no problem moving this into the function, probably was a bit premature generalisation there.
I think there's probably a long way to go with making the generic cache more performant before we need to start adding CODEC specific code.
:-) The original version I did put the pointers in the DAPM context, but I somehow managed to talk myself out of it as I was nervous of bloating the DAPM context and it felt like this only applied to adding routes. Adding it to the card feels kinda reasonable as that is where the lists we are caching reside.
If it's just one or two pointers it's not that big a deal in the DAPM context, we don't have that many of those per card or per system - if it was going into the widgets that'd hurt a lot more.
On 05/07/2015 02:48 PM, Mark Brown wrote: [...]
+skip_search:
- if (cache) {
cache->sink = wsink;
cache->source = wsource;
- }
Putting this into a store hit function would be good.
mutex_lock_nested(&dapm->card->dapm_mutex, SND_SOC_DAPM_CLASS_INIT); for (i = 0; i < num; i++) {
r = snd_soc_dapm_add_route(dapm, route);
r = snd_soc_dapm_add_route(dapm, route, &cache);
Should we just have the cache in the dapm context or the card instead of locally? It's only two pointers at the minute so there doesn't seem to be much cost from keeping it around and it might generate somme more hits for some use cases. Or to put it another way why is the cache optional and created and re-created here like this?
I don't think the cache will produce hits anywhere else other then adding routes from the same route table. That's the only place were the same widget or widgets close to each other are accessed sequentially. I'd prefer to keep it locally.
On Thu, May 07, 2015 at 04:53:34PM +0200, Lars-Peter Clausen wrote:
On 05/07/2015 02:48 PM, Mark Brown wrote:
Should we just have the cache in the dapm context or the card instead of locally? It's only two pointers at the minute so there doesn't seem to be much cost from keeping it around and it might generate somme more hits for some use cases. Or to put it another way why is the cache optional and created and re-created here like this?
I don't think the cache will produce hits anywhere else other then adding routes from the same route table. That's the only place were the same widget or widgets close to each other are accessed sequentially. I'd prefer to keep it locally.
It's probably the overwhelmingly common case with the current implementation at least, we might get some extras from things like mic biases on boards and CODECs that handle variants by using multiple routing tables but they'll be vastly less common than things from the same table. My thinking here is more that improving the encapsulation makes it easier to do a better job later for no meaningful cost now.
On 05/07/2015 12:33 PM, Charles Keepax wrote:
Currently snd_soc_dapm_add_route will continue to search the widget list even after both the source and sink for the route have been located. This patch breaks out of the search when both are located giving a small improvement in probe time for drivers.
Signed-off-by: Charles Keepax ckeepax@opensource.wolfsonmicro.com
Reviewed-by: Lars-Peter Clausen lars@metafoo.de
On Thu, May 07, 2015 at 11:33:58AM +0100, Charles Keepax wrote:
Currently snd_soc_dapm_add_route will continue to search the widget list even after both the source and sink for the route have been located. This patch breaks out of the search when both are located giving a small improvement in probe time for drivers.
Applied, thanks.
participants (4)
-
Charles Keepax
-
Lars-Peter Clausen
-
Mark Brown
-
Mike Looijmans