Re: [alsa-devel] [PATCH 1/2 v2] ASoC: soc-cache: block based rbtree compression

2 May 2011


      On Mon, May 02, 2011 at 04:47:41PM +0200, Takashi Iwai wrote:
...
At Mon, 2 May 2011 15:42:56 +0100,
Dimitris Papastamos wrote:
...
On Mon, May 02, 2011 at 04:29:01PM +0200, Takashi Iwai wrote:
...
At Mon,  2 May 2011 13:28:28 +0100,
Dimitris Papastamos wrote:
...
This patch prepares the ground for the actual rbtree optimization patch
which will save a pointer to the last accessed register block that was used
in either the read() or write() functions.
Each node manages a block of up to RBTREE_BLOCK_NUM registers.  There can be
no two nodes with overlapping blocks.  Currently there is no check in the code
to scream in case that ever happens.  Each block has a base and top register,
all others lie in between these registers.  Note that variable length blocks
aren't supported.  So if you have an interval [8, 15] and only some of those
registers actually exist on the device, the block will have the non-existent
registers as zero.  There is also no way of reporting that any of those
non-existent registers were accessed/modified.
The larger the block size, the more probable it is that one of the
managed registers is non-zero, and therefore the node will need to be
allocated at initialization time and waste space.
If register N is accessed and it is not part of any of the current
blocks in the rbtree, a new node is created with a base register
which is floor(N / RBTREE_BLOCK_NUM) * RBTREE_BLOCK_NUM and a top
register as base_register + RBTREE_BLOCK_NUM - 1.  All other registers
in the block are initialized as expected and the node is inserted into
the tree.
Signed-off-by: Dimitris Papastamos dp@opensource.wolfsonmicro.com
Looking through the patch, I wonder whether it gives a performance
gain enough for the additional complexity.  Did you measure somehow?
This patch might still provide a performance benefit by itself since by
increasing the block size we decrease the number of nodes in the rbtree
that need to be looked up everytime we touch the cache.
Right.  But does it buy so much, in comparison with the additional
complexity of the code?  That's my primary question.
For some codecs, the register maps group together registers of similar
functionality.  With an optimal block size or near optimal, there will
be only one rbtree look up if we continually touch any of those
registers.  This helps primarily when we sync the register map as well
as during initialization.  I think the added code complexity is worth
it.  I could perhaps send patches incrementally to simplify this
further.
Thanks,
Dimitris