On Tue 05-12-23 21:22:59, Yury Norov wrote:
On Mon, Dec 04, 2023 at 07:51:01PM +0100, Jan Kara wrote:
This series is a result of discussion [1]. All find_bit() functions imply exclusive access to the bitmaps. However, KCSAN reports quite a number of warnings related to find_bit() API. Some of them are not pointing to real bugs because in many situations people intentionally allow concurrent bitmap operations.
If so, find_bit() can be annotated such that KCSAN will ignore it:
bit = data_race(find_first_bit(bitmap, nbits));
No, this is not a correct thing to do. If concurrent bitmap changes can happen, find_first_bit() as it is currently implemented isn't ever a safe choice because it can call __ffs(0) which is dangerous as you properly note above. I proposed adding READ_ONCE() into find_first_bit() / find_next_bit() implementation to fix this issue but you disliked that. So other option we have is adding find_first_bit() and find_next_bit() variants that take volatile 'addr' and we have to use these in code like xas_find_chunk() which cannot be converted to your new helpers.
Here is some examples when concurrent operations with plain find_bit() are acceptable:
- two threads running find_*_bit(): safe wrt ffs(0) and returns correct value, because underlying bitmap is unchanged;
- find_next_bit() in parallel with set or clear_bit(), when modifying a bit prior to the start bit to search: safe and correct;
- find_first_bit() in parallel with set_bit(): safe, but may return wrong bit number;
- find_first_zero_bit() in parallel with clear_bit(): same as above.
In last 2 cases find_bit() may not return a correct bit number, but it may be OK if caller requires any (not exactly first) set or clear bit, correspondingly.
In such cases, KCSAN may be safely silenced.
True - but these are special cases. In particular the case in xas_find_chunk() is not any of these special cases. It is using find_next_bit() which is can be racing with clear_bit(). So what are your plans for such usecase?
Honza