I'm trying to address a bug where we end up with a thread spinning and consuming an entire cpu. The issue seems to be this code in sound/core/pcm_native.c:
/* Writer in rwsem may block readers even during its waiting in queue, * and this may lead to a deadlock when the code path takes read sem * twice (e.g. one in snd_pcm_action_nonatomic() and another in * snd_pcm_stream_lock()). As a (suboptimal) workaround, let writer to * spin until it gets the lock. */ static inline void down_write_nonblock(struct rw_semaphore *lock) { while (!down_write_trylock(lock)) cond_resched(); }
The original commit for this is 67ec1072b053c15564e6090ab30127895dc77a89
What we're suspecting is that a normal thread (SCHED_OTHER) has a reader lock and a real-time thread using SCHED_RR or SCHED_FIFO is trying to take the writer lock. If both threads are pinned to the same CPU for some reason then the reader thread will never get scheduled (because the real-time writer thread is still runnable), and we will never make progress.
Does this sound right? What can we do to fix this?
Thanks,
Rob.