
On Tue, 22 Apr 2025 12:29:31 +0200, Hillf Danton wrote:
On Tue, 22 Apr 2025 09:03:20 +0200 Takashi Iwai wrote:
On Tue, 22 Apr 2025 01:38:59 +0200, Hillf Danton wrote:
On Mon, 21 Apr 2025 16:36:30 +0200 Takashi Iwai wrote:
On Mon, 21 Apr 2025 12:43:42 +0200, Hillf Danton wrote:
I misread "Which reads and writes are you trying to solve?" though I showed the read/write, but it is a bad case particulay with UAF.
Could you tell us what will happen if the race is not fixed? Could ep be freed with in-flight urbs for example?
Before the patch, wait_clear_urbs() might return earlier than actually all pending eps are finished, so it can be UAF.
Got it.
Is it still race if the wait loop in wait_clear_urbs() ends before the urb complete callbace completes, given the last sentence in your commit message? If nope, igore my noise please.
Well, your concern about the missing barrier -- that would wait_clear_urbs() missing the refcount decrement, hence it would be rather to make the return delayed. So it shouldn't lead to further UAF, but at most it might lead to an unnecessary delay.
That said, I'm willing to take a fix even for a theoretical issue if it clarifies what it really fixes. But scratching a random surface isn't what we want.
Thank you for shedding light on the race, given a) the mb in 26fbe9772b8c ("USB: core: Fix hang in usb_kill_urb by adding memory barriers") b) the urb complete callback is invoked in giveback, see __usb_hcd_giveback_urb()
use the urb routines instead to close the race.
I'm afraid that it can break things as of this form; the stopped stream might be restarted without reinitializing URBs. That is, this isn't called only from disconnect or close, but also just for stopping the stream in the middle, too.
I'd like to test the change proposed locally, so please tippoint me to the test programs that could trigger the break.
Well, you need to follow the code logic. The function isn't called only from release / disconnect, but also from the code pattern: trigger(STOP) -> prepare -> sync_stop and the URBs aren't released / re-initialized in this case.
For the problem you raised, I suppose it's better to stick with your first approach with the manual barriers. But you'd need to describe exactly what it does and why it needed as a proper patch.
thanks,
Takashi