From: Jason Gunthorpe jgg@nvidia.com Sent: Thursday, October 1, 2020 8:04 PM
On Thu, Oct 01, 2020 at 02:14:23PM +0200, Greg KH wrote:
On Thu, Oct 01, 2020 at 08:58:47AM -0300, Jason Gunthorpe wrote:
On Thu, Oct 01, 2020 at 01:05:51PM +0200, Greg KH wrote:
You have to be _VERY_ careful after calling ancillary_device_initialize(), as now you can not just free up the memory if something goes wrong before ancillary_device_add() is called, right?
I've looked at way too many versions of this patch and related. This is the only one so far that I didn't find various bugs on the error cases.
But you haven't seen the callers of this function. Without this documented, you will have problems.
I've seen the Intel irdma, both versions of the SOF stuff and an internal mlx5 patch..
Look at the SOF example, it has perfectly paired error unwinds. Each function has unwind that cleans up exactly what it creates. Every 'free' unwind is paired with an 'alloc' in the same function. Simple. Easy to audit. Easy to correctly enhance down the road.
This is the common kernel goto error design pattern.
Why is this two-step process even needed here?
Initializing the refcount close to the allocation is a common design pattern as is initializing it close to registration. Both options are tricky, both have various common subtle bugs, both have awkward elements.
At the end of the day, after something like 20 iterations, this is the first series that actually doesn't have error unwind bugs.
Can we empower Dave to make this choice? It is not like it is wild or weird, the driver core already exposes _initialize and _add functions that work in exactly the same way.
Why is this two-step process even needed here? Without this documented, you will have problems.
And it is also documented in "Ancillary device" section in the Documentation/driver-api/ancillary_bus.rst Below is the snippet from the patch.
It is likely worth to add this part of the documentation using standard kernel kdoc format where export function definition of initialize() and add() resides, so that it cannot be missed out.
+Registering an ancillary_device is a two-step process. First you must +call ancillary_device_initialize(), which will check several aspects of +the ancillary_device struct and perform a device_initialize(). After +this step completes, any error state must have a call to put_device() ^^^^^^^^^^^^ +in its resolution path. The second step in registering an +ancillary_device is to perform a call to ancillary_device_add(), which +will set the name of the device and add the device to the bus.