Conceptual bug on SoundWire probe/remove?

Greg Kroah-Hartman gregkh at linuxfoundation.org
Fri Mar 25 17:26:17 CET 2022


On Fri, Mar 25, 2022 at 10:51:51AM -0500, Pierre-Louis Bossart wrote:
> 
> 
> On 3/24/22 06:15, Greg Kroah-Hartman wrote:
> > On Wed, Mar 23, 2022 at 02:45:59PM -0500, Pierre-Louis Bossart wrote:
> > > Hi,
> > > I could use feedback/guidance on a possible conceptual bug in the SoundWire
> > > probe and bus handling.
> > > 
> > > When we probe a driver, the code does this:
> > > 
> > > static int sdw_drv_probe(struct device *dev)
> > > {
> > > 	struct sdw_slave *slave = dev_to_sdw_dev(dev);
> > > 	struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
> > > 	const struct sdw_device_id *id;
> > > 	const char *name;
> > > 	int ret;
> > > 
> > > 	/*
> > > 	 * fw description is mandatory to bind
> > > 	 */
> > > 	if (!dev->fwnode)
> > > 		return -ENODEV;
> > > 
> > > 	if (!IS_ENABLED(CONFIG_ACPI) && !dev->of_node)
> > > 		return -ENODEV;
> > > 
> > > 	id = sdw_get_device_id(slave, drv);
> > > 	if (!id)
> > > 		return -ENODEV;
> > > 
> > > 	slave->ops = drv->ops;
> > 
> > That is wrong and should never happen as you lost all reference
> > counting.  Please don't do that.
> 
> ok, so I think we all agree on the issue. It's not new code, it's been there
> since December 2017 and 4.16

It's hard to notice that in code review :(

> > > The last line is the problematic one. If at some point, the user does an
> > > rmmod and unbinds the SoundWire codec driver, the .remove will be called and
> > > the 'drv' will no longer be valid, but we will still have a reference to
> > > drv->ops and use that pointer in the bus code, e.g.
> > > 
> > > 		/* Update the Slave driver */
> > > 		if (slave_notify && slave->ops &&
> > > 		    slave->ops->interrupt_callback) {
> > > 			slave_intr.sdca_cascade = sdca_cascade;
> > > 			slave_intr.control_port = clear;
> > > 			memcpy(slave_intr.port, &port_status,
> > > 			       sizeof(slave_intr.port));
> > > 
> > > 			slave->ops->interrupt_callback(slave,
> > > &slave_intr);
> > > 		}
> > > 
> > > I noodled with a potential fix in
> > > https://github.com/thesofproject/linux/pull/3534/commits/82d64fb0fd39b532263f060a8ec86e47e9ab305b
> > > 
> > > where I force-reset this slave->ops pointer, but it is likely to be very
> > > racy.
> > 
> > Just properly reference count things and this should be ok.  You can
> > NEVER just save off a pointer to a random thing that you do not own
> > without increasing the reference count, otherwise bad things will
> > happen.  It's always been this way.
> 
> but I am not following the reference count recommendation in that specific
> case. If we increase a reference count in the probe, wouldn't that prevent
> the unbind/remove?

Depends on what you are increasing the reference count of :)

bind/unbind has nothing to do with the reference count of objects, it
only prevents the devices/whatever from being removed from the system
(hopefully.)

When doing "give me the driver ops of that driver over there" you have
to be VERY careful.  It's a very uncommon pattern that I can't think of
anyone else doing outside of a bus logic.  I don't recommend it, instead
grab references to the devices themselves and go through the device and
what is bound/unbound to that.  Also keep things in sync when you grab
those other devices, that can also get messy.

How about just not doing this at all?



More information about the Alsa-devel mailing list