Re: [RFC] races in access to pci_devices

Patrick Mochel (mochel@osdl.org)
Thu, 15 Nov 2001 08:35:12 -0800 (PST)


On Thu, 15 Nov 2001, Andrew Morton wrote:

> Jeff Garzik wrote:
> >
> > I haven't looked at it in over a year, but from a quick look, all the
> > list access look like they can be protected by a simple spinlock.
>
> I don't think so? We do things like calling driver probe methods
> in the middle of a driver list walk.

That's true. However, ->probe() and the rest of the struct pci_driver
methods are all called from the PCI layer. It seems that a per-device
rwsem would could protect competing calls from screwing things up (a
->remove() call during a ->probe() call).

> An rwsem _may_ be suitable, but I'm not sure that we don't do
> a nested walk in some circumstances, and AFAIK our rwsems
> still are not safe for the same thread to do a down_read() twice.

If the device was locked by the PCI layer before all calls down to the
driver, and it was documented in BIG bold letters, this _should_ prevent
drivers from doing silly things like locking itself again, right? ;)

> Then there's the bus list, and the order of its lock wrt the device
> list.
>
> One approach would be to use a spinlock and a per-device refcount.
> So something like:

It seems as though having two lists of devices complicates things quite a
bit. I was looking into trying to obviate the need for a global device
list yesterday, but it seems painful at best (and probably worthy of
another thread).

Suppose you didn't have a global device list, but you had a global bus
list (including all subordinate buses).

Could you do:

struct pci_bus * bus;
struct pci_dev * dev;

for_each_bus(bus) {
down_read(&bus->lock);
for_each_child(bus,dev) {
down_read(&dev->lock);
dev->driver->foo();
up_read(&dev->lock);
}
up_read(&bus->lock);
}

[snip]

I started to implement proper locking for the new driver model a couple of
days ago. So, far it only touches the top-level device tree, but attempts
to deal with the same class of problems. (Though, I won't assert that it
is even close to the best solution).

The simple cases I've touched so far are registering and unregistering
devices in the tree. I don't want a device to be unregistered before its
completely done registering, and I don't want two register calls to stomp
on the same parent's list of children.

So, I do something like:

device_reigster(struct device * dev)
{
down_write(&dev->lock);
...
down_write(&parent->lock);
list_add_tail(&dev->node,&parent->devices);
up_write(&parent->lock);
...
up_write(&dev->lock);
}

Is this right, or is there something better?

> I _think_ all this list traversal happens in process context now.
> Not sure about the PCI hotplug driver though.
>
> It's really sticky. Which is why it isn't fixed :(
>
> Sigh. Maybe go for an rwsem in 2.5, backport when it stops
> deadlocking?

I am more than willing to work on this, as I have a strong interest in
getting this right for the global device tree (just point me in the right
direction..)

-pat

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/