Re: [alsa-devel] a7839e96 (PNP: increase max resources) breaks my ALSA intel8x0 sound card
Avuton Olrich wrote:
With v2.6.24 my second ALSA sound device stopped working.
After bisection it says this was the offending commit.
a7839e960675b549f06209d18283d5cee2ce9261 is first bad commit commit a7839e960675b549f06209d18283d5cee2ce9261 Author: Zhao Yakui yakui.zhao@intel.com Date: Wed Nov 28 16:21:21 2007 -0800
PNP: increase the maximum number of resources On some systems the number of resources(IO,MEM) returnedy by PNP device is greater than the PNP constant, for example motherboard devices. It brings that some resources can't be reserved and resource confilicts. This will cause PCI resources are assigned wrongly in some systems, and cause hang. This is a regression since we deleted ACPI motherboard driver and use PNP system driver. [akpm@linux-foundation.org: fix text and coding-style a bit] Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Thomas Renninger <trenn@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The audio device is 00:1b.0 (see my lspci -vvv output), the other audio device works fine.
http://avuton.googlepages.com/v2.6-before (dmesg revision before) http://avuton.googlepages.com/v2.6-after (dmesg broken revision) http://avuton.googlepages.com/lspci-vvv http://avuton.googlepages.com/config (from the broken revision) http://avuton.googlepages.com/iomem http://avuton.googlepages.com/ioports
Here's why the driver fails to load:
[ 31.133060] ACPI: PCI Interrupt 0000:00:1b.0[A] -> GSI 22 (level, low) -> IRQ 22 [ 31.133141] PCI: Unable to reserve mem region #1:4000@febf8000 for device 0000:00:1b.0 [ 31.133197] ACPI: PCI interrupt for device 0000:00:1b.0 disabled [ 31.133244] HDA Intel: probe of 0000:00:1b.0 failed with error -16
The iomem location of the HDA controller conflicts with this reservation by the BIOS:
[ 22.906654] system 00:08: iomem range 0xfebfa000-0xfebfac00 has been reserved
There was a patch floating around to ignore PnPACPI reservations which conflict with PCI BARs, which appears to be what's happening in this case. That patch originally worked for any board, but was later made specific to a certain Supermicro motherboard which had the sata_nv controller MMIO regions marked as reserved, preventing the driver from loading. We may need a more general solution. See:
On Sun, 27 Jan 2008 12:17:17 -0600 Robert Hancock hancockr@shaw.ca wrote:
Avuton Olrich wrote:
With v2.6.24 my second ALSA sound device stopped working.
After bisection it says this was the offending commit.
a7839e960675b549f06209d18283d5cee2ce9261 is first bad commit commit a7839e960675b549f06209d18283d5cee2ce9261 Author: Zhao Yakui yakui.zhao@intel.com Date: Wed Nov 28 16:21:21 2007 -0800
PNP: increase the maximum number of resources On some systems the number of resources(IO,MEM) returnedy by PNP device is greater than the PNP constant, for example motherboard devices. It brings that some resources can't be reserved and resource confilicts. This will cause PCI resources are assigned wrongly in some systems, and cause hang. This is a regression since we deleted ACPI motherboard driver and use PNP system driver. [akpm@linux-foundation.org: fix text and coding-style a bit] Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Thomas Renninger <trenn@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The audio device is 00:1b.0 (see my lspci -vvv output), the other audio device works fine.
http://avuton.googlepages.com/v2.6-before (dmesg revision before) http://avuton.googlepages.com/v2.6-after (dmesg broken revision) http://avuton.googlepages.com/lspci-vvv http://avuton.googlepages.com/config (from the broken revision) http://avuton.googlepages.com/iomem http://avuton.googlepages.com/ioports
I don't think anything has happened yet on this?
Here's why the driver fails to load:
[ 31.133060] ACPI: PCI Interrupt 0000:00:1b.0[A] -> GSI 22 (level, low) -> IRQ 22 [ 31.133141] PCI: Unable to reserve mem region #1:4000@febf8000 for device 0000:00:1b.0 [ 31.133197] ACPI: PCI interrupt for device 0000:00:1b.0 disabled [ 31.133244] HDA Intel: probe of 0000:00:1b.0 failed with error -16
The iomem location of the HDA controller conflicts with this reservation by the BIOS:
[ 22.906654] system 00:08: iomem range 0xfebfa000-0xfebfac00 has been reserved
There was a patch floating around to ignore PnPACPI reservations which conflict with PCI BARs, which appears to be what's happening in this case. That patch originally worked for any board, but was later made specific to a certain Supermicro motherboard which had the sata_nv controller MMIO regions marked as reserved, preventing the driver from loading. We may need a more general solution. See:
Thanks. If we were to remove the supermicro-specificity, would this be a sufficiently general solution?
Andrew Morton wrote:
There was a patch floating around to ignore PnPACPI reservations which conflict with PCI BARs, which appears to be what's happening in this case. That patch originally worked for any board, but was later made specific to a certain Supermicro motherboard which had the sata_nv controller MMIO regions marked as reserved, preventing the driver from loading. We may need a more general solution. See:
Thanks. If we were to remove the supermicro-specificity, would this be a sufficiently general solution?
I think so. There was one objection that it introduced a dependency on pnpacpi loading after PCI bus enumeration, though.
Linus also suggested that pnpacpi could be marking the resources as "present but unused" so that drivers can request those regions but we still prevent dynamically assigning resources into them.
On Thu, 31 Jan 2008, Robert Hancock wrote:
I think so. There was one objection that it introduced a dependency on pnpacpi loading after PCI bus enumeration, though.
Linus also suggested that pnpacpi could be marking the resources as "present but unused" so that drivers can request those regions but we still prevent dynamically assigning resources into them.
I _think_ that's what ACPI used to do before switching over to the PnPACPI thing, so I do think that "present but not reserved" approach is not just the right one, but also the (historically) tested one.
Linus
On Thursday 31 January 2008 05:50:13 pm Linus Torvalds wrote:
On Thu, 31 Jan 2008, Robert Hancock wrote:
I think so. There was one objection that it introduced a dependency on pnpacpi loading after PCI bus enumeration, though.
Linus also suggested that pnpacpi could be marking the resources as "present but unused" so that drivers can request those regions but we still prevent dynamically assigning resources into them.
I _think_ that's what ACPI used to do before switching over to the PnPACPI thing, so I do think that "present but not reserved" approach is not just the right one, but also the (historically) tested one.
The reservation happens in drivers/pnp/system.c, and it does mark the region as "not busy."
I think the problem here is that the PCI BAR is bigger and spans the region reported by ACPI:
[ 22.906654] system 00:08: iomem range 0xfebfa000-0xfebfac00 has been reserved [ 31.133141] PCI: Unable to reserve mem region #1:4000@febf8000 for device 0000:00:1b.0
We can easily add more BIOSes to the PNP quirk.
I really don't want to use the earlier quirk that scanned PCI devices from a PNP quirk. I think that's just wrong because PNP (which conceptually includes ACPI) is what tells us about PCI root bridges.
Bjorn
On Mon, 4 Feb 2008, Bjorn Helgaas wrote:
I think the problem here is that the PCI BAR is bigger and spans the region reported by ACPI:
Ok, then it doesn't help that it's not busy.
In that case, the only real fix is to simply do the ACPI reservations *after* PCI probing. Which is what it should have done to begin with.
We can easily add more BIOSes to the PNP quirk.
No. Don't add quirks just because the basic ordering is shit.
I really don't want to use the earlier quirk that scanned PCI devices from a PNP quirk. I think that's just wrong because PNP (which conceptually includes ACPI) is what tells us about PCI root bridges.
So? Do the bridge listing separately from resource marking. Why tie the two together? They have absolutely *nothing* to do with each other.
The fact is, scanning devices should happen first. And AFTER the device tree is scanned, we can then safely add all the special resources that don't show up as normal devices.
Linus
On Monday 04 February 2008 11:18:09 am Linus Torvalds wrote:
On Mon, 4 Feb 2008, Bjorn Helgaas wrote:
I think the problem here is that the PCI BAR is bigger and spans the region reported by ACPI:
Ok, then it doesn't help that it's not busy.
In that case, the only real fix is to simply do the ACPI reservations *after* PCI probing. Which is what it should have done to begin with.
I'm sure you're right, but I don't understand why yet. Here's what I think is happening; please correct me where I'm going wrong:
1) enumerate PNP & ACPI devices 2) initialize PNP & ACPI drivers 2a) register ACPI PCI root bridge driver, which enumerates PCI devices behind the bridge 2b) register PNP system driver and reserve resources (this is where the current quirk skips some reservations) 3) initialize PCI drivers 3a) register intel8x0 sound driver and reserve conflicting resources
I really don't want to use the earlier quirk that scanned PCI devices from a PNP quirk. I think that's just wrong because PNP (which conceptually includes ACPI) is what tells us about PCI root bridges.
So? Do the bridge listing separately from resource marking. Why tie the two together? They have absolutely *nothing* to do with each other.
The fact is, scanning devices should happen first. And AFTER the device tree is scanned, we can then safely add all the special resources that don't show up as normal devices.
I think you're suggesting that we should do 2a first, to enumerate all PCI devices, and only later do 2b. But I don't know how to accomplish that cleanly.
It does happen in that order today, but only because the ACPI drivers are registered before the PNP drivers. I think that's an artificial distinction, so I don't want to rely on it. If the PCI bridge driver became a PNP driver, we could use link ordering to make sure it still came first, although that seems a little fragile to me.
Bjorn
On Mon, 4 Feb 2008, Bjorn Helgaas wrote:
I'm sure you're right, but I don't understand why yet. Here's what I think is happening; please correct me where I'm going wrong:
- enumerate PNP & ACPI devices
- initialize PNP & ACPI drivers 2a) register ACPI PCI root bridge driver, which enumerates PCI devices behind the bridge 2b) register PNP system driver and reserve resources (this is where the current quirk skips some reservations)
- initialize PCI drivers 3a) register intel8x0 sound driver and reserve conflicting resources
So where in this would you put the
pcibios_init() -> pcibios_resource_survey()
call (it's a subsys_initcall)?
THAT is the thing that actually registers the PCI resurces we've found into the resource tree!
It's very inconveniently placed as-is, since it literally depends on the whole initcall ordering (and the link order within that subsys_initcall thing), and all of this is architecture-driven rather than driven from some central place.
So this is the thing that I think should happen before any PnP or ACPI drivers actually start registerign themselves (but obviously needs to happen after the PCI buses have been enumerated).
The ACPI/PnP tables shouldn't be able to break the enumeration of the actual hardware devices, now should it?
I think you're suggesting that we should do 2a first, to enumerate all PCI devices, and only later do 2b. But I don't know how to accomplish that cleanly.
We should enumerate the PCI devices, then register their resources (and no, I'm not at *all* convinced it should happen as a separate subsys_initcall), and then register the PnP resources.
So I think we should have roughly something like:
- arch_initcall: this could enumerate the ACPI/PnP devices (but not register anything). Alternatively, do it as subsys_initcall, and just make sure it happens early with link-order.
- subsys_initcall: this should do that pcibios_init() thing that surveys the resources (and the PCI enumeration needs to have happened before, probably in the same initcall thanks to link order)
- PnP/ACPI resource allocation *after* it, but before driver loading (which wll cause new resources to be allocated). This could be fs_initcall, or whatever (that's what things like "acpi_event_init" already do).
- regular drivers will come along much later, as part of driver_initcall, and by the time this happens, we've now reserved all resources we know about.
Basically, we just want to register the most trust-worthy resources before we register anything less trust-worthy. And actual device probing simply tends to be more trust-worthy than any randomly broken ACPI/PnP tables.
Linus
On Monday 04 February 2008 02:16:52 pm Linus Torvalds wrote:
On Mon, 4 Feb 2008, Bjorn Helgaas wrote:
So where in this would you put the
pcibios_init() -> pcibios_resource_survey()
call (it's a subsys_initcall)?
THAT is the thing that actually registers the PCI resurces we've found into the resource tree!
I think pcibios_init() currently happens after we register ACPI & PNP drivers, i.e., at 3 below:
1) enumerate PNP & ACPI devices 2) initialize PNP & ACPI drivers 2a) register ACPI PCI root bridge driver, which enumerates PCI devices behind the bridge 2b) register PNP system driver and reserve resources (this is where the current quirk skips some reservations) 3) pcibios_init() -> pcibios_resource_survey() 4) initialize PCI drivers 4a) register intel8x0 sound driver and reserve resources (conflict happens here)
The ACPI/PnP tables shouldn't be able to break the enumeration of the actual hardware devices, now should it?
Well, no :-) We have to make PNP & ACPI smart enough to not cause trouble, and I fully accept that the burden is on PNP.
But PNPBIOS and ACPI by definition are for devices that don't have their own enumeration protocol. Obviously, we have a lot of legacy drivers that blindly probe for devices at magic addresses, but that's validating guesswork, not actually enumerating anything.
In this particular case, we can easily enumerate all the PCI devices in domain 0. But for machines that have multiple PCI domains, I don't think we want to exhaustively scan all possible domains. ACPI tells us what root bridges exist and what domain each is in, so we can scan a little more efficiently.
We should enumerate the PCI devices, then register their resources (and no, I'm not at *all* convinced it should happen as a separate subsys_initcall), and then register the PnP resources.
So I think we should have roughly something like:
- arch_initcall: this could enumerate the ACPI/PnP devices (but not register anything). Alternatively, do it as subsys_initcall, and just make sure it happens early with link-order.
Scanning PCI buses has to happen here, which currently means that we have to register the ACPI PCI root driver so we know which domains and buses to scan.
subsys_initcall: this should do that pcibios_init() thing that surveys the resources (and the PCI enumeration needs to have happened before, probably in the same initcall thanks to link order)
PnP/ACPI resource allocation *after* it, but before driver loading (which wll cause new resources to be allocated). This could be fs_initcall, or whatever (that's what things like "acpi_event_init" already do).
If we put the PNP system driver here, we can easily do a quirk that ignores PNP resources that overlap PCI resources. But it's kind of ugly to have the ACPI PCI root driver early and other PNP drivers later because they're basically similar animals.
- regular drivers will come along much later, as part of driver_initcall, and by the time this happens, we've now reserved all resources we know about.
Basically, we just want to register the most trust-worthy resources before we register anything less trust-worthy. And actual device probing simply tends to be more trust-worthy than any randomly broken ACPI/PnP tables.
I agree that PCI BARs are likely more trustworthy than firmware tables. Maybe we could figure out a way to do the PNP reservations, then revert them if we find an overlapping PCI BAR.
Does anybody with this motherboard (or the Supermicro board with similar SATA problems) also have Windows on it? I'm curious to see how Windows deals with this conflict, e.g., what shows up in the device manager?
Bjorn
On Feb 4, 2008 11:03 PM, Bjorn Helgaas bjorn.helgaas@hp.com wrote:
Does anybody with this motherboard (or the Supermicro board with similar SATA problems) also have Windows on it? I'm curious to see how Windows deals with this conflict, e.g., what shows up in the device manager?
Sorry, I don't own Windows.
On Tue, 5 Feb 2008, Bjorn Helgaas wrote:
- PnP/ACPI resource allocation *after* it, but before driver loading (which wll cause new resources to be allocated). This could be fs_initcall, or whatever (that's what things like "acpi_event_init" already do).
If we put the PNP system driver here, we can easily do a quirk that ignores PNP resources that overlap PCI resources.
No, you don't need any quirks: you just do an "insert_resource()" and ignore the error return. If the (bogus) PnP resource clashes with the (correct) hardware PCI resource, the insert will simply fail. No quirks needed.
But it's kind of
ugly to have the ACPI PCI root driver early and other PNP drivers later because they're basically similar animals.
No they are not.
If one does just device enumeration, and the other does resource registrations, then they ARE NOT similar animals at all. Don't claim that they are.
Linus
On Tuesday 05 February 2008 11:15:12 am Linus Torvalds wrote:
On Tue, 5 Feb 2008, Bjorn Helgaas wrote:
- PnP/ACPI resource allocation *after* it, but before driver loading (which wll cause new resources to be allocated). This could be fs_initcall, or whatever (that's what things like "acpi_event_init" already do).
If we put the PNP system driver here, we can easily do a quirk that ignores PNP resources that overlap PCI resources.
No, you don't need any quirks: you just do an "insert_resource()" and ignore the error return. If the (bogus) PnP resource clashes with the (correct) hardware PCI resource, the insert will simply fail. No quirks needed.
But it's kind of
ugly to have the ACPI PCI root driver early and other PNP drivers later because they're basically similar animals.
No they are not.
If one does just device enumeration, and the other does resource registrations, then they ARE NOT similar animals at all. Don't claim that they are.
Whoa, easy :-) I just meant they're similar in that we discover PCI root bridges and other PNP devices by traversing the ACPI namespace, so unless we make special arrangements, we bind drivers to them at roughly the same time.
I'll play with your insert_resource() idea and see if I can figure something out.
Bjorn
On Feb 5, 2008 12:12 PM, Bjorn Helgaas bjorn.helgaas@hp.com wrote:
I'll play with your insert_resource() idea and see if I can figure something out.
Any word on this yet?
Thanks!
On Tuesday 05 February 2008 01:12:46 pm Bjorn Helgaas wrote:
On Tuesday 05 February 2008 11:15:12 am Linus Torvalds wrote:
No, you don't need any quirks: you just do an "insert_resource()" and ignore the error return. If the (bogus) PnP resource clashes with the (correct) hardware PCI resource, the insert will simply fail. No quirks needed.
I'll play with your insert_resource() idea and see if I can figure something out.
Sorry for the delay. I did work on this, but I don't see how this can work. pcibios_init() marks its reservations as not busy, so the subsequent PNP request doesn't fail, even if it clashes.
The PNP system driver is an fs_initcall(), so it already happens after pcibios_init():
1) register ACPI PCI root bridge driver, which enumerates PCI devices behind the bridge 2) pcibios_init() -> pcibios_resource_survey() -> request_resource() 3) register PNP system driver -> request_region() 4) register intel8x0 sound driver and reserve resources (conflict happens here)
We have reservations in this order:
febf8000-febfbfff : 0000:00:1b.0 -- from pcibios_resource_survey (!busy) febfa000-febfac00 : pnp 00:08 -- from PNP system driver (!busy) febf8000-febfbfff : ICH HD audio -- fails because it spans the PNP region
The PNP reservation succeeds even though the PCI reservation has already happened, so I don't see how we can do this without a quirk that ignores the bogus PNP resources.
Bjorn
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
Sorry for the delay. I did work on this, but I don't see how this can work. pcibios_init() marks its reservations as not busy, so the subsequent PNP request doesn't fail, even if it clashes.
It *shouldn't* fail.
Things should fail only when two different drivers have requested the same region. NOT when something tells the system that a region _exists_.
Two different things.
Linus
On Thursday 14 February 2008 12:42:52 pm Linus Torvalds wrote:
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
Sorry for the delay. I did work on this, but I don't see how this can work. pcibios_init() marks its reservations as not busy, so the subsequent PNP request doesn't fail, even if it clashes.
It *shouldn't* fail.
Things should fail only when two different drivers have requested the same region. NOT when something tells the system that a region _exists_.
The sound driver doesn't fail because two different drivers have requested the same region; it fails because PNP told us a region exists, and the sound region crosses the edge of the PNP region.
You wrote earlier that:
On Tue, 5 Feb 2008, Bjorn Helgaas wrote:
- PnP/ACPI resource allocation *after* it, but before driver loading (which wll cause new resources to be allocated). This could be fs_initcall, or whatever (that's what things like "acpi_event_init" already do).
If we put the PNP system driver here, we can easily do a quirk that ignores PNP resources that overlap PCI resources.
No, you don't need any quirks: you just do an "insert_resource()" and ignore the error return. If the (bogus) PnP resource clashes with the (correct) hardware PCI resource, the insert will simply fail. No quirks needed.
I thought you were suggesting here that the PNP system driver would do an insert_resource(), and it would fail if it clashed with the PCI resource.
Can you be more explicit about how you think I should fix this?
Bjorn
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
On Thursday 14 February 2008 12:42:52 pm Linus Torvalds wrote:
It *shouldn't* fail.
Things should fail only when two different drivers have requested the same region. NOT when something tells the system that a region _exists_.
The sound driver doesn't fail because two different drivers have requested the same region; it fails because PNP told us a region exists, and the sound region crosses the edge of the PNP region.
Right, and that was a bug.
It *shouldn't* fail. The PnP resource should be inserted _after_ the PCI region has been inserted, and _that_ should fail, since the PnP region is crap and cannot be inserted "half-way".
Linus
On Thursday 14 February 2008 01:26:59 pm Linus Torvalds wrote:
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
On Thursday 14 February 2008 12:42:52 pm Linus Torvalds wrote:
It *shouldn't* fail.
Things should fail only when two different drivers have requested the same region. NOT when something tells the system that a region _exists_.
The sound driver doesn't fail because two different drivers have requested the same region; it fails because PNP told us a region exists, and the sound region crosses the edge of the PNP region.
Right, and that was a bug.
It *shouldn't* fail. The PnP resource should be inserted _after_ the PCI region has been inserted, and _that_ should fail, since the PnP region is crap and cannot be inserted "half-way".
That means the PNP system driver has to be registered after the PCI driver. We can't guarantee that, especially if the sound driver is a module.
Bjorn
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
That means the PNP system driver has to be registered after the PCI driver.
After the PCI *subsystem*
Here's the actual problem:
[ 31.133141] PCI: Unable to reserve mem region #1:4000@febf8000 for device 0000:00:1b.0
and here is what the resource tree *should* look like:
febf8000-febfbfff : 0000:00:1b.0 febf8000-febfbfff : ICH HD audio
that's with a working sound driver.
Notice how there is *two* resources there: there's the PCI bus resource itself, the one called 0000:00:1b.0 (which was active at boot), and there is the "driver resource" that nests inside of it ("ICH HD audio").
The PCI bus resource is created and inserted into the resource when the PCI bus discovery happens - long before the driver comes along at all.
And the problem is this ABSOLUTE CRAP that happened much earlier:
[ 22.906610] system 00:08: iomem range 0xfebfe000-0xfebfec00 has been reserved [ 22.906654] system 00:08: iomem range 0xfebfa000-0xfebfac00 has been reserved
which happens (a) before the PCI bus probing (b) despite the fact that the PCI bus probing already found something at that address, and PnP shouldn't pollute the *correct* resources.
and the fix is to move that crap PnP to _after_ the PCI bus has been probed, and make sure that if the PnP resource clashes with the known good ones, then PnP gets the h*ll out and doesn't insert it at all.
See?
We can't guarantee that, especially if the sound driver is a module.
This has *nothing* to do with the sound driver.
This has everything to do with the fact that the PnP layer is a piece of crap and overrides/messes with the *correct* resources that we found during PCI bus probing.
Is that really so hard to understand?
So here is (for the *fifth* time) what PnP should do:
- only try to insert its resources *after* the PCI bus probing has happened - and if that fails, LET THE CRAP FAIL instead of making the *good* code fail!
Ok?
Linus
On Thursday 14 February 2008 02:37:15 pm Linus Torvalds wrote:
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
That means the PNP system driver has to be registered after the PCI driver.
After the PCI *subsystem*
Here's the actual problem:
[ 31.133141] PCI: Unable to reserve mem region #1:4000@febf8000 for device 0000:00:1b.0
and here is what the resource tree *should* look like:
febf8000-febfbfff : 0000:00:1b.0 febf8000-febfbfff : ICH HD audio
that's with a working sound driver.
Notice how there is *two* resources there: there's the PCI bus resource itself, the one called 0000:00:1b.0 (which was active at boot), and there is the "driver resource" that nests inside of it ("ICH HD audio").
The PCI bus resource is created and inserted into the resource when the PCI bus discovery happens - long before the driver comes along at all.
And the problem is this ABSOLUTE CRAP that happened much earlier:
[ 22.906610] system 00:08: iomem range 0xfebfe000-0xfebfec00 has been reserved [ 22.906654] system 00:08: iomem range 0xfebfa000-0xfebfac00 has been reserved
which happens (a) before the PCI bus probing
I don't think so. pci_scan_bus_parented() happened earlier, here:
[ 22.875359] ACPI: PCI Root Bridge [PCI0] (0000:00)
and pcibios_init() is a subsys_initcall() that happens before the PNP system driver registers via fs_initcall() (I had this order wrong in a previous email).
This has everything to do with the fact that the PnP layer is a piece of crap and overrides/messes with the *correct* resources that we found during PCI bus probing.
There are clearly problems with PNP. I'm trying to help improve the situation.
So here is (for the *fifth* time) what PnP should do:
- only try to insert its resources *after* the PCI bus probing has happened
That already happens.
- and if that fails, LET THE CRAP FAIL instead of making the *good* code fail!
The PNP resource fits entirely inside the PCI bus resource, so the PNP insertion will only fail if the sound driver has already been loaded.
It seems like we're talking past each other. Would anybody else like to step in and help explain this to me? Maybe a fresh viewpoint will help me find a clue.
Bjorn
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
[ 22.906610] system 00:08: iomem range 0xfebfe000-0xfebfec00 has been reserved [ 22.906654] system 00:08: iomem range 0xfebfa000-0xfebfac00 has been reserved
The PNP resource fits entirely inside the PCI bus resource, so the PNP insertion will only fail if the sound driver has already been loaded.
Ok, it does indeed fit entirely in (and the discussion about "clashing" misled me - the PCI resource doesn't actually clash, it's just a subset). And the problem then ends up that the PnP thing adds resources to inside the PCI resource. It shouldn't. It's crap.
It should insert the resource to the root resource (or a bridge resource), or not at all. If somebody else has already inserted a real device resource, we already know about it, and the PnP information is going to be just making things worse.
The *really* basic issue is that PnP and ACPI generally report utter crap. We should always totally ignore them EXCEPT AS A WAY TO AVOID _NEW_ ALLOCATIONS.
That's the only valid reason to believe in ACPI: we don't know what the hell it's talking about, but we _do_ know that we shouldn't be allocating new resources over it (because if it actually happens to be correct, it is some random scary stuff that we obviously didn't find out about).
But the moment we have better information (where "we actually scanned the hardware" is the very definition of better information), we should always totally discard any ACPI crud. It's guaranteed to be worse than what we already know about.
That's all I ever wanted. To have ACPI realize that it should never ever mess with something we know better about.
Linus
On Thu, 14 Feb 2008, Linus Torvalds wrote:
It should insert the resource to the root resource (or a bridge resource), or not at all. If somebody else has already inserted a real device resource, we already know about it, and the PnP information is going to be just making things worse.
Hmm. The approach I'd take is to always insert the thing into the root resource. If we do want to let PnP insert it into some lower-level bus, we'd need to have some way to distinguish "bus" from "device", and we don't.
So right now, how about just making PnP/ACPI just use
root = (flags & IORESOURCE_MEM) ? iomem_resource : ioport_resource; request_resource(root, newresource);
which is what we do for the e820 map and the other special resources we know about (ie the magic resources we make up ourselves like video ram and our standard PCI/ISA resource lists like the <0x100 DMA/PIC/FPU IO ports etc)
Linus
On Thursday 14 February 2008 04:14:45 pm Linus Torvalds wrote:
On Thu, 14 Feb 2008, Linus Torvalds wrote:
It should insert the resource to the root resource (or a bridge resource), or not at all. If somebody else has already inserted a real device resource, we already know about it, and the PnP information is going to be just making things worse.
Hmm. The approach I'd take is to always insert the thing into the root resource. If we do want to let PnP insert it into some lower-level bus, we'd need to have some way to distinguish "bus" from "device", and we don't.
So right now, how about just making PnP/ACPI just use
root = (flags & IORESOURCE_MEM) ? iomem_resource : ioport_resource; request_resource(root, newresource);
which is what we do for the e820 map and the other special resources we know about (ie the magic resources we make up ourselves like video ram and our standard PCI/ISA resource lists like the <0x100 DMA/PIC/FPU IO ports etc)
That makes sense, but ... there are BIOSes that actually list nested resources, e.g., http://bugzilla.kernel.org/show_bug.cgi?id=9514#c29 :
[000000290 - 000000294] Motherboard resources [000000290 - 00000029F] Motherboard resources
We'd have to make sure we don't start with the 0x290-0x294 resource, because then we'll fail when we try to reserve 0x290-0x29f, and we really should avoid using the 0x295-0x2f9 piece as well.
And even if we do figure out the "outermost" resources, I'd worry that some BIOS will have separate ACPI devices that have overlapping resources. Then Murphy's Law says that we'll find the device with the smaller resource first, reserve it, then find a device with an enclosing resource, and leave something unreserved.
Maybe that's still better than a quirk; I don't know. I guess it's pretty ugly either way.
Bjorn
On Thu, 14 Feb 2008, Bjorn Helgaas wrote:
That makes sense, but ... there are BIOSes that actually list nested resources, e.g., http://bugzilla.kernel.org/show_bug.cgi?id=9514#c29 :
[000000290 - 000000294] Motherboard resources [000000290 - 00000029F] Motherboard resources
Heh.
Not that I can really see us ever being confused by this in practice. Sure, we'd skip the second resource (because it's busy), but that's what we used to do before anyway, wasn't it? And even if we do, there's very little reason we'd ever actually try to reserve that 294-29f range.
In practice, the only resources we actually tend to end up really allocating tends to be things like Cardbus (or a whole PCI bus either due to hotplug or due to the BIOS not bothering), which tend want a big window for the bus window, so the "off-by-one" kinds of things wouldn't really worry me. Very seldom do we actually have small resources to worry about on a PC platform, since they all tend to be pre-allocated by the BIOS anyway.
That said, if we really want to handle this, we could certainly add a whole new ioresource flag bit that says "allow inserting resources into this", and we could set that bit not just for the PnP reservations themselves, but in PCI bus resources too.
Basically, there are two different ways of inserting a resource:
- the "register this subresource" thing that "request_resource()" does, which just works on the one given resource and honors the BUSY bit.
- the "insert this resource into the tree" (insert_resource()) that starts from the root and tries to find the right location. It honors the BUSY bit too, but we could extend it to _only_ extend into resources that allow it.
Here's the trivial patch to add the infrastructure for this, but I did *not* do the required "mark PCI bus resources as IORESOURCE_INSERT" etc (nor obviously the markers to make PnP resources themselves be marked as "insert").
But if you want to play around with it, I think this is at least *conceptually* a sane idea. Another way to see it is that a resource with the IORESOURCE_INSERT bit is "subdivisible", and then it's very obvious that a normal PCI device resource should not have it set: it is a "leaf" resources that can not be split up.
So the only way to populate "leaf" resources is by explicitly specifying them and doing a "request_resource()" on the resource, while the non-leaf IORESOURCE_INSERT resources can be populated from below with ioresrouce_insert().
Is this over-designed? I dunno. The _implementation_ on a resource level is certainly trivial.
Oh, and if it wasn't obvious: the patch is untested. It compiles. That's all I'm going to say about it, especially since without marking any resources with the new IORESOURCE_INSERT bit, the only thing you can do with "insert_resource()" is to insert into the root level (because we don't check the level we are passing into)
Linus
--- include/linux/ioport.h | 1 + kernel/resource.c | 2 ++ 2 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 605d237..0a56410 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -45,6 +45,7 @@ struct resource_list { #define IORESOURCE_RANGELENGTH 0x00008000 #define IORESOURCE_SHADOWABLE 0x00010000 #define IORESOURCE_BUS_HAS_VGA 0x00080000 +#define IORESOURCE_INSERT 0x00100000 /* Allow insert_resource() */
#define IORESOURCE_DISABLED 0x10000000 #define IORESOURCE_UNSET 0x20000000 diff --git a/kernel/resource.c b/kernel/resource.c index 82aea81..d6da786 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -397,6 +397,8 @@ int insert_resource(struct resource *parent, struct resource *new) result = -EBUSY; if (first == parent) goto out; + if (!(first->flags & IORESOURCE_INSERT)) + goto out;
if ((first->start > new->start) || (first->end < new->end)) break;
On Thursday 14 February 2008 05:40:36 pm Linus Torvalds wrote:
That said, if we really want to handle this, we could certainly add a whole new ioresource flag bit that says "allow inserting resources into this", and we could set that bit not just for the PnP reservations themselves, but in PCI bus resources too.
Basically, there are two different ways of inserting a resource:
the "register this subresource" thing that "request_resource()" does, which just works on the one given resource and honors the BUSY bit.
the "insert this resource into the tree" (insert_resource()) that starts from the root and tries to find the right location. It honors the BUSY bit too, but we could extend it to _only_ extend into resources that allow it.
I guess it's time to get back to this problem.
I don't want to make PNP insert resources only at the root. That would avoid the ALSA problem because the enclosing PCI resource is allocated first, but it also precludes us from dealing with lots of valid ACPI information. ACPI can describe layers of bridges and devices behind them. It don't think we should throw away that information just to solve this problem.
And I don't really want to add IORESOURCE_INSERT. It's probably a useful idea, but I don't think it helps solve *this* problem. We'd have to mark both the PCI and the PNP regions as "insert", and I think we'd end up with the exact same situation we have today:
febf8000-febfbfff : 0000:00:1b.0 -- from pcibios_resource_survey (!busy) febfa000-febfac00 : pnp 00:08 -- from PNP system driver (!busy) febf8000-febfbfff : ICH HD audio -- fails; spans part of PNP region
I'm back to the quirk idea, which is still ugly but seems the most straightforward to me. Here's my current patch (this won't apply directly because we'd have to revert the Supermicro-specific quirk first).
Excuse me a minute while I put on my asbestos underwear.
Bjorn
PNP: disable PNP motherboard resources that overlap PCI BARs
Some BIOSes have PNP motherboard devices with resources that partially overlap PCI BARs. The PNP system driver claims these motherboard resources, which prevents the normal PCI driver from requesting them later.
This patch disables the PNP resources that conflict with PCI BARs so they won't be claimed by the PNP system driver.
Of course, this only works if PCI devices have already been enumerated. Currently, PCI devices are discovered before any PNP init via this path:
acpi_pci_root_init() -> acpi_pci_root_add() -> pci_acpi_scan_root() -> pci_scan_bus_parented() -> pci_scan_child_bus() -> ...
References: https://bugzilla.redhat.com/show_bug.cgi?id=280641 https://bugzilla.redhat.com/show_bug.cgi?id=313491 http://lkml.org/lkml/2008/1/9/449 http://thread.gmane.org/gmane.linux.acpi.devel/27312 http://lkml.org/lkml/2008/1/27/168
Signed-off-by: Bjorn Helgaas bjorn.helgaas@hp.com
Index: work7/drivers/pnp/quirks.c =================================================================== --- work7.orig/drivers/pnp/quirks.c 2008-02-27 10:17:40.000000000 -0700 +++ work7/drivers/pnp/quirks.c 2008-02-27 10:19:35.000000000 -0700 @@ -108,6 +108,77 @@ "pnp: SB audio device quirk - increasing port range\n"); }
+ +#include <linux/pci.h> + +static void quirk_system_pci_resources(struct pnp_dev *dev) +{ + struct pci_dev *pdev = NULL; + resource_size_t pnp_start, pnp_end, pci_start, pci_end; + int i, j; + + /* + * Some BIOSes have PNP motherboard devices with resources that + * partially overlap PCI BARs. The PNP system driver claims these + * motherboard resources, which prevents the normal PCI driver from + * requesting them later. + * + * This patch disables the PNP resources that conflict with PCI BARs + * so they won't be claimed by the PNP system driver. + */ + for_each_pci_dev(pdev) { + for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) { + if (!(pci_resource_flags(pdev, i) & IORESOURCE_MEM) || + pci_resource_len(pdev, i) == 0) + continue; + + pci_start = pci_resource_start(pdev, i); + pci_end = pci_resource_end(pdev, i); + for (j = 0; j < PNP_MAX_MEM; j++) { + if (!pnp_mem_valid(dev, j) || + pnp_mem_len(dev, j) == 0) + continue; + + pnp_start = pnp_mem_start(dev, j); + pnp_end = pnp_mem_end(dev, j); + + /* + * If the PNP region doesn't overlap the PCI + * region at all, there's no problem. + */ + if (pnp_end < pci_start || pnp_start > pci_end) + continue; + + /* + * If the PNP region completely encloses (or is + * at least as large as) the PCI region, that's + * also OK. For example, this happens when the + * PNP device describes a bridge with PCI + * behind it. + */ + if (pnp_start <= pci_start && + pnp_end >= pci_end) + continue; + + /* + * Otherwise, the PNP region overlaps *part* of + * the PCI region, and that might prevent a PCI + * driver from requesting its resources. + */ + dev_warn(&dev->dev, "mem resource " + "(0x%llx-0x%llx) overlaps %s BAR %d " + "(0x%llx-0x%llx), disabling\n", + (unsigned long long) pnp_start, + (unsigned long long) pnp_end, + pci_name(pdev), i, + (unsigned long long) pci_start, + (unsigned long long) pci_end); + pnp_mem_flags(dev, j) = 0; + } + } + } +} + /* * PnP Quirks * Cards or devices that need some tweaking due to incomplete resource info @@ -128,6 +199,8 @@ {"CTL0043", quirk_sb16audio_resources}, {"CTL0044", quirk_sb16audio_resources}, {"CTL0045", quirk_sb16audio_resources}, + {"PNP0c01", quirk_system_pci_resources}, + {"PNP0c02", quirk_system_pci_resources}, {""} };
participants (5)
-
Andrew Morton
-
Avuton Olrich
-
Bjorn Helgaas
-
Linus Torvalds
-
Robert Hancock