On Fri, 19 Jan 2018 13:58:01 -0800 syzbot syzbot+29f08ad5cb6820798dfe@syzkaller.appspotmail.com wrote:
Hello,
syzbot hit the following crash on mmots commit 2164355612187e55e8d60a28d2cc6b2337841a7e (Fri Jan 19 01:07:54 2018 +0000) pci: test for unexpectedly disabled bridges
So far this crash happened 2 times on mmots. C reproducer is attached. syzkaller reproducer is attached. Raw console output is attached. compiler: gcc (GCC) 7.1.1 20170620 .config is attached.
IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+29f08ad5cb6820798dfe@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer.
BUG: unable to handle kernel paging request at ffffc90001691000 IP: memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:65 PGD 1dad2c067 P4D 1dad2c067 PUD 1dad2d067 PMD 1c6a8f067 PTE 0 Oops: 0002 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 5739 Comm: syzkaller592073 Not tainted 4.15.0-rc8-mm1+ #57 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:65 RSP: 0018:ffff8801cbbdfb78 EFLAGS: 00010246 RAX: fffff520002d3f00 RBX: ffffc90001691000 RCX: 000000000000ee51 RDX: 000000000000ee51 RSI: 0000000000000000 RDI: ffffc90001691000 RBP: ffff8801cbbdfb98 R08: fffff520002d3fcb R09: ffffc90001691000 R10: 0000000000001dcb R11: fffff520002d3fca R12: 000000000000ee51 R13: 0000000000000000 R14: 00007ffffffff000 R15: 000000002001be51 FS: 00007f88ae7d7700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc90001691000 CR3: 00000001ccefa005 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: memset include/linux/string.h:329 [inline] _copy_from_user+0xe9/0x110 lib/usercopy.c:16 copy_from_user include/linux/uaccess.h:147 [inline] snd_pcm_oss_write1 sound/core/oss/pcm_oss.c:1347 [inline] snd_pcm_oss_write+0x438/0x880 sound/core/oss/pcm_oss.c:2659 __vfs_write+0xef/0x970 fs/read_write.c:480 vfs_write+0x189/0x510 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0xef/0x220 fs/read_write.c:581 entry_SYSCALL_64_fastpath+0x29/0xa0 RIP: 0033:0x44a559 RSP: 002b:00007f88ae7d6da8 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000006dcc24 RCX: 000000000044a559 RDX: 000000000000fe51 RSI: 000000002000c000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00000000006dcc20 R13: 7073642f7665642f R14: 00800000c0045006 R15: 0000000000000001 Code: 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 RIP: memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:65 RSP: ffff8801cbbdfb78 CR2: ffffc90001691000 ---[ end trace 8f421641f3e10f44 ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds..
It's hard to believe that the (four year old) workaround-for-a-pci-restoring-bug.patch could cause this.
From: Linus Torvalds torvalds@linux-foundation.org Subject: pci: test for unexpectedly disabled bridges
The all-ones value is not just a "device didn't exist" case, it's also potentially a quite valid value, so not restoring it would be wrong.
What *would* be interesting is to hear where the bad values came from in the first place. It sounds like the device state is saved after the PCI bus controller in front of the device has been crapped on, resulting in the PCI config cycles never reaching the device at all.
Something along this patch (together with suspend/resume debugging output) migth help pinpoint it. But it really sounds like something totally brokenly turned off the PCI bridge (some ACPI shutdown crud? I wouldn't be entirely surprised)
Cc: Greg KH greg@kroah.com Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
drivers/pci/pci.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff -puN drivers/pci/pci.c~workaround-for-a-pci-restoring-bug drivers/pci/pci.c --- a/drivers/pci/pci.c~workaround-for-a-pci-restoring-bug +++ a/drivers/pci/pci.c @@ -1094,6 +1094,15 @@ static void pci_restore_pcix_state(struc int pci_save_state(struct pci_dev *dev) { int i; + u32 val; + + /* Unable to read PCI device/manufacturer state? Something is seriously wrong! */ + if (pci_read_config_dword(dev, 0, &val) || val == 0xffffffff) { + printk("Broken read from PCI device %s\n", pci_name(dev)); + WARN_ON(1); + return -1; + } + /* XXX: 100% dword access ok here? */ for (i = 0; i < 16; i++) pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]); _