Crash in acpi_ns_validate_handle triggered by soundwire on Linux 5.10

Rafael J. Wysocki rafael at kernel.org
Thu Jan 28 13:13:24 CET 2021


On Wed, Jan 27, 2021 at 8:19 PM Marcin Ślusarz <marcin.slusarz at gmail.com> wrote:
>
> śr., 27 sty 2021 o 18:28 Pierre-Louis Bossart
> <pierre-louis.bossart at linux.intel.com> napisał(a):
> > > Weird, I can't reproduce this problem with my self-compiled kernel :/
> > > I don't even see soundwire modules loaded in. Manually loading them of course
> > > doesn't do much.
> > >
> > > Previously I could boot into the "faulty" kernel by using "recovery mode", but
> > > I can't do that anymore - it crashes too.
> > >
> > > Maybe there's some kind of race and this bug depends on some specific
> > > ordering of events?
> >
> > missing Kconfig?
> > You need CONFIG_SOUNDWIRE and CONFIG_SND_SOC_SOF_INTEL_SOUNDWIRE
> > selected to enter this sdw_intel_acpi_scan() routine.
>
> It was a PEBKAC, but a slightly different one. I won't bore you with
> (embarrassing) details ;).
>
> I reproduced the problem, tested both your and Rafael's patches
> and the kernel still crashes, with the same stack trace.
> (Yes, I'm sure I booted the right kernel :)
>
> Why "recovery mode" stopped working (or worked previously) is still a mystery.

So for clarity, you've tried this:

static int snd_intel_dsp_check_soundwire(struct pci_dev *pci)
{
    struct sdw_intel_acpi_info info;
    acpi_handle handle;
    int ret;

    handle = ACPI_HANDLE(&pci->dev);
    if (!handle)
        return -ENODEV;

and it has not made a difference?

And the relevant part of the trace is:

RIP: 0010:acpi_ns_validate_handle+0x1a/0x23
Code: 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 44 00 00
48 8d 57 ff 48 89 f8 48 83 fa fd 76 08 48 8b 05 0c b8 67 01 c3 <80> 7f
08 0f 74 02 31 c0 c3 0f 1f 44 00 00 48 8b 3d f6 b7 67 01 e8
RSP: 0000:ffffc388807c7b20 EFLAGS: 00010213
RAX: 0000000000000048 RBX: ffffc388807c7b70 RCX: 0000000000000000
RDX: 0000000000000047 RSI: 0000000000000246 RDI: 0000000000000048
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffffc0f5f4d1 R11: ffffffff8f0cb268 R12: 0000000000001001
R13: ffffffff8e33b160 R14: 0000000000000048 R15: 0000000000000000
FS:  00007f24548288c0(0000) GS:ffff9f781fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 0000000106158004 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
 acpi_get_data_full+0x4d/0x92
 acpi_bus_get_device+0x1f/0x40
 sdw_intel_acpi_scan+0x59/0x230 [soundwire_intel]
 ? strstr+0x22/0x60
 ? dmi_matches+0x76/0xe0
 snd_intel_dsp_driver_probe.cold+0xaf/0x163 [snd_intel_dspcfg]
 azx_probe+0x7a/0x970 [snd_hda_intel]
 local_pci_probe+0x42/0x80
 ? _cond_resched+0x16/0x40
 pci_device_probe+0xfd/0x1b0

so it looks like we got to sdw_intel_acpi_scan() with a non-NULL, but
otherwise invalid parent_handle which then was passed to
acpi_bus_get_device().  Subsequently it got to acpi_get_data_full()
and acpi_ns_validate_handle() that crashed, because it tried to
dereference it via ACPI_GET_DESCRIPTOR_TYPE().

To debug it further, can you please modify
snd_intel_dsp_check_soundwire() to read like this:

static int snd_intel_dsp_check_soundwire(struct pci_dev *pci)
{
    struct sdw_intel_acpi_info info;
    struct acpi_device *adev = NULL;
    acpi_handle handle;
    int ret;

    handle = ACPI_HANDLE(&pci->dev);
    if (!handle)
        return -ENODEV;

    if (acpi_bus_get_device(handle, &adev))
        return -ENODEV;

and see what happens then?


More information about the Alsa-devel mailing list