[alsa-devel] [PATCH 1/3] Revert "ASoC: Intel: Skylake: Acquire irq after RIRB allocation"
This reverts commit 12eeeb4f4733bbc4481d01df35933fc15beb8b19.
The patch doesn't fix accessing memory with null pointer in skl_interrupt().
There are two problems: 1) skl_init_chip() is called twice, before and after dma buffer is allocate. The first call sets bus->chip_init which prevents the second from initializing bus->corb.buf and rirb.buf from bus->rb.area. 2) snd_hdac_bus_init_chip() enables interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. There is a small window which skl_interrupt() can be called if irq has been acquired. If so, it crashes when using null dma buffer pointers.
Will fix the problems in the following patches. Also attaching the crash for future reference.
[ 16.949148] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [ 16.950829] gsmi: Log Shutdown Reason 0x03 [ 16.950830] Modules linked in: uvcvideo(+) videobuf2_vmalloc snd_soc_skl(+) videobuf2_memops videobuf2_v4l2 videobuf2_core snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi snd_hda_ext_core snd_hda_core snd_soc_max98357a acpi_als snd_soc_da7219 lzo lzo_compress zram snd_seq_dummy snd_seq snd_seq_device bridge stp llc ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_mark fuse cfg80211 iio_trig_sysfs cros_ec_sensors cros_ec_sensors_ring cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf industrialio smsc95xx usbnet mii joydev [ 16.950874] CPU: 0 PID: 1083 Comm: chrome Not tainted 4.14.64 #14 [ 16.950875] Hardware name: Google Yorp/Yorp, BIOS Google_Yorp.10985.0.2018_08_20_1648 08/17/2018 [ 16.950878] task: ffff88015b1c2b80 task.stack: ffff880155f30000 [ 16.950887] RIP: 0010:snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] [ 16.950889] RSP: 0000:ffff88015c807c08 EFLAGS: 00010003 [ 16.950891] RAX: 0000000000000101 RBX: 000000000000080c RCX: 1ffff10026822185 [ 16.950893] RDX: dffffc0000000000 RSI: ffff88015b1c2b80 RDI: ffffc90000514058 [ 16.950894] RBP: ffff88015c807c68 R08: 0000000000000000 R09: 0000000000000000 [ 16.950895] R10: 0000000000000000 R11: ffffffffc043074f R12: 0000000000000800 [ 16.950897] R13: 0000000000000001 R14: 0000000000000002 R15: 1ffff10026822119 [ 16.950899] FS: 00007d85924cc740(0000) GS:ffff88015c800000(0000) knlGS:0000000000000000 [ 16.950900] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.950902] CR2: 000058a54df16038 CR3: 00000001551c0000 CR4: 00000000003406f0 [ 16.950903] Call Trace: [ 16.950906] <IRQ> [ 16.950918] skl_interrupt+0x19e/0x2d6 [snd_soc_skl] [ 16.950926] ? dma_supported+0xb5/0xb5 [snd_soc_skl] [ 16.950933] __handle_irq_event_percpu+0x27a/0x6c8 [ 16.950937] ? __irq_wake_thread+0x1d1/0x1d1 [ 16.950942] ? __do_softirq+0x57a/0x69e [ 16.950944] handle_irq_event_percpu+0x95/0x1ba [ 16.950948] ? _raw_spin_unlock+0x65/0xdc [ 16.950951] ? __handle_irq_event_percpu+0x6c8/0x6c8 [ 16.950953] ? _raw_spin_unlock+0x65/0xdc [ 16.950957] ? time_cpufreq_notifier+0x483/0x483 [ 16.950959] handle_irq_event+0x89/0x123 [ 16.950962] handle_fasteoi_irq+0x16f/0x425 [ 16.950965] handle_irq+0x1fe/0x28e [ 16.950969] do_IRQ+0x6e/0x12e [ 16.950972] common_interrupt+0x7a/0x7a [ 16.950974] </IRQ> [ 16.950976] RIP: 0033:0x58097f61a5c0 [ 16.950978] RSP: 002b:00007ffe95c971a8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffbc [ 16.950980] RAX: 000058097f61a5c0 RBX: 00000e4ac5220560 RCX: 0000000000004e10 [ 16.950982] RDX: 000058098563df20 RSI: 00007ffe95c97250 RDI: 00000e4ac5220500 [ 16.950983] RBP: 00007ffe95c97410 R08: 0000000000000000 R09: 00007ffe95c97250 [ 16.950984] R10: 0000000000000000 R11: 0000000000000000 R12: 00000e4ac5220560 [ 16.950986] R13: 00000e4ac5220560 R14: 00000e4ac47c9650 R15: 0000580987646350 [ 16.950988] Code: 74 12 48 89 df e8 eb 2d 8e cd 48 ba 00 00 00 00 00 fc ff df 4c 8b 23 44 89 f0 83 c8 01 0f b7 c0 49 8d 1c 84 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 da 01 00 00 44 8b 3b 41 0f b7 c6 49 8d 1c [ 16.951031] RIP: snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] RSP: ffff88015c807c08 [ 16.951036] ---[ end trace 58bf9ece1775bc92 ]--- [ 16.956871] Kernel panic - not syncing: Fatal exception in interrupt [ 16.956888] Kernel Offset: 0xc800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Signed-off-by: Yu Zhao yuzhao@google.com --- sound/soc/intel/skylake/skl.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index cf09721ca13e..dce649485649 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -838,7 +838,11 @@ static int skl_first_init(struct hdac_bus *bus)
snd_hdac_bus_parse_capabilities(bus);
+ if (skl_acquire_irq(bus, 0) < 0) + return -EBUSY; + pci_set_master(pci); + synchronize_irq(bus->irq);
gcap = snd_hdac_chip_readw(bus, GCAP); dev_dbg(bus->dev, "chipset global capabilities = 0x%x\n", gcap); @@ -871,12 +875,6 @@ static int skl_first_init(struct hdac_bus *bus) if (err < 0) return err;
- err = skl_acquire_irq(bus, 0); - if (err < 0) - return err; - - synchronize_irq(bus->irq); - /* initialize chip */ skl_init_pci(skl);
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Signed-off-by: Yu Zhao yuzhao@google.com --- sound/hda/hdac_controller.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 560ec0986e1a..11057d9f84ec 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -40,6 +40,8 @@ static void azx_clear_corbrp(struct hdac_bus *bus) */ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus) { + WARN_ON_ONCE(!bus->rb.area); + spin_lock_irq(&bus->reg_lock); /* CORB set up */ bus->corb.addr = bus->rb.addr; @@ -479,13 +481,15 @@ bool snd_hdac_bus_init_chip(struct hdac_bus *bus, bool full_reset) /* reset controller */ azx_reset(bus, full_reset);
- /* initialize interrupts */ + /* clear interrupts */ azx_int_clear(bus); - azx_int_enable(bus);
/* initialize the codec command I/O */ snd_hdac_bus_init_cmd_io(bus);
+ /* enable interrupts after CORB/RIRB buffers are initialized above */ + azx_int_enable(bus); + /* program the position buffer */ if (bus->use_posbuf && bus->posbuf.addr) { snd_hdac_chip_writel(bus, DPLBASE, (u32)bus->posbuf.addr);
Internally, skl_init_chip() calls snd_hdac_bus_init_chip() which 1) sets bus->chip_init to prevent multiple entrances before device is stopped; 2) enables interrupt.
We shouldn't use it for the purpose of resetting device only because 1) when we really want to initialize device, we won't be able to do so; 2) we are ready to handle interrupt yet, and kernel crashes when interrupt comes in.
Rename azx_reset() to snd_hdac_bus_reset_link(), and use it to reset device properly.
Fixes: 60767abcea3d ("ASoC: Intel: Skylake: Reset the controller in probe") Signed-off-by: Yu Zhao yuzhao@google.com --- include/sound/hdaudio.h | 1 + sound/hda/hdac_controller.c | 7 ++++--- sound/soc/intel/skylake/skl.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/sound/hdaudio.h b/include/sound/hdaudio.h index 6f1e1f3b3063..cd1773d0e08f 100644 --- a/include/sound/hdaudio.h +++ b/include/sound/hdaudio.h @@ -412,6 +412,7 @@ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_stop_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_enter_link_reset(struct hdac_bus *bus); void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus); +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset);
void snd_hdac_bus_update_rirb(struct hdac_bus *bus); int snd_hdac_bus_handle_stream_irq(struct hdac_bus *bus, unsigned int status, diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 11057d9f84ec..74244d8e2909 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -385,7 +385,7 @@ void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus) EXPORT_SYMBOL_GPL(snd_hdac_bus_exit_link_reset);
/* reset codec link */ -static int azx_reset(struct hdac_bus *bus, bool full_reset) +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset) { if (!full_reset) goto skip_reset; @@ -410,7 +410,7 @@ static int azx_reset(struct hdac_bus *bus, bool full_reset) skip_reset: /* check to see if controller is ready */ if (!snd_hdac_chip_readb(bus, GCTL)) { - dev_dbg(bus->dev, "azx_reset: controller not ready!\n"); + dev_dbg(bus->dev, "controller not ready!\n"); return -EBUSY; }
@@ -425,6 +425,7 @@ static int azx_reset(struct hdac_bus *bus, bool full_reset)
return 0; } +EXPORT_SYMBOL_GPL(snd_hdac_bus_reset_link);
/* enable interrupts */ static void azx_int_enable(struct hdac_bus *bus) @@ -479,7 +480,7 @@ bool snd_hdac_bus_init_chip(struct hdac_bus *bus, bool full_reset) return false;
/* reset controller */ - azx_reset(bus, full_reset); + snd_hdac_bus_reset_link(bus, full_reset);
/* clear interrupts */ azx_int_clear(bus); diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index dce649485649..1d17be0f78a0 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -834,7 +834,7 @@ static int skl_first_init(struct hdac_bus *bus) return -ENXIO; }
- skl_init_chip(bus, true); + snd_hdac_bus_reset_link(bus, true);
snd_hdac_bus_parse_capabilities(bus);
On Mon, 10 Sep 2018 23:23:58 +0200, Yu Zhao wrote:
Internally, skl_init_chip() calls snd_hdac_bus_init_chip() which
- sets bus->chip_init to prevent multiple entrances before device
is stopped; 2) enables interrupt.
We shouldn't use it for the purpose of resetting device only because
- when we really want to initialize device, we won't be able to do
so; 2) we are ready to handle interrupt yet, and kernel crashes when interrupt comes in.
Rename azx_reset() to snd_hdac_bus_reset_link(), and use it to reset device properly.
Fixes: 60767abcea3d ("ASoC: Intel: Skylake: Reset the controller in probe") Signed-off-by: Yu Zhao yuzhao@google.com
That makes sense.
And I noticed that the legacy HD-audio driver potentially needs the same reset (although we haven't heard of any problem for years).
So now I wonder whether this requirement of reset is really mandatory for the real hardware, or just theoretical...
In anyway,
Reviewed-by: Takashi Iwai tiwai@suse.de
thanks,
Takashi
On Mon, 10 Sep 2018 23:21:50 +0200, Yu Zhao wrote:
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Signed-off-by: Yu Zhao yuzhao@google.com
Looks good to me.
Reviewed-by: Takashi Iwai tiwai@suse.de
BTW, the reason why this hasn't been hit on the legacy HD-audio driver is that we allocate usually with MSI, so the irq is isolated.
Any reason that Intel SKL driver doesn't use MST?
thanks,
Takashi
On Tue, Sep 11, 2018 at 08:06:49AM +0200, Takashi Iwai wrote:
On Mon, 10 Sep 2018 23:21:50 +0200, Yu Zhao wrote:
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Signed-off-by: Yu Zhao yuzhao@google.com
Looks good to me.
Reviewed-by: Takashi Iwai tiwai@suse.de
BTW, the reason why this hasn't been hit on the legacy HD-audio driver is that we allocate usually with MSI, so the irq is isolated.
Any reason that Intel SKL driver doesn't use MST?
This I'm not sure. Vinod might have answer to it, according to https://patchwork.kernel.org/patch/6375831/#13796611
On 11-09-18, 14:58, Yu Zhao wrote:
On Tue, Sep 11, 2018 at 08:06:49AM +0200, Takashi Iwai wrote:
On Mon, 10 Sep 2018 23:21:50 +0200, Yu Zhao wrote:
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Signed-off-by: Yu Zhao yuzhao@google.com
Looks good to me.
Reviewed-by: Takashi Iwai tiwai@suse.de
BTW, the reason why this hasn't been hit on the legacy HD-audio driver is that we allocate usually with MSI, so the irq is isolated.
Any reason that Intel SKL driver doesn't use MST?
This I'm not sure. Vinod might have answer to it, according to https://patchwork.kernel.org/patch/6375831/#13796611
IIRC (seemed quite some time back) we faced issues with using MSI on SKL and didnt try afterwards. If Intel folks can try it and check. Pierre is out, maybe Liam can help..?
On Mon, 10 Sep 2018 23:17:18 +0200, Yu Zhao wrote:
This reverts commit 12eeeb4f4733bbc4481d01df35933fc15beb8b19.
The patch doesn't fix accessing memory with null pointer in skl_interrupt().
There are two problems: 1) skl_init_chip() is called twice, before and after dma buffer is allocate. The first call sets bus->chip_init which prevents the second from initializing bus->corb.buf and rirb.buf from bus->rb.area. 2) snd_hdac_bus_init_chip() enables interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. There is a small window which skl_interrupt() can be called if irq has been acquired. If so, it crashes when using null dma buffer pointers.
Will fix the problems in the following patches. Also attaching the crash for future reference.
[ 16.949148] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [ 16.950829] gsmi: Log Shutdown Reason 0x03 [ 16.950830] Modules linked in: uvcvideo(+) videobuf2_vmalloc snd_soc_skl(+) videobuf2_memops videobuf2_v4l2 videobuf2_core snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi snd_hda_ext_core snd_hda_core snd_soc_max98357a acpi_als snd_soc_da7219 lzo lzo_compress zram snd_seq_dummy snd_seq snd_seq_device bridge stp llc ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_mark fuse cfg80211 iio_trig_sysfs cros_ec_sensors cros_ec_sensors_ring cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf industrialio smsc95xx usbnet mii joydev [ 16.950874] CPU: 0 PID: 1083 Comm: chrome Not tainted 4.14.64 #14 [ 16.950875] Hardware name: Google Yorp/Yorp, BIOS Google_Yorp.10985.0.2018_08_20_1648 08/17/2018 [ 16.950878] task: ffff88015b1c2b80 task.stack: ffff880155f30000 [ 16.950887] RIP: 0010:snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] [ 16.950889] RSP: 0000:ffff88015c807c08 EFLAGS: 00010003 [ 16.950891] RAX: 0000000000000101 RBX: 000000000000080c RCX: 1ffff10026822185 [ 16.950893] RDX: dffffc0000000000 RSI: ffff88015b1c2b80 RDI: ffffc90000514058 [ 16.950894] RBP: ffff88015c807c68 R08: 0000000000000000 R09: 0000000000000000 [ 16.950895] R10: 0000000000000000 R11: ffffffffc043074f R12: 0000000000000800 [ 16.950897] R13: 0000000000000001 R14: 0000000000000002 R15: 1ffff10026822119 [ 16.950899] FS: 00007d85924cc740(0000) GS:ffff88015c800000(0000) knlGS:0000000000000000 [ 16.950900] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.950902] CR2: 000058a54df16038 CR3: 00000001551c0000 CR4: 00000000003406f0 [ 16.950903] Call Trace: [ 16.950906] <IRQ> [ 16.950918] skl_interrupt+0x19e/0x2d6 [snd_soc_skl] [ 16.950926] ? dma_supported+0xb5/0xb5 [snd_soc_skl] [ 16.950933] __handle_irq_event_percpu+0x27a/0x6c8 [ 16.950937] ? __irq_wake_thread+0x1d1/0x1d1 [ 16.950942] ? __do_softirq+0x57a/0x69e [ 16.950944] handle_irq_event_percpu+0x95/0x1ba [ 16.950948] ? _raw_spin_unlock+0x65/0xdc [ 16.950951] ? __handle_irq_event_percpu+0x6c8/0x6c8 [ 16.950953] ? _raw_spin_unlock+0x65/0xdc [ 16.950957] ? time_cpufreq_notifier+0x483/0x483 [ 16.950959] handle_irq_event+0x89/0x123 [ 16.950962] handle_fasteoi_irq+0x16f/0x425 [ 16.950965] handle_irq+0x1fe/0x28e [ 16.950969] do_IRQ+0x6e/0x12e [ 16.950972] common_interrupt+0x7a/0x7a [ 16.950974] </IRQ> [ 16.950976] RIP: 0033:0x58097f61a5c0 [ 16.950978] RSP: 002b:00007ffe95c971a8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffbc [ 16.950980] RAX: 000058097f61a5c0 RBX: 00000e4ac5220560 RCX: 0000000000004e10 [ 16.950982] RDX: 000058098563df20 RSI: 00007ffe95c97250 RDI: 00000e4ac5220500 [ 16.950983] RBP: 00007ffe95c97410 R08: 0000000000000000 R09: 00007ffe95c97250 [ 16.950984] R10: 0000000000000000 R11: 0000000000000000 R12: 00000e4ac5220560 [ 16.950986] R13: 00000e4ac5220560 R14: 00000e4ac47c9650 R15: 0000580987646350 [ 16.950988] Code: 74 12 48 89 df e8 eb 2d 8e cd 48 ba 00 00 00 00 00 fc ff df 4c 8b 23 44 89 f0 83 c8 01 0f b7 c0 49 8d 1c 84 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 da 01 00 00 44 8b 3b 41 0f b7 c6 49 8d 1c [ 16.951031] RIP: snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] RSP: ffff88015c807c08 [ 16.951036] ---[ end trace 58bf9ece1775bc92 ]--- [ 16.956871] Kernel panic - not syncing: Fatal exception in interrupt [ 16.956888] Kernel Offset: 0xc800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Please try to rip off uninteresting hex values and other stuff there.
Signed-off-by: Yu Zhao yuzhao@google.com
Put Fixes tag.
thanks,
Takashi
On Tue, Sep 11, 2018 at 08:03:21AM +0200, Takashi Iwai wrote:
Yu Zhao wrote:
Will fix the problems in the following patches. Also attaching the crash for future reference.
[ 16.950887] RIP: 0010:snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] [ 16.950889] RSP: 0000:ffff88015c807c08 EFLAGS: 00010003 [ 16.950891] RAX: 0000000000000101 RBX: 000000000000080c RCX: 1ffff10026822185
Please try to rip off uninteresting hex values and other stuff there.
Right:
Please think hard before including complete backtraces in upstream reports, they are very large and contain almost no useful information relative to their size so often obscure the relevant content in your message. If part of the backtrace is usefully illustrative then it's usually better to pull out the relevant sections.
On Tue, Sep 11, 2018 at 05:36:36PM +0100, Mark Brown wrote:
On Tue, Sep 11, 2018 at 08:03:21AM +0200, Takashi Iwai wrote:
Yu Zhao wrote:
Will fix the problems in the following patches. Also attaching the crash for future reference.
[ 16.950887] RIP: 0010:snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] [ 16.950889] RSP: 0000:ffff88015c807c08 EFLAGS: 00010003 [ 16.950891] RAX: 0000000000000101 RBX: 000000000000080c RCX: 1ffff10026822185
Please try to rip off uninteresting hex values and other stuff there.
Right:
Please think hard before including complete backtraces in upstream reports, they are very large and contain almost no useful information relative to their size so often obscure the relevant content in your message. If part of the backtrace is usefully illustrative then it's usually better to pull out the relevant sections.
Thanks, will do in v2.
This reverts commit 12eeeb4f4733bbc4481d01df35933fc15beb8b19.
The patch doesn't fix accessing memory with null pointer in skl_interrupt().
There are two problems: 1) skl_init_chip() is called twice, before and after dma buffer is allocate. The first call sets bus->chip_init which prevents the second from initializing bus->corb.buf and rirb.buf from bus->rb.area. 2) snd_hdac_bus_init_chip() enables interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. There is a small window which skl_interrupt() can be called if irq has been acquired. If so, it crashes when using null dma buffer pointers.
Will fix the problems in the following patches. Also attaching the crash for future reference.
[ 16.949148] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI <snipped> [ 16.950903] Call Trace: [ 16.950906] <IRQ> [ 16.950918] skl_interrupt+0x19e/0x2d6 [snd_soc_skl] [ 16.950926] ? dma_supported+0xb5/0xb5 [snd_soc_skl] [ 16.950933] __handle_irq_event_percpu+0x27a/0x6c8 [ 16.950937] ? __irq_wake_thread+0x1d1/0x1d1 [ 16.950942] ? __do_softirq+0x57a/0x69e [ 16.950944] handle_irq_event_percpu+0x95/0x1ba [ 16.950948] ? _raw_spin_unlock+0x65/0xdc [ 16.950951] ? __handle_irq_event_percpu+0x6c8/0x6c8 [ 16.950953] ? _raw_spin_unlock+0x65/0xdc [ 16.950957] ? time_cpufreq_notifier+0x483/0x483 [ 16.950959] handle_irq_event+0x89/0x123 [ 16.950962] handle_fasteoi_irq+0x16f/0x425 [ 16.950965] handle_irq+0x1fe/0x28e [ 16.950969] do_IRQ+0x6e/0x12e [ 16.950972] common_interrupt+0x7a/0x7a [ 16.950974] </IRQ> <snipped> [ 16.951031] RIP: snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] RSP: ffff88015c807c08 [ 16.951036] ---[ end trace 58bf9ece1775bc92 ]---
Fixes: 2eeeb4f4733b ("ASoC: Intel: Skylake: Acquire irq after RIRB allocation") Signed-off-by: Yu Zhao yuzhao@google.com --- sound/soc/intel/skylake/skl.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index e7fd14daeb4f..d174cbe35f7a 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -912,7 +912,11 @@ static int skl_first_init(struct hdac_bus *bus)
snd_hdac_bus_parse_capabilities(bus);
+ if (skl_acquire_irq(bus, 0) < 0) + return -EBUSY; + pci_set_master(pci); + synchronize_irq(bus->irq);
gcap = snd_hdac_chip_readw(bus, GCAP); dev_dbg(bus->dev, "chipset global capabilities = 0x%x\n", gcap); @@ -945,12 +949,6 @@ static int skl_first_init(struct hdac_bus *bus) if (err < 0) return err;
- err = skl_acquire_irq(bus, 0); - if (err < 0) - return err; - - synchronize_irq(bus->irq); - /* initialize chip */ skl_init_pci(skl);
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Yu Zhao yuzhao@google.com --- sound/hda/hdac_controller.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 560ec0986e1a..11057d9f84ec 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -40,6 +40,8 @@ static void azx_clear_corbrp(struct hdac_bus *bus) */ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus) { + WARN_ON_ONCE(!bus->rb.area); + spin_lock_irq(&bus->reg_lock); /* CORB set up */ bus->corb.addr = bus->rb.addr; @@ -479,13 +481,15 @@ bool snd_hdac_bus_init_chip(struct hdac_bus *bus, bool full_reset) /* reset controller */ azx_reset(bus, full_reset);
- /* initialize interrupts */ + /* clear interrupts */ azx_int_clear(bus); - azx_int_enable(bus);
/* initialize the codec command I/O */ snd_hdac_bus_init_cmd_io(bus);
+ /* enable interrupts after CORB/RIRB buffers are initialized above */ + azx_int_enable(bus); + /* program the position buffer */ if (bus->use_posbuf && bus->posbuf.addr) { snd_hdac_chip_writel(bus, DPLBASE, (u32)bus->posbuf.addr);
Internally, skl_init_chip() calls snd_hdac_bus_init_chip() which 1) sets bus->chip_init to prevent multiple entrances before device is stopped; 2) enables interrupt.
We shouldn't use it for the purpose of resetting device only because 1) when we really want to initialize device, we won't be able to do so; 2) we are ready to handle interrupt yet, and kernel crashes when interrupt comes in.
Rename azx_reset() to snd_hdac_bus_reset_link(), and use it to reset device properly.
Fixes: 60767abcea3d ("ASoC: Intel: Skylake: Reset the controller in probe") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Yu Zhao yuzhao@google.com --- include/sound/hdaudio.h | 1 + sound/hda/hdac_controller.c | 7 ++++--- sound/soc/intel/skylake/skl.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/sound/hdaudio.h b/include/sound/hdaudio.h index 6f1e1f3b3063..cd1773d0e08f 100644 --- a/include/sound/hdaudio.h +++ b/include/sound/hdaudio.h @@ -412,6 +412,7 @@ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_stop_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_enter_link_reset(struct hdac_bus *bus); void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus); +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset);
void snd_hdac_bus_update_rirb(struct hdac_bus *bus); int snd_hdac_bus_handle_stream_irq(struct hdac_bus *bus, unsigned int status, diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 11057d9f84ec..74244d8e2909 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -385,7 +385,7 @@ void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus) EXPORT_SYMBOL_GPL(snd_hdac_bus_exit_link_reset);
/* reset codec link */ -static int azx_reset(struct hdac_bus *bus, bool full_reset) +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset) { if (!full_reset) goto skip_reset; @@ -410,7 +410,7 @@ static int azx_reset(struct hdac_bus *bus, bool full_reset) skip_reset: /* check to see if controller is ready */ if (!snd_hdac_chip_readb(bus, GCTL)) { - dev_dbg(bus->dev, "azx_reset: controller not ready!\n"); + dev_dbg(bus->dev, "controller not ready!\n"); return -EBUSY; }
@@ -425,6 +425,7 @@ static int azx_reset(struct hdac_bus *bus, bool full_reset)
return 0; } +EXPORT_SYMBOL_GPL(snd_hdac_bus_reset_link);
/* enable interrupts */ static void azx_int_enable(struct hdac_bus *bus) @@ -479,7 +480,7 @@ bool snd_hdac_bus_init_chip(struct hdac_bus *bus, bool full_reset) return false;
/* reset controller */ - azx_reset(bus, full_reset); + snd_hdac_bus_reset_link(bus, full_reset);
/* clear interrupts */ azx_int_clear(bus); diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index d174cbe35f7a..29225623b4b4 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -908,7 +908,7 @@ static int skl_first_init(struct hdac_bus *bus) return -ENXIO; }
- skl_init_chip(bus, true); + snd_hdac_bus_reset_link(bus, true);
snd_hdac_bus_parse_capabilities(bus);
On Tue, Sep 11, 2018 at 03:12:46PM -0600, Yu Zhao wrote:
This reverts commit 12eeeb4f4733bbc4481d01df35933fc15beb8b19.
The patch doesn't fix accessing memory with null pointer in skl_interrupt().
Please use normal subject lines and so on even for revert patches - they're patches just the same as any other.
On Wed, Sep 12, 2018 at 11:20:20AM +0100, Mark Brown wrote:
On Tue, Sep 11, 2018 at 03:12:46PM -0600, Yu Zhao wrote:
This reverts commit 12eeeb4f4733bbc4481d01df35933fc15beb8b19.
The patch doesn't fix accessing memory with null pointer in skl_interrupt().
Please use normal subject lines and so on even for revert patches - they're patches just the same as any other.
Well, they seem pretty normal to me, and have been used by many:
commit 82ab86e82911107ead6fc7cd73568f75bc266a57 Author: Mark Brown broonie@kernel.org Date: Mon Oct 30 19:40:25 2017 +0000
Revert "ASoC: rt5651: Enable jack detection on JD* pins"
This reverts commit 60d5a1a47b9a8381c08d2263b11ac9c757c87746.
Signed-off-by: Mark Brown broonie@kernel.org
Anyway, I'll update them in v3.
On Wed, Sep 12, 2018 at 01:32:21PM -0600, Yu Zhao wrote:
On Wed, Sep 12, 2018 at 11:20:20AM +0100, Mark Brown wrote:
Please use normal subject lines and so on even for revert patches - they're patches just the same as any other.
Well, they seem pretty normal to me, and have been used by many:
Many people do them but that doesn't mean they're good practice!
This reverts commit 12eeeb4f4733bbc4481d01df35933fc15beb8b19.
The patch claims it fixes accessing memory with null pointer on skl_interrupt() and snd_hdac_bus_update_rirb() path, but in fact it has no effect.
There are two problems: 1) skl_init_chip() is called twice, before and after dma buffer is allocate. The first call sets bus->chip_init which prevents the second from initializing bus->corb.buf and rirb.buf from bus->rb.area. 2) snd_hdac_bus_init_chip() enables interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. There is a small window which skl_interrupt() can be called if irq has been acquired. If so, it crashes when using null dma buffer pointers.
Will fix the problems in the following patches. Also attaching the crash for future reference.
[ 16.949148] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI <snipped> [ 16.950903] Call Trace: [ 16.950906] <IRQ> [ 16.950918] skl_interrupt+0x19e/0x2d6 [snd_soc_skl] [ 16.950926] ? dma_supported+0xb5/0xb5 [snd_soc_skl] [ 16.950933] __handle_irq_event_percpu+0x27a/0x6c8 [ 16.950937] ? __irq_wake_thread+0x1d1/0x1d1 [ 16.950942] ? __do_softirq+0x57a/0x69e [ 16.950944] handle_irq_event_percpu+0x95/0x1ba [ 16.950948] ? _raw_spin_unlock+0x65/0xdc [ 16.950951] ? __handle_irq_event_percpu+0x6c8/0x6c8 [ 16.950953] ? _raw_spin_unlock+0x65/0xdc [ 16.950957] ? time_cpufreq_notifier+0x483/0x483 [ 16.950959] handle_irq_event+0x89/0x123 [ 16.950962] handle_fasteoi_irq+0x16f/0x425 [ 16.950965] handle_irq+0x1fe/0x28e [ 16.950969] do_IRQ+0x6e/0x12e [ 16.950972] common_interrupt+0x7a/0x7a [ 16.950974] </IRQ> <snipped> [ 16.951031] RIP: snd_hdac_bus_update_rirb+0x19b/0x4cf [snd_hda_core] RSP: ffff88015c807c08 [ 16.951036] ---[ end trace 58bf9ece1775bc92 ]---
Fixes: 2eeeb4f4733b ("ASoC: Intel: Skylake: Acquire irq after RIRB allocation") Signed-off-by: Yu Zhao yuzhao@google.com --- sound/soc/intel/skylake/skl.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index e7fd14daeb4f..d174cbe35f7a 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -912,7 +912,11 @@ static int skl_first_init(struct hdac_bus *bus)
snd_hdac_bus_parse_capabilities(bus);
+ if (skl_acquire_irq(bus, 0) < 0) + return -EBUSY; + pci_set_master(pci); + synchronize_irq(bus->irq);
gcap = snd_hdac_chip_readw(bus, GCAP); dev_dbg(bus->dev, "chipset global capabilities = 0x%x\n", gcap); @@ -945,12 +949,6 @@ static int skl_first_init(struct hdac_bus *bus) if (err < 0) return err;
- err = skl_acquire_irq(bus, 0); - if (err < 0) - return err; - - synchronize_irq(bus->irq); - /* initialize chip */ skl_init_pci(skl);
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Yu Zhao yuzhao@google.com --- sound/hda/hdac_controller.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 560ec0986e1a..11057d9f84ec 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -40,6 +40,8 @@ static void azx_clear_corbrp(struct hdac_bus *bus) */ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus) { + WARN_ON_ONCE(!bus->rb.area); + spin_lock_irq(&bus->reg_lock); /* CORB set up */ bus->corb.addr = bus->rb.addr; @@ -479,13 +481,15 @@ bool snd_hdac_bus_init_chip(struct hdac_bus *bus, bool full_reset) /* reset controller */ azx_reset(bus, full_reset);
- /* initialize interrupts */ + /* clear interrupts */ azx_int_clear(bus); - azx_int_enable(bus);
/* initialize the codec command I/O */ snd_hdac_bus_init_cmd_io(bus);
+ /* enable interrupts after CORB/RIRB buffers are initialized above */ + azx_int_enable(bus); + /* program the position buffer */ if (bus->use_posbuf && bus->posbuf.addr) { snd_hdac_chip_writel(bus, DPLBASE, (u32)bus->posbuf.addr);
Internally, skl_init_chip() calls snd_hdac_bus_init_chip() which 1) sets bus->chip_init to prevent multiple entrances before device is stopped; 2) enables interrupt.
We shouldn't use it for the purpose of resetting device only because 1) when we really want to initialize device, we won't be able to do so; 2) we are ready to handle interrupt yet, and kernel crashes when interrupt comes in.
Rename azx_reset() to snd_hdac_bus_reset_link(), and use it to reset device properly.
Fixes: 60767abcea3d ("ASoC: Intel: Skylake: Reset the controller in probe") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Yu Zhao yuzhao@google.com --- include/sound/hdaudio.h | 1 + sound/hda/hdac_controller.c | 7 ++++--- sound/soc/intel/skylake/skl.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/sound/hdaudio.h b/include/sound/hdaudio.h index 6f1e1f3b3063..cd1773d0e08f 100644 --- a/include/sound/hdaudio.h +++ b/include/sound/hdaudio.h @@ -412,6 +412,7 @@ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_stop_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_enter_link_reset(struct hdac_bus *bus); void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus); +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset);
void snd_hdac_bus_update_rirb(struct hdac_bus *bus); int snd_hdac_bus_handle_stream_irq(struct hdac_bus *bus, unsigned int status, diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 11057d9f84ec..74244d8e2909 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -385,7 +385,7 @@ void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus) EXPORT_SYMBOL_GPL(snd_hdac_bus_exit_link_reset);
/* reset codec link */ -static int azx_reset(struct hdac_bus *bus, bool full_reset) +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset) { if (!full_reset) goto skip_reset; @@ -410,7 +410,7 @@ static int azx_reset(struct hdac_bus *bus, bool full_reset) skip_reset: /* check to see if controller is ready */ if (!snd_hdac_chip_readb(bus, GCTL)) { - dev_dbg(bus->dev, "azx_reset: controller not ready!\n"); + dev_dbg(bus->dev, "controller not ready!\n"); return -EBUSY; }
@@ -425,6 +425,7 @@ static int azx_reset(struct hdac_bus *bus, bool full_reset)
return 0; } +EXPORT_SYMBOL_GPL(snd_hdac_bus_reset_link);
/* enable interrupts */ static void azx_int_enable(struct hdac_bus *bus) @@ -479,7 +480,7 @@ bool snd_hdac_bus_init_chip(struct hdac_bus *bus, bool full_reset) return false;
/* reset controller */ - azx_reset(bus, full_reset); + snd_hdac_bus_reset_link(bus, full_reset);
/* clear interrupts */ azx_int_clear(bus); diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index d174cbe35f7a..29225623b4b4 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -908,7 +908,7 @@ static int skl_first_init(struct hdac_bus *bus) return -ENXIO; }
- skl_init_chip(bus, true); + snd_hdac_bus_reset_link(bus, true);
snd_hdac_bus_parse_capabilities(bus);
On Wed, Sep 12, 2018 at 01:44:41PM -0600, Yu Zhao wrote:
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
sound/hda/hdac_controller.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
This is an ALSA commit, not an ASoC one.
participants (4)
-
Mark Brown
-
Takashi Iwai
-
Vinod
-
Yu Zhao