[alsa-devel] [PATCH] sound/usb: fix to release stream resources from media_snd_device_delete()

Shuah Khan shuahkh at osg.samsung.com
Tue Mar 22 14:03:42 CET 2016


On 03/21/2016 10:01 PM, Shuah Khan wrote:
> On 03/19/2016 07:31 AM, Shuah Khan wrote:
>> On 03/19/2016 06:10 AM, Mauro Carvalho Chehab wrote:
>>> Em Fri, 18 Mar 2016 20:50:31 -0600
>>> Shuah Khan <shuahkh at osg.samsung.com> escreveu:
>>>
>>>> Fix to release stream resources from media_snd_device_delete() before
>>>> media device is unregistered. Without this change, stream resource free
>>>> is attempted after the media device is unregistered which would result
>>>> in use-after-free errors.
>>>>
>>>> Signed-off-by: Shuah Khan <shuahkh at osg.samsung.com>
>>>> ---
>>>>
>>>> - Ran bind/unbind loop (1000 iteration) test on snd-usb-audio
>>>>   while running mc_nextgen_test loop (1000 iterations) in parallel.
>>>> - Ran bind/unbind and rmmod/modprobe tests on both drivers. Also
>>>>   generated graphs when after bind/unbind, rmmod/modprobe. Graphs
>>>>   look good.
>>>> - Note: Please apply the following patch to fix memory leak:
>>>>   sound/usb: Fix memory leak in media_snd_stream_delete() during unbind
>>>>   https://lkml.org/lkml/2016/3/16/1050
>>>
>>> Yeah, a way better!
>>>
>>> For normal bind/unbind, it seems to be working fine. Also
>>> for driver's rmmod, so:
>>>
>>> Tested-by: Mauro Carvalho Chehab <mchehab at osg.samsung.com>
>>>
>>> PS.:
>>> ===
>>>
>>> There are still some troubles if we run an infinite loop
>>> binding/unbinding au0828 and another one binding/unbinding
>>> snd-usb-audio, like this one:
>>
>> Yes. I noticed this one when I was running a loop of 1000 on au0828.
>> I couldn't reproduce this when I tested this patch.
>>
>> P.S: au08282 loops takes longer btw since au0828 initialization is lot more
>> complex. We have to look at this one.
>>
>>>
>>>
>>> [   91.556188] ------------[ cut here ]------------
>>> [   91.556500] WARNING: CPU: 1 PID: 2920 at drivers/media/media-entity.c:392 __media_entity_pipeline_start+0x271/0xce0 [media]()
>>> [   91.556626] Modules linked in: ir_lirc_codec lirc_dev au8522_dig xc5000 tuner au8522_decoder au8522_common au0828 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core tveeprom dvb_core rc_core v4l2_common videodev snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device media cpufreq_powersave cpufreq_conservative cpufreq_userspace cpufreq_stats parport_pc ppdev lp parport snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt kvm iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha256_ssse3 sha256_generic hmac drbg snd_hda_codec_realtek i915 snd_hda_codec_generic aesni_intel aes_x86_64 btusb lrw btrtl gf128mul snd_hda_intel glue_helper ablk_helper btbcm btintel cryptd psmouse snd_hda_codec bluetooth
>>> [   91.556693]  snd_hwdep i2c_algo_bit sg snd_hda_core serio_raw pcspkr evdev drm_kms_helper snd_pcm rfkill drm snd_timer mei_me snd i2c_i801 soundcore lpc_ich mei mfd_core battery video dw_dmac i2c_designware_platform dw_dmac_core i2c_designware_core acpi_pad button tpm_tis tpm ext4 crc16 mbcache jbd2 dm_mod sd_mod hid_generic usbhid ahci libahci libata e1000e xhci_pci ptp scsi_mod ehci_pci ehci_hcd pps_core xhci_hcd fan thermal sdhci_acpi sdhci mmc_core i2c_hid hid
>>> [   91.556748] CPU: 1 PID: 2920 Comm: v4l_id Tainted: G      D W       4.5.0+ #62
>>> [   91.556751] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0350.2015.0812.1722 08/12/2015
>>> [   91.556754]  ffffffffa0e247a0 ffff8803a3d17b08 ffffffff819447c1 0000000000000000
>>> [   91.556759]  ffff88009bbe17c0 ffff8803a3d17b48 ffffffff8114fd16 ffffffffa0e20651
>>> [   91.556763]  ffff8803a47c9c58 ffff8803a477d2c8 ffff8803a5ac96f8 dffffc0000000000
>>> [   91.556768] Call Trace:
>>> [   91.556774]  [<ffffffff819447c1>] dump_stack+0x85/0xc4
>>> [   91.556778]  [<ffffffff8114fd16>] warn_slowpath_common+0xc6/0x120
>>> [   91.556783]  [<ffffffffa0e20651>] ? __media_entity_pipeline_start+0x271/0xce0 [media]
>>> [   91.556786]  [<ffffffff8114feea>] warn_slowpath_null+0x1a/0x20
>>> [   91.556790]  [<ffffffffa0e20651>] __media_entity_pipeline_start+0x271/0xce0 [media]
>>> [   91.556794]  [<ffffffff822d9aca>] ? __schedule+0x78a/0x2570
>>> [   91.556797]  [<ffffffff8156e428>] ? memset+0x28/0x30
>>> [   91.556802]  [<ffffffffa0e203e0>] ? media_entity_pipeline_stop+0x60/0x60 [media]
>>> [   91.556806]  [<ffffffff8124a73d>] ? trace_hardirqs_on+0xd/0x10
>>> [   91.556810]  [<ffffffffa1430a05>] ? au0828_enable_source+0x55/0x9f0 [au0828]
>>> [   91.556813]  [<ffffffff822ddb20>] ? mutex_trylock+0x400/0x400
>>> [   91.556818]  [<ffffffffa1440833>] ? au0828_v4l2_close+0xb3/0x590 [au0828]
>>> [   91.556822]  [<ffffffffa1430da4>] au0828_enable_source+0x3f4/0x9f0 [au0828]
>>> [   91.556824]  [<ffffffff822ddb20>] ? mutex_trylock+0x400/0x400
>>> [   91.556834]  [<ffffffffa133acf6>] v4l_enable_media_source+0x66/0x90 [videodev]
>>> [   91.556839]  [<ffffffffa14409da>] au0828_v4l2_close+0x25a/0x590 [au0828]
>>> [   91.556846]  [<ffffffffa1301270>] v4l2_release+0xf0/0x210 [videodev]
>>> [   91.556849]  [<ffffffff815c2c4f>] __fput+0x20f/0x6d0
>>> [   91.556853]  [<ffffffff815c317e>] ____fput+0xe/0x10
>>> [   91.556856]  [<ffffffff811acde7>] task_work_run+0x137/0x200
>>> [   91.556859]  [<ffffffff81005d54>] exit_to_usermode_loop+0x154/0x180
>>> [   91.556863]  [<ffffffff8124a1b6>] ? trace_hardirqs_on_caller+0x16/0x590
>>> [   91.556866]  [<ffffffff810073a6>] syscall_return_slowpath+0x186/0x1c0
>>> [   91.556869]  [<ffffffff822e7a1c>] entry_SYSCALL_64_fastpath+0xbf/0xc1
>>> [   91.556872] ---[ end trace 3e95e11ff0b9efad ]---
>>>
>>> I suspect that something is not properly protected by the graph mutex at
>>> either ALSA or au0828 driver.
>>>
>>> I also got a GPF:
>>>
>>> [   95.398931] au0828: Start Pipeline: Xceive XC5000->au0828a vbi Error -16
>>> [   95.398936] au0828: Deactivate link Error 0
>>> [   95.398943] tuner 6-0061: Putting tuner to sleep
>>> [   95.398960] kasan: CONFIG_KASAN_INLINE enabled
>>> [   95.398962] kasan: GPF could be caused by NULL-ptr deref or user memory access
>>> [   95.398967] general protection fault: 0000 [#2] SMP KASAN
>>> [   95.399020] Modules linked in: ir_lirc_codec lirc_dev au8522_dig xc5000 tuner au8522_decoder au8522_common au0828 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core tveeprom dvb_core rc_core v4l2_common videodev snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device media cpufreq_powersave cpufreq_conservative cpufreq_userspace cpufreq_stats parport_pc ppdev lp parport snd_hda_codec_hdmi intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt kvm iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha256_ssse3 sha256_generic hmac drbg snd_hda_codec_realtek i915 snd_hda_codec_generic aesni_intel aes_x86_64 btusb lrw btrtl gf128mul snd_hda_intel glue_helper ablk_helper btbcm btintel cryptd psmouse snd_hda_codec bluetooth
>>> [   95.399853]  snd_hwdep i2c_algo_bit sg snd_hda_core serio_raw pcspkr evdev drm_kms_helper snd_pcm rfkill drm snd_timer mei_me snd i2c_i801 soundcore lpc_ich mei mfd_core battery video dw_dmac i2c_designware_platform dw_dmac_core i2c_designware_core acpi_pad button tpm_tis tpm ext4 crc16 mbcache jbd2 dm_mod sd_mod hid_generic usbhid ahci libahci libata e1000e xhci_pci ptp scsi_mod ehci_pci ehci_hcd pps_core xhci_hcd fan thermal sdhci_acpi sdhci mmc_core i2c_hid hid
>>> [   95.400389] CPU: 2 PID: 2966 Comm: v4l_id Tainted: G      D W       4.5.0+ #62
>>> [   95.400455] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0350.2015.0812.1722 08/12/2015
>>> [   95.400541] task: ffff88009bbe4740 ti: ffff8803a45a0000 task.ti: ffff8803a45a0000
>>> [   95.400610] RIP: 0010:[<ffffffff81db0aa4>]  [<ffffffff81db0aa4>] usb_set_interface+0x34/0x9c0
>>> [   95.400694] RSP: 0018:ffff8803a45a7d18  EFLAGS: 00010202
>>> [   95.400744] RAX: dffffc0000000000 RBX: ffff8803a3cbab2c RCX: 1ffff10006ba18e2
>>> [   95.400809] RDX: 0000000000000008 RSI: 0000000000000000 RDI: 0000000000000040
>>> [   95.400873] RBP: ffff8803a45a7d88 R08: 0000000000000001 R09: 0000000000000001
>>> [   95.400938] R10: ffff8803a4414a38 R11: 0000000000000000 R12: ffff8803bc482400
>>> [   95.401004] R13: dffffc0000000000 R14: ffff8803a3cb8000 R15: 0000000000000000
>>> [   95.401069] FS:  00007f19fbd19700(0000) GS:ffff8803c6900000(0000) knlGS:0000000000000000
>>> [   95.401141] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   95.401190] CR2: 0000558300d51d44 CR3: 000000009af29000 CR4: 00000000003406e0
>>> [   95.401255] Stack:
>>> [   95.401277]  ffff8803a5b5ea20 ffff8803a45a7d40 ffffffffa1490f0f 0000000000000061
>>> [   95.401358]  ffff8803a5b5e510 ffff8803a45a7d58 ffffffffa147003c ffff880300000000
>>> [   95.401438]  ffff880300000000 ffff8803a3cbab2c ffff8803bc482400 dffffc0000000000
>>> [   95.401518] Call Trace:
>>> [   95.401547]  [<ffffffffa1490f0f>] ? xc5000_sleep+0x8f/0xd0 [xc5000]
>>> [   95.405722]  [<ffffffffa147003c>] ? fe_standby+0x3c/0x50 [tuner]
>>> [   95.409842]  [<ffffffffa1440b20>] au0828_v4l2_close+0x3a0/0x590 [au0828]
>>> [   95.412140]  [<ffffffffa1301270>] v4l2_release+0xf0/0x210 [videodev]
>>> [   95.414119]  [<ffffffff815c2c4f>] __fput+0x20f/0x6d0
>>> [   95.416046]  [<ffffffff815c317e>] ____fput+0xe/0x10
>>> [   95.417879]  [<ffffffff811acde7>] task_work_run+0x137/0x200
>>> [   95.419656]  [<ffffffff81005d54>] exit_to_usermode_loop+0x154/0x180
>>> [   95.421420]  [<ffffffff8124a1b6>] ? trace_hardirqs_on_caller+0x16/0x590
>>> [   95.423177]  [<ffffffff810073a6>] syscall_return_slowpath+0x186/0x1c0
>>> [   95.424911]  [<ffffffff822e7a1c>] entry_SYSCALL_64_fastpath+0xbf/0xc1
>>> [   95.426629] Code: 00 00 00 00 fc ff df 48 89 e5 41 57 41 56 41 55 41 54 49 89 ff 53 48 83 c7 40 48 83 ec 48 89 55 c8 48 89 fa 48 c1 ea 03 89 75 d0 <80> 3c 02 00 0f 85 67 07 00 00 49 8d 7f 18 48 b8 00 00 00 00 00 
>>> [   95.428560] RIP  [<ffffffff81db0aa4>] usb_set_interface+0x34/0x9c0
>>> [   95.430358]  RSP <ffff8803a45a7d18>
>>> [   95.432566] ---[ end trace 3e95e11ff0b9efaf ]---
>>>
>>> As nobody expects someone to do infinite loops binding/unbinding
>>> both drivers, I guess it is ok to keep it for Kernel 4.6, but
>>> we need to find a way fix those locking issues for dynamic graph
>>> changes.
>>>
>>
>> Yes. Right. We have to look at these closely. I agree nobody runs a stress
>> bin/ubind and rmmod/modprobe like we are doing. I think there is some lock
>> and race type issue in au0828 v4l2 release path. I have had to fix some issues
>> there in the past. I agree with you that we are good for 4.6-rc1 and I can 
>> continue to look at this. We can get this fixed for rc2 or rc3.
>>
> 
> I isolated this problem to an existing race condition between au0828_v4l2_open()
> and  au0828_v4l2_close(). It won't show up in normal scenarios. The bind/unbind
> loop test triggers it.
> 
> au0828_v4l2_close() check for dev_state == DEV_DISCONNECTED will fail to
> detect the device disconnected state correctly, if au0828_v4l2_open() runs
> to set the DEV_INITIALIZED bit. A loop test of bind/unbind found this bug
> by increasing the likelihood of au0828_v4l2_open() occurring while unbind
> is in progress. When au0828_v4l2_close() fails to detect that the device
> is in disconnect state, it attempts to power down the device and fails with
> the both of the above problems.
> 
> The second one from xc5000_sleep() would trigger without any of the new
> patches. I am building Linux 4.5 to verify that.
> 

Hi Mauro,

Reproduced the problem on Linux 4.5 release. Here is the log. The patch
I sent https://lkml.org/lkml/2016/3/22/3 should go into stable releases
as well.

Mar 22 06:55:16 anduin kernel: [  883.235413] PM: Removing info for No Bus:video1
Mar 22 06:55:16 anduin kernel: [  883.236165] device: 'vbi0': device_unregister
Mar 22 06:55:16 anduin kernel: [  883.237883] kasan: CONFIG_KASAN_INLINE enabled
Mar 22 06:55:16 anduin kernel: [  883.237893] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] SMP KASAN
Mar 22 06:55:16 anduin kernel: [  883.237988] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables snd_usb_audio snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi ir_lirc_codec snd_seq_device lirc_dev binfmt_misc rc_hauppauge au8522_dig nls_iso8859_1 xc5000 tuner au8522_decoder au8522_common hp_wmi sparse_keymap au0828 ghash_clmulni_intel aesni_intel tveeprom aes_x86_64 ablk_helper dvb_core cryptd lrw gf128mul uvcvideo glue_helper rc_core v4l2_common joydev videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core serio_raw videodev media k10temp i2c_piix4 snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec tpm_infineon snd_hda_core snd_hwdep snd_pcm hp_accel lis3lv02d input_polldev mac_hid snd_timer kvm_amd tpm_tis kvm irqbypass parport_pc ppdev lp parport autofs4 p
 l
2303 usbserial hid_generic psmouse usbhid hid radeon i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt firewire_ohci fb_sys_fops sdhci_pci firewire_core sdhci crc_itu_t r8169 drm mii wmi video
Mar 22 06:55:16 anduin kernel: [  883.239547] CPU: 1 PID: 2793 Comm: v4l_id Not tainted 4.5.0 #16
Mar 22 06:55:16 anduin kernel: [  883.239613] Hardware name: Hewlett-Packard HP ProBook 6475b/180F, BIOS 68TTU Ver. F.04 08/03/2012
Mar 22 06:55:16 anduin kernel: [  883.239709] task: ffff88009a0fcd80 ti: ffff8800942f0000 task.ti: ffff8800942f0000
Mar 22 06:55:16 anduin kernel: [  883.239790] RIP: 0010:[<ffffffff82102264>]  [<ffffffff82102264>] usb_set_interface+0x34/0x9d0
Mar 22 06:55:16 anduin kernel: [  883.239896] PM: Removing info for No Bus:vbi0
Mar 22 06:55:16 anduin kernel: [  883.239947] RSP: 0018:ffff8800942f7d10  EFLAGS: 00010212
Mar 22 06:55:16 anduin kernel: [  883.240007] RAX: dffffc0000000000 RBX: ffff8801eef75338 RCX: ffff88009a0fd5b8
Mar 22 06:55:16 anduin kernel: [  883.240093] RDX: 0000000000000008 RSI: 0000000000000000 RDI: 0000000000000040
Mar 22 06:55:16 anduin kernel: [  883.240189] RBP: ffff8800942f7d88 R08: 0000000000000006 R09: 0000000000000000
Mar 22 06:55:16 anduin kernel: [  883.240284] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801eef76874
Mar 22 06:55:16 anduin kernel: [  883.240381] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8801eef74000
Mar 22 06:55:16 anduin kernel: [  883.240478] FS:  00007f83985c4700(0000) GS:ffff8801fa880000(0000) knlGS:0000000000000000
Mar 22 06:55:16 anduin kernel: [  883.240586] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 22 06:55:16 anduin kernel: [  883.240664] CR2: 000055fd2e0ec000 CR3: 000000009426d000 CR4: 00000000000406e0
Mar 22 06:55:16 anduin kernel: [  883.240754] Stack:
Mar 22 06:55:16 anduin kernel: [  883.240779]  dffffc0000000000 ffff8801eef74000 ffff8800942f7d40 ffffffffa0f58f0f
Mar 22 06:55:16 anduin kernel: [  883.240873]  ffff88009a0fd5d8 ffff8801eeee4050 ffff8800942f7d58 ffffffffa0e3803c
Mar 22 06:55:16 anduin kernel: [  883.240966]  ffff880100000000 ffff880000000000 ffff8801eef75338 ffff8801eef76874
Mar 22 06:55:16 anduin kernel: [  883.241058] Call Trace:
Mar 22 06:55:16 anduin kernel: [  883.241095]  [<ffffffffa0f58f0f>] ? xc5000_sleep+0x8f/0xd0 [xc5000]
Mar 22 06:55:16 anduin kernel: [  883.241166]  [<ffffffffa0e3803c>] ? fe_standby+0x3c/0x50 [tuner]
Mar 22 06:55:16 anduin kernel: [  883.241242]  [<ffffffffa0e8bade>] au0828_v4l2_close+0x36e/0x570 [au0828]
Mar 22 06:55:16 anduin kernel: [  883.241348]  [<ffffffffa0c70ec0>] v4l2_release+0xf0/0x210 [videodev]
Mar 22 06:55:16 anduin kernel: [  883.241438]  [<ffffffff815662fc>] __fput+0x1fc/0x6c0
Mar 22 06:55:16 anduin kernel: [  883.241508]  [<ffffffff8156682e>] ____fput+0xe/0x10
Mar 22 06:55:16 anduin kernel: [  883.241583]  [<ffffffff8116a813>] task_work_run+0x133/0x1f0
Mar 22 06:55:16 anduin kernel: [  883.241664]  [<ffffffff810035d0>] exit_to_usermode_loop+0x140/0x170
Mar 22 06:55:16 anduin kernel: [  883.241747]  [<ffffffff810061b6>] syscall_return_slowpath+0x156/0x1b0
Mar 22 06:55:16 anduin kernel: [  883.241838]  [<ffffffff82829db1>] int_ret_from_sys_call+0x25/0x9f
Mar 22 06:55:16 anduin kernel: [  883.241921] Code: 00 00 00 00 fc ff df 48 89 e5 41 57 41 56 41 55 41 54 49 89 fd 53 48 83 c7 40 48 83 ec 50 89 55 c8 48 89 fa 48 c1 ea 03 89 75 d0 <80> 3c 02 00 0f 85 7b 07 00 00 49 8d 7d 18 48 b8 00 00 00 00 00
Mar 22 06:55:16 anduin kernel: [  883.242471] RIP  [<ffffffff82102264>] usb_set_interface+0x34/0x9d0
Mar 22 06:55:16 anduin kernel: [  883.242612]  RSP <ffff8800942f7d10>

thanks,
-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shuahkh at osg.samsung.com | (970) 217-8978


More information about the Alsa-devel mailing list