[alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix memory allocation
Yang, Libin
libin.yang at intel.com
Mon Jun 17 09:30:42 CEST 2019
>-----Original Message-----
>From: Yang, Libin
>Sent: Monday, June 3, 2019 5:10 PM
>To: Takashi Iwai <tiwai at suse.de>
>Cc: Pierre-Louis Bossart <pierre-louis.bossart at linux.intel.com>; alsa-
>devel at alsa-project.org; broonie at kernel.org
>Subject: RE: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec: fix
>memory allocation
>
>Hi Takashi,
>
>>-----Original Message-----
>>From: Takashi Iwai [mailto:tiwai at suse.de]
>>Sent: Thursday, May 23, 2019 4:27 PM
>>To: Yang, Libin <libin.yang at intel.com>
>>Cc: Pierre-Louis Bossart <pierre-louis.bossart at linux.intel.com>; alsa-
>>devel at alsa-project.org; broonie at kernel.org
>>Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel: hda-codec:
>>fix memory allocation
>>
>>On Thu, 23 May 2019 10:21:20 +0200,
>>Yang, Libin wrote:
>>>
>>> Hi Takashi,
>>>
>>> >-----Original Message-----
>>> >From: Takashi Iwai [mailto:tiwai at suse.de]
>>> >Sent: Thursday, May 23, 2019 4:16 PM
>>> >To: Yang, Libin <libin.yang at intel.com>
>>> >Cc: Pierre-Louis Bossart <pierre-louis.bossart at linux.intel.com>;
>>> >alsa- devel at alsa-project.org; broonie at kernel.org
>>> >Subject: Re: [alsa-devel] [PATCH v2 12/12] ASoC: SOF: Intel:
>>> >hda-codec: fix memory allocation
>>> >
>>> >On Thu, 23 May 2019 10:03:03 +0200,
>>> >Yang, Libin wrote:
>>> >>
>>> >> Please let me describe the issue here.
>>> >>
>>> >> The test case is:
>>> >> 1) Unload module with script "sudo ./sof_remove.sh" ,
>>> >> 2) reload module with script "sudo ./sof_insert.sh"
>>> >>
>>> >> After several rounds of removing and inserting kernel modules,
>>> >> system will complain like below:
>>> >> "BUG: unable to handle kernel paging request at 000000292a282031"
>>> >
>>> >Did you try some kernel debug options? It might show what went wrong.
>>>
>>> No, I haven't. I'm not sure which options I can use for this case.
>>> Could you please give me some suggestions?
>>
>>You can enable CONFIG_DEBUG_DEVRES and adjust the devres.log option
>for
>>showing each devres allocation and removal. And I'd try
>>CONFIG_DEBUG_SLAB and CONFIG_DEBUG_KMEMLEAK or whatever
>interesting in
>>CONFIG_DEBUG section, too.
>
>Thanks for your suggestion. After more than 1 week debug, I think maybe I
>have root caused this issue from the devres.log message.
>
>Below is my finding.
>1. When initialing the codecs, snd_hdac_ext_bus_device_init() will be called,
>and it will set hdev->dev.release = default_release.
>However, for analog codec (not hdac_hdmi codec), hdac_hda_codec_probe()
>will be called later. And it will call snd_hda_codec_device_new(), which will
>reset codec->code.dev.release = snd_hda_codec_dev_release; This means
>hdac_hdmi's hdev dev release is default_release() defined in hdac_ext_bus.c,
>and other hda codec's hdev dev release is snd_hda_codec_dev_release().
>
>Both default_release() and snd_hda_codec_dev_release() will call kfree() to
>free the hdac_device (or its container) in the current code.
>
>2. When we run rmmod sof_pci_dev, it will free the sdev. If we use Struct
>hdac_device *hdev = devm_kzalloc(sdev->dev...). This means hdev will also be
>freed automatically.
>
>In the removal, snd_hdac_ext_bus_device_remove() will be called to remove
>the hdev (in this function it is struct hdac_device *codec.
>The name is not aligned in different places).
>However for hdac_hdmi, the hdev->dev is used by other code.
>So calling device.release() (the function default_release()) will be postponed.
>After after sdev is freed, the device.release() will be called. But for devm_xxx,
>hdev will also be freed when sdev is freed. This means hdev.dev after sdev is
>freed is invalid now as hdev has already freed. It will access invalid memory.
>This will cause the bug.
>
>So I think we should not use devm_xxx, and let's free the hdev manually.
>
>At the end of this topic, I still found 2 suspicious code in the current code.
>1. in sound/soc/intel/skylate/skl.c
>it calls hdev = devm_kazlloc() or hda_codec = devm_kzalloc().
>As we will call kfree() in the current code, should we replace it with kzalloc()?
>Maybe we need cavs drivers owner's help on it.
>
>2. in snd_hdac_ext_bus_device_remove()
>It will call snd_hdac_device_unregister() to unregister the hdac_devices and
>put_device(&codec->dev) to release the dev.
>For analog codec, snd_hdac_device_unregister() will free the codec->dev's
>kobject. And snd_hda_codec_dev_release() will be called to free the
>hdac_device.
>So it is invalid to call put_device(&codec->dev). If you print
>refcound_read(&(codec->dev.kobj.kref.refcount)) for analog codec before
>put_device(), you will find the refcount has already been 0.
I tested on the latest code. The behavior on kernel 5.2.0-rc1 is different from
it on kernel 5.0.0 of https://github.com/thesofproject/linux/. (thesofproject
kernel has already been upgraded to 5.2.0-rc1 now)
On kernel 5.0.0 (https://github.com/thesofproject/linux/), when removing
the sof kernel modules, the sequence is:
1. devm_kazlloc_release() to release hdac_device
2. snd_hdac_ext_bus_device_exit(): this function will use hdac_device
However, on kernel 5.2.0-rc1, the sequence is:
1. snd_hdac_ext_bus_device_exit()
2. devm_kazlloc_release() to release hdac_device
So I think it is safe now to use devm_xxx to allocate the hdac_device.
Also, Ranjani found a duplicated release of hdac_device. She has
a patch to fix it.
Regards,
Libin
>
>Regards,
>Libin
>
>>
>>
>>Takashi
More information about the Alsa-devel
mailing list