On Dec 25 2015 04:10, Stefan Richter wrote:
On Dec 24 Stefan Richter wrote:
On Dec 24 Takashi Sakamoto wrote: [...]
Between bootstrap, firmware loading and post configuration, some bus resets are observed.
This commit offloads most processing of probe function into workqueue and schedules the workqueue after successive bus resets. This has an effect to get correct hardware information and avoid involvement to bus reset storm.
For code simplicity, this change effects all of Dice-based models, i.e. Dice II, Dice Jr., Dice Mini and Dice III. Due to the same reason, sound card instance is kept till card free function even if some errors are detected in the workqueue.
[...]
--- a/sound/firewire/dice/dice.c +++ b/sound/firewire/dice/dice.c @@ -18,6 +18,8 @@ MODULE_LICENSE("GPL v2"); #define WEISS_CATEGORY_ID 0x00 #define LOUD_CATEGORY_ID 0x10
+#define PROBE_DELAY_MS (2 * MSEC_PER_SEC)
static int check_dice_category(struct fw_unit *unit) { struct fw_device *device = fw_parent_device(unit); @@ -185,6 +187,9 @@ static void dice_card_free(struct snd_card *card) { struct snd_dice *dice = card->private_data;
- /* The workqueue for registration uses the memory block. */
- cancel_work_sync(&dice->dwork.work);
According to the kerneldoc comment at cancel_work_sync, this should actually be cancel_delayed_work_sync(&dice->dwork);.
- snd_dice_stream_destroy_duplex(dice); snd_dice_transaction_destroy(dice); fw_unit_put(dice->unit);
@@ -192,6 +197,66 @@ static void dice_card_free(struct snd_card *card) mutex_destroy(&dice->mutex); }
Another comment to dice_card_free() which is aside from this patch:
You are calling mutex_destroy(&dice->mutex) here. But if I am right in my understanding that dice_card_free() is called from dice_remove(), then this is wrong because dice_card_free() still holding the mutex.
I am not sure what the proper fix is. Perhaps just never perform mutex_destroy().
Indeed.
I decide to move the cancel_work_sync() to .remove callback. (actually, it should be cancel_delayed_work_sync().) After .remove callback, .update callback is not called anymore therefore no works are schedulled for the registration.
Then, the mutex_lock()/mutex_unlock() is not need for this purpose. When sound card is released, the work is confirmed to finished and unschedulled anymore because .remove is called before.
+static void do_registration(struct work_struct *work) +{
- struct snd_dice *dice = container_of(work, struct snd_dice, dwork.work);
- int err;
- mutex_lock(&dice->mutex);
So the worker needs to obtain &dice->mutex. But...
[...]
@@ -263,14 +307,25 @@ static void dice_remove(struct fw_unit *unit) { struct snd_dice *dice = dev_get_drvdata(&unit->device);
- /* For a race condition of struct snd_card.shutdown. */
- mutex_lock(&dice->mutex);
- /* No need to wait for releasing card object in this context. */ snd_card_free_when_closed(dice->card);
- mutex_unlock(&dice->mutex);
}
...if I read snd_card_free_when_closed() and its surrounding code correctly, then dice_card_free() is called from within this, which can cause a deadlock. Right?
If so, then it seems to me that cancel_delayed_work_sync(&dice->dwork) should be moved out of dice_card_free() and be put at the beginning of dice_remove().
Thanks
Takashi Sakamoto