An announcement for kernel-global workqueue users.
Hello.
The Linux kernel provides kernel-global WQs (namely, system_wq, system_highpri_wq, system_long_wq, system_unbound_wq, system_freezable_wq, system_power_efficient_wq and system_freezable_power_efficient_wq). But since attempt to flush kernel-global WQs has possibility of deadlock, Tejun Heo thinks that we should stop calling flush_scheduled_work() and flush_workqueue(system_*). Such callers as of Linux 5.17 are listed below.
---------- $ git grep -nF 'flush_scheduled_work()' drivers/acpi/osl.c:1182: * invoke flush_scheduled_work()/acpi_os_wait_events_complete() to flush drivers/acpi/osl.c:1575: flush_scheduled_work(); drivers/block/aoe/aoedev.c:324: flush_scheduled_work(); drivers/block/aoe/aoedev.c:523: flush_scheduled_work(); drivers/crypto/atmel-ecc.c:401: flush_scheduled_work(); drivers/crypto/atmel-sha204a.c:162: flush_scheduled_work(); drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c:2606: flush_scheduled_work(); drivers/gpu/drm/bridge/lontium-lt9611uxc.c:985: flush_scheduled_work(); drivers/gpu/drm/i915/display/intel_display.c:10790: flush_scheduled_work(); drivers/gpu/drm/i915/gt/selftest_execlists.c:87: flush_scheduled_work(); drivers/iio/light/tsl2563.c:811: flush_scheduled_work(); drivers/infiniband/hw/mlx4/cm.c:511: flush_scheduled_work(); drivers/infiniband/hw/mlx4/cm.c:543: flush_scheduled_work(); /* make sure all timers were flushed */ drivers/infiniband/ulp/isert/ib_isert.c:2639: flush_scheduled_work(); drivers/input/mouse/psmouse-smbus.c:320: flush_scheduled_work(); drivers/md/dm.c:229: flush_scheduled_work(); drivers/message/fusion/mptscsih.c:1234: flush_scheduled_work(); drivers/net/phy/phy.c:1060: /* Cannot call flush_scheduled_work() here as desired because drivers/net/usb/lan78xx.c:3240: * can't flush_scheduled_work() until we drop rtnl (later), drivers/net/usb/usbnet.c:853: * can't flush_scheduled_work() until we drop rtnl (later), drivers/net/wireless/ath/ath6kl/usb.c:481: flush_scheduled_work(); drivers/net/wwan/wwan_hwsim.c:537: flush_scheduled_work(); /* Wait deletion works completion */ drivers/nvme/target/configfs.c:1557: flush_scheduled_work(); drivers/nvme/target/rdma.c:1587: flush_scheduled_work(); drivers/nvme/target/rdma.c:2056: flush_scheduled_work(); drivers/nvme/target/tcp.c:1818: flush_scheduled_work(); drivers/nvme/target/tcp.c:1879: flush_scheduled_work(); drivers/nvme/target/tcp.c:1884: flush_scheduled_work(); drivers/platform/surface/surface_acpi_notify.c:863: flush_scheduled_work(); drivers/power/supply/ab8500_btemp.c:975: flush_scheduled_work(); drivers/power/supply/ab8500_chargalg.c:1993: flush_scheduled_work(); drivers/power/supply/ab8500_charger.c:3400: flush_scheduled_work(); drivers/power/supply/ab8500_fg.c:3021: flush_scheduled_work(); drivers/rapidio/devices/tsi721.c:2944: flush_scheduled_work(); drivers/rtc/dev.c:99: flush_scheduled_work(); drivers/scsi/mpt3sas/mpt3sas_scsih.c:12409: flush_scheduled_work(); drivers/scsi/qla2xxx/qla_target.c:1568: flush_scheduled_work(); drivers/staging/olpc_dcon/olpc_dcon.c:386: flush_scheduled_work(); sound/soc/intel/atom/sst/sst.c:363: flush_scheduled_work(); $ git grep -nF 'flush_workqueue(system_' drivers/block/rnbd/rnbd-clt.c:1776: flush_workqueue(system_long_wq); drivers/infiniband/core/device.c:2857: flush_workqueue(system_unbound_wq); include/linux/workqueue.h:592: flush_workqueue(system_wq); ----------
I tried to send a patch that emits a warning when flushing kernel-global WQs is attempted ( https://lkml.kernel.org/r/2efd5461-fccd-f1d9-7138-0a6767cbf5fe@I-love.SAKURA... ). But Linus does not want such patch ( https://lkml.kernel.org/r/CAHk-=whWreGjEQ6yasspzBrNnS7EQiL+SknToWt=SzUh4Xomy... ).
Steps for converting kernel-global WQs into module's local WQs are shown below. But since an oversight in Step 4 results in breakage, I think that this conversion should be carefully handled by maintainers/developers of each module who are familiar with that module. (This is why I'm sending this mail than sending patches, in order to ask for your cooperation.)
---------- Step 0: Consider if flushing kernel-global WQs is unavoidable.
For example, commit 081bdc9fe05bb232 ("RDMA/ib_srp: Fix a deadlock") simply removed flush_workqueue(system_long_wq) call.
For another example, schedule_on_each_cpu() does not need to call flush_scheduled_work() because schedule_on_each_cpu() knows the list of all "struct work_struct" instances which need to be flushed using flush_work() call.
If flushing kernel-global WQs is still unavoidable, please proceed to the following steps.
Step 1: Declare a variable for your module.
struct workqueue_struct *my_wq;
Step 2: Create a WQ for your module from __init function. The same flags used by corresponding kernel-global WQ can be used when creating the WQ for your module.
my_wq = alloc_workqueue("my_wq_name", 0, 0);
Step 3: Destroy the WQ created in Step 2 from __exit function (and the error handling path of __init function if __init function may fail after creating the WQ).
destroy_workqueue(my_wq);
Step 4: Replace e.g. schedule_work() call with corresponding queue_work() call throughout your module which should be handled by the WQ for your module.
Step 5: Replace flush_scheduled_work() and flush_workqueue(system_*) calls with flush_workqueue() of the WQ for your module.
flush_workqueue(my_wq); ----------
Regards.
participants (1)
-
Tetsuo Handa