linux-kernel - Re: [PATCH 4/9] firewire: don't use PREPARE_DELAYED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5306AF8E.3080006@hurleysoftware.com>
Date:	Thu, 20 Feb 2014 20:44:46 -0500
From:	Peter Hurley <peter@...leysoftware.com>
To:	Tejun Heo <tj@...nel.org>, laijs@...fujitsu.com
CC:	linux-kernel@...r.kernel.org,
	Stefan Richter <stefanr@...6.in-berlin.de>,
	linux1394-devel@...ts.sourceforge.net,
	Chris Boot <bootc@...tc.net>, linux-scsi@...r.kernel.org,
	target-devel@...r.kernel.org
Subject: Re: [PATCH 4/9] firewire: don't use PREPARE_DELAYED_WORK

On 02/20/2014 03:44 PM, Tejun Heo wrote:
> PREPARE_[DELAYED_]WORK() are being phased out.  They have few users
> and a nasty surprise in terms of reentrancy guarantee as workqueue
> considers work items to be different if they don't have the same work
> function.
>
> firewire core-device and sbp2 have been been multiplexing work items
> with multiple work functions.  Introduce fw_device_workfn() and
> sbp2_lu_workfn() which invoke fw_device->workfn and
> sbp2_logical_unit->workfn respectively and always use the two
> functions as the work functions and update the users to set the
> ->workfn fields instead of overriding work functions using
> PREPARE_DELAYED_WORK().
>
> It would probably be best to route this with other related updates
> through the workqueue tree.
>
> Compile tested.
>
> Signed-off-by: Tejun Heo <tj@...nel.org>
> Cc: Stefan Richter <stefanr@...6.in-berlin.de>
> Cc: linux1394-devel@...ts.sourceforge.net
> Cc: Chris Boot <bootc@...tc.net>
> Cc: linux-scsi@...r.kernel.org
> Cc: target-devel@...r.kernel.org
> ---
>   drivers/firewire/core-device.c | 22 +++++++++++++++-------
>   drivers/firewire/sbp2.c        | 17 +++++++++++++----
>   include/linux/firewire.h       |  1 +
>   3 files changed, 29 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/firewire/core-device.c b/drivers/firewire/core-device.c
> index de4aa40..2c6d5e1 100644
> --- a/drivers/firewire/core-device.c
> +++ b/drivers/firewire/core-device.c
> @@ -916,7 +916,7 @@ static int lookup_existing_device(struct device *dev, void *data)
>   		old->config_rom_retries = 0;
>   		fw_notice(card, "rediscovered device %s\n", dev_name(dev));
>
> -		PREPARE_DELAYED_WORK(&old->work, fw_device_update);
> +		old->workfn = fw_device_update;
>   		fw_schedule_device_work(old, 0);
>
>   		if (current_node == card->root_node)
> @@ -1075,7 +1075,7 @@ static void fw_device_init(struct work_struct *work)
>   	if (atomic_cmpxchg(&device->state,
>   			   FW_DEVICE_INITIALIZING,
>   			   FW_DEVICE_RUNNING) == FW_DEVICE_GONE) {
> -		PREPARE_DELAYED_WORK(&device->work, fw_device_shutdown);
> +		device->workfn = fw_device_shutdown;
>   		fw_schedule_device_work(device, SHUTDOWN_DELAY);

Implied mb of test_and_set_bit() in queue_work_on() ensures that the
newly assigned work function is visible on all cpus before evaluating
whether or not the work can be queued.

Ok.

>   	} else {
>   		fw_notice(card, "created device %s: GUID %08x%08x, S%d00\n",
> @@ -1196,13 +1196,20 @@ static void fw_device_refresh(struct work_struct *work)
>   		  dev_name(&device->device), fw_rcode_string(ret));
>    gone:
>   	atomic_set(&device->state, FW_DEVICE_GONE);
> -	PREPARE_DELAYED_WORK(&device->work, fw_device_shutdown);
> +	device->workfn = fw_device_shutdown;
>   	fw_schedule_device_work(device, SHUTDOWN_DELAY);
>    out:
>   	if (node_id == card->root_node->node_id)
>   		fw_schedule_bm_work(card, 0);
>   }
>
> +static void fw_device_workfn(struct work_struct *work)
> +{
> +	struct fw_device *device = container_of(to_delayed_work(work),
> +						struct fw_device, work);

I think this needs an smp_rmb() here.

> +	device->workfn(work);
> +}
> +

Otherwise this cpu could speculatively load workfn before
set_work_pool_and_clear_pending(), which means that the old workfn
could have been loaded but PENDING was still set and caused queue_work_on()
to reject the work as already pending.

Result: the new work function never runs.

But this exposes a more general problem that I believe workqueue should
prevent; speculated loads and stores in the work item function should be
prevented from occurring before clearing PENDING in
set_work_pool_and_clear_pending().

IOW, the beginning of the work function should act like a barrier in
the same way that queue_work_on() (et. al.) already does.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/