lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Fri, 25 Sep 2020 15:16:11 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     qiang.zhang@...driver.com, tj@...nel.org,
        akpm@...ux-foundation.org, Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] kernel/kthread.c: kthread_worker: add work status
 check in timer_fn

On Fri 2020-09-25 20:38:21, Hillf Danton wrote:
> 
> On Fri, 25 Sep 2020 11:30:46 +0200 Petr Mladek wrote:
> > 
> > On Fri 2020-09-25 13:07:59, qiang.zhang@...driver.com wrote:
> > > From: Zqiang <qiang.zhang@...driver.com>
> > > 
> > > When queue delayed work to worker, at some point after that the timer_fn
> > > will be call, add work to worker's work_list, at this time, the work may
> > > be cancel, so add "work->canceling" check current work status.
> > 
> > Great catch!
> > 
> > I was able to understand the problem from the description. Though I
> > would still try to improve it a bit. I suggest:
> > 
> > <new_text>
> > Subject: kthread_worker: Prevent queuing delayed work from timer_fn when it is being canceled
> > 
> > There is a small race window when a delayed work is being canceled and
> > the work still might be queued from the timer_fn:
> > 
> > CPU0					CPU1
> > 
> > kthread_cancel_delayed_work_sync()
> >   __kthread_cancel_work_sync()
> >     __kthread_cancel_work()
> > 	work->canceling++;
> > 
> > 					kthread_delayed_work_timer_fn()
> > 					  kthread_insert_work();
> > 
> > BUG: kthread_insert_work() should not get called when work->canceling
> > is set.
> 
> Seems like the diagram above can't cover the case that the timer fired
> and started acquiring the lock half a tick before here came the cancel
> that took the lock then set the canceling mark.
> Nor is it a bug given that cancel is always flushing the current work.

This is the same as:

CPU0					CPU1

					kthread_delayed_work_timer_fn()
					  kthread_insert_work();

kthread_cancel_delayed_work_sync()


It just shows that kthread_cancel_delayed_work_sync() can't stop work
that is already being proceed. In this case, it has to wait until it
finishes.

By other words, this patch can't fix any real problem with timing.
kthread_cancel_delayed_work_sync() only guarantees that the work
in neither queued nor running when it finishes.


But I still think that the patch makes sense. work->lock synchronizes
manipulation of the work state. And it is wrong to queue the work
when canceling flag is set.

By other words. The bug was harmless. But it still was a bug.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ