lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 31 Oct 2015 11:00:12 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Jeff Layton <jlayton@...chiereds.net>
Cc:	linux-kernel@...r.kernel.org, bfields@...ldses.org,
	Michael Skralivetsky <michael.skralivetsky@...marydata.com>,
	Chris Worley <chris.worley@...marydata.com>,
	Trond Myklebust <trond.myklebust@...marydata.com>,
	Lai Jiangshan <laijs@...fujitsu.com>
Subject: Re: timer code oops when calling mod_delayed_work

(cc'ing Lai)

Hello, Jeff.

On Thu, Oct 29, 2015 at 01:58:36PM -0400, Jeff Layton wrote:
> crash> p cache_cleaner
> cache_cleaner = $12 = {
>   work = {
>     data = {
>       counter = 0xfffffffe1

If I'm not mistaken, PENDING, flush color 14, OFFQ and POOL_NONE.

>     }, 
>     entry = {
>       next = 0xffffffffa03623c8 <cache_cleaner+8>, 
>       prev = 0xffffffffa03623c8 <cache_cleaner+8>

Empty entry.

>     }, 
>     func = 0xffffffffa03333c0 <cache_cleaner_func>
>   }, 
>   timer = {
>     entry = {
>       next = 0x0, 
>       pprev = 0xffff88085fd0eaf8
>     }, 
>     expires = 0x100021e99, 
>     function = 0xffffffff810b66a0 <delayed_work_timer_fn>, 
>     data = 0xffffffffa03623c0, 
>     flags = 0x200014, 
>     slack = 0xffffffff, 
>     start_pid = 0x0, 
>     start_site = 0x0, 
>     start_comm = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
>   }, 
>   wq = 0xffff88085f48fc00, 
>   cpu = 0x14
> }
> 
> So the PENDING bit is set (lowest bit in data.counter), and timer->entry.pprev
> pprev pointer is not NULL (so timer_pending is true). I also see that
> there are several nfsd threads running the shrinker at the same time.
> 
> There is one potential problem that I see, but I'd appreciate someone
> sanity checking me on this. Here is mod_delayed_work_on:
...
> ...and here is the beginning of try_to_grab_pending:
> 
> ------------------[snip]------------------------
>         /* try to steal the timer if it exists */
>         if (is_dwork) {
>                 struct delayed_work *dwork = to_delayed_work(work);
> 
>                 /*
>                  * dwork->timer is irqsafe.  If del_timer() fails, it's
>                  * guaranteed that the timer is not queued anywhere and not
>                  * running on the local CPU.
>                  */
>                 if (likely(del_timer(&dwork->timer)))
>                         return 1;
>         }
> 
>         /* try to claim PENDING the normal way */
>         if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)))
>                 return 0;
> ------------------[snip]------------------------
> 
> 
> ...so if del_timer returns true, we'll return 1 from
> try_to_grab_pending without actually setting the
> WORK_STRUCT_PENDING_BIT, and then will end up calling
> __queue_delayed_work.
> 
> That seems wrong to me -- shouldn't we be ensuring that that bit is set
> when returning 1 from try_to_grab_pending to guard against concurrent
> queue_delayed_work_on calls?

But if try_to_grab_pending() succeeded at stealing dwork->timer, it's
known that the PENDING bit must already be set.  IOW, the bit is
stolen together with the timer.

Heh, this one is tricky.  Yeah, try_to_grab_pending() missing PENDING
would explain the failure but I can't see how it'd leak at the moment.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ