lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070525081631.GB985@ff.dom.local>
Date:	Fri, 25 May 2007 10:16:31 +0200
From:	Jarek Poplawski <jarkao2@...pl>
To:	Jason Wessel <jason.wessel@...driver.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [BUG] 2.6.21 hang in cancel_rearming_delayed_workqueue()

On 25-05-2007 05:21, Jason Wessel wrote:
> There is a problem with the calling cancel_rearming_delayed_work if the 
> timer was not yet active.
> 
> I see this problem when netpoll_cleanup() is called without having done 
> any work because it had not processed any packets yet.  The problem 
> appears to be a result of the loop check 
> while(!cancel_delayed_work(dwork)).    This endlessly loops because 
> del_timer_sync() can return 0 or 1 for success which is passed back as a 
> result to the final invariant check for the loop.  In this particular 
> case zero will always be returned because the timer is not active.
> 
> It is possible that the problem exists else where, but I thought I would 
> ask if this is expected?
> 
> #0  del_timer_sync (timer=0xc7ed90f8) at kernel/timer.c:530
> #1  0xc012f08e in cancel_rearming_delayed_workqueue (wq=0xc7fee800,
>   dwork=0xc7ed90e8) at include/linux/workqueue.h:201
> #2  0xc012f0af in cancel_rearming_delayed_work (dwork=0x20)
>   at kernel/workqueue.c:680
> #3  0xc0312f78 in netpoll_cleanup (np=0xc880bf40) at net/core/netpoll.c:784
> 
> Possible fix.
> 
> Signed-off-by: Jason Wessel <jason.wessel@...driver.com>
> 
> Index: linux-2.6.21/kernel/workqueue.c
> ===================================================================
> --- linux-2.6.21.orig/kernel/workqueue.c
> +++ linux-2.6.21/kernel/workqueue.c
> @@ -666,7 +666,7 @@ EXPORT_SYMBOL(flush_scheduled_work);
> void cancel_rearming_delayed_workqueue(struct workqueue_struct *wq,
>                                      struct delayed_work *dwork)
> {
> -       while (!cancel_delayed_work(dwork))
> +       while (cancel_delayed_work(dwork) > 0)
>               flush_workqueue(wq);
> }
> EXPORT_SYMBOL(cancel_rearming_delayed_workqueue);

It's very optimistic change...

I wonder, how this all could work so long (or how it is supposed
to work now without breaking other callers) with (almost) reversed
condition?

According to this comment:

" * cancel_rearming_delayed_workqueue - reliably kill off a delayed
 work whose handler rearms the delayed work."

So, it cannot be used in netpoll_cleanup() if there is no rearming
during this cancel at all. This is a tricky behaviour of course,
and is changed in 2.6.22-rc.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ