lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20070803201458.GA181@tv-sign.ru>
Date:	Sat, 4 Aug 2007 00:14:58 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Chuck Ebbert <cebbert@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	richard kennedy <richard@....demon.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: Processes spinning forever, apparently in lock_timer_base()?

Chuck Ebbert wrote:
>
> Looks like the same problem with spinlock unfairness we've seen
> elsewhere: it seems to be looping here? Or is everyone stuck
> just waiting for writeout?
> 
> lock_timer_base():
>         for (;;) {
>                 tvec_base_t *prelock_base = timer->base;
>                 base = tbase_get_base(prelock_base);
>                 if (likely(base != NULL)) {
>                         spin_lock_irqsave(&base->lock, *flags);
>                         if (likely(prelock_base == timer->base))
>                                 return base;
>                         /* The timer has migrated to another CPU */
>                         spin_unlock_irqrestore(&base->lock, *flags);
>                 }
>                 cpu_relax();
>         }

I don't think there is an unfairness problem. We are looping only
if timer->base changes in between. IOW, there is no "lock + unlock
+ lock(same_lock)" here, we take another lock on each iteration.

And:

>  [<c0407208>] do_IRQ+0xbd/0xd1
>  [<c042ffcb>] lock_timer_base+0x19/0x35
>  [<c04300df>] __mod_timer+0x9a/0xa4
>  [<c060bb55>] schedule_timeout+0x70/0x8f
>
> ...
>
>  [<c0407208>] do_IRQ+0xbd/0xd1
>  [<c042ffcb>] lock_timer_base+0x19/0x35
>  [<c04300df>] __mod_timer+0x9a/0xa4
>  [<c060bb55>] schedule_timeout+0x70/0x8f
>
> ...
>
>  [<c042ffcb>] lock_timer_base+0x19/0x35
>  [<c04300df>] __mod_timer+0x9a/0xa4
>  [<c060bb55>] schedule_timeout+0x70/0x8f

All traces start from schedule_timeout()->mod_timer(). This timer
is local, nobody can see it, its ->base can't be NULL or changed.

This means that lock_timer_base() can't loop, but can't take the
tvec_t_base_s.lock. But in that case the trace should different,
strange.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ