linux-kernel - Re: [BUG] Lockless patches cause hardlock under heavy IO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 24 Jun 2008 10:13:45 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	paulmck@...ux.vnet.ibm.com
Cc:	Ryan Hope <rmh3093@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-mm@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] Lockless patches cause hardlock under heavy IO

On Monday 23 June 2008 23:05, Paul E. McKenney wrote:
> On Mon, Jun 23, 2008 at 09:54:52PM +1000, Nick Piggin wrote:
> > On Monday 23 June 2008 13:51, Ryan Hope wrote:
> > > well i get the hardlock on -mm with out using reiser4, i am pretty
> > > sure is swap related
> >
> > The guys seeing hangs don't use PREEMPT_RCU, do they?
> >
> > In my swapping tests, I found -mm3 to be stable with classic RCU, but
> > on a hunch, I tried PREEMPT_RCU and it crashed a couple of times rather
> > quickly. First crash was in find_get_pages so I suspected lockless
> > pagecache doing something subtly wrong with the RCU API, but I just got
> > another crash in __d_lookup:
>
> Could you please send me a repeat-by?  (At least Alexey is no longer
> alone!)

OK, I had DEBUG_PAGEALLOC in the .config, which I think is probably
important to reproduce it (but the fact that I'm reproducing oopses
with << PAGE_SIZE objects like dentries and radix tree nodes indicates
that there is even more free-before-grace activity going undetected --
if you construct a test case using full pages, it might become even
easier to detect with DEBUG_PAGEALLOC).

2 socket, 8 core x86 system.

I mounted two tmpfs filesystems, one contains a single large file
which is formatted as 1K block size ext3 and mounted loopback, the
other is used directly. Linux kernel source is unpacked on each mount
and concurrent make -j128 on each. This pushes it pretty hard into
swap. Classic RCU survived another 5 hours of this last night.

But that's a fairly convoluted test for an RCU problem. I expect it
should be easier to trigger with something more targetted...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/