linux-kernel - Re: [patch 9/9] mm: fix pagecache write deadlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070204103620.33c24cad.akpm@linux-foundation.org>
Date:	Sun, 4 Feb 2007 10:36:20 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Linux Kernel <linux-kernel@...r.kernel.org>,
	Linux Filesystems <linux-fsdevel@...r.kernel.org>,
	Linux Memory Management <linux-mm@...ck.org>
Subject: Re: [patch 9/9] mm: fix pagecache write deadlocks

On Sun, 4 Feb 2007 16:10:51 +0100 Nick Piggin <npiggin@...e.de> wrote:

> On Sun, Feb 04, 2007 at 03:15:49AM -0800, Andrew Morton wrote:
> > On Sun, 4 Feb 2007 12:03:17 +0100 Nick Piggin <npiggin@...e.de> wrote:
> > 
> > > On Sun, Feb 04, 2007 at 02:56:02AM -0800, Andrew Morton wrote:
> > > > On Sun, 4 Feb 2007 11:46:09 +0100 Nick Piggin <npiggin@...e.de> wrote:
> > > > 
> > > > If that recollection is right, I think we could afford to reintroduce that
> > > > problem, frankly.  Especially as it only happens in the incredibly rare
> > > > case of that get_user()ed page getting unmapped under our feet.
> > > 
> > > Dang. I was hoping to fix it without introducing data corruption.
> > 
> > Well.  It's a compromise.  Being practical about it, I reeeealy doubt that
> > anyone will hit this combination of circumstances.
> 
> They're not likely to hit the deadlocks, either. Probability gets more
> likely after my patch to lock the page in the fault path. But practially,
> we could live without that too, because the data corruption it fixes is
> very rare as well. Which is exactly what we've been doing quite happily
> for most of 2.6, including all distro kernels (I think).

Thing is, an application which is relying on the contents of that page is
already unreliable (or really peculiar), because it can get indeterminate
results anyway.

> ...
> 
> On a P4 Xeon, SMP kernel, on a tmpfs filesystem, a 1GB dd if=/dev/zero write
> had the following performance (higher is worse):
> 
> Orig kernel			New kernel
> new file (no pagecache)
> 4K  blocks 1.280s		1.287s (+0.5%)
> 64K blocks 1.090s		1.105s (+1.4%)
> notrunc (uptodate pagecache)
> 4K  blocks 0.976s		1.001s (+0.5%)
> 64K blocks 0.780s		0.792s (+1.5%)
> 
> [numbers are better than +/- 0.005]
> 
> So we lose somewhere between half and one and a half of one percent
> performance in a pagecache write intensive workload.

That's not too bad - caches are fast.  Did you look at optimising the
handling of that temp page, ensure that we always use the same page?  I
guess the page allocator per-cpu-pages thing is being good here.

I'm not sure how, though.  Park a copy in the task_struct, just as an
experiment.  But that'd de-optimise multiple-tasks-writing-on-the-same-cpu.
Maybe a per-cpu thing?  Largely duplicates the page allocator's per-cpu-pages.

Of course, we're also increasing caceh footprint, which this test won't
show.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/