linux-kernel - Re: [PATCH] [8/18] BKL-removal: Remove BKL from remote

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200801280358.14024.ak@suse.de>
Date:	Mon, 28 Jan 2008 03:58:13 +0100
From:	Andi Kleen <ak@...e.de>
To:	Trond Myklebust <Trond.Myklebust@...app.com>
Cc:	Steve French <smfrench@...il.com>, swhiteho@...hat.com,
	sfrench@...ba.org, vandrove@...cvut.cz,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	akpm@...l.org
Subject: Re: [PATCH] [8/18] BKL-removal: Remove BKL from remote_llseek

On Monday 28 January 2008 00:08:56 Trond Myklebust wrote:
> 
> On Sun, 2008-01-27 at 16:18 -0600, Steve French wrote:
> > If two seeks overlap, can't you end up with an f_pos value that is
> > different than what either thread seeked to? or if you have a seek and
> > a read overlap can't you end up with the read occurring in the midst
> > of an update of f_pos (which takes more than one instruction on
> > various architectures), e.g. reading an f_pos, which has only the
> > lower half of a 64 bit field updated?   I agree that you shouldn't
> > have seeks racing in parallel but I think it is preferable to get
> > either the updated f_pos or the earlier f_pos not something 1/2
> > updated.
> 
> Why? The threads are doing something inherently liable to corrupt data
> anyway. If they can race over the seek, why wouldn't they race over the
> read or write too?
> The race in lseek() should probably be the least of your worries in this
> case.

The problem is that it's not a race in who gets to do its thing first, but a 
parallel reader can actually see a corrupted value from the two independent 
words on 32bit (e.g. during a 4GB). And this could actually completely corrupt 
f_pos when it happens with two racing relative seeks or read/write()s

I would consider that a bug.

Fixes would be either to always take a spinlock to update this (nasty on 
platforms where spinlocks are expensive like P4) or define some architecture
specific way to read/write 64bit values consistently. In theory also some
lazy locking seqlock like mechanism could be used, but that had the disadvantage
of being theoretically starvable.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/