[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201003291643.01214.arnd@arndb.de>
Date: Mon, 29 Mar 2010 15:43:00 +0100
From: Arnd Bergmann <arnd@...db.de>
To: Andi Kleen <andi@...stfloor.org>
Cc: Jiri Kosina <jkosina@...e.cz>,
Frederic Weisbecker <fweisbec@...il.com>,
linux-kernel@...r.kernel.org, Matthew Wilcox <matthew@....cx>,
Thomas Gleixner <tglx@...utronix.de>, jblunck@...e.de,
Alan Cox <alan@...ux.intel.com>, Ingo Molnar <mingo@...e.hu>,
gregkh@...e.de
Subject: Re: [GIT, RFC] Killing the Big Kernel Lock II
On Monday 29 March 2010, Andi Kleen wrote:
> Arnd Bergmann <arnd@...db.de> writes:
>
> >> - The seek function in uhci-debug.c probably is still racy.
> >
> > That function could be removed in favor of using generic_file_ioctl
> > and setting i_size to up->size.
>
> Does that lock against read in libfs?
No.
> > Also, the race is only between concurrent calls of llseek on
> > the same file descriptor, which is undefined anyway.
> > The current code also doesn't protect you against partial updates
> > of f_pos during ->read() on 32 bit systems (nothing ever does),
>
> That is not what I meant.
>
> > and it even fails to protect against the concurrent llseek race
> > because the assignment is done outside of the f_pos update.
>
> I wasn't sure it would protect against parallel reads.
>
> Does it?
There is no way for any driver or file system right now to protect
against that, nor has there been for a long time[1]. The sys_read and
sys_write functions use file_pos_write() to update the file->f_pos
without taking any lock, and they pass a local variable into the
*ppos argument of the ->read/->write file operations, which means
that the file operation itself cannot add locking to the update
either.
We never do in-place updates of file->f_pos, but on architectures
where a 64 bit load can see incorrect data from a 64 bit store,
any concurrent read/write/llseek combinations may cause problems,
except for two concurrent lseek. Also, llseek is usually serialized
with readdir/getdents for file systems.
> > The patch looks correct, but I probably wouldn't bother with the rename,
> > and simply drop the BKL in the caller.
>
> I think a rename is better, I take compile errors over subtle
> breakage any day.
ok, fine with me.
Arnd
[1] http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=55f09ec0087c160533eab791607d92c9ce6222ae
was merged in linux-2.6.8, which opened this race.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists