[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171219224054.GV21978@ZenIV.linux.org.uk>
Date: Tue, 19 Dec 2017 22:40:54 +0000
From: Al Viro <viro@...IV.linux.org.uk>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Giuseppe Scrivano <gscrivan@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, alexander.deucher@....com,
broonie@...nel.org, chris@...is-wilson.co.uk,
David Miller <davem@...emloft.net>, deepa.kernel@...il.com,
Greg KH <gregkh@...uxfoundation.org>,
luc.vanoostenryck@...il.com, lucien xin <lucien.xin@...il.com>,
Ingo Molnar <mingo@...nel.org>,
Neil Horman <nhorman@...driver.com>,
syzkaller-bugs@...glegroups.com,
Vladislav Yasevich <vyasevich@...il.com>
Subject: Re: [PATCH linux-next] mqueue: fix IPC namespace use-after-free
On Tue, Dec 19, 2017 at 03:49:24PM -0600, Eric W. Biederman wrote:
> > what would you be delaying? kmem_cache_alloc() for struct mount and assignments
> > to its fields? That's noise; if anything, I would expect the main cost with
> > a plenty of containers to be in sget() scanning the list of mqueue superblocks.
> > And we can get rid of that, while we are at it - to hell with mount_ns(), with
> > that approach we can just use mount_nodev() instead. The logics in
> > mq_internal_mount() will deal with multiple instances - if somebody has already
> > triggered creation of internal mount, all subsequent calls in that ipcns will
> > end up avoiding kern_mount_data() entirely. And if you have two callers
> > racing - sure, you will get two superblocks. Not for long, though - the first
> > one to get to setting ->mq_mnt (serialized on mq_lock) wins, the second loses
> > and prompty destroys his vfsmount and superblock. I seriously suspect that
> > variant below would cut down on the cost a whole lot more - as it is, we have
> > the total of O(N^2) spent in the loop inside of sget_userns() when we create
> > N ipcns and mount in each of those; this patch should cut that to
> > O(N)...
>
> If that is where the cost is, is there any point in delaying creating
> the internal mount at all?
We won't know without the profiles... Incidentally, is there any point in
using mount_ns() for procfs? Similar scheme (with ->proc_mnt instead of
->mq_mnt, of course) would live with mount_nodev() just fine, and it's
definitely less costly - we don't bother with the loop in sget_userns()
at all that way.
Powered by blists - more mailing lists