[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070213224131.GK22104@elte.hu>
Date: Tue, 13 Feb 2007 23:41:31 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Andi Kleen <andi@...stfloor.org>
Cc: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@....com.au>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Ulrich Drepper <drepper@...hat.com>,
Zach Brown <zach.brown@...cle.com>,
Evgeniy Polyakov <johnpol@....mipt.ru>,
"David S. Miller" <davem@...emloft.net>,
Benjamin LaHaise <bcrl@...ck.org>,
Suparna Bhattacharya <suparna@...ibm.com>,
Davide Libenzi <davidel@...ilserver.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [patch 05/11] syslets: core code
* Andi Kleen <andi@...stfloor.org> wrote:
> > > > + if (!access_ok(VERIFY_WRITE, arg_ptr, sizeof(*arg_ptr)))
> > > > + return -EFAULT;
> > >
> > > It's a little unclear why you do that many individual access_ok()s.
> > > And why is the target constant sized anyways?
> >
> > each indirect pointer has to be checked separately, before dereferencing
> > it. (Andrew pointed out that they should be VERIFY_READ, i fixed that in
> > my tree)
>
> But why only constant sized? It could be a variable length object,
> couldn't it?
i think what you might be missing is that it's only the 6 syscall
arguments that are fetched via indirect pointers - security checks are
then done by the system calls themselves. It's a bit awkward to think
about, but it is surprisingly clean in the assembly, and it simplified
syslet programming too.
> > get_user_pages() would have to be limited in some way - and i didnt
> > want
>
> If you only use it for a small ring buffer it is naturally limited.
yeah, but 'small' is a dangerous word when it comes to adding IO
interfaces ;-)
> > a single page is enough for 1024 completion pointers - that's more
> > than enough for most purposes - and the default mlock limit is 40K.
>
> Then limit it to a single page and use gup
1024 (512 on 64-bit) is alot but not ALOT. It is also certainly not
ALOOOOT :-) Really, people will want to have more than 512
disks/spindles in the same box. I have used such a beast myself. For Tux
workloads and benchmarks we had parallelism levels of millions of
pending requests (!) on a single system - networking, socket limits,
disk IO combined with thousands of clients do create such scenarios. I
really think that such 'pinned pages' are a pretty natural fit for
sys_mlock() and RLIMIT_MEMLOCK, and since the kernel side is careful to
use the _inatomic() uaccess methods, it's safe (and fast) as well.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists