[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070213222443.GH22104@elte.hu>
Date: Tue, 13 Feb 2007 23:24:43 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Andi Kleen <andi@...stfloor.org>
Cc: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Arjan van de Ven <arjan@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@....com.au>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Ulrich Drepper <drepper@...hat.com>,
Zach Brown <zach.brown@...cle.com>,
Evgeniy Polyakov <johnpol@....mipt.ru>,
"David S. Miller" <davem@...emloft.net>,
Benjamin LaHaise <bcrl@...ck.org>,
Suparna Bhattacharya <suparna@...ibm.com>,
Davide Libenzi <davidel@...ilserver.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [patch 05/11] syslets: core code
* Andi Kleen <andi@...stfloor.org> wrote:
> Ingo Molnar <mingo@...e.hu> writes:
>
> > +
> > +static struct async_thread *
> > +pick_ready_cachemiss_thread(struct async_head *ah)
>
> The cachemiss names are confusing. I assume that's just a left over
> from Tux?
yeah. Although 'stuff goes async' is quite similar to a cachemiss. We
didnt have some resource available right now so the syscall has to block
== i.e. some cache was not available.
> > +
> > + memset(atom->args, 0, sizeof(atom->args));
> > +
> > + ret |= __get_user(arg_ptr, &uatom->arg_ptr[0]);
> > + if (!arg_ptr)
> > + return ret;
> > + if (!access_ok(VERIFY_WRITE, arg_ptr, sizeof(*arg_ptr)))
> > + return -EFAULT;
>
> It's a little unclear why you do that many individual access_ok()s.
> And why is the target constant sized anyways?
each indirect pointer has to be checked separately, before dereferencing
it. (Andrew pointed out that they should be VERIFY_READ, i fixed that in
my tree)
it looks a bit scary in C but the assembly code is very fast and quite
straightforward.
> + /*
> + * Lock down the ring. Note: user-space should not munlock() this,
> + * because if the ring pages get swapped out then the async
> + * completion code might return a -EFAULT instead of the expected
> + * completion. (the kernel safely handles that case too, so this
> + * isnt a security problem.)
> + *
> + * mlock() is better here because it gets resource-accounted
> + * properly, and even unprivileged userspace has a few pages
> + * of mlock-able memory available. (which is more than enough
> + * for the completion-pointers ringbuffer)
> + */
>
> If it's only a few pages you don't need any resource accounting. If
> it's more then it's nasty to steal the users quota. I think plain
> gup() would be better.
get_user_pages() would have to be limited in some way - and i didnt want
to add yet another wacky limit thing - so i just used the already
existing mlock() infrastructure for this. If Oracle wants to set up a 10
MB ringbuffer, they can set the PAM resource limits to 11 MB and still
have enough stuff left. And i dont really expect GPG to start using
syslets - just yet ;-)
a single page is enough for 1024 completion pointers - that's more than
enough for most purposes - and the default mlock limit is 40K.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists