[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzBXzQS90XYZKHRr0N66uBrDmS=o_U9u=e4qq2CzTg8ww@mail.gmail.com>
Date: Tue, 26 Nov 2013 12:03:11 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Andi Kleen <andi@...stfloor.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH] Add a text_poke syscall v2
On Tue, Nov 26, 2013 at 11:05 AM, Andy Lutomirski <luto@...capital.net> wrote:
>
> IIRC someone proposed that, rather than specifying a "handler", that any
> user thread that traps just wait until the poke completes. This would
> complicate the kernel implementation a bit, but it would make the user
> code a good deal simpler. Is there any reason that this is a bad idea?
Please do it this way instead, because user space will get the
callback version wrong and then - because it never actually triggers
in practice in normal situations - it will cause very *very* subtle
bugs that we can't fix.
Making the kernel serialize the accesses is the right thing to do.
Just a new per-mm mutex should trivially do it, then you don't even
have to check the "current->mm == bp_target_mm" thing at all, you just
make the bp handler do a simple
if (mutex_lock_interruptible(&mm->text_poke_mutex) >= 0)
mutex_unlock(&mm->text_poke_mutex);
and return. All done.
Plus the callback thing is pointless if we can do the instruction
switch atomically (which would be true for UP, for single-thread, and
potentially for certain sizes/alignments coupled with known rules for
particular micro-architectures). So it's not a particularly good
interface anyway.
Btw, I also think that there's a separate problem wrt shared pages.
Should we perhaps only allow this in private mappings? Because right
now it has that "current->mm == bp_target_mm" thing, and generally it
only works on one particular mm, but by using "get_user_pages_fast(,
1,..)" it really only requires write permissions on the page. So it
could be shared mapping, and I could easily see people doing that on
purpose ("open executable file, then use text_poke() to change it for
this architecture") and THAT DOES NOT WORK with the current patch if
somebody else is also running that app..
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists