linux-kernel - Re: [uml-devel] SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110823021717.GA2203@ZenIV.linux.org.uk>
Date:	Tue, 23 Aug 2011 03:17:18 +0100
From:	Al Viro <viro@...IV.linux.org.uk>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>, Andrew Lutomirski <luto@....edu>,
	Borislav Petkov <bp@...64.org>, Ingo Molnar <mingo@...nel.org>,
	"user-mode-linux-devel@...ts.sourceforge.net" 
	<user-mode-linux-devel@...ts.sourceforge.net>,
	Richard Weinberger <richard@....at>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mingo@...hat.com" <mingo@...hat.com>
Subject: Re: [uml-devel] SYSCALL, ptrace and syscall restart breakages (Re:
 [RFC] weird crap with vdso on uml/i386)

On Tue, Aug 23, 2011 at 02:13:12AM +0100, Al Viro wrote:
> *UGH*.  OK,
> 	1) I'm an idiot; int_ret_from_sys_call does *not* usually step on
> rbp (it's callee-saved).  So normally ebp is left as is on the way out,
> which is why we don't see stuff getting buggered left, right and center.
> 	2) Sometimes it apparently does somehow happen.  I don't see where
> it happens yet, but uml breakage that started all of that looks *exactly*
> like that.  %ebp getting arg6 in it when we return into __kernel_vsyscall()
> from the kernel fits the observed pattern precisely.
> 	3) modulo that the situation is nowhere near as bad as I thought.
> Brown paperbag time for me - for missing that if my analysis had been correct
> we'd've seen breakage _much_ earlier.  Mea culpa.
> 	4) we still have a problem, apparently, but it's more narrow now -
> the question is when would %rbp be shat into?
> 
> Al, off to apply a serious self-LART...

So it smells like a nasty effect of PTRACE_SETREGS/PTRACE_POKEUSER on the
way *out* of syscall (the same on the way in wouldn't have such effect -
modified EBP value would be simply lost after the syscall; passed to it
as arg6, but that's it).

All right, now I have a nice shiny reproducer for uml folks:
main()
{
	char *s = sbrk(8192);
	*s = 0;
	brk(s);
}
will do it on the affected boxen.  It gets fucked in the second call of
brk().  What happens is this:
brk(3) in libc:
	about to call brk(2), will have to stomp on %ebx
	save ebx into ecx for the duration of call
	the new brk level into ebx, syscall number into eax
	hit __kernel_vsyscall()
		push ebp
		mov ecx, ebp
ecx is going to get stomped on, save it (i.e. original ebx) into ebp
		syscall
and at that point ebp has changed - it became equal to arg6, aka what we'd
put on stack, aka the value of ebp prior to all of that, aka the frame
pointer of caller.
		use ecx to set ss back to sanity
		mov ebp, ecx
and now ecx is buggered.
		pop ebp
the value of ebp has actually not been changed by that.
		ret
	put the value of ecx back into ebx
only it's not the value we used to have there.  Not anymore.  Now we are about
to store the return value of brk(2) into static variable.  And that's where
it really hits the fan, since we are in a PIC code and ebx is not what it
used to be.  So instead of that variable we access hell knows what address
and promptly segfault.

I have a very strong suspicion that I know what will turn out to be involved
into that - the page eviction done by sys_brk().  Note that dirtying this
sucker is really necessary - without *s = 0 it won't segfault at all.  With
it we get a segfault described above.

And page eviction on uml is nasty and convoluted as hell.  It has to do
munmap() on process' VM.  Which is done in a rather sick way - we have a
stub present in address space of all processes, with a function that
does a given series of mmap/munmap/mprotect and traps itself.  Guest
kernel puts arguments for that sucker into a shared data page and continues
the process into that function.  Once it's done, we get the damn thing
stopped again, nice and ready for us to continue dealing with it.

Something in that shitstorm of ptrace() calls ends up doing SETREGS
when victim sits on the way out of (host) syscall.  Boom...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/