lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 Nov 2012 04:30:43 +1100 (EST)
From:	u3557@...o.sublimeip.com (Amnon Shiloh)
To:	oleg@...hat.com (Oleg Nesterov)
Cc:	rostedt@...dmis.org (Steven Rostedt),
	fweisbec@...il.com (Frederic Weisbecker),
	mingo@...hat.com (Ingo Molnar),
	a.p.zijlstra@...llo.nl (Peter Zijlstra),
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arch_check_bp_in_kernelspace: fix the range check

Hi Oleg,

Yes, I can see that "arch/x86/kernel/vsyscall_64.c"
has changed dramatically since I last looked at it.

Since this is the case, I no longer need to trap the vsyscall page.

Now however, that "vsyscall" was effectively replaced by vdso, it
creates a new problem for me and probably for anyone else who uses
some form of checkpoint/restore:

Suppose a process is checkpointed because the system needs to reboot
for a kernel-upgrade, then restored on the new and different kernel.
The new VDSO page may no longer match the new kernel - it could for
example fetch data from addresses in the vsyscall page that now
contain different things; or in case the hardware also was changed,
it may use machine-instructions that are now illegal.

As I don't mind to forego the "fast" sys_time(), my obvious solution
is to disable the vdso for traced processes that may be checkpointed.

One way to do it would be by brute-force: straight after "execve"
unmap the tracee's vdso page, then manipulate the ELF tables in
its memory so the VDSO entry is gone and the library will not go
looking for it.  Alternately, the function-table within the VDSO
page can be erased.

I just wonder whether you know of an easier and more standard way
to disable the vdso in user-mode - ideally on a per-process basis,
or otherwise, if it's too hard, on the whole computer.  I searched
the web and found references to "/proc/sys/vm/vdso_enable", but I
have no such file or "sysctl" option on my system.

Best Regards,
Amnon.


> 
> Hi Amnon,
> 
> Please read my previous email ;)
> http://marc.info/?l=linux-kernel&m=135342649119153
> 
> On 11/21, u3557@...o.sublimeip.com wrote:
> >
> > Hi Oleg,
> >
> > > Or. Perhaps we can define TRAP_VSYSCALL and change emulate_vsyscall() to
> > > do
> > >
> > >
> > > 	if (current->ptrace && test_thread_flag(TIF_SYSCALL_TRACE))
> > > 		send_sigtrap(TRAP_VSYSCALL, ...);
> > >
> > > if it returns true?
> > >
> >
> > I wish it were possible, but the vsyscall page is entered in user-mode,
> 
> Only in NATIVE mode. emulate_vsyscall() runs in kernel mode.
> 
> And in the NATIVE mode PTRACE_SYSCALL should work just fine, because:
> 
> > The vsyscall page was designed in order to avoid user/kernel context
> > switches,
> 
> True, it was. But not today. Please look at __vsyscall_page:
> 
> 	__vsyscall_page:
> 
> 		mov $__NR_gettimeofday, %rax
> 		syscall
> 		ret
> 
> If you want the "fast" sys_time() without entering the kernel, you can
> use __vdso_time(). And since vdso has the user-space mapping you can
> insert "int3" or use hw breakpoints.
> 
> At least this is my understanding after I glanced at the new implementation.
> 
> 
> However. It is not that I think that TRAP_VSYSCALL is really good idea.
> At least it needs another option...
> 
> Oleg.
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ