[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWUGuWUPLHy2UwJHfX=c1fDwpL9hk1NwO8fkkHh8gUpnA@mail.gmail.com>
Date: Thu, 10 Sep 2015 12:17:34 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Denys Vlasenko <dvlasenk@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>,
Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>, Oleg Nesterov <oleg@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Alexei Starovoitov <ast@...mgrid.com>,
Will Drewry <wad@...omium.org>,
Kees Cook <keescook@...omium.org>, X86 ML <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 RESEND] x86/asm/entry/32, selftests: Add
'test_syscall_vdso' test
On Thu, Sep 10, 2015 at 12:04 PM, Denys Vlasenko <dvlasenk@...hat.com> wrote:
> On 09/10/2015 12:01 AM, Andy Lutomirski wrote:
>> On Mon, Sep 7, 2015 at 8:56 AM, Denys Vlasenko <dvlasenk@...hat.com> wrote:
>>> This new test checks that all x86 registers are preserved across
>>> 32-bit syscalls. It tests syscalls through VDSO (if available)
>>> and through INT 0x80, normally and under ptrace.
>>>
>>> If kernel is a 64-bit one, high registers (r8..r15) are poisoned
>>> before the syscall is called and are checked afterwards.
>>>
>>> They must be either preserved, or cleared to zero (but r11 is special);
>>> r12..15 must be preserved for INT 0x80.
>>>
>>> EFLAGS is checked for changes too, but change there is not
>>> considered to be a bug (paravirt kernels do not preserve
>>> arithmetic flags).
>>>
>>> Run-tested on 64-bit kernel:
>>>
>>> $ ./test_syscall_vdso_32
>>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>>> [OK] Arguments are preserved across syscall
>>> [NOTE] R11 has changed:0000000000200ed7 - assuming clobbered by SYSRET insn
>>> [OK] R8..R15 did not leak kernel data
>>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>>> [OK] Arguments are preserved across syscall
>>> [OK] R8..R15 did not leak kernel data
>>> [RUN] Running tests under ptrace
>>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>>> [OK] Arguments are preserved across syscall
>>> [OK] R8..R15 did not leak kernel data
>>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>>> [OK] Arguments are preserved across syscall
>>> [OK] R8..R15 did not leak kernel data
>>>
>>> On 32-bit paravirt kernel:
>>>
>>> $ ./test_syscall_vdso_32
>>> [NOTE] Not a 64-bit kernel, won't test R8..R15 leaks
>>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>>> [WARN] Flags before=0000000000200ed7 id 0 00 o d i s z 0 a 0 p 1 c
>>> [WARN] Flags after=0000000000200246 id 0 00 i z 0 0 p 1
>>> [WARN] Flags change=0000000000000c91 0 00 o d s 0 a 0 0 c
>>> [OK] Arguments are preserved across syscall
>>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>>> [OK] Arguments are preserved across syscall
>>> [RUN] Running tests under ptrace
>>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>>> [OK] Arguments are preserved across syscall
>>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>>> [OK] Arguments are preserved across syscall
>>>
>>> Signed-off-by: Denys Vlasenko <dvlasenk@...hat.com>
>>> CC: Linus Torvalds <torvalds@...ux-foundation.org>
>>> CC: Steven Rostedt <rostedt@...dmis.org>
>>> CC: Ingo Molnar <mingo@...nel.org>
>>> CC: Borislav Petkov <bp@...en8.de>
>>> CC: "H. Peter Anvin" <hpa@...or.com>
>>> CC: Andy Lutomirski <luto@...capital.net>
>>> CC: Oleg Nesterov <oleg@...hat.com>
>>> CC: Frederic Weisbecker <fweisbec@...il.com>
>>> CC: Alexei Starovoitov <ast@...mgrid.com>
>>> CC: Will Drewry <wad@...omium.org>
>>> CC: Kees Cook <keescook@...omium.org>
>>> CC: x86@...nel.org
>>> CC: linux-kernel@...r.kernel.org
>>
>> Acked-by: Andy Lutomirski <luto@...nel.org>
>>
>> with minor caveats below, none of which are show-stoppers...
>>
>>> + /* INT80 syscall entrypoint can be used by
>>> + * 64-bit programs too, unlike SYSCALL/SYSENTER.
>>> + * Therefore it must preserve R12+
>>> + * (they are callee-saved registers in 64-bit C ABI).
>>> + *
>>> + * This was probably historically not intended,
>>> + * but R8..11 are clobbered (cleared to 0).
>>> + * IOW: they are the only registers which aren't
>>> + * preserved across INT80 syscall.
>>> + */
>>> + if (*r64 == 0 && num <= 11)
>>> + continue;
>>
>> Ugh. I'll change my big entry patchset to preserve these and maybe to
>> preserve all of the 64-bit regs.
>
> If you do that, this won't change the ABI: we don't _promise_
> to save them. If we accidentally do, that means nothing.
>
> If you do that, the test won't fail. The code above does
> not require regs to be 0 - there is further code which
> also allow them to be unchanged.
>
> (I'm not very comfortable about additional six push/pops
> which are necessary for this to happen. I'm surprised
> maintainers tentatively agreed to that -
> I was grilled and asked to prove with measurements
> that *one* additional push+pop wasn't adding significant overhead).
I suspect that I need to make the series faster.
Also, int $0x80 isn't a fast path for any legitimate use case except
Debian, and I would argue that Debian is just buggy.
--Andy
--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists