linux-kernel - Re: [RFC,PATCH 1/2] seccomp_filters: system call filtering using BPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABqD9hbhcEr0idfBVb9sXiAP1rBVUtzWmORq0WL1MF-eWW-nvQ@mail.gmail.com>
Date:	Fri, 13 Jan 2012 17:10:41 -0600
From:	Will Drewry <wad@...omium.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	linux-kernel@...r.kernel.org, keescook@...omium.org,
	john.johansen@...onical.com, serge.hallyn@...onical.com,
	coreyb@...ux.vnet.ibm.com, pmoore@...hat.com, eparis@...hat.com,
	djm@...drot.org, torvalds@...ux-foundation.org,
	segoon@...nwall.com, rostedt@...dmis.org, jmorris@...ei.org,
	scarybeasts@...il.com, avi@...hat.com, penberg@...helsinki.fi,
	viro@...iv.linux.org.uk, luto@....edu, mingo@...e.hu,
	akpm@...ux-foundation.org, khilman@...com, borislav.petkov@....com,
	amwang@...hat.com, ak@...ux.intel.com, eric.dumazet@...il.com,
	gregkh@...e.de, dhowells@...hat.com, daniel.lezcano@...e.fr,
	linux-fsdevel@...r.kernel.org,
	linux-security-module@...r.kernel.org, olofj@...omium.org,
	mhalcrow@...gle.com, dlaor@...hat.com,
	Roland McGrath <mcgrathr@...omium.org>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFC,PATCH 1/2] seccomp_filters: system call filtering using BPF

On Fri, Jan 13, 2012 at 1:01 PM, Will Drewry <wad@...omium.org> wrote:
> On Fri, Jan 13, 2012 at 11:31 AM, Oleg Nesterov <oleg@...hat.com> wrote:
>> On 01/12, Will Drewry wrote:
>>>
>>> On Thu, Jan 12, 2012 at 11:23 AM, Oleg Nesterov <oleg@...hat.com> wrote:
>>> > On 01/12, Will Drewry wrote:
>>> >>
>>> >> On Thu, Jan 12, 2012 at 10:22 AM, Oleg Nesterov <oleg@...hat.com> wrote:
>>> >> >> +      */
>>> >> >> +     regs = seccomp_get_regs(regs_tmp, &regs_size);
>>> >> >
>>> >> > Stupid question. I am sure you know what are you doing ;) and I know
>>> >> > nothing about !x86 arches.
>>> >> >
>>> >> > But could you explain why it is designed to use user_regs_struct ?
>>> >> > Why we can't simply use task_pt_regs() and avoid the (costly) regsets?
>>> >>
>>> >> So on x86 32, it would work since user_regs_struct == task_pt_regs
>>> >> (iirc), but on x86-64
>>> >> and others, that's not true.
>>> >
>>> > Yes sure, I meant that userpace should use pt_regs too.
>>> >
>>> >> If it would be appropriate to expose pt_regs to userspace, then I'd
>>> >> happily do so :)
>>> >
>>> > Ah, so that was the reason. But it is already exported? At least I see
>>> > the "#ifndef __KERNEL__" definition in arch/x86/include/asm/ptrace.h.
>>> >
>>> > Once again, I am not arguing, just trying to understand. And I do not
>>> > know if this definition is part of abi.
>>>
>>> I don't either :/  My original idea was to operate on task_pt_regs(current),
>>> but I noticed that PTRACE_GETREGS/SETREGS only uses the
>>> user_regs_struct. So I went that route.
>>
>> Well, I don't know where user_regs_struct come from initially. But
>> probably it is needed to allow to access the "artificial" things like
>> fs_base. Or perhaps this struct mimics the layout in the coredump.
>
> Not sure - added Roland whose name was on many of the files :)
>
> I just noticed that ptrace ABI allows pt_regs access using the register
> macros (PTRACE_PEEKUSR) and user_regs_struct access (PTRACE_GETREGS).
>
> But I think the latter is guaranteed to have a certain layout while the macros
> for PEEKUSR can do post-processing fixup.  (Which could be done in the
> bpf evaluator load_pointer() helper if needed.)
>
>>> I'd love for pt_regs to be fair game to cut down on the copying!
>>
>> Me too. I see no point in using user_regs_struct.
>
> I'll rev the change to use pt_regs and drop all the helper code.  If
> no one says otherwise, that certainly seems ideal from a performance
> perspective, and I see pt_regs exported to userland along with ptrace
> abi register offset macros.

On second thought, pt_regs is scary :)

>From looking at
  http://lxr.linux.no/linux+v3.2.1/arch/x86/include/asm/syscall.h#L97
and ia32syscall enty code, it appears that for x86, at least, the
pt_regs for compat processes will be 8 bytes wide per register on the
stack.  This means if a self-filtering 32-bit program runs on a 64-bit host in
IA32_EMU, its filters will always index into pt_regs incorrectly.

I'm not 100% that I am reading the code right, but it means that I can either
keep using user_regs_struct or fork the code behavior based on compat. That
would need to be arch dependent then which is pretty rough.

Any thoughts?

I'll do a v5 rev for Eric's comments soon, but I'm not quite sure
about the pt_regs
change yet.  If the performance boost is worth the effort of having a
per-arch fixup,
I can go that route.  Otherwise, I could look at some alternate approach for a
faster-than-regview payload.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/