[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABqD9hbjGYA-jAOe-3CZEUV3MG2Qgs6SJ2irN7N+JMB2wj-mzA@mail.gmail.com>
Date: Thu, 12 Jan 2012 11:10:55 -0600
From: Will Drewry <wad@...omium.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: linux-kernel@...r.kernel.org, keescook@...omium.org,
john.johansen@...onical.com, serge.hallyn@...onical.com,
coreyb@...ux.vnet.ibm.com, pmoore@...hat.com, eparis@...hat.com,
djm@...drot.org, torvalds@...ux-foundation.org,
segoon@...nwall.com, rostedt@...dmis.org, jmorris@...ei.org,
scarybeasts@...il.com, avi@...hat.com, penberg@...helsinki.fi,
viro@...iv.linux.org.uk, luto@....edu, mingo@...e.hu,
akpm@...ux-foundation.org, khilman@...com, borislav.petkov@....com,
amwang@...hat.com, ak@...ux.intel.com, eric.dumazet@...il.com,
gregkh@...e.de, dhowells@...hat.com, daniel.lezcano@...e.fr,
linux-fsdevel@...r.kernel.org,
linux-security-module@...r.kernel.org, olofj@...omium.org,
mhalcrow@...gle.com, dlaor@...hat.com
Subject: Re: [RFC,PATCH 1/2] seccomp_filters: system call filtering using BPF
On Thu, Jan 12, 2012 at 10:22 AM, Oleg Nesterov <oleg@...hat.com> wrote:
> On 01/11, Will Drewry wrote:
>>
>> +__weak u8 *seccomp_get_regs(u8 *scratch, size_t *available)
>> +{
>> + /* regset is usually returned based on task personality, not current
>> + * system call convention. This behavior makes it unsafe to execute
>> + * BPF programs over regviews if is_compat_task or the personality
>> + * have changed since the program was installed.
>> + */
>> + const struct user_regset_view *view = task_user_regset_view(current);
>> + const struct user_regset *regset = &view->regsets[0];
>> + size_t scratch_size = *available;
>> + if (regset->core_note_type != NT_PRSTATUS) {
>> + /* The architecture should override this method for speed. */
>> + regset = find_prstatus(view);
>> + if (!regset)
>> + return NULL;
>> + }
>> + *available = regset->n * regset->size;
>> + /* Make sure the scratch space isn't exceeded. */
>> + if (*available > scratch_size)
>> + *available = scratch_size;
>> + if (regset->get(current, regset, 0, *available, scratch, NULL))
>> + return NULL;
>> + return scratch;
>> +}
>> +
>> +/**
>> + * seccomp_test_filters - tests 'current' against the given syscall
>> + * @syscall: number of the system call to test
>> + *
>> + * Returns 0 on ok and non-zero on error/failure.
>> + */
>> +int seccomp_test_filters(int syscall)
>> +{
>> + struct seccomp_filter *filter;
>> + u8 regs_tmp[sizeof(struct user_regs_struct)], *regs;
>> + size_t regs_size = sizeof(struct user_regs_struct);
>> + int ret = -EACCES;
>> +
>> + filter = current->seccomp.filter; /* uses task ref */
>> + if (!filter)
>> + goto out;
>> +
>> + /* All filters in the list are required to share the same system call
>> + * convention so only the first filter is ever checked.
>> + */
>> + if (seccomp_check_personality(filter))
>> + goto out;
>> +
>> + /* Grab the user_regs_struct. Normally, regs == ®s_tmp, but
>> + * that is not mandatory. E.g., it may return a point to
>> + * task_pt_regs(current). NULL checking is mandatory.
>> + */
>> + regs = seccomp_get_regs(regs_tmp, ®s_size);
>
> Stupid question. I am sure you know what are you doing ;) and I know
> nothing about !x86 arches.
>
> But could you explain why it is designed to use user_regs_struct ?
> Why we can't simply use task_pt_regs() and avoid the (costly) regsets?
So on x86 32, it would work since user_regs_struct == task_pt_regs
(iirc), but on x86-64
and others, that's not true. I don't think it's kosher to expose
pt_regs to the userspace, but if, let's say, x86-32 overrides the weak
linkage, then it could just return task_pt_regs and be the fastest
path.
If it would be appropriate to expose pt_regs to userspace, then I'd
happily do so :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists