[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTi=tmzKqLL30q9Mq+9u7s-M3sG2-MKz7pUJ1R08Z@mail.gmail.com>
Date: Tue, 1 Feb 2011 09:58:13 -0500
From: Eric Paris <eparis@...isplace.org>
To: Eric Paris <eparis@...hat.com>
Cc: linux-kernel@...r.kernel.org, mingo@...e.hu, agl@...gle.com,
fweisbec@...il.com, tzanussi@...il.com
Subject: Re: Using ftrace/perf as a basis for generic seccomp
On Wed, Jan 12, 2011 at 4:28 PM, Eric Paris <eparis@...hat.com> wrote:
> Some time ago Adam posted a patch to allow for a generic seccomp
> implementation (unlike the current seccomp where your choice is all
> syscalls or only read, write, sigreturn, and exit) which got little
> traction and it was suggested he instead do the same thing somehow using
> the tracing code:
> http://thread.gmane.org/gmane.linux.kernel/833556
> The actual method that this could be achieved was apparently left as an
> exercise for the reader. Since I'd like to do something similar (and
> actually basically reimplemented Adam's code before I found this thread)
> I guess that makes me the reader. I've never touched
> perf/ftrace/whatever so I'm not even knowledgeably enough to ask good
> questions so please, try to talk to me like a 2 year old.
>
> I started playing a bit having no idea where to start decided to see
> where something like:
> perf stat -e syscalls:sys_enter_read -e syscalls:sys_enter_write -- ./seccomp_test
> Ended up in the kernel. It ended up I saw in perf_syscall_enter(). So
> I decided to do a little hacking and added this little patch segment:
>
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index bac752f..6653995 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -495,8 +495,12 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
> int size;
>
> syscall_nr = syscall_get_nr(current, regs);
> - if (!test_bit(syscall_nr, enabled_perf_enter_syscalls))
> +
> + if (!test_bit(syscall_nr, enabled_perf_enter_syscalls)) {
> + if (current->seccomp.mode == 2)
> + do_exit(SIGKILL);
> return;
> + }
>
> sys_data = syscall_nr_to_meta(syscall_nr);
> if (!sys_data)
>
> Which appears to be a necessary, but not sufficient, requirement, since
> another unrelated task could also have a 'watch?' on other syscalls. So
> I hacked in this little PoS into the filter code.
>
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index eac7e33..d8c1c8f 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -4780,15 +4780,19 @@ void perf_tp_event(u64 addr, u64 count, void *record, int entry_size,
> .size = entry_size,
> .data = record,
> };
> + int found = 0;
>
> perf_sample_data_init(&data, addr);
> data.raw = &raw;
>
> hlist_for_each_entry_rcu(event, node, head, hlist_entry) {
> - if (perf_tp_event_match(event, &data, regs))
> + if (perf_tp_event_match(event, &data, regs)) {
> + found = 1;
> perf_swevent_event(event, count, 1, &data, regs);
> + }
> }
> -
> + if (current->seccomp.mode == 2 && !found)
> + do_exit(SIGKILL);
> perf_swevent_put_recursion_context(rctx);
> }
> EXPORT_SYMBOL_GPL(perf_tp_event);
>
> Which seems to get me a 'working' version of generic seccomp on top of
> ftrace. Problem is it makes me feel dirty, I'm logging a bunch of trace
> stuff I don't care about, and I'm sure its being done wrong 1001 ways.
> I know that do_exit(SIGKILL) is actually really wrong since it ends up
> giving me this crap (but i don't know how to do it better from there)
>
> note: seccomp-test[2485] exited with preempt_count 1
> BUG: sleeping function called from invalid context at kernel/rwsem.c:21
>
> So, finally, onto the question. How would you guys do it? The tracing
> code seems to me to be built on the idea of recording information on a
> very small limited set of events, not blocking access on the complement
> of a small limited set of events.
>
> I'm not seeing how the tracing code is better than the generic seccomp
> code that Adam wrote, but hopefully someone can enlighten me as to how
> this can be done reasonably. I need all the guidance you can offer
> because I don't really see what next steps should be!
Ping tracing people. I'm not seeing steps forward. At this point I'm
going to start looking at Adam's code again. I can think of a couple
of cleanups and simplifications to his code. I just don't see how
using tracing is supposed to work better.....
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists