[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1294867725.3237.230.camel@localhost.localdomain>
Date: Wed, 12 Jan 2011 16:28:45 -0500
From: Eric Paris <eparis@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: mingo@...e.hu, agl@...gle.com, fweisbec@...il.com,
tzanussi@...il.com
Subject: Using ftrace/perf as a basis for generic seccomp
Some time ago Adam posted a patch to allow for a generic seccomp
implementation (unlike the current seccomp where your choice is all
syscalls or only read, write, sigreturn, and exit) which got little
traction and it was suggested he instead do the same thing somehow using
the tracing code:
http://thread.gmane.org/gmane.linux.kernel/833556
The actual method that this could be achieved was apparently left as an
exercise for the reader. Since I'd like to do something similar (and
actually basically reimplemented Adam's code before I found this thread)
I guess that makes me the reader. I've never touched
perf/ftrace/whatever so I'm not even knowledgeably enough to ask good
questions so please, try to talk to me like a 2 year old.
I started playing a bit having no idea where to start decided to see
where something like:
perf stat -e syscalls:sys_enter_read -e syscalls:sys_enter_write -- ./seccomp_test
Ended up in the kernel. It ended up I saw in perf_syscall_enter(). So
I decided to do a little hacking and added this little patch segment:
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index bac752f..6653995 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -495,8 +495,12 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
int size;
syscall_nr = syscall_get_nr(current, regs);
- if (!test_bit(syscall_nr, enabled_perf_enter_syscalls))
+
+ if (!test_bit(syscall_nr, enabled_perf_enter_syscalls)) {
+ if (current->seccomp.mode == 2)
+ do_exit(SIGKILL);
return;
+ }
sys_data = syscall_nr_to_meta(syscall_nr);
if (!sys_data)
Which appears to be a necessary, but not sufficient, requirement, since
another unrelated task could also have a 'watch?' on other syscalls. So
I hacked in this little PoS into the filter code.
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index eac7e33..d8c1c8f 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4780,15 +4780,19 @@ void perf_tp_event(u64 addr, u64 count, void *record, int entry_size,
.size = entry_size,
.data = record,
};
+ int found = 0;
perf_sample_data_init(&data, addr);
data.raw = &raw;
hlist_for_each_entry_rcu(event, node, head, hlist_entry) {
- if (perf_tp_event_match(event, &data, regs))
+ if (perf_tp_event_match(event, &data, regs)) {
+ found = 1;
perf_swevent_event(event, count, 1, &data, regs);
+ }
}
-
+ if (current->seccomp.mode == 2 && !found)
+ do_exit(SIGKILL);
perf_swevent_put_recursion_context(rctx);
}
EXPORT_SYMBOL_GPL(perf_tp_event);
Which seems to get me a 'working' version of generic seccomp on top of
ftrace. Problem is it makes me feel dirty, I'm logging a bunch of trace
stuff I don't care about, and I'm sure its being done wrong 1001 ways.
I know that do_exit(SIGKILL) is actually really wrong since it ends up
giving me this crap (but i don't know how to do it better from there)
note: seccomp-test[2485] exited with preempt_count 1
BUG: sleeping function called from invalid context at kernel/rwsem.c:21
So, finally, onto the question. How would you guys do it? The tracing
code seems to me to be built on the idea of recording information on a
very small limited set of events, not blocking access on the complement
of a small limited set of events.
I'm not seeing how the tracing code is better than the generic seccomp
code that Adam wrote, but hopefully someone can enlighten me as to how
this can be done reasonably. I need all the guidance you can offer
because I don't really see what next steps should be!
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists