[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1iptf1cm2.fsf@fess.ebiederm.org>
Date: Thu, 12 May 2011 20:52:05 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: "Serge E. Hallyn" <serge@...lyn.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
"Serge E. Hallyn" <serge.hallyn@...onical.com>,
Daniel Lezcano <daniel.lezcano@...e.fr>,
David Howells <dhowells@...hat.com>,
James Morris <jmorris@...ei.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
containers@...ts.linux-foundation.org,
Al Viro <viro@...iv.linux.org.uk>
Subject: Re: acl_permission_check: disgusting performance
"Serge E. Hallyn" <serge@...lyn.com> writes:
> Quoting Linus Torvalds (torvalds@...ux-foundation.org):
>> Those four instructions are about two thirds of the cost of the
>> function. The last two are about 50% of the cost.
>>
>> They are the accesses to "current", "->cred", "->user" and "->user_ns"
>> respectively (the cmp with the big constant is that compare against
>> "init_ns").
>>
>> Now, if we got rid of them, we wouldn't improve performance by 2/3rds
>> on that function, because we do need the two first accesses for
>> "fsuid" (which is the next check), and the third one (which is
>> currently "cred->user" ends up doing the cache miss that we'd take for
>> "cred->fsuid" anyway. So the first three costs are fairly inescapable.
>>
>> They are also cheaper, probably because those fields tend to be more
>> often in the cache. So it really is that fourth one that hurts the
>> most, as shown by it taking almost a third of the cycles of that
>> function.
>>
>> And it all comes from that annoying commit e795b71799ff0 ("userns:
>> userns: check user namespace for task->file uid equivalence checks"),
>> and I bet nobody involved thought about how expensive that was.
>>
>> That "user_ns" is _really_ expensive to load. And the fact that it's
>> after a chain of three other loads makes it all totally serialized,
>> and makes things much more expensive.
>>
>> Could we perhaps have "user_ns" directly in the "struct cred"? Or
>
> The only reason not to put it into struct cred would be to avoid growing
> the struct cred. For that matter, esp since you can't unshare the user_ns,
> it could also go right into the task_struct.
>
> (Eric's sys_setns patchset will eventually complicate that, but I don't
> think it'll be a problem)
>From the perspective of a process the user namespace and the pid
namespace will never change. I expect we will have something that lets
you change the user namespace and the pid namespace experienced by child
processes. So the sys_setns work should not affect this.
>> could we avoid or short-circuit this check entirely somehow, since it
>> always checks against "init_ns"?
>
> Of course I'm hoping that before fall the check won't be against
> init_ns any more :) I was actually hoping to get back to that next
> week, so I can start by testing the caching you suggest.
Linus brings up a good point that we need to be very careful with
the user namespace and performance. That said I think there is
a cheap trick we can do until the user namespace is actually
good for something.
Something like my untested patch below.
Perhaps current_user_ns needs to move into user_namespace.h to get this
to compile. There are some weird circular header dependencies in there.
In any event an inline version of current_user_ns that returns
init_user_ns in the case where user namespaces aren't compiled in should
fix the immediate performance problems by allowing the compiler to
optimize them out.
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 9aeeb0b..09c76c2 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -357,7 +357,17 @@ static inline void put_cred(const struct cred *_cred)
#define _current_user_ns() (current_cred_xxx(user)->user_ns)
#define current_security() (current_cred_xxx(security))
+#if CONFIG_USER_NS
extern struct user_namespace *current_user_ns(void);
+#else
+struct user_namespace;
+extern struct user_namespace init_user_ns;
+static inline struct user_namespace *current_user_ns(void)
+{
+
+ return &init_user_ns;
+}
+#endif
#define current_uid_gid(_uid, _gid) \
do { \
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists