lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 12 May 2011 20:52:05 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	"Serge E. Hallyn" <serge@...lyn.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Serge E. Hallyn" <serge.hallyn@...onical.com>,
	Daniel Lezcano <daniel.lezcano@...e.fr>,
	David Howells <dhowells@...hat.com>,
	James Morris <jmorris@...ei.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	containers@...ts.linux-foundation.org,
	Al Viro <viro@...iv.linux.org.uk>
Subject: Re: acl_permission_check: disgusting performance

"Serge E. Hallyn" <serge@...lyn.com> writes:

> Quoting Linus Torvalds (torvalds@...ux-foundation.org):
>> Those four instructions are about two thirds of the cost of the
>> function. The last two are about 50% of the cost.
>> 
>> They are the accesses to "current", "->cred", "->user" and "->user_ns"
>> respectively (the cmp with the big constant is that compare against
>> "init_ns").
>> 
>> Now, if we got rid of them, we wouldn't improve performance by 2/3rds
>> on that function, because we do need the two first accesses for
>> "fsuid" (which is the next check), and the third one (which is
>> currently "cred->user" ends up doing the cache miss that we'd take for
>> "cred->fsuid" anyway. So the first three costs are fairly inescapable.
>> 
>> They are also cheaper, probably because those fields tend to be more
>> often in the cache. So it really is that fourth one that hurts the
>> most, as shown by it taking almost a third of the cycles of that
>> function.
>> 
>> And it all comes from that annoying commit e795b71799ff0 ("userns:
>> userns: check user namespace for task->file uid equivalence checks"),
>> and I bet nobody involved thought about how expensive that was.
>> 
>> That "user_ns" is _really_ expensive to load. And the fact that it's
>> after a chain of three other loads makes it all totally serialized,
>> and makes things much more expensive.
>> 
>> Could we perhaps have "user_ns" directly in the "struct cred"? Or
>
> The only reason not to put it into struct cred would be to avoid growing
> the struct cred.  For that matter, esp since you can't unshare the user_ns,
> it could also go right into the task_struct.
>
> (Eric's sys_setns patchset will eventually complicate that, but I don't
> think it'll be a problem)

>From the perspective of a process the user namespace and the pid
namespace will never change.  I expect we will have something that lets
you change the user namespace and the pid namespace experienced by child
processes.  So the sys_setns work should not affect this.

>> could we avoid or short-circuit this check entirely somehow, since it
>> always checks against "init_ns"?
>
> Of course I'm hoping that before fall the check won't be against
> init_ns any more :)  I was actually hoping to get back to that next
> week, so I can start by testing the caching you suggest.

Linus brings up a good point that we need to be very careful with
the user namespace and performance.  That said I think there is
a cheap trick we can do until the user namespace is actually
good for something.

Something like my untested patch below.

Perhaps current_user_ns needs to move into user_namespace.h to get this
to compile.  There are some weird circular header dependencies in there.

In any event an inline version of current_user_ns that returns
init_user_ns in the case where user namespaces aren't compiled in should
fix the immediate performance problems by allowing the compiler to
optimize them out.

diff --git a/include/linux/cred.h b/include/linux/cred.h
index 9aeeb0b..09c76c2 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -357,7 +357,17 @@ static inline void put_cred(const struct cred *_cred)
 #define _current_user_ns()	(current_cred_xxx(user)->user_ns)
 #define current_security()	(current_cred_xxx(security))
 
+#if CONFIG_USER_NS
 extern struct user_namespace *current_user_ns(void);
+#else
+struct user_namespace;
+extern struct user_namespace init_user_ns;
+static inline struct user_namespace *current_user_ns(void)
+{
+
+	return &init_user_ns;
+}
+#endif
 
 #define current_uid_gid(_uid, _gid)		\
 do {						\
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ