lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 7 Jul 2016 20:27:13 +0000
From:	Topi Miettinen <toiwoton@...il.com>
To:	Petr Mladek <pmladek@...e.com>
Cc:	"Serge E. Hallyn" <serge@...lyn.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Tejun Heo <tj@...nel.org>, lkml <linux-kernel@...r.kernel.org>,
	luto@...nel.org, Kees Cook <keescook@...omium.org>,
	Jonathan Corbet <corbet@....net>,
	Li Zefan <lizefan@...wei.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Serge Hallyn <serge.hallyn@...onical.com>,
	James Morris <james.l.morris@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Howells <dhowells@...hat.com>,
	David Woodhouse <David.Woodhouse@...el.com>,
	Ard Biesheuvel <ard.biesheuvel@...aro.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
	"open list:CONTROL GROUP (CGROUP)" <cgroups@...r.kernel.org>,
	"open list:CAPABILITIES" <linux-security-module@...r.kernel.org>
Subject: Re: [PATCH] capabilities: add capability cgroup controller

On 07/07/16 09:16, Petr Mladek wrote:
> On Sun 2016-07-03 15:08:07, Topi Miettinen wrote:
>> The attached patch would make any uses of capabilities generate audit
>> messages. It works for simple tests as you can see from the commit
>> message, but unfortunately the call to audit_cgroup_list() deadlocks the
>> system when booting a full blown OS. There's no deadlock when the call
>> is removed.
>>
>> I guess that in some cases, cgroup_mutex and/or css_set_lock could be
>> already held earlier before entering audit_cgroup_list(). Holding the
>> locks is however required by task_cgroup_from_root(). Is there any way
>> to avoid this? For example, only print some kind of cgroup ID numbers
>> (are there unique and stable IDs, available without locks?) for those
>> cgroups where the task is registered in the audit message?
> 
> I am not sure if anyone know what really happens here. I suggest to
> enable lockdep. It might detect possible deadlock even before it
> really happens, see Documentation/locking/lockdep-design.txt
> 
> It can be enabled by
> 
>    CONFIG_PROVE_LOCKING=y
> 
> It depends on
> 
>     CONFIG_DEBUG_KERNEL=y
> 
> and maybe some more options, see lib/Kconfig.debug

Thanks a lot! I caught this stack dump:

starting version 230
[    3.416647] ------------[ cut here ]------------
[    3.417310] WARNING: CPU: 0 PID: 95 at
/home/topi/d/linux.git/kernel/locking/lockdep.c:2871
lockdep_trace_alloc+0xb4/0xc0
[    3.417605] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[    3.417923] Modules linked in:
[    3.418288] CPU: 0 PID: 95 Comm: systemd-udevd Not tainted 4.7.0-rc5+ #97
[    3.418444] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Debian-1.8.2-1 04/01/2014
[    3.418726]  0000000000000086 000000007970f3b0 ffff88000016fb00
ffffffff813c9c45
[    3.418993]  ffff88000016fb50 0000000000000000 ffff88000016fb40
ffffffff81091e9b
[    3.419176]  00000b3705e2c798 0000000000000046 0000000000000410
00000000ffffffff
[    3.419374] Call Trace:
[    3.419511]  [<ffffffff813c9c45>] dump_stack+0x67/0x92
[    3.419644]  [<ffffffff81091e9b>] __warn+0xcb/0xf0
[    3.419745]  [<ffffffff81091f1f>] warn_slowpath_fmt+0x5f/0x80
[    3.419868]  [<ffffffff810e9a84>] lockdep_trace_alloc+0xb4/0xc0
[    3.419988]  [<ffffffff8120dc42>] kmem_cache_alloc_node+0x42/0x600
[    3.420156]  [<ffffffff8110432d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[    3.420170]  [<ffffffff8163183b>] __alloc_skb+0x5b/0x1d0
[    3.420170]  [<ffffffff81144f6b>] audit_log_start+0x29b/0x480
[    3.420170]  [<ffffffff810a2925>] ? __lock_task_sighand+0x95/0x270
[    3.420170]  [<ffffffff81145cc9>] audit_log_cap_use+0x39/0xf0
[    3.420170]  [<ffffffff8109cd75>] ns_capable+0x45/0x70
[    3.420170]  [<ffffffff8109cdb7>] capable+0x17/0x20
[    3.420170]  [<ffffffff812a2f50>] oom_score_adj_write+0x150/0x2f0
[    3.420170]  [<ffffffff81230997>] __vfs_write+0x37/0x160
[    3.420170]  [<ffffffff810e33b7>] ? update_fast_ctr+0x17/0x30
[    3.420170]  [<ffffffff810e3449>] ? percpu_down_read+0x49/0x90
[    3.420170]  [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0
[    3.420170]  [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0
[    3.420170]  [<ffffffff81231048>] vfs_write+0xb8/0x1b0
[    3.420170]  [<ffffffff812533c6>] ? __fget_light+0x66/0x90
[    3.420170]  [<ffffffff81232078>] SyS_write+0x58/0xc0
[    3.420170]  [<ffffffff81001f2c>] do_syscall_64+0x5c/0x300
[    3.420170]  [<ffffffff81849c9a>] entry_SYSCALL64_slow_path+0x25/0x25
[    3.420170] ---[ end trace fb586899fb556a5e ]---
[    3.447922] random: systemd-udevd urandom read with 3 bits of entropy
available
[    4.014078] clocksource: Switched to clocksource tsc
Begin: Loading essential drivers ... done.

This is with qemu and the boot continues normally. With real computer,
there's no such output and system just seems to freeze.

Could it be possible that the deadlock happens because there's some IO
towards /sys/fs/cgroup, which causes a capability check and that in turn
causes locking problems when we try to print cgroup list?

-Topi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ