linux-kernel - Re: [RFC PATCH-cgroup 5/6] cgroup: Skip dying css in cgroup_apply_control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170621214216.GE14720@htj.duckdns.org>
Date:   Wed, 21 Jun 2017 17:42:16 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Waiman Long <longman@...hat.com>
Cc:     Li Zefan <lizefan@...wei.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org, kernel-team@...com, pjt@...gle.com,
        luto@...capital.net, efault@....de, torvalds@...ux-foundation.org
Subject: Re: [RFC PATCH-cgroup 5/6] cgroup: Skip dying css in
 cgroup_apply_control_{enable,disable}

Hello,

On Wed, Jun 14, 2017 at 11:05:36AM -0400, Waiman Long wrote:
> While constantly turning on and off controllers, it is possible to
> trigger the dying CSS warnings in cgroup_apply_control_enable() and
> cgroup_apply_control_disable(). The current code, however, proceeds
> after the warning leading to other secondary warnings and maybe even
> data corruption, like
> 
>   cgroup: cgroup_addrm_files: failed to add current, err=-17
> 
> To avoid the secondary errors, the dying CSS is now ignored or skipped
> so as not to cause other problem.
> 
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
>  kernel/cgroup/cgroup.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index f0bea32..2a5bd49 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -2846,12 +2846,24 @@ static int cgroup_apply_control_enable(struct cgroup *cgrp)
>  		for_each_subsys(ss, ssid) {
>  			struct cgroup_subsys_state *css = cgroup_css(dsct, ss);
>  
> -			WARN_ON_ONCE(css && percpu_ref_is_dying(&css->refcnt));
> -
>  			if (!(cgroup_ss_mask(dsct, false) & (1 << ss->id)) ||
>  			    (dsct->bypass_ss_mask & (1 << ss->id)))
>  				continue;
>  
> +			/*
> +			 * If the css is dying, we will just skip it after
> +			 * warning.
> +			 */
> +			if (css && (css->flags & CSS_DYING)) {
> +				char name[NAME_MAX+1];
> +
> +				cgroup_name(cgrp, name, NAME_MAX);
> +				pr_warn("%s: %s css of cgroup %s is dying!\n",
> +					__func__, ss->name, name);
> +				WARN_ON_ONCE(1);
> +				continue;
> +			}

Can you trigger this without your patches because this triggering
means that the code screwed up before it reached this point.  We
should be fixing that bug rather than masking it up here.

Thanks.

-- 
tejun