lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210909143720.GA14709@blackbody.suse.cz>
Date:   Thu, 9 Sep 2021 16:37:20 +0200
From:   Michal Koutný <mkoutny@...e.com>
To:     brookxu <brookxu.cn@...il.com>
Cc:     tj@...nel.org, lizefan.x@...edance.com, hannes@...xchg.org,
        vipinsh@...gle.com, linux-kernel@...r.kernel.org,
        cgroups@...r.kernel.org
Subject: Re: [RFC PATCH 3/3] misc_cgroup: remove error log to avoid log flood

On Wed, Sep 08, 2021 at 01:24:36PM +0800, brookxu <brookxu.cn@...il.com> wrote:
> This log provides less information, we can get more detailed failure
> records through
> misc.events, misc.events.local and misc.failcnt.
> From this, perhaps we can remove it.

I hope: a) it's not used widely, b) no-one relies on parsing the
message so this is an acceptable change.

> @@ -157,13 +157,6 @@ int misc_cg_try_charge(enum misc_res_type type, struct misc_cg *cg,
>  		new_usage = atomic_long_add_return(amount, &res->usage);
>  		if (new_usage > READ_ONCE(res->max) ||
>  		    new_usage > READ_ONCE(misc_res_capacity[type])) {
> -			if (!res->failed) {
> -				pr_info("cgroup: charge rejected by the misc controller for %s resource in ",
> -					misc_res_name[type]);
> -				pr_cont_cgroup_path(i->css.cgroup);
> -				pr_cont("\n");
> -				res->failed = true;
> -			}

`i` is the misc_cg whose limit was hit. In a sense, I think the current
implementation of the log message (before your patch) is not as useful
as it could be. The logged message here should not refer to `i` but `cg`
(i.e. the cgroup where the actual chargee resides). It's basically the
idea from [1].

So there could be two type of events (not referring to the v1-specific
failcnt):
- max - number of times the cgroup's misc.max was hit,
- fail - number of times operation failed (rejected) in the cgroup.

The former would tell you which limit is probably set too low, the
latter would capture which cgroup workload is affected. (The difference
would become apparent with >1 level deep hierarchies.)

Regards,
Michal

[1] https://lore.kernel.org/lkml/20191202191100.GF16681@devbig004.ftw2.facebook.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ