linux-kernel - Re: [RFC PATCH 3/3] misc_cgroup: remove error log to avoid log flood

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210910092310.GA18084@blackbody.suse.cz>
Date:   Fri, 10 Sep 2021 11:23:10 +0200
From:   Michal Koutný <mkoutny@...e.com>
To:     brookxu <brookxu.cn@...il.com>
Cc:     Vipin Sharma <vipinsh@...gle.com>, tj@...nel.org,
        lizefan.x@...edance.com, hannes@...xchg.org,
        linux-kernel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [RFC PATCH 3/3] misc_cgroup: remove error log to avoid log flood

On Fri, Sep 10, 2021 at 01:30:46PM +0800, brookxu <brookxu.cn@...il.com> wrote:
> I am a bit confused here. For misc_cgroup, we can only be rejected when the count
> touch Limit, but there may be other more reasons for other subsystems.

Sorry, I wasn't clear about that -- the failures I meant to be counted
here were only the ones caused by (an ancestor) limit. Maybe there's a
better naem for that.

> Therefore, when we are rejected, does it mean that we have touch
> Limit? If so, do we still need to distinguish between max and fail?
> (for misc_cgroup)

r
`- c1
   `- c2.max
       `- c3
          `- c4.max
	     `- task t
          `- c5

Assuming c2.max < c4.max, when a task t calls try_charge and it fails
because of c2.max, then the 'max' event is counted to c2 (telling that
the limit is perhaps low) and the 'fail' event is counted to c4 (telling
you where the troubles originated). That is my idea. Although in the
case of short-lived cgroups, you'd likely only get the hierarchically
aggregated 'fail' events from c3 or higher with lower (spatial)
precision.
What would be the type of information useful for your troubleshooting?

Cheers,
Michal