lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <62188f37-f816-08e9-cdd5-8df23131746d@openvz.org>
Date:   Tue, 16 Aug 2022 10:47:05 +0300
From:   Vasily Averin <vvs@...nvz.org>
To:     Tejun Heo <tj@...nel.org>,
        Michal Koutný <mkoutny@...e.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        linux-kernel@...r.kernel.org, kernel@...nvz.org,
        Shakeel Butt <shakeelb@...gle.com>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <songmuchun@...edance.com>,
        Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH 0/3] enable memcg accounting for kernfs objects

On 8/11/22 06:19, Vasily Averin wrote:
> On 8/9/22 20:56, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Aug 09, 2022 at 07:49:34PM +0200, Michal Koutný wrote:
>>> On Tue, Aug 09, 2022 at 07:31:31AM -1000, Tejun Heo <tj@...nel.org> wrote:
>>>> I'm not quite sure whether following the usual "charge it to the allocating
>>>> task's cgroup" is the best way to go about it. I wonder whether it'd be
>>>> better to attach it to the new cgroup's nearest ancestor with memcg enabled.
>>>
>>> See also 
>>> https://lore.kernel.org/r/YnBLge4ZQNbbxufc@blackbook/
>>> and
>>> https://lore.kernel.org/r/20220511163439.GD24172@blackbody.suse.cz/
>>
>> Ah, thanks. Vasily, can you please include some summary of the discussions
>> and the rationales for the path taken in the commit message?
> 
> Dear Tejun,
> thank you for the feedback, I'll do it in next patch set iteration.
> 
> However, I noticed another problem in neighborhood and I planned to
> add new patches into current patch set. One of the new patches is quite simple,
> however second one is quite complex and requires some discussion.

Summing up a private discussion with Tejun, Michal and Roman:
I'm going to create few new patches:

1) adjust active memcg for objects allocated during creation of new cgroup
  This patch will take memcg from parent cgroup an use it for accounting
  all objects allocated during creation of new cgroup.
  For that it moves set_active_memcg() calls from mem_cgroup_css_alloc()
  to cgroup_mkdir() and creates missing infrastructure.
  This allows you to predict which memcg should be used for object accounting,
  and should simplify debugging of possible problems and corner cases.

2) memcg: enable kernfs accounting: nodes and iattr
  Already discussed and approved patches.
  These objects consumes significant part of memory in various scenarios,
  including new cgroup creation and new net device creation.

3) adjust active memcg for simple_xattr accounting
  sysfs and tmpfs are in-memory file system, 
  for extended attributes they uses simple_xattr infrastructure.
  The patch forces sys_set[f]xattr calls to account xattr object
  to predictable memcg: for kernfs memcg will be taken from kernfs node,
  for shmem -- from shmem_info.
  Like 1) case, this allows to understand which memcg should be used
  for object accounting and simplify debugging of possible troubles.

4) memcg: enable accounting for simple_xattr: names and values
  This patch enables accounting for objects described in previous patch

5) simple_xattrs: replace list to rb-tree
  This significantly reduces the search time for existing entries.

Additionally Roman Gushchin prepares patch
"`put`ting the kernfs_node reference earlier in the cgroup removal process"

Thank you,
	Vasily Averin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ