lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 May 2022 15:29:27 +0800
From:   Muchun Song <songmuchun@...edance.com>
To:     Waiman Long <longman@...hat.com>
Cc:     hannes@...xchg.org, mhocko@...nel.org, roman.gushchin@...ux.dev,
        shakeelb@...gle.com, akpm@...ux-foundation.org,
        cgroups@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, duanxiongchun@...edance.com
Subject: Re: [PATCH v5 00/11] Use obj_cgroup APIs to charge the LRU pages

On Mon, May 30, 2022 at 10:41:30PM -0400, Waiman Long wrote:
> On 5/30/22 03:49, Muchun Song wrote:
> > This version is rebased on v5.18.
> > 
> > Since the following patchsets applied. All the kernel memory are charged
> > with the new APIs of obj_cgroup.
> > 
> > 	[v17,00/19] The new cgroup slab memory controller [1]
> > 	[v5,0/7] Use obj_cgroup APIs to charge kmem pages [2]
> > 
> > But user memory allocations (LRU pages) pinning memcgs for a long time -
> > it exists at a larger scale and is causing recurring problems in the real
> > world: page cache doesn't get reclaimed for a long time, or is used by the
> > second, third, fourth, ... instance of the same job that was restarted into
> > a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory,
> > and make page reclaim very inefficient.
> > 
> > We can convert LRU pages and most other raw memcg pins to the objcg direction
> > to fix this problem, and then the LRU pages will not pin the memcgs.
> > 
> > This patchset aims to make the LRU pages to drop the reference to memory
> > cgroup by using the APIs of obj_cgroup. Finally, we can see that the number
> > of the dying cgroups will not increase if we run the following test script.
> > 
> > ```bash
> > #!/bin/bash
> > 
> > dd if=/dev/zero of=temp bs=4096 count=1
> > cat /proc/cgroups | grep memory
> > 
> > for i in {0..2000}
> > do
> > 	mkdir /sys/fs/cgroup/memory/test$i
> > 	echo $$ > /sys/fs/cgroup/memory/test$i/cgroup.procs
> > 	cat temp >> log
> > 	echo $$ > /sys/fs/cgroup/memory/cgroup.procs
> > 	rmdir /sys/fs/cgroup/memory/test$i
> > done
> > 
> > cat /proc/cgroups | grep memory
> > 
> > rm -f temp log
> > ```
> > 
> > [1] https://lore.kernel.org/linux-mm/20200623015846.1141975-1-guro@fb.com/
> > [2] https://lore.kernel.org/linux-mm/20210319163821.20704-1-songmuchun@bytedance.com/
> > 
> > v4: https://lore.kernel.org/all/20220524060551.80037-1-songmuchun@bytedance.com/
> > v3: https://lore.kernel.org/all/20220216115132.52602-1-songmuchun@bytedance.com/
> > v2: https://lore.kernel.org/all/20210916134748.67712-1-songmuchun@bytedance.com/
> > v1: https://lore.kernel.org/all/20210814052519.86679-1-songmuchun@bytedance.com/
> > RFC v4: https://lore.kernel.org/all/20210527093336.14895-1-songmuchun@bytedance.com/
> > RFC v3: https://lore.kernel.org/all/20210421070059.69361-1-songmuchun@bytedance.com/
> > RFC v2: https://lore.kernel.org/all/20210409122959.82264-1-songmuchun@bytedance.com/
> > RFC v1: https://lore.kernel.org/all/20210330101531.82752-1-songmuchun@bytedance.com/
> > 
> > v5:
> >   - Lots of improvements from Johannes, Roman and Waiman.
> >   - Fix lockdep warning reported by kernel test robot.
> >   - Add two new patches to do code cleanup.
> >   - Collect Acked-by and Reviewed-by from Johannes and Roman.
> >   - I didn't replace local_irq_disable/enable() to local_lock/unlock_irq() since
> >     local_lock/unlock_irq() takes an parameter, it needs more thinking to transform
> >     it to local_lock.  It could be an improvement in the future.
> 
> My comment about local_lock/unlock is just a note that
> local_irq_disable/enable() have to be eventually replaced. However, we need
> to think carefully where to put the newly added local_lock. It is perfectly
> fine to keep it as is and leave the conversion as a future follow-up.
>

Totally agree.
 
> Thank you very much for your work on this patchset.
>

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ