lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <8a2f2644-71d0-05d7-49d8-878aafa99652@huawei.com> Date: Sat, 26 Nov 2022 21:09:51 +0800 From: Yongqiang Liu <liuyongqiang13@...wei.com> To: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org> CC: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, <aarcange@...hat.com>, <hughd@...gle.com>, <mgorman@...e.de>, <mhocko@...e.cz>, <cl@...two.org>, <n-horiguchi@...jp.nec.com>, <zokeefe@...gle.com>, <rientjes@...gle.com>, Matthew Wilcox <willy@...radead.org>, <peterx@...hat.com>, "Wangkefeng (OS Kernel Lab)" <wangkefeng.wang@...wei.com>, "zhangxiaoxu (A)" <zhangxiaoxu5@...wei.com>, <kirill.shutemov@...ux.intel.com>, Yongqiang Liu <liuyongqiang13@...wei.com>, Lu Jialin <lujialin4@...wei.com> Subject: [QUESTION] memcg page_counter seems broken in MADV_DONTNEED with THP enabled Hi, We use mm_counter to how much a process physical memory used. Meanwhile, page_counter of a memcg is used to count how much a cgroup physical memory used. If a cgroup only contains a process, they looks almost the same. But with THP enabled, sometimes memory.usage_in_bytes in memcg may be twice or more than rss in proc/[pid]/smaps_rollup as follow: [root@...alhost sda]# cat /sys/fs/cgroup/memory/test/memory.usage_in_bytes 1080930304 [root@...alhost sda]# cat /sys/fs/cgroup/memory/test/cgroup.procs 1290 [root@...alhost sda]# cat /proc/1290/smaps_rollup 55ba80600000-ffffffffff601000 ---p 00000000 00:00 0 [rollup] Rss: 500648 kB Pss: 498337 kB Shared_Clean: 2732 kB Shared_Dirty: 0 kB Private_Clean: 364 kB Private_Dirty: 497552 kB Referenced: 500648 kB Anonymous: 492016 kB LazyFree: 0 kB AnonHugePages: 129024 kB ShmemPmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 0 I have found the differences was because that __split_huge_pmd decrease the mm_counter but page_counter in memcg was not decreased with refcount of a head page is not zero. Here are the follows: do_madvise madvise_dontneed_free zap_page_range unmap_single_vma zap_pud_range zap_pmd_range __split_huge_pmd __split_huge_pmd_locked __mod_lruvec_page_state zap_pte_range add_mm_rss_vec add_mm_counter -> decrease the mm_counter tlb_finish_mmu arch_tlb_finish_mmu tlb_flush_mmu_free free_pages_and_swap_cache release_pages folio_put_testzero(page) -> not zero, skip continue; __folio_put_large free_transhuge_page free_compound_page mem_cgroup_uncharge page_counter_uncharge -> decrease the page_counter node_page_stat which shows in meminfo was also decreased. the __split_huge_pmd seems free no physical memory unless the total THP was free.I am confused which one is the true physical memory used of a process. Kind regards, Yongqiang Liu
Powered by blists - more mailing lists