[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7468bcc-4b75-1190-5eae-9796d35b048c@huawei.com>
Date: Tue, 16 Aug 2022 15:04:08 +0800
From: mawupeng <mawupeng1@...wei.com>
To: <gregkh@...uxfoundation.org>
CC: <mawupeng1@...wei.com>, <rppt@...ux.vnet.ibm.com>,
<hughd@...gle.com>, <aarcange@...hat.com>, <hannes@...xchg.org>,
<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
<wangkefeng.wang@...wei.com>, <willy@...radead.org>
Subject: Re: [PATCH stable 4.14,4.19 1/1] mm: Fix page counter mismatch in
shmem_mfill_atomic_pte
On 2022/8/16 13:31, Greg KH wrote:
> On Tue, Aug 16, 2022 at 11:27:08AM +0800, mawupeng wrote:
>> Cc Greg
>
> Cc Greg for what? I have no context here at all as to what you want me
> to do..
We found a bug related to memory cgroup counter in stable 4.14/4.19.
shmem_mfill_atomic_pte() wrongly called mem_cgroup_cancel_charge() in "success"
path, it should mem_cgroup_uncharge() to dec memory counter instead.
mem_cgroup_cancel_charge() should only be used if this transaction is
unsuccessful and mem_cgroup_uncharge() is used to do this if this transaction
succeed.
Commit 3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
in v5.8-rc1 change is charge/uncharge/cancel logic so don't have this
problem.
This counter will underflow to negative maximum value and trigger oom to kill all
process include sshd and leave system unaccessible.
The reason cc you is that we want to merge this bugfix into stable 4.14/4.19.
The error call trace:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 17127 at mm/page_counter.c:62 page_counter_cancel+0x57/0x90
RIP: 0010:page_counter_cancel+0x57/0x90
Call Trace:
page_counter_uncharge+0x33/0x60
uncharge_batch+0xb5/0x5f0
mem_cgroup_uncharge_list+0x102/0x170
release_pages+0x814/0xcc0
tlb_flush_mmu_free+0xa9/0x140
arch_tlb_finish_mmu+0xa4/0x140
tlb_finish_mmu+0x90/0xf0
exit_mmap+0x264/0x4b0
>
> totally confused,
>
> greg k-h
Powered by blists - more mailing lists