[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9cc88b1c-8a8c-95ea-2cf7-31be3b771495@omnom.net>
Date: Sun, 3 Apr 2022 13:34:07 +1000
From: Andrew Holmes <aholmes@...om.net>
To: "Maciej W. Rozycki" <macro@...am.me.uk>, yaliang.wang@...driver.com
Cc: rppt@...nel.org, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
huangpei@...ngson.cn, Andrew Morton <akpm@...ux-foundation.org>,
kumba@...too.org, Geert Uytterhoeven <geert@...ux-m68k.org>,
anshuman.khandual@....com, penberg@...nel.org,
linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org,
Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()
On 3/4/2022 12:48 am, Maciej W. Rozycki wrote:
> On Thu, 10 Mar 2022, yaliang.wang@...driver.com wrote:
>
>> pgd page is freed by generic implementation pgd_free() since commit
>> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
>> however, there are scenarios that the system uses more than one page as
>> the pgd table, in such cases the generic implementation pgd_free() won't
>> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
>> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
>> will be set as "1", which will cause allocating two pages as the pgd
>> table. Well, at the same time, the generic implementation pgd_free()
>> just free one pgd page, which will result in the memory leak.
>>
>> The memory leak can be easily detected by executing shell command:
>> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
>>
>> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
>> Signed-off-by: Yaliang Wang <Yaliang.Wang@...driver.com>
>
> As a critical regression shouldn't this have been marked for backporting
> to stable branches?
Very yes please - this bug has been driving several of us at OpenWrt
crazy for quite[1] some[2] time now, mostly on Octeon devices. We'd
(wrongly) suspected the octeon-ethernet driver, but this morning finally
bisected it down to f9cb654cb550 and can confirm this patch fixes the
regression.
MIPS64 has essentially been broken/unusable for 8 kernel releases,
including two LTS kernels, since the original commit landed. Should
there not have been CI/tests that caught this? It's pretty major!
- Andrew
[1]
https://forum.openwrt.org/t/oom-killer-dnsmasq-when-physical-free-ram-remains/109351
[2]
https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827
Powered by blists - more mailing lists