[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200818201817.351499e75cba2a84e8bf33e6@linux-foundation.org>
Date: Tue, 18 Aug 2020 20:18:17 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Doug Berger <opendmb@...il.com>
Cc: Jason Baron <jbaron@...mai.com>,
David Rientjes <rientjes@...gle.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm: include CMA pages in lowmem_reserve at boot
On Fri, 14 Aug 2020 09:49:26 -0700 Doug Berger <opendmb@...il.com> wrote:
> The lowmem_reserve arrays provide a means of applying pressure
> against allocations from lower zones that were targeted at
> higher zones. Its values are a function of the number of pages
> managed by higher zones and are assigned by a call to the
> setup_per_zone_lowmem_reserve() function.
>
> The function is initially called at boot time by the function
> init_per_zone_wmark_min() and may be called later by accesses
> of the /proc/sys/vm/lowmem_reserve_ratio sysctl file.
>
> The function init_per_zone_wmark_min() was moved up from a
> module_init to a core_initcall to resolve a sequencing issue
> with khugepaged. Unfortunately this created a sequencing issue
> with CMA page accounting.
>
> The CMA pages are added to the managed page count of a zone
> when cma_init_reserved_areas() is called at boot also as a
> core_initcall. This makes it uncertain whether the CMA pages
> will be added to the managed page counts of their zones before
> or after the call to init_per_zone_wmark_min() as it becomes
> dependent on link order. With the current link order the pages
> are added to the managed count after the lowmem_reserve arrays
> are initialized at boot.
>
> This means the lowmem_reserve values at boot may be lower than
> the values used later if /proc/sys/vm/lowmem_reserve_ratio is
> accessed even if the ratio values are unchanged.
>
> In many cases the difference is not significant, but for example
> an ARM platform with 1GB of memory and the following memory layout
> [ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff]
> [ 0.000000] Normal empty
> [ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff]
>
> would result in 0 lowmem_reserve for the DMA zone. This would allow
> userspace to deplete the DMA zone easily.
Sounds fairly serious for thos machines. Was a cc:stable considered?
> Funnily enough
> $ cat /proc/sys/vm/lowmem_reserve_ratio
> would fix up the situation because it forces
> setup_per_zone_lowmem_reserve as a side effect.
>
> This commit breaks the link order dependency by invoking
> init_per_zone_wmark_min() as a postcore_initcall so that the
> CMA pages have the chance to be properly accounted in their
> zone(s) and allowing the lowmem_reserve arrays to receive
> consistent values.
>
Powered by blists - more mailing lists