lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170829133945.at7q7u2vk6qwrhjh@dhcp22.suse.cz>
Date:   Tue, 29 Aug 2017 15:39:45 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:     Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Johannes Weiner <hannes@...xchg.org>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Minchan Kim <minchan@...nel.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for
 ZONE_MOVABLE request

On Tue 29-08-17 09:45:47, Joonsoo Kim wrote:
> On Mon, Aug 28, 2017 at 11:56:16AM +0200, Michal Hocko wrote:
> > On Mon 28-08-17 09:15:52, Joonsoo Kim wrote:
> > > On Fri, Aug 25, 2017 at 09:38:42AM +0200, Michal Hocko wrote:
> > > > On Fri 25-08-17 09:20:31, Joonsoo Kim wrote:
> > > > > On Thu, Aug 24, 2017 at 11:41:58AM +0200, Vlastimil Babka wrote:
> > > > > > On 08/24/2017 07:45 AM, js1304@...il.com wrote:
> > > > > > > From: Joonsoo Kim <iamjoonsoo.kim@....com>
> > > > > > > 
> > > > > > > Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
> > > > > > > important to reserve. When ZONE_MOVABLE is used, this problem would
> > > > > > > theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
> > > > > > > allocation request which is mainly used for page cache and anon page
> > > > > > > allocation. So, fix it.
> > > > > > > 
> > > > > > > And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
> > > > > > > makes code complex. For example, if there is highmem system, following
> > > > > > > reserve ratio is activated for *NORMAL ZONE* which would be easyily
> > > > > > > misleading people.
> > > > > > > 
> > > > > > >  #ifdef CONFIG_HIGHMEM
> > > > > > >  32
> > > > > > >  #endif
> > > > > > > 
> > > > > > > This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
> > > > > > > array by MAX_NR_ZONES and place "#ifdef" to right place.
> > > > > > > 
> > > > > > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
> > > > > > > Acked-by: Vlastimil Babka <vbabka@...e.cz>
> > > > > > 
> > > > > > Looks like I did that almost year ago, so definitely had to refresh my
> > > > > > memory now :)
> > > > > > 
> > > > > > Anyway now I looked more thoroughly and noticed that this change leaks
> > > > > > into the reported sysctl. On a 64bit system with ZONE_MOVABLE:
> > > > > > 
> > > > > > before the patch:
> > > > > > vm.lowmem_reserve_ratio = 256   256     32
> > > > > > 
> > > > > > after the patch:
> > > > > > vm.lowmem_reserve_ratio = 256   256     32      2147483647
> > > > > > 
> > > > > > So if we indeed remove HIGHMEM from protection (c.f. Michal's mail), we
> > > > > > should do that differently than with the INT_MAX trick, IMHO.
> > > > > 
> > > > > Hmm, this is already pointed by Minchan and I have answered that.
> > > > > 
> > > > > lkml.kernel.org/r/<20170421013243.GA13966@...304-desktop>
> > > > > 
> > > > > If you have a better idea, please let me know.
> > > > 
> > > > Why don't we just use 0. In fact we are reserving 0 pages... Using
> > > > INT_MAX is just wrong.
> > > 
> > > The number of reserved pages is calculated by "managed_pages /
> > > ratio". Using INT_MAX, net result would be 0.
> > 
> > Why cannot we simply special case 0?
> > 
> > > There is a logic converting ratio 0 to ratio 1.
> > > 
> > > if (sysctl_lowmem_reserve_ratio[idx] < 1)
> > >         sysctl_lowmem_reserve_ratio[idx] = 1
> > 
> > This code just tries to prevent from division by 0 but I am wondering
> > we should simply set lowmem_reserve to 0 in that case.
> > 
> > > If I use 0 to represent 0 reserved page, there would be a user
> > > who is affected by this change. So, I don't use 0 for this patch.
> > 
> > I am sorry but I do not understand? Could you be more specific please?
> 
> If there is a user that manually set sysctl_lowmem_reserve_ratio and
> he/she uses '0' to set ratio to '1', your suggestion making '0' as
> a special value changes his/her system behaviour. I'm afraid this
> case.

Documentation (Documentation/sysctl/vm.txt) explicitly states that 1
is minimum. So I wouldn't afraid all that much. And you can actually
printk_once if 0 is set and explain that this disables memory reserve
for the particular zone altogether.

> However, if you and Vlastimil agree with this making '0' as a special
> value, I will go this way.

I do agree that INT_MAX is just too ugly.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ