lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170512063815.GC6803@dhcp22.suse.cz>
Date:   Fri, 12 May 2017 08:38:15 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Joonsoo Kim <js1304@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Rik van Riel <riel@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        mgorman@...hsingularity.net, Laura Abbott <lauraa@...eaurora.org>,
        Minchan Kim <minchan@...nel.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Michal Nazarewicz <mina86@...a86.com>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Russell King <linux@...linux.org.uk>,
        Will Deacon <will.deacon@....com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@....com
Subject: Re: [PATCH v7 0/7] Introduce ZONE_CMA

On Fri 12-05-17 11:00:48, Joonsoo Kim wrote:
> On Thu, May 11, 2017 at 11:13:04AM +0200, Michal Hocko wrote:
> > On Thu 11-05-17 11:12:43, Joonsoo Kim wrote:
> > > Sorry for the late response. I was on a vacation.
> > > 
> > > On Tue, May 02, 2017 at 03:32:29PM +0200, Michal Hocko wrote:
> > > > On Tue 02-05-17 13:01:32, Joonsoo Kim wrote:
> > > > > On Thu, Apr 27, 2017 at 05:06:36PM +0200, Michal Hocko wrote:
> > > > [...]
> > > > > > I see this point and I agree that using a specific zone might be a
> > > > > > _nicer_ solution in the end but you have to consider another aspects as
> > > > > > well. The main one I am worried about is a long term maintainability.
> > > > > > We are really out of page flags and consuming one for a rather specific
> > > > > > usecase is not good. Look at ZONE_DMA. I am pretty sure that almost
> > > > > > no sane HW needs 16MB zone anymore, yet we have hard time to get rid
> > > > > > of it and so we have that memory laying around unused all the time
> > > > > > and blocking one page flag bit. CMA falls into a similar category
> > > > > > AFAIU. I wouldn't be all that surprised if a future HW will not need CMA
> > > > > > allocations in few years, yet we will have to fight to get rid of it
> > > > > > like we do with ZONE_DMA. And not only that. We will also have to fight
> > > > > > finding page flags for other more general usecases in the meantime.
> > > > > 
> > > > > This maintenance problem is inherent. This problem exists even if we
> > > > > uses MIGRATETYPE approach. We cannot remove many hooks for CMA if a
> > > > > future HW will not need CMA allocation in few years. The only
> > > > > difference is that one takes single zone bit only for CMA user and the
> > > > > other approach takes many hooks that we need to take care about it all
> > > > > the time.
> > > > 
> > > > And I consider this a big difference. Because while hooks are not nice
> > > > they will affect CMA users (in a sense of bugs/performance etc.). While
> > > > an additional bit consumed will affect potential future and more generic
> > > > features.
> > > 
> > > Because these hooks are so tricky and are spread on many places,
> > > bugs about these hooks can be made by *non-CMA* user and they hurt
> > > *CMA* user. These hooks could also delay non-CMA user's development speed
> > > since there are many hooks about CMA and understanding how CMA is managed
> > > is rather difficult.
> > 
> > Than make those hooks easier to maintain. Seriously this is a
> > non-argument.
> 
> I can't understand what you said here. 

I wanted to say that you can make those hooks so non-intrusive that
nobody outside of the CMA has to even care that CMA exists.

> With zone approach, someone who
> isn't related to CMA don't need to understand how CMA is managed.
> 
> > 
> > [...]
> > 
> > > > And all this can be isolated to CMA specific hooks with mostly minimum
> > > > impact to most users. I hear you saying that zone approach is more natural
> > > > and I would agree if we wouldn't have to care about the number of zones.
> > > 
> > > I attach a solution about one more bit in page flags although I don't
> > > agree with your opinion that additional bit is no-go approach. Just
> > > note that we have already used three bits for zone encoding in some
> > > configuration due to ZONE_DEVICE.
> > 
> > I am absolutely not happy about ZONE_DEVICE but there is _no_ other
> > viable solution right now. I know that people behind this are still
> > considering struct page over direct pfn usage but they are not in the
> > same situation as CMA which _can_ work without additional zone.
> 
> IIUC, ZONE_DEVICE can reuse the other zone and migratetype.

They are not going to migrate anything or define any allocation fallback
policy because those pages are outside of the page allocator completely.
And that is why a zone approach is a reasonable approach. There are
probably other ways and I will certainly push going that way.

[...]

> > If you _really_ insist on using zone for CMA then reuse ZONE_MOVABLE.
> > I absolutely miss why do you push a specialized zone so hard.
> 
> As I said before, there is no fundamental issue to reuse ZONE_MOVABLE.
> I just don't want to reuse it because they have a different
> characteristic. In MM subsystem context, their characteristic is the same.
> However, CMA memory should be used for the device in runtime so more
> allocation guarantee is needed. See the offline_pages() in
> mm/memory_hotplug.c. They can bear in 120 sec to offline the
> memory but CMA memory can't.

This is just an implementation detail. Pinned pages in the CMA ranges
should be easilly checked. Moreover memory hotplug cares only about
hotplugable memory and placing CMA ranges there could be seen as a
configuration bug.

> And, this is a design issue. I don't want to talk about why should we
> pursuit the good design. Originally, ZONE exists to manage different
> type of memory. Migratetype is not for that purpose. Using separate
> zone fits the original purpose. Mixing them would be a bad design and
> we would esaily encounter the unexpected problem in the future.

As I've said earlier. Abusing ZONE_MOVABLE is not ideal either. I would
rather keep the status quo and fix the cluttered code and make it easier
to follow. But if you absolutely insist that a specialized zone is
necessary then ZONE_MOVABLE a) already exists and we do not need to
consume another bit b) most of the CMA zone characteristics overlap
with MOVABLE. So it is the least painful zone to use with the current
restrictions we have.

> > [...]
> > > > No, but I haven't heard any single argument that those bugs are
> > > > impossible to fix with the current approach. They might be harder to fix
> > > > but if I can chose between harder for CMA and harder for other more
> > > > generic HW independent features I will go for the first one. And do not
> > > > take me wrong, I have nothing against CMA as such. It solves a real life
> > > > problem. I just believe it doesn't deserve to consume a new bit in page
> > > > flags because that is just too scarce resource.
> > > 
> > > As I mentioned above, I think that maintenance overhead due to CMA
> > > deserves to consume a new bit in page flags. It also provide us
> > > extendability to introduce more zones in the future.
> > > 
> > > Anyway, this value-judgement is subjective so I guess that we
> > > cannot agree with each other. To solve your concern,
> > > I make following solution. Please let me know your opinion about this.
> > > This patch can be applied on top of my ZONE_CMA series.
> > 
> > I don not think this makes situation any easier or more acceptable for
> > merging.
> 
> Please say the reason. This implementation don't use additional bit in
> page flags that you concerned about. And, there is no performance
> regression at least in my simple test.

I really do not want to question your "simple test" but page_zonenum is
used in many performance sensitive paths and proving it doesn't regress
would require testing many different workload. Are you going to do that?

> > But I feel we are looping without much progress. So let me NAK this
> > until it is _proven_ that the current code is unfixable nor ZONE_MOVABLE
> > can be reused
> 
> I want to open all the possibilty so could you check that ZONE_MOVABLE
> can be overlapped with other zones? IIRC, your rework doesn't allow
> it.

My rework keeps the status quo, which is based on the assumption that
zones cannot overlap. A longer term plan is that this restriction is
removed. As I've said earlier overlapping zones is an interesting
concept which is definitely worth pursuing.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ