lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190926122208.GI20255@dhcp22.suse.cz>
Date:   Thu, 26 Sep 2019 14:22:08 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     virtio-dev@...ts.oasis-open.org, kvm list <kvm@...r.kernel.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        Dave Hansen <dave.hansen@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Matthew Wilcox <willy@...radead.org>,
        linux-mm <linux-mm@...ck.org>, Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        linux-arm-kernel@...ts.infradead.org,
        Oscar Salvador <osalvador@...e.de>,
        Yang Zhang <yang.zhang.wz@...il.com>,
        Pankaj Gupta <pagupta@...hat.com>,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        Nitesh Narayan Lal <nitesh@...hat.com>,
        Rik van Riel <riel@...riel.com>, lcapitulino@...hat.com,
        "Wang, Wei W" <wei.w.wang@...el.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Alexander Duyck <alexander.h.duyck@...ux.intel.com>
Subject: Re: [PATCH v10 0/6] mm / virtio: Provide support for unused page
 reporting

On Tue 24-09-19 08:20:22, Alexander Duyck wrote:
> On Tue, Sep 24, 2019 at 7:23 AM Michal Hocko <mhocko@...nel.org> wrote:
> >
> > On Wed 18-09-19 10:52:25, Alexander Duyck wrote:
> > [...]
> > > In order to try and keep the time needed to find a non-reported page to
> > > a minimum we maintain a "reported_boundary" pointer. This pointer is used
> > > by the get_unreported_pages iterator to determine at what point it should
> > > resume searching for non-reported pages. In order to guarantee pages do
> > > not get past the scan I have modified add_to_free_list_tail so that it
> > > will not insert pages behind the reported_boundary.
> > >
> > > If another process needs to perform a massive manipulation of the free
> > > list, such as compaction, it can either reset a given individual boundary
> > > which will push the boundary back to the list_head, or it can clear the
> > > bit indicating the zone is actively processing which will result in the
> > > reporting process resetting all of the boundaries for a given zone.
> >
> > Is this any different from the previous version? The last review
> > feedback (both from me and Mel) was that we are not happy to have an
> > externally imposed constrains on how the page allocator is supposed to
> > maintain its free lists.
> 
> The main change for v10 versus v9 is that I allow the page reporting
> boundary to be overridden. Specifically there are two approaches that
> can be taken.
> 
> The first is to simply reset the iterator for whatever list is
> updated. What this will do is reset the iterator back to list_head and
> then you can do whatever you want with that specific list.

OK, this is slightly better than pushing the allocator to the corner.
The allocator really has to be under control of its data structures.
I would still be happier if the allocator wouldn't really have to bother
about somebody snooping its internal state to do its own thing. So
please make sure to describe why and how much this really matters.
 
> The other option is to simply clear the ZONE_PAGE_REPORTING_ACTIVE
> bit. That will essentially notify the page reporting code that any/all
> hints that were recorded have been discarded and that it needs to
> start over.
> 
> All I am trying to do with this approach is reduce the work. Without
> doing this the code has to walk the entire free page list for the
> higher orders every iteration and that will not be cheap.

How expensive this will be?

> Admittedly
> it is a bit more invasive than the cut/splice logic used in compaction
> which is taking the pages it has already processed and moving them to
> the other end of the list. However, I have reduced things so that we
> only really are limiting where add_to_free_list_tail can place pages,
> and we are having to check/push back the boundaries if a reported page
> is removed from a free_list.
> 
> > If this is really the only way to go forward then I would like to hear
> > very convincing arguments about other approaches not being feasible.
> > There are none in this cover letter unfortunately. This will be really a
> > hard sell without them.
> 
> So I had considered several different approaches.

Thanks this is certainly useful and it would have been even more so if
you gave some rough numbers to quantify how much overhead for different
solutions we are talking about here.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ