linux-kernel - Re: [PATCH v12 2/6] mm: Use zone and order instead of free area in free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e750dab283e0bc17658486e0f3070c55b7923eb2.camel@linux.intel.com>
Date:   Thu, 24 Oct 2019 08:19:17 -0700
From:   Alexander Duyck <alexander.h.duyck@...ux.intel.com>
To:     David Hildenbrand <david@...hat.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        kvm@...r.kernel.org, mst@...hat.com, linux-kernel@...r.kernel.org,
        willy@...radead.org, mhocko@...nel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, mgorman@...hsingularity.net,
        vbabka@...e.cz
Cc:     yang.zhang.wz@...il.com, nitesh@...hat.com, konrad.wilk@...cle.com,
        pagupta@...hat.com, riel@...riel.com, lcapitulino@...hat.com,
        dave.hansen@...el.com, wei.w.wang@...el.com, aarcange@...hat.com,
        pbonzini@...hat.com, dan.j.williams@...el.com, osalvador@...e.de
Subject: Re: [PATCH v12 2/6] mm: Use zone and order instead of free area in
 free_list manipulators

On Thu, 2019-10-24 at 11:32 +0200, David Hildenbrand wrote:
> On 23.10.19 17:16, Alexander Duyck wrote:
> > On Wed, 2019-10-23 at 10:26 +0200, David Hildenbrand wrote:
> > > On 23.10.19 00:28, Alexander Duyck wrote:
> > > > From: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
> > > > 
> > > > In order to enable the use of the zone from the list manipulator functions
> > > > I will need access to the zone pointer. As it turns out most of the
> > > > accessors were always just being directly passed &zone->free_area[order]
> > > > anyway so it would make sense to just fold that into the function itself
> > > > and pass the zone and order as arguments instead of the free area.
> > > > 
> > > > In order to be able to reference the zone we need to move the declaration
> > > > of the functions down so that we have the zone defined before we define the
> > > > list manipulation functions. Since the functions are only used in the file
> > > > mm/page_alloc.c we can just move them there to reduce noise in the header.
> > > > 
> > > > Reviewed-by: Dan Williams <dan.j.williams@...el.com>
> > > > Reviewed-by: David Hildenbrand <david@...hat.com>
> > > > Reviewed-by: Pankaj Gupta <pagupta@...hat.com>
> > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
> > > > ---
> > > >    include/linux/mmzone.h |   32 -----------------------
> > > >    mm/page_alloc.c        |   67 +++++++++++++++++++++++++++++++++++-------------
> > > >    2 files changed, 49 insertions(+), 50 deletions(-)
> > > 
> > > Did you see
> > > 
> > > https://lore.kernel.org/lkml/20191001152928.27008.8178.stgit@localhost.localdomain/T/#m4d2bc2f37bd7bdc3ae35c4f197905c275d0ad2f9
> > > 
> > > this time?
> > > 
> > > And the difference to the old patch is only an empty line.
> > > 
> > 
> > I saw the report. However I have not had much luck reproducing it in order
> > to get root cause. Here are my results for linux-next 20191021 with that
> > patch running page_fault2 over an average of 3 runs:
> 
> It would have been good if you'd reply to the report or sth. like that. 
> Then people (including me) are aware that you looked into it and what 
> your results of your investigation were.

I'll try to be more careful to remember to do that in the future.

> > Baseline:   3734692.00
> > This patch: 3739878.67
> > 
> > Also I am not so sure about these results as the same patch had passed
> > previously before and instead it was patch 3 that was reported as having a
> > -1.2% regression[1]. All I changed in response to that report was to add
> 
> Well, previously there was also a regression in the successor 
> PageReported() patch, not sure how they bisect in this case.

This is one of the things that has me thinking this is a possible code
alignment type issue. In the past when chasing down network performance
issues I would see these kind of things when a loop went from being cache
line aligned to not being aligned.

> > page_is_reported() which just wrapped the bit test for the reported flag
> > in a #ifdef to avoid testing it for the blocks that were already #ifdef
> > wrapped anyway.
> > 
> > I am still trying to see if I can get access to a system that would be a
> > better match for the one that reported the issue. My working theory is
> 
> I barely see false positives (well, I also barely see reports at all) on 
> MM, that's why I asked.

Like I said, I will dig into this.

> > that maybe it requires a high core count per node to reproduce. Either
> > that or it is some combination of the kernel being tested on and the patch
> > is causing some loop to go out of alignment and become more expensive.
> 
> Yes, double check that the config and the setup roughly matches what has 
> been reported.

I've been testing with the same .config and version of gcc. The last bit I
am trying now is to test with the same exact kernel source. I figure if I
can reproduce the issue with that then I can figure out exact root cause,
even if there isn't much we can do since the issue doesn't appear with the
latest patch set.