lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 29 Sep 2016 12:14:02 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:     Vlastimil Babka <vbabka@...e.cz>, Mel Gorman <mgorman@...e.de>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        kernel-team@...com
Subject: Re: Regression in mobility grouping?

On Thu, Sep 29, 2016 at 03:14:33PM +0900, Joonsoo Kim wrote:
> On Wed, Sep 28, 2016 at 10:25:40PM -0400, Johannes Weiner wrote:
> > On Wed, Sep 28, 2016 at 11:39:25AM -0400, Johannes Weiner wrote:
> > > On Wed, Sep 28, 2016 at 11:00:15AM +0200, Vlastimil Babka wrote:
> > > > I guess testing revert of 9c0415e could give us some idea. Commit
> > > > 3a1086f shouldn't result in pageblock marking differences and as I said
> > > > above, 99592d5 should be just restoring to what 3.10 did.
> > > 
> > > I can give this a shot, but note that this commit makes only unmovable
> > > stealing more aggressive. We see reclaimable blocks up as well.
> > 
> > Quick update, I reverted back to stealing eagerly only on behalf of
> > MIGRATE_RECLAIMABLE allocations in a 4.6 kernel:
> 
> Hello, Johannes.
> 
> I think that it would be better to check 3.10 with above patches.
> Fragmentation depends on not only policy itself but also
> allocation/free pattern. There might be a large probability that
> allocation/free pattern is changed in this large kernel version
> difference.

You mean backport suspicious patches to 3.10 until I can reproduce it
there? I'm not sure. You're correct, the patterns very likely *have*
changed. But that alone cannot explain mobility grouping breaking that
badly. There is a reproducable bad behavior. It should be easier to
track down than to try to recreate it in the last-known-good kernel.

> > This is an UNMOVABLE order-3 allocation falling back to RECLAIMABLE.
> > According to can_steal_fallback(), this allocation shouldn't steal the
> > pageblock, yet change_ownership=1 indicates the block is UNMOVABLE.
> > 
> > Who converted it? I wonder if there is a bug in ownership management,
> > and there was an UNMOVABLE block on the RECLAIMABLE freelist from the
> > beginning. AFAICS we never validate list/mt consistency anywhere.
> 
> According to my code review, it would be possible. When stealing
> happens, we moved those buddy pages to current requested migratetype
> buddy list. If the other migratetype allocation request comes and
> stealing from the buddy list of previous requested migratetype
> happens, change_ownership will show '1' even if there is no ownership
> changing.

These two paths should exclude each other through the zone->lock, no?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ