[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <60e8df74202e40b28a4d53dbc7fd0b22@IL-EXCH02.marvell.com>
Date: Tue, 31 May 2016 13:10:44 +0000
From: Yehuda Yitschak <yehuday@...vell.com>
To: Marcin Wojtas <mw@...ihalf.com>,
Robin Murphy <robin.murphy@....com>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Lior Amsalem <alior@...vell.com>,
Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
Catalin Marinas <catalin.marinas@....com>,
Arnd Bergmann <arnd@...db.de>,
Grzegorz Jaszczyk <jaz@...ihalf.com>,
Will Deacon <will.deacon@....com>,
Nadav Haklai <nadavh@...vell.com>,
Tomasz Nowicki <tn@...ihalf.com>,
Gregory Clément
<gregory.clement@...e-electrons.com>
Subject: RE: [BUG] Page allocation failures with newest kernels
Hi Robin
During some of the stress tests we also came across a different warning from the arm64 page management code
It looks like a race is detected between HW and SW marking a bit in the PTE
Not sure it's really related but I thought it might give a clue on the issue
http://pastebin.com/ASv19vZP
Thanks
Yehuda
> -----Original Message-----
> From: Marcin Wojtas [mailto:mw@...ihalf.com]
> Sent: Tuesday, May 31, 2016 13:30
> To: Robin Murphy
> Cc: linux-mm@...ck.org; linux-kernel@...r.kernel.org; linux-arm-
> kernel@...ts.infradead.org; Lior Amsalem; Thomas Petazzoni; Yehuda
> Yitschak; Catalin Marinas; Arnd Bergmann; Grzegorz Jaszczyk; Will Deacon;
> Nadav Haklai; Tomasz Nowicki; Gregory Clément
> Subject: Re: [BUG] Page allocation failures with newest kernels
>
> Hi Robin,
>
> >
> > I remember there were some issues around 4.2 with the revision of the
> > arm64 atomic implementations affecting the cmpxchg_double() in SLUB,
> > but those should all be fixed (and the symptoms tended to be
> considerably more fatal).
> > A stronger candidate would be 97303480753e (which landed in 4.4),
> > which has various knock-on effects on the layout of SLUB internals -
> > does fiddling with L1_CACHE_SHIFT make any difference?
> >
>
> I'll check the commits, thanks. I forgot to add L1_CACHE_SHIFT was my first
> suspect - I had spent a long time debugging network controller, which
> stopped working because of this change - L1_CACHE_BYTES (and hence
> NET_SKB_PAD) not fitting HW constraints. Anyway reverting it didn't help at
> all for page alloc issue.
>
> Best regards,
> Marcin
Powered by blists - more mailing lists