[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <096ddb2cfbe83309396c48e75648889cae68e672.camel@intel.com>
Date: Fri, 5 Jul 2019 05:09:29 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "mingo@...nel.org" <mingo@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"mroos@...ux.ee" <mroos@...ux.ee>,
"linux-ia64@...ts.kernel.org" <linux-ia64@...ts.kernel.org>
CC: "namit@...are.com" <namit@...are.com>,
"Hansen, Dave" <dave.hansen@...el.com>
Subject: Re: [bisected] "mm/vmalloc: Add flag for freeing of special
permsissions" corrupts memory on ia64
On Thu, 2019-07-04 at 12:53 +0300, Meelis Roos wrote:
> I noticed that while 5.1 works on my HP Integrity RX2620, 5.2-rc6
> crashed on boot nondeterministically.
> Bisecting it took many tries sice it does not happen on each boot and
> when it happes, the symptoms are
> different each time. But now the bisection converged to
Thanks for the report.
This arch seems similar to sparc in that there are no set_memory_()
implementations, except that it's even simpler because
flush_tlb_kernel_range() just calls flush_tlb_all() so the range
shouldn't matter either. So this commit *should* have just been adding
a TLB flush, with most of it not affecting ia64.
From these logs, especially the fault stack traces and BUG()'s, it
seems like the vmalloc memory might be the allocations being corrupted.
After scrutinizing this so much for sparc, only to have the cause be
sparc's TLB flush in the end, I wonder if something similar could be
happening here. If the TLB wasn't getting flushed on all cores or in
the vmalloc range or something like that, the module loader may be
reading and writing to old entries pointing to re-cycled pages and
cause strange behavior like this.
I am out of the office and don't have access to this hardware either. I
will try to find someone at Intel that does to speed this up. In the
meantime I can send you a logging patch to do some sanity checks if you
are able to run it.
I think I found your earlier mail, and it said 5.2-rc1 did not show the
problem. I guess this wasn't the case after further testing, but 5.1
continued to be problem free?
Thanks,
Rick
Powered by blists - more mailing lists