lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 25 Sep 2017 14:58:25 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Seth Forshee <seth.forshee@...onical.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: Memory hotplug regression in 4.13

On Thu 21-09-17 00:40:34, Seth Forshee wrote:
> On Wed, Sep 20, 2017 at 11:29:31AM +0200, Michal Hocko wrote:
> > Hi,
> > I am currently at a conference so I will most probably get to this next
> > week but I will try to ASAP.
> > 
> > On Tue 19-09-17 11:41:14, Seth Forshee wrote:
> > > Hi Michal,
> > > 
> > > I'm seeing oopses in various locations when hotplugging memory in an x86
> > > vm while running a 32-bit kernel. The config I'm using is attached. To
> > > reproduce I'm using kvm with the memory options "-m
> > > size=512M,slots=3,maxmem=2G". Then in the qemu monitor I run:
> > > 
> > >   object_add memory-backend-ram,id=mem1,size=512M
> > >   device_add pc-dimm,id=dimm1,memdev=mem1
> > > 
> > > Not long after that I'll see an oops, not always in the same location
> > > but most often in wp_page_copy, like this one:
> > 
> > This is rather surprising. How do you online the memory?
> 
> The kernel has CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y.

OK, so the memory gets online automagically at the time when it is
hotadded. Could you send the full dmesg?

> > > [   24.673623] BUG: unable to handle kernel paging request at dffff000
> > > [   24.675569] IP: wp_page_copy+0xa8/0x660
> > 
> > could you resolve the IP into the source line?
> 
> It seems I don't have that kernel anymore, but I've got a 4.14-rc1 build
> and the problem still occurs there. It's pointing to the call to
> __builtin_memcpy in memcpy (include/linux/string.h line 340), which we
> get to via wp_page_copy -> cow_user_page -> copy_user_highpage.

Hmm, this is interesting. That would mean that we have successfully
mapped the destination page but its memory is still not accessible.

Right now I do not see how the patch you have bisected to could make any
difference because it only postponed the onlining to be independent but
your config simply onlines automatically so there shouldn't be any
semantic change. Maybe there is some sort of off-by-one or something.

I will try to investigate some more. Do you think it would be possible
to configure kdump on your system and provide me with the vmcore in some
way?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ