lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170330162031.GE4326@dhcp22.suse.cz>
Date:   Thu, 30 Mar 2017 18:20:34 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Jiri Kosina <jikos@...nel.org>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Toshi Kani <toshi.kani@...com>, joeyli <jlee@...e.com>,
        linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>,
        linux-api@...r.kernel.org
Subject: Re: memory hotplug and force_remove

On Thu 30-03-17 10:47:52, Jiri Kosina wrote:
> On Tue, 28 Mar 2017, Rafael J. Wysocki wrote:
> 
> > > > > we have been chasing the following BUG() triggering during the memory
> > > > > hotremove (remove_memory):
> > > > > 	ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> > > > > 				check_memblock_offlined_cb);
> > > > > 	if (ret)
> > > > > 		BUG();
> > > > > 
> > > > > and it took a while to learn that the issue is caused by
> > > > > /sys/firmware/acpi/hotplug/force_remove being enabled. I was really
> > > > > surprised to see such an option because at least for the memory hotplug
> > > > > it cannot work at all. Memory hotplug fails when the memory is still
> > > > > in use. Even if we do not BUG() here enforcing the hotplug operation
> > > > > will lead to problematic behavior later like crash or a silent memory
> > > > > corruption if the memory gets onlined back and reused by somebody else.
> > > > > 
> > > > > I am wondering what was the motivation for introducing this behavior and
> > > > > whether there is a way to disallow it for memory hotplug. Or maybe drop
> > > > > it completely. What would break in such a case?
> > > > 
> > > > Honestly, I don't remember from the top of my head and I haven't looked at
> > > > that code for several months.
> > > > 
> > > > I need some time to recall that.
> > > 
> > > Did you have any chance to look into this?
> > 
> > Well, yes.
> > 
> > It looks like that was added for some people who depended on the old behavior
> > at that time.
> > 
> > I guess we can try to drop it and see what happpens. :-)
> 
> I'd agree with that; at the same time, udev rule should be submitted to 
> systemd folks though. I don't think there is anything existing in this 
> area yet (neither do distros ship their own udev rules for this AFAIK).

Another option would keepint the force_remove knob but make the code be
error handling aware. In other words rather than ignoring offline error
simply propagate it up the chain and do not consider the offline. Would
that be acceptable?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ