lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 20 Apr 2019 18:04:07 -0400
From:   Pavel Tatashin <pasha.tatashin@...een.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     James Morris <jmorris@...ei.org>, Sasha Levin <sashal@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Keith Busch <keith.busch@...el.com>,
        Vishal L Verma <vishal.l.verma@...el.com>,
        Dave Jiang <dave.jiang@...el.com>,
        Ross Zwisler <zwisler@...nel.org>,
        Tom Lendacky <thomas.lendacky@....com>,
        "Huang, Ying" <ying.huang@...el.com>,
        Fengguang Wu <fengguang.wu@...el.com>,
        Borislav Petkov <bp@...e.de>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Yaowei Bai <baiyaowei@...s.chinamobile.com>,
        Takashi Iwai <tiwai@...e.de>,
        Jérôme Glisse <jglisse@...hat.com>
Subject: Re: [v1 2/2] device-dax: "Hotremove" persistent memory that is used
 like normal RAM

On Sat, Apr 20, 2019 at 5:02 PM Dan Williams <dan.j.williams@...el.com> wrote:
>
> On Sat, Apr 20, 2019 at 10:02 AM Pavel Tatashin
> <pasha.tatashin@...een.com> wrote:
> >
> > > > Thank you for looking at this.  Are you saying, that if drv.remove()
> > > > returns a failure it is simply ignored, and unbind proceeds?
> > >
> > > Yeah, that's the problem. I've looked at making unbind able to fail,
> > > but that can lead to general bad behavior in device-drivers. I.e. why
> > > spend time unwinding allocated resources when the driver can simply
> > > fail unbind? About the best a driver can do is make unbind wait on
> > > some event, but any return results in device-unbind.
> >
> > Hm, just tested, and it is indeed so.
> >
> > I see the following options:
> >
> > 1. Move hot remove code to some other interface, that can fail. Not
> > sure what that would be, but outside of unbind/remove_id. Any
> > suggestion?
> > 2. Option two is don't attept to offline memory in unbind. Do
> > hot-remove memory in unbind if every section is already offlined.
> > Basically, do a walk through memblocks, and if every section is
> > offlined, also do the cleanup.
>
> I think something like option-2 could work just as long as the user is
> ok with failure and prepared to handle it. It's already the case that
> the request_region() in kmem permanently prevents the memory range
> from being reused by any other driver. So if the hot-unplug fails it
> could skip the corresponding release_region() and effectively it's the
> same as what we have now in terms of reuse protection. In your flow if
> the memory remove failed then the conversion attempt from devdax to
> raw mode would also fail and presumably you could fall back to doing a
> full reboot / rebuild of the application state?

With option two, where we will simply check that every memory_block is
offlined, we will have deterministic behavior:

1. If user did not offline every dax memory section beforehand via
echo offline > /sys/devices/system/memory/memoryN/state

echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
Will be the same as now, will simply return, and user won't be able to
use dax afterwords or hotremove it.

2. If user did offline ever dax memory section beforehand
echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
Will be guaranteed to succeed to hotremove the memory, as there is
nothing that can fail.

So, if user wants to hotremove dax memory, he/she must ensure that
every section is offlined before unbinding.

Pasha

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ