linux-kernel - Re: [PATCH v3 01/11] pagemap: Introduce ->memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAPcyv4iyS0EB0zLNxLwML1C0E2Eqk3TweHvmgpNWpZKVPVpz5Q@mail.gmail.com>
Date:   Wed, 24 Mar 2021 11:00:16 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Christoph Hellwig <hch@....de>
Cc:     "ruansy.fnst@...itsu.com" <ruansy.fnst@...itsu.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-xfs <linux-xfs@...r.kernel.org>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        Linux MM <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        device-mapper development <dm-devel@...hat.com>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        david <david@...morbit.com>, Alasdair Kergon <agk@...hat.com>,
        Mike Snitzer <snitzer@...hat.com>,
        Goldwyn Rodrigues <rgoldwyn@...e.de>,
        "qi.fuli@...itsu.com" <qi.fuli@...itsu.com>,
        "y-goto@...itsu.com" <y-goto@...itsu.com>
Subject: Re: [PATCH v3 01/11] pagemap: Introduce ->memory_failure()

On Wed, Mar 24, 2021 at 10:39 AM Christoph Hellwig <hch@....de> wrote:
>
> On Wed, Mar 24, 2021 at 09:37:01AM -0700, Dan Williams wrote:
> > > Eww.  As I said I think the right way is that the file system (or
> > > other consumer) can register a set of callbacks for opening the device.
> >
> > How does that solve the problem of the driver being notified of all
> > pfn failure events?
>
> Ok, I probably just showed I need to spend more time looking at
> your proposal vs the actual code..
>
> Don't we have a proper way how one of the nvdimm layers own a
> spefific memory range and call directly into that instead of through
> a notifier?

So that could be a new dev_pagemap operation as Ruan has here. I was
thinking that other agents would be interested in non-dev_pagemap
managed ranges, but we could leave that for later and just make the
current pgmap->memory_failure() callback proposal range based.

>
> > Today pmem only finds out about the ones that are
> > notified via native x86 machine check error handling via a notifier
> > (yes "firmware-first" error handling fails to do the right thing for
> > the pmem driver),
>
> Did any kind of firmware-first error handling ever get anything
> right?  I wish people would have learned that by now.

Part of me wants to say if you use firmware-first you get to keep the
pieces, but it's not always the end user choice as far as I
understand.

> > or the ones that are eventually reported via address
> > range scrub, but only for the nvdimms that implement range scrubbing.
> > memory_failure() seems a reasonable catch all point to route pfn
> > failure events, in an arch independent way, to interested drivers.
>
> Yeah.
>
> > I'm fine swapping out dax_device blocking_notiier chains for your
> > proposal, but that does not address all the proposed reworks in my
> > list which are:
> >
> > - delete "drivers/acpi/nfit/mce.c"
> >
> > - teach memory_failure() to be able to communicate range failure
> >
> > - enable memory_failure() to defer to a filesystem that can say
> > "critical metadata is impacted, no point in trying to do file-by-file
> > isolation, bring the whole fs down".
>
> This all sounds sensible.

Ok, Ruan, I think this means rework your dev_pagemap_ops callback to
be range based. Add a holder concept for dax_devices and then layer
that on Christoph's eventual dax_device callback mechanism that a
dax_device holder can register.