lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Feb 2020 14:03:46 -0400
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     Logan Gunthorpe <logang@...tatee.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        linux-ia64@...r.kernel.org,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        linux-s390 <linux-s390@...r.kernel.org>,
        Linux-sh <linux-sh@...r.kernel.org>,
        platform-driver-x86@...r.kernel.org, Linux MM <linux-mm@...ck.org>,
        Michal Hocko <mhocko@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Hellwig <hch@....de>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Eric Badger <ebadger@...aio.com>
Subject: Re: [PATCH v3 0/7] Allow setting caching mode in arch_add_memory()
 for P2PDMA

On Thu, Feb 27, 2020 at 09:55:04AM -0800, Dan Williams wrote:
> On Thu, Feb 27, 2020 at 9:43 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
> >
> > On Thu, Feb 27, 2020 at 10:21:50AM -0700, Logan Gunthorpe wrote:
> > >
> > >
> > > On 2020-02-27 10:17 a.m., Jason Gunthorpe wrote:
> > > >> Instead of this, this series proposes a change to arch_add_memory()
> > > >> to take the pgprot required by the mapping which allows us to
> > > >> explicitly set pagetable entries for P2PDMA memory to WC.
> > > >
> > > > Is there a particular reason why WC was selected here? I thought for
> > > > the p2pdma cases there was no kernel user that touched the memory?
> > >
> > > Yes, that's correct. I choose WC here because the existing users are
> > > registering memory blocks without side effects which fit the WC
> > > semantics well.
> >
> > Hm, AFAIK WC memory is not compatible with the spinlocks/mutexs/etc in
> > Linux, so while it is true the memory has no side effects, there would
> > be surprising concurrency risks if anything in the kernel tried to
> > write to it.
> >
> > Not compatible means the locks don't contain stores to WC memory the
> > way you would expect. AFAIK on many CPUs extra barriers are required
> > to keep WC stores ordered, the same way ARM already has extra barriers
> > to keep UC stores ordered with locking..
> >
> > The spinlocks are defined to contain UC stores though.
> 
> How are spinlocks and mutexes getting into p2pdma ranges in the first
> instance? Even with UC, the system has bigger problems if it's trying
> to send bus locks targeting PCI, see the flurry of activity of trying
> to trigger faults on split locks [1].

This is not what I was trying to explain.

Consider

 static spinlock lock; // CPU DRAM
 static idx = 0;
 u64 *wc_memory = [..];

 spin_lock(&lock);
 wc_memory[0] = idx++;
 spin_unlock(&lock);

You'd expect that the PCI device will observe stores where idx is
strictly increasing, but this is not guarenteed. idx may decrease, idx
may skip. It just won't duplicate.

Or perhaps

 wc_memory[0] = foo;
 writel(doorbell)

foo is not guarenteed observable by the device before doorbell reaches
the device.

All of these are things that do not happen with UC or NC memory, and
are surprising violations of our programming model.

Generic kernel code should never touch WC memory unless the code is
specifically designed to handle it.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ