lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1440089067.30197.3.camel@linux.intel.com>
Date:	Thu, 20 Aug 2015 10:44:27 -0600
From:	Ross Zwisler <ross.zwisler@...ux.intel.com>
To:	Dan Williams <dan.j.williams@...el.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	"Luis R. Rodriguez" <mcgrof@...e.com>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Borislav Petkov <bp@...e.de>, Christoph Hellwig <hch@....de>,
	Christoph Jaeger <cj@...ux.com>,
	Dan Streetman <ddstreet@...e.org>,
	Ingo Molnar <mingo@...hat.com>,
	Juergen Gross <jgross@...e.com>, Len Brown <lenb@...nel.org>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	Thierry Reding <treding@...dia.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Toshi Kani <toshi.kani@...com>,
	Vishal Verma <vishal.l.verma@...el.com>,
	Will Deacon <will.deacon@....com>,
	Linux ACPI <linux-acpi@...r.kernel.org>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	X86 ML <x86@...nel.org>
Subject: Re: [PATCH v2] nd_blk: add support for "read flush" DSM flag

On Wed, 2015-08-19 at 16:06 -0700, Dan Williams wrote:
> On Wed, Aug 19, 2015 at 3:48 PM, Ross Zwisler
> <ross.zwisler@...ux.intel.com> wrote:
> > Add support for the "read flush" _DSM flag, as outlined in the DSM spec:
> >
> > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> >
> > This flag tells the ND BLK driver that it needs to flush the cache lines
> > associated with the aperture after the aperture is moved but before any
> > new data is read.  This ensures that any stale cache lines from the
> > previous contents of the aperture will be discarded from the processor
> > cache, and the new data will be read properly from the DIMM.  We know
> > that the cache lines are clean and will be discarded without any
> > writeback because either a) the previous aperture operation was a read,
> > and we never modified the contents of the aperture, or b) the previous
> > aperture operation was a write and we must have written back the dirtied
> > contents of the aperture to the DIMM before the I/O was completed.
> >
> > By supporting the "read flush" flag we can also change the ND BLK
> > aperture mapping from write-combining to write-back via memremap().
> >
> > In order to add support for the "read flush" flag I needed to add a
> > generic routine to invalidate cache lines, mmio_flush_range().  This is
> > protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
> > only supported on x86.
> >
> > Signed-off-by: Ross Zwisler <ross.zwisler@...ux.intel.com>
> > Cc: Dan Williams <dan.j.williams@...el.com>
> [..]
> > diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
> > index 7c2638f..56fff01 100644
> > --- a/drivers/acpi/nfit.c
> > +++ b/drivers/acpi/nfit.c
> [..]
> >  static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
> > @@ -1078,11 +1078,16 @@ static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
> >                 }
> >
> >                 if (rw)
> > -                       memcpy_to_pmem(mmio->aperture + offset,
> > +                       memcpy_to_pmem(mmio->addr.aperture + offset,
> >                                         iobuf + copied, c);
> > -               else
> > +               else {
> > +                       if (nfit_blk->dimm_flags & ND_BLK_READ_FLUSH)
> > +                               mmio_flush_range((void __force *)
> > +                                       mmio->addr.aperture + offset, c);
> > +
> >                         memcpy_from_pmem(iobuf + copied,
> > -                                       mmio->aperture + offset, c);
> > +                                       mmio->addr.aperture + offset, c);
> > +               }
> 
> Why is the flush inside the "while (len)" loop?  I think it should be
> done immediately after the call to write_blk_ctl() since that is the
> point at which the aperture becomes invalidated, and not prior to each
> read within a given aperture position.  Taking it a bit further, we
> may be writing the same address into the control register as was there
> previously so we wouldn't need to flush in that case.

The reason I was doing it in the "while (len)" loop is that you have to walk
through the interleave tables, reading each segment until you have read 'len'
bytes.  If we were to invalidate right after the write_blk_ctl(), we would
essentially have to re-create the "while (len)" loop, hop through all the
segments doing the invalidation, then run through the segments again doing the
actual I/O.

It seemed a lot cleaner to just run through the segments once, invalidating
and reading each segment individually.

The bad news about the current approach is that we end up doing a bunch of
extra mb() fencing, twice per segment via clflush_cache_range().

The other option would be to do the double pass, but on the first pass to just
do the flushing without fencing, then fence everything, then do the reads.

I don't have a good feel for how much overhead all this extra fencing will be
vs the cost of traversing the segments twice.  The code is certainly simpler
with the way its implemented now.  If you feel that the extra fencing is too
expensive I'll implement it as a double-pass.  Otherwise we may want to wait
for performance data to justify the change.

Regarding skipping the flush if the control register is unchanged - sure, that
seems like a good idea.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ