lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iaeB1q0fttoXcBzYfwkx-ezzOKjK2QrpGieqLr7rrdiQ@mail.gmail.com>
Date:	Thu, 6 Aug 2015 14:26:01 -0700
From:	Dan Williams <dan.j.williams@...el.com>
To:	Ross Zwisler <ross.zwisler@...ux.intel.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH 6/6] dax: update I/O path to do proper PMEM flushing

On Thu, Aug 6, 2015 at 10:43 AM, Ross Zwisler
<ross.zwisler@...ux.intel.com> wrote:
> Update the DAX I/O path so that all operations that store data (I/O
> writes, zeroing blocks, punching holes, etc.) properly synchronize the
> stores to media using the PMEM API.  This ensures that the data DAX is
> writing is durable on media before the operation completes.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@...ux.intel.com>
> ---
>  fs/dax.c | 55 ++++++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 44 insertions(+), 11 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 47c3323..e7595db 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -17,12 +17,14 @@
>  #include <linux/atomic.h>
>  #include <linux/blkdev.h>
>  #include <linux/buffer_head.h>
> +#include <linux/dax.h>
>  #include <linux/fs.h>
>  #include <linux/genhd.h>
>  #include <linux/highmem.h>
>  #include <linux/memcontrol.h>
>  #include <linux/mm.h>
>  #include <linux/mutex.h>
> +#include <linux/pmem.h>
>  #include <linux/sched.h>
>  #include <linux/uio.h>
>  #include <linux/vmstat.h>
> @@ -46,10 +48,13 @@ int dax_clear_blocks(struct inode *inode, sector_t block, long size)
>                         unsigned pgsz = PAGE_SIZE - offset_in_page(addr);
>                         if (pgsz > count)
>                                 pgsz = count;
> -                       if (pgsz < PAGE_SIZE)
> +               if (pgsz < PAGE_SIZE) {
>                                 memset(addr, 0, pgsz);
> -                       else
> +                               wb_cache_pmem((void __pmem *)addr, pgsz);
> +                       } else {
>                                 clear_page(addr);
> +                               wb_cache_pmem((void __pmem *)addr, PAGE_SIZE);
> +                       }
>                         addr += pgsz;
>                         size -= pgsz;
>                         count -= pgsz;
> @@ -59,6 +64,7 @@ int dax_clear_blocks(struct inode *inode, sector_t block, long size)
>                 }
>         } while (size);
>
> +       wmb_pmem();
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(dax_clear_blocks);
> @@ -70,15 +76,24 @@ static long dax_get_addr(struct buffer_head *bh, void **addr, unsigned blkbits)
>         return bdev_direct_access(bh->b_bdev, sector, addr, &pfn, bh->b_size);
>  }
>
> +/*
> + * This function's stores and flushes need to be synced to media by a
> + * wmb_pmem() in the caller. We flush the data instead of writing it back
> + * because we don't expect to read this newly zeroed data in the near future.
> + */
>  static void dax_new_buf(void *addr, unsigned size, unsigned first, loff_t pos,
>                         loff_t end)
>  {
>         loff_t final = end - pos + first; /* The final byte of the buffer */
>
> -       if (first > 0)
> +       if (first > 0) {
>                 memset(addr, 0, first);
> -       if (final < size)
> +               flush_cache_pmem((void __pmem *)addr, first);

Why are we invalidating vs just writing back?  Isn't there a
possibility that the cpu will read these zeroes, in which case why
force it go to memory?  Let the cpu figure out when these writes are
evicted from the cache hierarchy or otherwise include some performance
numbers showing it is a win to force the eviction early.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ