[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.2009140852030.22422@file01.intranet.prod.int.rdu2.redhat.com>
Date: Tue, 15 Sep 2020 08:34:41 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Dan Williams <dan.j.williams@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Ira Weiny <ira.weiny@...el.com>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Eric Sandeen <esandeen@...hat.com>,
Dave Chinner <dchinner@...hat.com>,
"Kani, Toshi" <toshi.kani@....com>,
"Norton, Scott J" <scott.norton@....com>,
"Tadakamadla, Rajesh (DCIG/CDI/HPS Perf)"
<rajesh.tadakamadla@....com>
cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-nvdimm@...ts.01.org
Subject: [RFC] nvfs: a filesystem for persistent memory
Hi
I am developing a new filesystem suitable for persistent memory - nvfs.
The goal is to have a small and fast filesystem that can be used on
DAX-based devices. Nvfs maps the whole device into linear address space
and it completely bypasses the overhead of the block layer and buffer
cache.
In the past, there was nova filesystem for pmem, but it was abandoned a
year ago (the last version is for the kernel 5.1 -
https://github.com/NVSL/linux-nova ). Nvfs is smaller and performs better.
The design of nvfs is similar to ext2/ext4, so that it fits into the VFS
layer naturally, without too much glue code.
I'd like to ask you to review it.
tarballs:
http://people.redhat.com/~mpatocka/nvfs/
git:
git://leontynka.twibright.com/nvfs.git
the description of filesystem internals:
http://people.redhat.com/~mpatocka/nvfs/INTERNALS
benchmarks:
http://people.redhat.com/~mpatocka/nvfs/BENCHMARKS
TODO:
- programs run approximately 4% slower when running from Optane-based
persistent memory. Therefore, programs and libraries should use page cache
and not DAX mapping.
- when the fsck.nvfs tool mmaps the device /dev/pmem0, the kernel uses
buffer cache for the mapping. The buffer cache slows does fsck by a factor
of 5 to 10. Could it be possible to change the kernel so that it maps DAX
based block devices directly?
- __copy_from_user_inatomic_nocache doesn't flush cache for leading and
trailing bytes.
Mikulas
Powered by blists - more mailing lists