linux-kernel - Re: [ANNOUNCE] DAXFS: A zero-copy, dmabuf-friendly filesystem for shared memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGHCLaSA9SnM+rtURV=U=hJ4kxpqUim6t7SgvxMNnAed0XaHMg@mail.gmail.com>
Date: Mon, 26 Jan 2026 16:02:48 -0800
From: Cong Wang <cwang@...tikernel.io>
To: Matthew Wilcox <willy@...radead.org>
Cc: Gao Xiang <hsiangkao@...ux.alibaba.com>, linux-fsdevel@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Cong Wang <xiyou.wangcong@...il.com>, 
	multikernel@...ts.linux.dev
Subject: Re: [ANNOUNCE] DAXFS: A zero-copy, dmabuf-friendly filesystem for
 shared memory

On Mon, Jan 26, 2026 at 12:40 PM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Mon, Jan 26, 2026 at 11:48:20AM -0800, Cong Wang wrote:
> > Specifically for this scenario, struct inode is not compatible. This
> > could rule out a lot of existing filesystems, except read-only ones.
>
> I don't think you understand that there's a difference between *on disk*
> inode and *in core* inode.  Compare and contrast struct ext2_inode and
> struct inode.
>
> > Now back to EROFS, it is still based on a block device, which
> > itself can't be shared among different kernels. ramdax is actually
> > a perfect example here, its label_area can't be shared among
> > different kernels.
> >
> > Let's take one step back: even if we really could share a device
> > with multiple kernels, it still could not share the memory footprint,
> > with DAX + EROFS, we would still get:
> > 1) Each kernel creates its own DAX mappings
> > 2) And faults pages independently
> >
> > There is no cross-kernel page sharing accounting.
> >
> > I hope this makes sense.
>
> No, it doesn't.  I'm not suggesting that you use erofs unchanged, I'm
> suggesting that you modify erofs to support your needs.

I just tried:
https://github.com/multikernel/linux/commit/a6dc3351e78fc2028e4ca0ea02e781ca0bfefea3

Unfortunately, the multi-kernel derivation is still there and probably
hard to eliminate without re-architecturing EROFS, here is why:

  DAXFS Inode (line 202-216):

  struct daxfs_base_inode {
      __le32 ino;
      __le32 mode;
      ...
      __le64 size;
      __le64 data_offset;    /* ← INTRINSIC: stored directly in inode
*/
      ...
  };

 DAXFS Read Path:
  // Pseudocode - what DAXFS does
  void *data = base + inode->data_offset + file_offset;
  copy_to_iter(data, len, to);
  // DONE. No metadata parsing, no derivation.

 EROFS Read Path:
  // What EROFS does (even in memory mode)
  struct erofs_map_blocks map = { .m_la = pos };
  erofs_map_blocks(inode, &map);  // ← DERIVES physical address
      // Inside erofs_map_blocks():
      //   - Check inode layout type (compact? extended?
chunk-indexed?)
      //   - For chunk-indexed: walk chunk table
      //   - For plain: compute from inode
      //   - Handle inline data, holes, compression...
  src = base + map.m_pa;

Please let me know if I miss anything here.

Also, the speculative branching support is also harder for EROFS,
please see my updated README here:
https://github.com/multikernel/daxfs/blob/main/README.md
(Skip to the Branching section.)

Thanks.
Cong Wang