[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YRupQKbg6uN8INCn@saturne.home>
Date: Tue, 17 Aug 2021 14:19:12 +0200
From: Laurent Stacul <captain.stac@...il.com>
To: linux-kernel@...r.kernel.org
Subject: XFS/mmap reflink file question
Hello,
I spent much time digging into the mmap mechanism and I don't have a clear view
on mmap'ing a file and a reflink to this file would be mapped twice in memory
(this only applies in case the filesystem supports reflink feature like XFS).
To describe my tests, I generate a file stored on an XFS partition and create a
reflink of it:
% dd if=/dev/zero of=./output.dat bs=1M count=24
% cp --reflink -v output.dat output2.dat
% xfs_bmap -v output.dat
output.dat:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..49151]: 3756776..3805927 0 (3756776..3805927) 49152 100000
% xfs_bmap -v output2.dat
output2.dat:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..49151]: 3756776..3805927 0 (3756776..3805927) 49152 100000
Then I mmap the first file twice using vmtouch tool:
% vmtouch -l output.dat&
[1] 15870
LOCKED 6144 pages (24M)
% vmtouch -l output.dat&
[2] 15872
LOCKED 6144 pages (24M)
% pmap -X 15872 | grep -e 'Pss' -e 'output' | awk '{if(NR>1)printf("%16s %4s %6s %10s %10s %10s\n", $1, $2, $4, $5, $7, $8)}'
Address Perm Device Inode Rss Pss
7fcbb9eb9000 r--s fc:10 3755268 24576 12288
As we can see the Proportional Set Size is as expected the half of the Resident
Set Size because the memory is shared by the two processes. Now, I mmap the
reflink `output2.dat' of 'output.dat':
% vmtouch -l output2.dat&
[3] 15892
LOCKED 6144 pages (24M)
% pmap -X 15872 | grep -e 'Pss' -e 'output' | awk '{if(NR>1)printf("%16s %4s %6s %10s %10s %10s\n", $1, $2, $4, $5, $7, $8)}'
Address Perm Device Inode Rss Pss
7fcbb9eb9000 r--s fc:10 3755268 24576 12288
The Pss of mmap'ed file by the first process has not decreased (I expected a
value of Rss / 3 because I hoped the memory would have been shared by the 3
processes). If I look at the process map of the last process, we can interpret
a new memory area was allocated and locked.
% pmap -X 15892 | grep -e 'Pss' -e 'output' | awk '{if(NR>1)printf("%16s %4s %6s %10s %10s %10s\n", $1, $2, $4, $5, $7, $8)}'
Address Perm Device Inode Rss Pss
7f5adc53f000 r--s fc:10 3755269 24576 24576
So my questions:
- Why can't we benefit from the memory sharing when reflinked files are mmap'ed
? It would be great because one application would be, in the context of
containers, the possibility to share some read only areas between container
that are built from the layer diff that are reproducible between images. We
can imagine a layer that brings some shared libraries in an image from a
reproducible FS diff so that containers would not load several times a
library.
- I can think of many tricky cases with the behavior I was expecting (especially
if a process has write access to the mapped area), but if you know a way, an
option something to achieve what I am trying to do, I would be glad to hear
it.
- Conversely, don't hesitate to tell me my expectation is just crazy.
Anyway, I am always looking forward to listening to valuable specialist insights.
Thanks in advance,
stac
PS: Please, add me is CC if this message deserves an answer.
Powered by blists - more mailing lists