lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240819163938.qtsloyko67cqrmb6@quentin>
Date: Mon, 19 Aug 2024 16:39:38 +0000
From: "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>
To: David Howells <dhowells@...hat.com>
Cc: brauner@...nel.org, akpm@...ux-foundation.org, chandan.babu@...cle.com,
	linux-fsdevel@...r.kernel.org, djwong@...nel.org, hare@...e.de,
	gost.dev@...sung.com, linux-xfs@...r.kernel.org, hch@....de,
	david@...morbit.com, Zi Yan <ziy@...dia.com>,
	yang@...amperecomputing.com, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, willy@...radead.org, john.g.garry@...cle.com,
	cl@...amperecomputing.com, p.raghav@...sung.com, mcgrof@...nel.org,
	ryan.roberts@....com
Subject: Re: [PATCH v12 00/10] enable bs > ps in XFS

> ---
> /* Distillation of the generic/393 xfstest */
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <fcntl.h>
> 
> #define ERR(x, y) do { if ((long)(x) == -1) { perror(y); exit(1); } } while(0)
> 
> static const char xxx[40] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
> static const char yyy[40] = "yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy";
> static const char dropfile[] = "/proc/sys/vm/drop_caches";
> static const char droptype[] = "3";
> static const char file[] = "/xfstest.test/wubble";
> 
> int main(int argc, char *argv[])
> {
>         int fd, drop;
> 
> 	/* Fill in the second 8K block of the file... */
>         fd = open(file, O_CREAT|O_TRUNC|O_WRONLY, 0666);
>         ERR(fd, "open");
>         ERR(ftruncate(fd, 0), "pre-trunc $file");
>         ERR(pwrite(fd, yyy, sizeof(yyy), 0x2000), "write-2000");
>         ERR(close(fd), "close");
> 
> 	/* ... and drop the pagecache so that we get a streaming
> 	 * write, attaching some private data to the folio.
> 	 */
>         drop = open(dropfile, O_WRONLY);
>         ERR(drop, dropfile);
>         ERR(write(drop, droptype, sizeof(droptype) - 1), "write-drop");
>         ERR(close(drop), "close-drop");
> 
>         fd = open(file, O_WRONLY, 0666);
>         ERR(fd, "reopen");
> 	/* Make a streaming write on the first 8K block (needs O_WRONLY). */
>         ERR(pwrite(fd, xxx, sizeof(xxx), 0), "write-0");
> 	/* Now use truncate to shrink and reexpand. */
>         ERR(ftruncate(fd, 4), "trunc-4");
>         ERR(ftruncate(fd, 4096), "trunc-4096");
>         ERR(close(fd), "close-2");
>         exit(0);
> }

I tried this code on XFS, and it is working as expected (I am getting
xxxx).

[nix-shell:~/xfstests]# hexdump -C /media/test/wubble
00000000  78 78 78 78 00 00 00 00  00 00 00 00 00 00 00 00  |xxxx............|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000

I did some tracing as well and here are the results.

$ trace-cmd record -e xfs_file_fsync -e xfs_file_buffered_write -e xfs_setattr -e xfs_zero_eof -F -c ./a.out

[nix-shell:~/xfstests]# trace-cmd report
cpus=4
           a.out-3872  [003] 84120.161472: xfs_setattr:          dev 259:0 ino 0x103 iflags 0x0
           a.out-3872  [003] 84120.172109: xfs_setattr:          dev 259:0 ino 0x103 iflags 0x20 
           a.out-3872  [003] 84120.172151: xfs_zero_eof:         dev 259:0 ino 0x103 isize 0x0 disize 0x0 pos 0x0 bytecount 0x2000 // First truncate
           a.out-3872  [003] 84120.172156: xfs_file_buffered_write: dev 259:0 ino 0x103 disize 0x0 pos 0x2000 bytecount 0x28
           a.out-3872  [003] 84120.185423: xfs_file_buffered_write: dev 259:0 ino 0x103 disize 0x2028 pos 0x0 bytecount 0x28
           a.out-3872  [003] 84120.185477: xfs_setattr:          dev 259:0 ino 0x103 iflags 0x0
           a.out-3872  [003] 84120.186493: xfs_setattr:          dev 259:0 ino 0x103 iflags 0x20
           a.out-3872  [003] 84120.186495: xfs_zero_eof:         dev 259:0 ino 0x103 isize 0x4 disize 0x4 pos 0x4 bytecount 0xffc // Third truncate

First and third truncate result in calling xfs_zero_eof as we are
increasing the size of the file.

When we do the second ftruncate(fd, 4), we call into iomap_truncate_page() with
offset 0:

int
iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
		const struct iomap_ops *ops)
{
	unsigned int blocksize = i_blocksize(inode);
	unsigned int off = pos & (blocksize - 1);

	/* Block boundary? Nothing to do */
	if (!off)
		return 0;
	return iomap_zero_range(inode, pos, blocksize - off, did_zero, ops);
}

As you can see, we take into account the blocksize (which is set as
minorder during inode init) and make sure the sub-block zeroing is done
correctly.

Also if you see iomap_invalidate_folio(), we don't remove the folio
private data until the whole folio is invalidated.

I doubt we are doing anything wrong from the page cache layer with these
patches.

All we do with minorder support is to make sure we always allocate folios
in the page cache that are at least min order in size and aligned to the
min order (PATCH 2 and 3) and we maintain this even we do a split (PATCH
4).

I hope this helps!

--
Pankaj

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ