lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <2sjhov4poma4o4efvwe2xk474iorxwvf4ifqa5oee74744ke2e@lipjana3f5ti> Date: Tue, 12 Nov 2024 11:50:46 +0200 From: "Kirill A. Shutemov" <kirill@...temov.name> To: Dave Chinner <david@...morbit.com> Cc: Jens Axboe <axboe@...nel.dk>, linux-mm@...ck.org, linux-fsdevel@...r.kernel.org, hannes@...xchg.org, clm@...a.com, linux-kernel@...r.kernel.org, willy@...radead.org, linux-btrfs@...r.kernel.org, linux-ext4@...r.kernel.org, linux-xfs@...r.kernel.org Subject: Re: [PATCH 10/16] mm/filemap: make buffered writes work with RWF_UNCACHED On Tue, Nov 12, 2024 at 07:02:33PM +1100, Dave Chinner wrote: > I think the post-IO invalidation that these IOs do is largely > irrelevant to how the page cache processes the write. Indeed, > from userspace, the functionality in this patchset would be > implemented like this: > > oneshot_data_write(fd, buf, len, off) > { > /* write into page cache */ > pwrite(fd, buf, len, off); > > /* force the write through the page cache */ > sync_file_range(fd, off, len, SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER); > > /* Invalidate the single use data in the cache now it is on disk */ > posix_fadvise(fd, off, len, POSIX_FADV_DONTNEED); > } > > Allowing the application to control writeback and invalidation > granularity is a much more flexible solution to the problem here; > when IO is sequential, delayed allocation will be allowed to ensure > large contiguous extents are created and that will greatly reduce > file fragmentation on XFS, btrfs, bcachefs and ext4. For random > writes, it'll submit async IOs in batches... > > Given that io_uring already supports sync_file_range() and > posix_fadvise(), I'm wondering why we need an new IO API to perform > this specific write-through behaviour in a way that is less flexible > than what applications can already implement through existing > APIs.... Attaching the hint to the IO operation allows kernel to keep the data in page cache if it is there for other reason. You cannot do it with a separate syscall. Consider a scenario of a nightly backup of the data. The same data is in cache because the actual workload needs it. You don't want backup task to invalidate the data from cache. Your snippet would do that. -- Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists