lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230414153612.GB360881@frogsfrogsfrogs>
Date:   Fri, 14 Apr 2023 08:36:12 -0700
From:   "Darrick J. Wong" <djwong@...nel.org>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     Miklos Szeredi <miklos@...redi.hu>,
        Bernd Schubert <bschubert@....com>, axboe@...nel.dk,
        io-uring@...r.kernel.org, linux-ext4@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-xfs@...r.kernel.org,
        dsingh@....com
Subject: Re: [PATCH 1/2] fs: add FMODE_DIO_PARALLEL_WRITE flag

On Thu, Apr 13, 2023 at 10:11:28PM -0700, Christoph Hellwig wrote:
> On Thu, Apr 13, 2023 at 09:40:29AM +0200, Miklos Szeredi wrote:
> > fuse_direct_write_iter():
> > 
> > bool exclusive_lock =
> >     !(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) ||
> >     iocb->ki_flags & IOCB_APPEND ||
> >     fuse_direct_write_extending_i_size(iocb, from);
> > 
> > If the write is size extending, then it will take the lock exclusive.
> > OTOH, I guess that it would be unusual for lots of  size extending
> > writes to be done in parallel.
> > 
> > What would be the effect of giving the  FMODE_DIO_PARALLEL_WRITE hint
> > and then still serializing the writes?
> 
> I have no idea how this flags work, but XFS also takes i_rwsem
> exclusively for appends, when the positions and size aren't aligned to
> the block size, and a few other cases.

IIUC uring wants to avoid the situation where someone sends 300 writes
to the same file, all of which end up in background workers, and all of
which then contend on exclusive i_rwsem.  Hence it has some hashing
scheme that executes io requests serially if they hash to the same value
(which iirc is the inode number?) to prevent resource waste.

This flag turns off that hashing behavior on the assumption that each of
those 300 writes won't serialize on the other 299 writes, hence it's ok
to start up 300 workers.

(apologies for precoffee garbled response)

--D

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ