lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjtKRAgFmBfgU0al@google.com>
Date:   Wed, 23 Mar 2022 09:26:44 -0700
From:   Jaegeuk Kim <jaegeuk@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Tim Murray <timmurray@...gle.com>,
        Waiman Long <longman@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux F2FS Dev Mailing List 
        <linux-f2fs-devel@...ts.sourceforge.net>
Subject: Re: [GIT PULL] f2fs for 5.18

On 03/22, Linus Torvalds wrote:
> On Tue, Mar 22, 2022 at 5:34 PM Tim Murray <timmurray@...gle.com> wrote:
> >
> > AFAICT, what's happening is that rwsem_down_read_slowpath
> > modifies sem->count to indicate that there's a pending reader while
> > f2fs_ckpt holds the write lock, and when f2fs_ckpt releases the write
> > lock, it wakes pending readers and hands the lock over to readers.
> > This means that any subsequent attempt to grab the write lock from
> > f2fs_ckpt will stall until the newly-awakened reader releases the read
> > lock, which depends on the readers' arbitrarily long scheduling
> > delays.
> 
> Ugh.
> 
> So I'm looking at some of this, and you have things like this:
> 
>         f2fs_down_read(&F2FS_I(inode)->i_sem);
>         cp_reason = need_do_checkpoint(inode);
>         f2fs_up_read(&F2FS_I(inode)->i_sem);
> 
> which really doesn't seem to want a sleeping lock at all.
> 
> In fact, it's not clear that it has any business serializing with IO
> at all. It seems to just check very basic inode state. Very strange.
> It's the kind of thing that the VFS layer tends to use te i_lock
> *spinlock* for.

Um.. let me check this i_sem, introduced by
d928bfbfe77a ("f2fs: introduce fi->i_sem to protect fi's info").

OTOH, I was suspecting the major contetion would be
	f2fs_lock_op -> f2fs_down_read(&sbi->cp_rwsem);
, which was used for most of filesystem operations.

And, when we need to do checkpoint, we'd like to block internal operations by
	f2fs_lock_all -> f2fs_down_write(&sbi->cp_rwsem);

So, what I expected was giving the highest priority to the checkpoint thread
by grabbing down_write to block all the other readers.

> 
> And perhaps equally oddly, then when you do f2fs_issue_checkpoint(),
> _that_ code uses fancy lockless lists.
> 
> I'm probably mis-reading it.
> 
>              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ