lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKFNMomzHg33SHnp6xGMEZY=+k6Y4t7dvBvgBDbO9H3ujzNDCw@mail.gmail.com>
Date:   Fri, 4 Aug 2023 11:01:39 +0900
From:   Ryusuke Konishi <konishi.ryusuke@...il.com>
To:     Jan Kara <jack@...e.cz>
Cc:     Christoph Hellwig <hch@....de>, Al Viro <viro@...iv.linux.org.uk>,
        Christian Brauner <brauner@...nel.org>,
        Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>,
        "Theodore Ts'o" <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Jaegeuk Kim <jaegeuk@...nel.org>, Chao Yu <chao@...nel.org>,
        "Darrick J. Wong" <djwong@...nel.org>,
        Jens Axboe <axboe@...nel.dk>, linux-btrfs@...r.kernel.org,
        linux-ext4@...r.kernel.org, linux-f2fs-devel@...ts.sourceforge.net,
        linux-nilfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-xfs@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: [PATCH 02/12] nilfs2: use setup_bdev_super to de-duplicate the
 mount code

On Thu, Aug 3, 2023 at 8:46 PM Jan Kara wrote:
>
> On Wed 02-08-23 17:41:21, Christoph Hellwig wrote:
> > Use the generic setup_bdev_super helper to open the main block device
> > and do various bits of superblock setup instead of duplicating the
> > logic.  This includes moving to the new scheme implemented in common
> > code that only opens the block device after the superblock has allocated.
> >
> > It does not yet convert nilfs2 to the new mount API, but doing so will
> > become a bit simpler after this first step.
> >
> > Signed-off-by: Christoph Hellwig <hch@....de>
>
> AFAICS nilfs2 could *almost* use mount_bdev() directly and then just do its

> snapshot thing after mount_bdev() returns. But it has this weird logic
> that: "if the superblock is already mounted but we can shrink the whole
> dcache, then do remount instead of ignoring mount options". Firstly, this
> looks racy - what prevents someone from say opening a file on the sb just
> after nilfs_tree_is_busy() shrinks dcache? Secondly, it is inconsistent
> with any other filesystem so it's going to surprise sysadmins not
> intimately knowing nilfs2. Thirdly, from userspace you cannot tell what
> your mount call is going to do. Last but not least, what is it really good
> for? Ryusuke, can you explain please?
>
>                                                                 Honza

I think you are referring to the following part:

>        if (!s->s_root) {
...
>        } else if (!sd.cno) {
>                if (nilfs_tree_is_busy(s->s_root)) {
>                        if ((flags ^ s->s_flags) & SB_RDONLY) {
>                                nilfs_err(s,
>                                          "the device already has a %s mount.",
>                                          sb_rdonly(s) ? "read-only" : "read/write");
>                                err = -EBUSY;
>                                goto failed_super;
>                        }
>                } else {
>                        /*
>                         * Try remount to setup mount states if the current
>                         * tree is not mounted and only snapshots use this sb.
>                         */
>                        err = nilfs_remount(s, &flags, data);
>                        if (err)
>                                goto failed_super;
>                }
>        }

What this logic is trying to do is, if there is already a nilfs2 mount
instance for the device, and are trying to mounting the current tree
(sd.cno is 0, so this is not a snapshot mount), then will switch
depending on whether the current tree has a mount:

- If the current tree is mounted, it's just like a normal filesystem.
(A read-only mount and a read/write mount can't coexist, so check
that, and reuse the instance if possible)
- Otherwise, i.e. for snapshot mounts only, do whatever is necessary
to add a new current mount, such as starting a log writer.
   Since it does the same thing that nilfs_remount does, so
nilfs_remount() is used there.

Whether or not there is a current tree mount can be determined by
d_count(s->s_root) > 1 as nilfs_tree_is_busy() does.
Where s->s_root is always the root dentry of the current tree, not
that of the mounted snapshot.

I remember that calling shrink_dcache_parent() before this test was to
do the test correctly if there was garbage left in the dcache from the
past current mount.

If the current tree isn't mounted, it just cleans up the garbage, and
the reference count wouldn't have incremented in parallel.

If the current tree is mounted, d_count(s->s_root) will not decrease
to 1, so it's not a problem.
However, this will cause unexpected dcache shrinkage for the in-use
tree, so it's not a good idea, as you pointed out.  If there is
another way of judging without this side effect, it should be
replaced.

I will reply here once.

Regards,
Ryusuke Konishi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ