lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 14 Sep 2011 16:53:38 -0700
From:	Valerie Aurora <val@...consulting.com>
To:	Jan Kara <jack@...e.cz>
Cc:	linux-fsdevel@...r.kernel.org, Dave Chinner <david@...morbit.com>,
	Masayoshi MIZUMA <m.mizuma@...fujitsu.com>,
	Greg Freemyer <greg.freemyer@...il.com>,
	linux-ext4@...r.kernel.org, Eric Sandeen <sandeen@...hat.com>
Subject: Re: [RFC PATCH 1/3] VFS: Fix s_umount thaw/write deadlock

On Wed, Sep 14, 2011 at 7:00 AM, Jan Kara <jack@...e.cz> wrote:
> On Mon 12-09-11 19:57:11, Valerie Aurora wrote:
>> Val, if you are sending patches as attachments, make them at least
>> text/plain please!

Oops, sorry.

>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>> index 04cf3b9..d1dca03 100644
>> --- a/fs/fs-writeback.c
>> +++ b/fs/fs-writeback.c
>> @@ -537,6 +537,9 @@ static long writeback_sb_inodes(struct super_block *sb,
>>       long write_chunk;
>>       long wrote = 0;  /* count both pages and inodes */
>>
>> +     if (vfs_is_frozen(sb))
>> +             return 0;
>> +
>  Umm, maybe we could make this more robust by skipping the superblock in
> __writeback_inodes_wb() and just explicitely stopping the writeback when
> work->sb is set (i.e. writeback is required only for frozen sb) in
> wb_writeback()?

Sorry, I don't quite understand what the goal is here?  I'm happy to
make the change, just want to make sure I'm accomplishing what you
want.

>>       while (!list_empty(&wb->b_io)) {
>>               struct inode *inode = wb_inode(wb->b_io.prev);
>>
>> @@ -1238,39 +1241,43 @@ EXPORT_SYMBOL(writeback_inodes_sb);
>>   * writeback_inodes_sb_if_idle       -       start writeback if none underway
>>   * @sb: the superblock
>>   *
>> - * Invoke writeback_inodes_sb if no writeback is currently underway.
>> - * Returns 1 if writeback was started, 0 if not.
>> + * Invoke writeback_inodes_sb if no writeback is currently underway
>> + * and no one else holds the s_umount lock.  Returns 1 if writeback
>> + * was started, 0 if not.
>>   */
>>  int writeback_inodes_sb_if_idle(struct super_block *sb)
>>  {
>>       if (!writeback_in_progress(sb->s_bdi)) {
>> -             down_read(&sb->s_umount);
>> -             writeback_inodes_sb(sb);
>> -             up_read(&sb->s_umount);
>> -             return 1;
>> -     } else
>> -             return 0;
>> +             if (down_read_trylock(&sb->s_umount)) {
>  What's exactly the deadlock trylock protects from here? Or is it just an
> optimization?

The trylock is an optimization Dave Chinner suggested.  The first
version I wrote acquired the lock and then checked vfs_is_frozen().

This function and the similar following one each have one caller,
btrfs in one case and ext4 in the other, and they are both trying to
get more writes to go out to disk in the hope of freeing up disk
space.

>> +                     writeback_inodes_sb(sb);
>> +                     up_read(&sb->s_umount);
>> +                     return 1;
>> +             }
>> +     }
>> +     return 0;
>>  }
>>  EXPORT_SYMBOL(writeback_inodes_sb_if_idle);
>>
>>  /**
>> - * writeback_inodes_sb_if_idle       -       start writeback if none underway
>> + * writeback_inodes_sb_nr_if_idle    -       start writeback if none underway
>>   * @sb: the superblock
>>   * @nr: the number of pages to write
>>   *
>> - * Invoke writeback_inodes_sb if no writeback is currently underway.
>> - * Returns 1 if writeback was started, 0 if not.
>> + * Invoke writeback_inodes_sb if no writeback is currently underway
>> + * and no one else holds the s_umount lock.  Returns 1 if writeback
>> + * was started, 0 if not.
>>   */
>>  int writeback_inodes_sb_nr_if_idle(struct super_block *sb,
>>                                  unsigned long nr)
>>  {
>>       if (!writeback_in_progress(sb->s_bdi)) {
>> -             down_read(&sb->s_umount);
>> -             writeback_inodes_sb_nr(sb, nr);
>> -             up_read(&sb->s_umount);
>> -             return 1;
>> -     } else
>> -             return 0;
>> +             if (down_read_trylock(&sb->s_umount)) {
>  The same question here...
>
>> +                     writeback_inodes_sb_nr(sb, nr);
>> +                     up_read(&sb->s_umount);
>> +                     return 1;
>> +             }
>> +     }
>> +     return 0;
>>  }
>>  EXPORT_SYMBOL(writeback_inodes_sb_nr_if_idle);
>>
>> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
>> index b34bdb2..993ce22 100644
>> --- a/fs/quota/quota.c
>> +++ b/fs/quota/quota.c
>> @@ -366,6 +366,18 @@ SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
>>       if (IS_ERR(sb))
>>               return PTR_ERR(sb);
>>
>> +     /*
>> +      * It's not clear which quota functions may write to the file
>> +      * system (all?).  Check for a frozen fs and bail out now.
>> +      */
>> +     if (vfs_is_frozen(sb)) {
>> +             drop_super(sb);
>> +             /* XXX Should quotactl_block() error path do this too? */
>  Yes, it does. Thanks for spotting this.

Fixed, thanks for confirming.

>> +             if (pathp && !IS_ERR(pathp))
>> +                     path_put(pathp);
>> +             return -EBUSY;
>> +     }
>> +
>>       ret = do_quotactl(sb, type, cmds, id, addr, pathp);
>>
>>       drop_super(sb);
> ...
>> diff --git a/fs/sync.c b/fs/sync.c
>> index c98a747..db15b11 100644
>> --- a/fs/sync.c
>> +++ b/fs/sync.c
>> @@ -68,7 +68,7 @@ int sync_filesystem(struct super_block *sb)
>>       /*
>>        * No point in syncing out anything if the filesystem is read-only.
>>        */
>> -     if (sb->s_flags & MS_RDONLY)
>> +     if ((sb->s_flags & MS_RDONLY) && !vfs_is_frozen(sb))
>                                      ^^^^
> The check should be: (sb->s_flags & MS_RDONLY || vfs_is_frozen(sb))

Fixed, and I reviewed the other checks for similar mistakes.  Thanks!
I'll resend soon.

-VAL
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ