lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DBE746F.3090707@ubuntu.com>
Date:	Mon, 02 May 2011 12:07:59 +0300
From:	Surbhi Palande <surbhi.palande@...ntu.com>
To:	Dave Chinner <david@...morbit.com>
CC:	Jan Kara <jack@...e.cz>,
	Toshiyuki Okajima <toshi.okajima@...fujitsu.com>,
	Ted Ts'o <tytso@....edu>,
	Masayoshi MIZUMA <m.mizuma@...fujitsu.com>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due
 to a deadlock

Hi,

On 04/06/2011 02:21 PM, Dave Chinner wrote:
> On Wed, Apr 06, 2011 at 08:18:56AM +0200, Jan Kara wrote:
>> On Wed 06-04-11 15:40:05, Dave Chinner wrote:
>>> On Fri, Apr 01, 2011 at 04:08:56PM +0200, Jan Kara wrote:
>>>> On Fri 01-04-11 10:40:50, Dave Chinner wrote:
>>>>> If you don't allow the page to be dirtied in the fist place, then
>>>>> nothing needs to be done to the writeback path because there is
>>>>> nothing dirty for it to write back.
>>>>    Sure but that's only the problem he was able to hit. But generally,
>>>> there's a problem with needing s_umount for unfreezing because it isn't
>>>> clear there aren't other code paths which can block with s_umount held
>>>> waiting for fs to get unfrozen. And these code paths would cause the same
>>>> deadlock. That's why I chose to get rid of s_umount during thawing.
>>>
>>> Holding the s_umount lock while checking if frozen and sleeping
>>> is essentially an ABBA lock inversion bug that can bite in many more
>>> places that just thawing the filesystem.  Any where this is done should
>>> be fixed, so I don't think just removing the s_umount lock from the thaw
>>> path is sufficient to avoid problems.
>>    That's easily said but hard to do - any transaction start in ext3/4 may
>> block on filesystem being frozen (this seems to be similar for XFS as I'm
>> looking into the code) and transaction start traditionally nests inside
>> s_umount (and basically there's no way around that since sync() calls your
>> fs code with s_umount held).
>
> Sure, but the question must be asked - why is ext3/4 even starting a
> transaction on a clean filesystem during sync? A frozen filesystem,
> by definition, is a clean filesytem, and therefore sync calls of any
> kind should not be trying to write to the FS or start transactions.
> XFS does this just fine, so I'd consider such behaviour on a frozen
> filesystem a bug in ext3/4...

I had a look at the xfs code for seeing how this is done.
xfs_file_aio_write()
   xfs_wait_for_freeze()
     vfs_check_frozen()
So xfs_file_aio_write() writes to buffers when the FS is not frozen.

Now, I want to know what stops the following scenario from happening:
--------------------
xfs_file_aio_write()
   xfs_wait_for_freeze()
     vfs_check_frozen()
At this point F.S was not frozen, so the next instruction in the 
xfs_file_aio_write() will be executed next.
However at this point (i.e after checking if F.S is frozen) the write 
process gets pre-empted and say the _freeze_ process gets control.

Now the F.S freezes and the write process gets the control back. And so 
we end up writing to the page cache when the F.S is frozen.
--------------------

Can anyone please enlighten me on how & why this premption is _not_ 
possible?

If this pre-emption is _possible_, then can we use sb->s_umount to 
prevent a freeze from happening while a write to the page cache buffers 
is going on. Eg:

* Before writing to the buffers in the page cache:

   down_write(sb->s_umount)
     if(sb->s_frozen == SB_FREEZE_WRITE) {
       // do not sleep with the sb->s_umount semaphore.
       up_write(s_umount);
       vfs_check_frozen();
       // if you are here then fs is not thawed.
       down_write(sb->s_umount);
     }


Thanks!


Warm Regards,
Surbhi.




>
>> So I'm afraid we are not going to get rid of
>> this ABBA dependency unless we declare that s_umount ranks above filesystem
>> being frozen - but surely I'm open to suggestions.
>
> Not sure I understand what you are saying there - this is already
> the case, isn't it? i.e. it has to be held exclusive to freeze a
> filesystem...
>
>> Another possibility is just to hide the problem e.g. by checking for frozen
>> filesystem whenever we try to get s_umount. But that looks a bit ugly to
>> me.
>
> And not necessary, AFAICT.
>
> Cheers,
>
> Dave.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ