lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1111301123320.5759@hs20-bc2-1.build.redhat.com>
Date:	Wed, 30 Nov 2011 11:34:23 -0500 (EST)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Alasdair G Kergon <agk@...hat.com>
cc:	Jan Kara <jack@...e.cz>, esandeen@...hat.com,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com,
	linux-fsdevel@...r.kernel.org,
	Christopher Chaltain <christopher.chaltain@...onical.com>,
	Valerie Aurora <val@...consulting.com>
Subject: Re: [dm-devel] [PATCH] deadlock with suspend and quotas

On Wed, 30 Nov 2011, Alasdair G Kergon wrote:

> On Tue, Nov 29, 2011 at 11:19:01AM +0100, Jan Kara wrote:
> > On Mon 28-11-11 18:32:18, Mikulas Patocka wrote:
> > > - skipping sync on frozen filesystem violates sync semantics. 
> > > Applications, such as databases, assume that when sync finishes, data were 
> > > written to stable storage. If we skip sync when the filesystem is frozen, 
> > > we can cause data corruption in these applications (if the system crashes 
> > > after we skipped a sync).
> 
> >   Here I don't agree. Filesystem must guarantee there are no dirty data on
> > a frozen filesystem. Ext4 and XFS do this, ext3 would need proper
> > page_mkwrite() implementation for this but that's the problem of ext3, not
> > freezing code in general. If there are no dirty data, sync code (and also
> > flusher thread) is free to return without doing anything.
>  
> Consider, during a 'create a snapshot' operation:
>    I/O flow:  application -> filesystem -> LV -> disk
> 
> dm lockfs is issued by LVM.
>   When this returns, the filesystem should be locked i.e. not issue any
>   further I/O to the LV.  (But if it did happen to issue I/O, it
>   wouldn't be a problem, as it would just get queued by dm and have no
>   impact on the snapshot creation operation.)
> 
> The application is still running and might still be issuing writes to
> the filesystem and might itself issue 'sync'.  But a 'sync' would only
> be meaningful for already-completed writes and the lockfs process should
> have already seen that they have hit disk.  So a sync issued while a
> device is locked can always be skipped.  Have I missed something in this
> reasoning, Mikulas?
> 
> Alasdair

You can't skip sync.

The problem is this (assume that you have non-journaled filesystem):

- A process issues a write() call. The write call goes to 
__generic_file_aio_write, suppose that the process goes immediatelly after 
"vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE);" and then is rescheduled.

- You suspend the filesystem

- A process that issued write() is scheduled to run, note that it already 
passed "vfs_check_frozen", so it goes on even on syspended filesystem. 
This process creates a dirty page in the page cache.

- The process eventually returns to userspace from the write() syscall, 
the filesystem is still suspended. write() doesn't guarantee that the data 
hit disk, so there is no problem so far.

- The applications calls sync. Now, if you skip sync on the suspended 
filesystem, you violate sync semantics: when a process calls write() and 
sync(), it can assume that data are written to stable storage.

So, in order to keep sync working, you must wait for the filesystem to 
thaw and then write the dirty data.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ