linux-ext4 - Re: [PATCH v3 2/5] fs: Add inode_update_time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVrSiXCt5+2801C+QA6B1jzb0K3VHT6w8sVf_VXrz16Bw@mail.gmail.com>
Date:	Mon, 19 Aug 2013 21:07:49 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Dave Chinner <david@...morbit.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
	"Theodore Ts'o" <tytso@....edu>,
	Dave Hansen <dave.hansen@...ux.intel.com>, xfs@....sgi.com,
	Jan Kara <jack@...e.cz>, Tim Chen <tim.c.chen@...ux.intel.com>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH v3 2/5] fs: Add inode_update_time_writable

On Mon, Aug 19, 2013 at 8:33 PM, Dave Chinner <david@...morbit.com> wrote:
> On Mon, Aug 19, 2013 at 08:20:12PM -0700, Andy Lutomirski wrote:
>> On Mon, Aug 19, 2013 at 7:28 PM, Dave Chinner <david@...morbit.com> wrote:
>> > On Fri, Aug 16, 2013 at 04:22:09PM -0700, Andy Lutomirski wrote:
>> >> This is like file_update_time, except that it acts on a struct inode *
>> >> instead of a struct file *.
>> >>
>> >> Signed-off-by: Andy Lutomirski <luto@...capital.net>
>> >> ---
>> >>  fs/inode.c         | 72 ++++++++++++++++++++++++++++++++++++++++++------------
>> >>  include/linux/fs.h |  1 +
>> >>  2 files changed, 58 insertions(+), 15 deletions(-)
>> >>
>>
>> [...]
>>
>> >> +
>> >> +int inode_update_time_writable(struct inode *inode)
>> >> +{
>> >> +     struct timespec now;
>> >> +     int sync_it = prepare_update_cmtime(inode, &now);
>> >> +     int ret;
>> >> +
>> >> +     if (!sync_it)
>> >> +             return 0;
>> >> +
>> >> +     /* sb_start_pagefault and update_time can both sleep. */
>> >> +     sb_start_pagefault(inode->i_sb);
>> >> +     ret = update_time(inode, &now, sync_it);
>> >> +     sb_end_pagefault(inode->i_sb);
>> >
>> > This gets called from the writeback path - you can't use
>> > sb_start_pagefault/sb_end_pagefault in that path.
>>
>> The race I'm worried about is:
>>
>>  - mmap
>>  - write to the mapping
>>  - remount ro
>>  - flush_cmtime -> inode_update_time_writable
>
> sb_start_pagefault() is for filesystem freeze protection, not
> remount-ro protection. If you freeze the filesystem, then we stop
> writes and pagefaults by making sb_start_pagefault/sb_start_write
> block, and then run writeback to clean all the pages.  If writeback
> then blocks on sb_start_pagefault(), we've got a deadlock.
>
>> This may be impossible, in which case I'm okay, but it's nice to have
>> a sanity check.  I'll see if I can figure out how to do that.
>
> The process of remount-ro should flush the dirty pages - the inode
> and page has been marked dirty by page_mkwrite(), after all.

Hmm.  We can land in here from writeback, in which case the time
should be updated unconditionally.  We can also land in here from
msync(MS_ASYNC) or munmap.  munmap at least shouldn't block.

The nasty case is if a page is dirtied, then the frozen level is set
to SB_FREEZE_PAGEFAULT, and then userspace calls munmap or msync
*before* writepages gets called.  In this case, blocking until the fs
is unfrozen is probably impolite, and returning without updating the
time is questionable.

Removing the check entirely may add a new race, though: what if
.flush_cmtime has called mapping_test_clear_cmtime but hasn't gotten
to updating the time yet when freezing finishes?  This could be
prevented by changing generic_flush_cmtime to do __sb_start_write(sb,
SB_FREEZE_FS, false) and doing nothing if the fs is already frozen.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html