linux-ext4 - Re: [PATCH 4/7] ext4: fsync should wait for DIO writers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120912140218.GC5726@quack.suse.cz>
Date:	Wed, 12 Sep 2012 16:02:18 +0200
From:	Jan Kara <jack@...e.cz>
To:	Dmitry Monakhov <dmonakhov@...nvz.org>
Cc:	Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org, tytso@....edu,
	wenqing.lz@...bao.com
Subject: Re: [PATCH 4/7] ext4: fsync should wait for DIO writers

On Mon 10-09-12 14:56:04, Dmitry Monakhov wrote:
> On Mon, 10 Sep 2012 11:51:35 +0200, Jan Kara <jack@...e.cz> wrote:
> > > Even more i_mutex is not holded while punch_hole which obviously
> > > result in dangerous data corruption due to write-after-free.
> >   Yes, that's a bug. I also noticed that but didn't get to fixing it (I'm
> > actually working on a more long term fix using range locking but that's
> > more of a research project so having somehow fixed at least the most
> > blatant locking problems is good).
> Yes you right. In order to do things right we should block:
> 1) direct io
> 2) pagecache /mmap users (writeback, readpage)
> 
> A assumes I've fixed (1) but (2) is still exist
> 
> My current assumption is to do actions similar to writeback
> 
>    down_write(EXT4_I(inode)->i_data_sem)
>    while (index <= end && pagevec_lookup(&pvec, mapping, index,...) {
>         lock_page(pvec[i]);
  Here you need to use trylock to avoid possible deadlocks...

>         zero_user_page(pvec[i], 0, PAGE_SIZE);
>         ret = try_to_release_page(pvec[i]);
>    }
>    /* At this moment we know that we locked all pages in range,
>     * NOTE!!!! currently ext_remove_space may drop i_data_sem internally
>     * so it should be modified to exit once i_mutex was dropped
>    */
>    ret = ext4_ext_remove_space(inode, from, to, NO_RELOCK)
>    while (pvec_num)
>          unlock_page(pvec[i])
>    }
>    up_write(EXT4_I(inode)->i_data_sem)
> 
> Number of locked pages should not be too large
> Or even more instead of massive page locking, we can lock page
> one by one, and simulate fake writeback, so all new writers will
> wait on that bit, but readers will see zeroes.
>    down_write(EXT4_I(inode)->i_data_sem)
>    while (index <= end && pagevec_lookup(&pvec, mapping, index,...) {
>         lock_page(pvec[i]);
>         zero_user_page(pvec[i], 0, PAGE_SIZE);
>         ret = try_to_release_page(pvec[i]);
>         set_page_writeback(pvec[i]);
>         unlock_page(pvec[i])
>    }
>    
>    ret = ext4_ext_remove_space(inode, from, to, NO_RELOCK)
>    while (pvec_num) {
>          end_page_writeback(pvec[i])
>    }
>    up_write(EXT4_I(inode)->i_data_sem)
  Oh, that's a hack. Please don't do that. Using page locks is cleaner
although I agree it's not very good either. That's why I decided not to
loose time with suboptimal solutions and rather look into range locking...

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html