lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 03 Mar 2009 17:36:05 +0900
From:	Akira Fujita <a-fujita@...jp.nec.com>
To:	Theodore Tso <tytso@....edu>
CC:	linux-ext4@...r.kernel.org
Subject: Re: [RFC][PATCH 2/3] ext4: Exchange the blocks between two inodes

Hi Ted,
Thank you for comment. :-)

Theodore Tso wrote:
>> +
>> +	up_write(&EXT4_I(org_inode)->i_data_sem);
>> +	ret = a_ops->write_begin(o_filp, mapping, offs, data_len, w_flags,
>> +								&page, &fsdata);
>> +	down_write(&EXT4_I(org_inode)->i_data_sem);
>>     
>
> This is going to be a problem.  Once we release i_data_sem, there is
> the possibility that other processes which might be running and
> accessing the file at the same time that the defragger is running
> could be blocked waiting for i_data_sem to be released.  The moment it
> gets released, they will grab the lock then start to modify extent
> tree --- either allocating new blocks to it, or worse, truncating or
> unlinking the target inode.
>
> This is going to be a mess to fix, since Linux doesn't have recursive
> locking primitives.  We do take i_mutex, which will protect us against
> truncates, but it won't protect against a write() system call.  Also,
> if there inode has delayed allocation blocks pending, those could get
> written out by the page cleaner, and i_mutex won't protect us against
> changes to i_data_sem, either.
>
>   
As you said, we take i_mutex at the start of ext4_defrag()
and hold it until the end of this function,
so orig file is protected against truncates and
it never be shrunk during defrag.

On the other hand, semaphore is released/taken around write_begin()
in ext4_defrag_partial() every page, so it does not protect orig file
against a write() system call from other process.  
So that defrag result (fragmentation) might not be best,
but data corruption does not occur at least.
So I think it is not a serious problem.

As above, it is not necessary to lock the whole of ext4_defrag()
with semaphore, we should just lock only a necessary point.
Therefore defrag V1's lock seems have unneeded lock points.
I will change lock point and semaphore type in the next version.

Do I overlook something?

Regards,
Akira Fujita


> We could add special-case kludgery to wrap around all of the places
> that takes or release the i_data_sem so that we get recursive locking
> support --- but that would be very ugly indeed.
>
> I'm not sure what's the best way to deal with this; maybe if we sleep
> on it someone will come up with a better suggestion --- but it's
> something that we have to figure out.
>
> 	  					- Ted
>
>   


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ