lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 06 Oct 2011 10:36:46 -0700
From:	Allison Henderson <achender@...ux.vnet.ibm.com>
To:	Lukas Czerner <lczerner@...hat.com>
CC:	Ext4 Developers List <linux-ext4@...r.kernel.org>,
	"Ted Ts'o" <tytso@....edu>, Christoph Hellwig <hch@...radead.org>
Subject: Re: Plan for reducing i_mutex in ext4

On 10/04/2011 01:38 AM, Lukas Czerner wrote:
> On Mon, 3 Oct 2011, Allison Henderson wrote:
>
>> Hi all,
>>
>> I've been working on locating all the existing uses of i_mutex in the current
>> ext4 code because I know we are planning to reduce the usage of i_mutex in
>> ext4.  So I've gone through the ext4 code and also the vfs code and come up
>> with a list of ext4 items that appear to be protected under i_mutex.  I'm
>> thinking about doing a patch to replace i_mutex with a private ext4 mutex, and
>> I wanted to update folks on this idea and pick up any feed back people might
>> have.
>>
>> I'm thinking maybe we can have a separate mutex for functions that only modify
>> meta data like ext4_ioctl and ext4_setattr to help relieve unneeded
>> contention.  And then the rest of functions that are modifying data can go
>> under a data mutex (including truncate since sometimes ext4_ioctl and
>> ext4_setattr will call ext4_truncate if they modify i_size).
>
> Just the other day I was talking with Christoph (adding him to cc) about
> this, but unfortunately I still did not have time to look at this, but I
> am glad that someone did.
>
> His suggestion was a bit more general than creating separate ext4
> specific mutex. His idea was to change i_mutex to union of plain mutex
> for directories and a rwlock for regular files. Then this union can be
> used in other file systems as well, for example to replace xfs_iolock in
> xfs.
>
> Also it might be nice to do something smarter than just a rwlock for
> regular files. It would be nice to have an structure of extent locks, so
> we can use it for file system using extents, which will improve
> scalability while hammering a single file from different processes.
>
> Note that currently ext4 concurrent read/write are atomic only wrt
> individual pages, but not on the system call as the whole. This might
> cause read() to return data mixed from several different writes, which
> is not posix conform. That could be solved with the generic rwlock for
> files, or even better with the system of extent locking.
>
> But Christoph, can probably describe hi idea a bit better.
>
> Thanks!
> -Lukas

Hi Lukas,

Sorry for the delay, and thanks for the response :)  Alrighty, I will 
have to do some prototyping and see if I can work in some of these 
concepts into a solution.  At the moment, Im trying to make sure I come 
up with something that still provides all the existing functionality so 
I dont introduce any new race problems, but there's certainly a lot of 
room for optimizing too.  Thx!

Allison Henderson
>
>>
>> So these are ext4 functions that currently lock i_mutex:
>>
>> ext4_sync_file
>> ext4_fallocate
>> ext4_move_extents via two helper routines:
>> 	mext_inode_double_lock and mext_inode_double_unlock
>> ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl)
>> ext4_quota_write
>> ext4_llseek
>> ext4_end_io_work
>> ext4_evict_inode (only while calling ext4_flush_completed_IO)
>> ext4_ind_direct_IO (only while calling ext4_flush_completed_IO)
>>
>>
>> And these are ext4 functions that have i_mutex locked by the vfs layer.  So we
>> will need to lock the new private mutex here too if we want them to be
>> synchronous with the above functions.
>>
>> ext4_setattr
>> ext4_da_writepages
>> ext4_rmdir
>> ext4_unlink
>> ext4_symlink
>> ext4_link
>> ext4_rename
>>
>> And one unique case:
>> ext4_fiemap calls generic_block_fiemap and passes it a function pointer to
>> ext4_get_block.  generic_block_fiemap will lock i_mutex before calling the
>> pointer.  I dont think ext4_get_block needs i_mutex locked all the time, so I
>> think we can just make a wrapper for ext4_get_block that locks the new private
>> mutex and then we can pass a pointer to the wrapper.
>>
>>
>> That's my list so far, if anyone knows of one I missed please let me know, and
>> also if you spot any other places where we can reduce unneeded contention by
>> using a separate lock.  Thx!
>>
>> Allison Henderson
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ