lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150418235341.GF25265@thunk.org>
Date:	Sat, 18 Apr 2015 19:53:41 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Jan Kara <jack@...e.cz>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: [PATCH 2/3] ext4: Speedup ext4 orphan inode handling

On Thu, Apr 16, 2015 at 05:42:56PM +0200, Jan Kara wrote:
> Ext4 orphan inode handling is a bottleneck for workloads which heavily
> truncate / unlink small files since it contends on the global
> s_orphan_mutex lock (and generally it's difficult to improve scalability
> of the ondisk linked list of orphaned inodes).
> 
> This patch implements new way of handling orphan inodes. Instead of
> linking orphaned inode into a linked list, we store it's inode number in
> a new special file which we call "orphan file". Currently we still
> protect the orphan file with a spinlock for simplicity but even in this
> setting we can substantially reduce the length of the critical section
> and thus speedup some workloads.

Do we need to store the inode number of the orphan inodes in a file?
We only need to deal with orphaned inode if the journal exists --- so
why not just define a new journal block type, and simply dump all of
the orphaned inodes into one or more journal blocks, which get written
out as part of the commit process?

We can track the orphaned inodes using an in-memory RCU linked list,
so it can be completely lockless, and then in the transaction commit,
we can simply traverse the linked list and write out all of orphaned
inodes to the journal.  I think this would be faster and simpler, and
the only real issue is that we'll need to plumb this interface down
into the jbd2 layer.  But I don't think that would be too difficult.

What do you think?

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ