lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53D8A258.7010904@lge.com>
Date:	Wed, 30 Jul 2014 16:44:24 +0900
From:	Gioh Kim <gioh.kim@....com>
To:	Jan Kara <jack@...e.cz>, Peter Zijlstra <peterz@...radead.org>
CC:	Alexander Viro <viro@...iv.linux.org.uk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	Theodore Ts'o <tytso@....edu>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	linux-ext4@...r.kernel.org, linux-mm@...ck.org,
	Minchan Kim <minchan@...nel.org>,
	Joonsoo Kim <js1304@...il.com>
Subject: Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in
 non-movable area



2014-07-22 오후 6:38, Jan Kara 쓴 글:
> On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
>> On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
>>> Hello,
>>>
>>> This patch try to solve problem that a long-lasting page cache of
>>> ext4 superblock disturbs page migration.
>>>
>>> I've been testing CMA feature on my ARM-based platform
>>> and found some pages for page caches cannot be migrated.
>>> Some of them are page caches of superblock of ext4 filesystem.
>>>
>>> Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
>>> from movable area. But the problem is that ext4 hold the page until
>>> it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.
>>>
>>> I introduce a new API for allocating page from non-movable area.
>>> It is useful for ext4 and others that want to hold page cache for a long time.
>>
>> There's no word on why you can't teach ext4 to still migrate that page.
>> For all I know it might be impossible, but at least mention why.

I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

For example, fat_fill_super() reads superblock via sb_bread()
and release it via brelse() immediately. Therefore the page that stores superblock can be migrated.



>    It doesn't seem to be worth the effort to make that page movable to me
> (it's reasonably doable since superblock buffer isn't accessed in *that*
> many places but single movable page doesn't seem like a good tradeoff for
> the complexity).
>
> But this made me look into the migration code and it isn't completely clear
> to me what makes the migration code decide that sb buffer isn't movable? We
> seem to be locking the buffers before moving the underlying page but we
> don't do any reference or state checks on the buffers... That seems to be
> assuming that noone looks at bh->b_data without holding buffer lock. That
> is likely true for ordinary data but definitely not true for metadata
> buffers (i.e., buffers for pages from block device mappings).

The sb buffer is not movable because it is not released.
sb_bread increase the reference counter of buffer-head so that
the page of the buffer-head cannot be movable.

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

Actually it is strange that ext4 keeps buffer-head in superblock structure until unmount (it can be long time)
I thinks the buffer-head should be released immediately like fat_fill_super() did.
I believe there is a reason to keep buffer-head so that I suggest this patch.



>
> Added linux-mm to CC to enlighten me a bit ;)
>
> 								Honza
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ