lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <555CB4B6.8050305@phunq.net>
Date:	Wed, 20 May 2015 09:22:14 -0700
From:	Daniel Phillips <daniel@...nq.net>
To:	Jan Kara <jack@...e.cz>, David Lang <david@...g.hm>
CC:	Rik van Riel <riel@...hat.com>, tux3@...3.org,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Subject: Re: [FYI] tux3: Core changes



On 05/20/2015 07:44 AM, Jan Kara wrote:
> On Tue 19-05-15 13:33:31, David Lang wrote:
>> On Tue, 19 May 2015, Daniel Phillips wrote:
>>
>>>> I understand that Tux3 may avoid these issues due to some other mechanisms
>>>> it internally has but if page forking should get into mm subsystem, the
>>>> above must work.
>>>
>>> It does work, and by example, it does not need a lot of code to make
>>> it work, but the changes are not trivial. Tux3's delta writeback model
>>> will not suit everyone, so you can't just lift our code and add it to
>>> Ext4. Using it in Ext4 would require a per-inode writeback model, which
>>> looks practical to me but far from a weekend project. Maybe something
>>> to consider for Ext5.
>>>
>>> It is the job of new designs like Tux3 to chase after that final drop
>>> of performance, not our trusty Ext4 workhorse. Though stranger things
>>> have happened - as I recall, Ext4 had O(n) directory operations at one
>>> time. Fixing that was not easy, but we did it because we had to. Fixing
>>> Ext4's write performance is not urgent by comparison, and the barrier
>>> is high, you would want jbd3 for one thing.
>>>
>>> I think the meta-question you are asking is, where is the second user
>>> for this new CoW functionality? With a possible implication that if
>>> there is no second user then Tux3 cannot be merged. Is that is the
>>> question?
>>
>> I don't think they are asking for a second user. What they are
>> saying is that for this functionality to be accepted in the mm
>> subsystem, these problem cases need to work reliably, not just work
>> for Tux3 because of your implementation.
>>
>> So for things that you don't use, you need to make it an error if
>> they get used on a page that's been forked (or not be an error and
>> 'do the right thing')
>>
>> For cases where it doesn't matter because Tux3 controls the
>> writeback, and it's undefined in general what happens if writeback
>> is triggered twice on the same page, you will need to figure out how
>> to either prevent the second writeback from triggering if there's
>> one in process, or define how the two writebacks are going to happen
>> so that you can't end up with them re-ordered by some other
>> filesystem.
>>
>> I think that that's what's meant by the top statement that I left in
>> the quote. Even if your implementation details make it safe, these
>> need to be safe even without your implementation details to be
>> acceptable in the core kernel.
>   Yeah, that's what I meant. If you create a function which manipulates
> page cache, you better make it work with other functions manipulating page
> cache. Otherwise it's a landmine waiting to be tripped by some unsuspecting
> developer. Sure you can document all the conditions under which the
> function is safe to use but a function that has several paragraphs in front
> of it explaning when it is safe to use isn't very good API...

Violent agreement, of course. To put it in concrete terms, each of
the page fork support functions must be examined and determined
sane. They are:

 * cow_replace_page_cache
 * cow_delete_from_page_cache
 * cow_clone_page
 * page_cow_one
 * page_cow_file

Would it be useful to drill down into those, starting from the top
of the list?

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ