lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87a8trxboz.fsf@mail.parknet.co.jp>
Date:	Mon, 17 Aug 2015 04:42:04 +0900
From:	OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
To:	Jan Kara <jack@...e.cz>
Cc:	Daniel Phillips <daniel@...nq.net>, David Lang <david@...g.hm>,
	Rik van Riel <riel@...hat.com>, tux3@...3.org,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [FYI] tux3: Core changes

Jan Kara <jack@...e.cz> writes:

> On Sun 09-08-15 22:42:42, OGAWA Hirofumi wrote:
>> Jan Kara <jack@...e.cz> writes:
>> 
>> > I'm not sure about which ENOSPC issue you are speaking BTW. Can you
>> > please ellaborate?
>> 
>> 1. GUP simulate page fault, and prepare to modify
>> 2. writeback clear dirty, and make PTE read-only
>> 3. snapshot/reflink make block cow
>
> I assume by point 3. you mean that snapshot / reflink happens now and thus
> the page / block is marked as COW. Am I right?

Right.

>> 4. driver called GUP modifies page, and dirty page without simulate page fault
>
> OK, but this doesn't hit ENOSPC because as you correctly write in point 4.,
> the page gets modified without triggering another page fault so COW for the
> modified page isn't triggered. Modified page contents will be in both the
> original and the reflinked file, won't it?

And above result can be ENOSPC too, depending on implement and race
condition. Also, if FS converted zerod blocks to hole like hammerfs,
simply ENOSPC happens. I.e. other process uses all spaces, but then no
->page_mkwrite() callback to check ENOSPC.

> And I agree that the fact that snapshotted file's original contents can
> still get modified is a bug. A one which is difficult to fix.

Yes, it is why I'm thinking this logic is issue, before page forking.

>> So it sounds like yet another "stable page". I.e. unpredictable
>> performance. (BTW, by recall of "stable page", noticed "stable page"
>> would not provide stabled page data for that logic too.)
>> 
>> Well, assuming "elevated refcount == threshold + waitq/wakeup", so
>> IMO, it is not attractive.  Rather the last option if there is no
>> others as design choice.
>
> I agree the performance will be less predictable and that is not good. But
> changing what is visible in the file when writeback races with GUP is a
> worse problem to me.
>
> Maybe if GUP marked pages it got ref for so that we could trigger the slow
> behavior only for them (Peter Zijlstra proposed in [1] an infrastructure so
> that pages pinned by get_user_pages() would be properly accounted and then
> we could use PG_mlocked and elevated refcount as a more reliable indication
> of pages that need special handling).

I'm not reading Peter's patchset fully though, looks like good, and
maybe similar strategy in my mind currently. Also I'm thinking to add
callback for FS at start and end of GUP's pin window. (for just an
example, callback can be used to stop writeback by FS if FS wants.)

Thanks.
-- 
OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ