lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 23 Jan 2008 13:00:57 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Miklos Szeredi <miklos@...redi.hu>
cc:	a.p.zijlstra@...llo.nl, salikhmetov@...il.com, linux-mm@...ck.org,
	jakob@...hought.net, linux-kernel@...r.kernel.org,
	valdis.kletnieks@...edu, riel@...hat.com, ksm@...dk,
	staubach@...hat.com, jesper.juhl@...il.com,
	akpm@...ux-foundation.org, protasnb@...il.com,
	r.e.wolff@...wizard.nl, hidave.darkstar@...il.com,
	hch@...radead.org
Subject: Re: [PATCH -v8 3/4] Enable the MS_ASYNC functionality in
 sys_msync()



On Wed, 23 Jan 2008, Miklos Szeredi wrote:
>
> > [ Side note: it is quite possible that we should not do the 
> >   SYNC_FILE_RANGE_WAIT_BEFORE on MS_ASYNC, and just skip over pages that 
> >   are busily under writeback already.
> 
> MS_ASYNC is not supposed to wait, so SYNC_FILE_RANGE_WAIT_BEFORE
> probably should not be used in that case.

Well, that would leave the page dirty (including in the page tables) if it 
was under page writeback when the MS_ASYNC happened.

So I agree, we shouldn't necessarily wait, but if we want the page tables 
to be cleaned, right now we need to.

> What would be perfect, is if we had a sync mode, that on encountering
> a page currently under writeback, would just do a page_mkclean() on
> it, so we still receive a page fault next time one of the mappings is
> dirtied, so the times can be updated.
> 
> Would there be any difficulties with that?

It would require fairly invasive changes. Right now the actual page 
writeback does effectively:

			...
                        if (wbc->sync_mode != WB_SYNC_NONE)
                                wait_on_page_writeback(page);

                        if (PageWriteback(page) ||
                            !clear_page_dirty_for_io(page)) {
                                unlock_page(page);
                                continue;
                        }

                        ret = (*writepage)(page, wbc, data);
			...

and that "clear_page_dirty_for_io()" really does clear *all* the dirty 
bits, so we absolutely must start writepage() when we have done that. And 
that, in turn, requires that we're not already under writeback.

Is it possible to fix? Sure. We'd have to split up 
clear_page_dirty_for_io() to do it, and do the

	if (mapping && mapping_cap_account_dirty(mapping))
			..

part first (before the PageWriteback() tests), and then doing the

	if (TestClearPageDirty(page))
			...

parts later (after checking that that we're not under page-writeback).

So it's not horribly hard, but it's kind of a separate issue right now. 
And while the *generic* page-writeback is easy enough to fix, I worry 
about low-level filesystems that have their own "writepages()" 
implementation. They could easily get that wrong.

So right now it seems that waiting for writeback to finish is the right 
and safe thing to do (and even so, I'm not actually willing to commit my 
suggested patch in 2.6.24, I think this needs more thinking about)

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ