lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 26 Aug 2009 18:05:19 +0100
From:	Richard Kennedy <richard@....demon.co.uk>
To:	Miklos Szeredi <miklos@...redi.hu>
Cc:	a.p.zijlstra@...llo.nl, akpm@...ux-foundation.org,
	chris.mason@...cle.com, linux-kernel@...r.kernel.org,
	jens.axboe@...cle.com
Subject: Re: [RFC PATCH] mm: balance_dirty_pages. reduce calls to
 global_page_state to reduce cache references

On Tue, 2009-08-25 at 13:46 +0200, Miklos Szeredi wrote:
> On Fri, 21 Aug 2009, Peter Zijlstra wrote:
> > > I have tried to retain the existing behavior as much as possible, but
> > > have added NR_WRITEBACK_TEMP to nr_writeback. This counter was used in
> > > clip_bdi_dirty_limit but not in balance_dirty_pages, grep suggests this
> > > is only used by FUSE but I haven't done any testing on that. It does
> > > seem logical to count all the WRITEBACK pages when making the throttling
> > > decisions so this change should be more correct ;)
> > 
> > Right, the NR_WRITEBACK_TEMP thing is a FUSE feature, its used in
> > writable mmap() support for FUSE things.
> > 
> > I must admit to forgetting the exact semantics of the things, maybe
> > Miklos can remind us.
> 
> I'll try: fuse is special because it needs writeback to be "very
> asynchronous".  What I mean by this is that writing a dirty page may
> block indefinitely and that shouldn't hold up unrelated filesystem or
> memory operations.
> 
> To satisfy this, fuse copies contents of dirty pages over to
> "temporary pages" and queues the write request with this temporary
> page, not the original page-cache page.
> 
> This has two effects:
> 
>  - the page-cache page does not remain in "writeback" state but is
>    cleaned immediately
> 
>  - the NR_WRITEBACK counter is not incremented for the duration of the
>    writeback
> 
> The first one is important because vmscan and page migration do
> wait_on_page_writeback() in some circumstances, which would block on
> fuse writebacks.
> 
> The second one is important because vmscan will throttle writeout if
> the NR_WRITEBACK counter goes over the dirty threshold
> (throttle_vm_writeout).  There were long discussions about this one,
> but in the end no one could surely tell how this works and why it is
> important.  But NR_WRITEBACK_TEMP must not be counted there, otherwise
> the page scanning can deadlock with fuse filesystems.
> 
> About balance_dirty_pages() I'm not quite sure.  By a similar logic we
> don't want NR_WRITEBACK_TEMP pages to contribute to throttling
> unrelated filesytem writebacks.  That might (through recursion via a
> userspace filesystem) lead to a deadlock.
> 
> So my recommendation is that we should retain the old behavior.
> 
> Thanks,
> Miklos

Thanks for the explanation.

The old behavior was to use NR_WRITEBACK_TEMP to clip the bdi
thresholds.
Should we continue to do this or just ignore NR_WRIEBACK_TEMP completely
in balance_dirty_pages ?

regards
Richard

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ