lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 29 Apr 2009 22:34:07 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	torvalds@...ux-foundation.org, nickpiggin@...oo.com.au,
	mingo@...e.hu, kosaki.motohiro@...fujitsu.com,
	a.p.zijlstra@...llo.nl, thomas.pi@...or.dea, ylalym@...il.com,
	linux-kernel@...r.kernel.org, ltt-dev@...ts.casi.polymtl.ca
Subject: Re: [PATCH] Fix dirty page accounting in
	redirty_page_for_writepage()

* Andrew Morton (akpm@...ux-foundation.org) wrote:
> On Wed, 29 Apr 2009 19:25:46 -0400
> Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
> 
> > Basically, the following execution :
> > 
> > dd if=/dev/zero of=/tmp/testfile
> > 
> > will slowly fill _all_ ram available without taking into account memory
> > pressure.
> > 
> > This is because the dirty page accounting is incorrect in
> > redirty_page_for_writepage.
> > 
> > This patch adds missing dirty page accounting in redirty_page_for_writepage().
> 
> The patch changes __set_page_dirty_nobuffers(), not
> redirty_page_for_writepage().
> 
> __set_page_dirty_nobuffers() has a huge number of callers.
> 

Right.

> > --- linux-2.6-lttng.orig/mm/page-writeback.c	2009-04-29 18:14:48.000000000 -0400
> > +++ linux-2.6-lttng/mm/page-writeback.c	2009-04-29 18:23:59.000000000 -0400
> > @@ -1237,6 +1237,12 @@ int __set_page_dirty_nobuffers(struct pa
> >  		if (!mapping)
> >  			return 1;
> >  
> > +		/*
> > +		 * Take care of setting back page accounting correctly.
> > +		 */
> > +		inc_zone_page_state(page, NR_FILE_DIRTY);
> > +		inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
> > +
> >  		spin_lock_irq(&mapping->tree_lock);
> >  		mapping2 = page_mapping(page);
> >  		if (mapping2) { /* Race with truncate? */
> > 
> 
> But __set_page_dirty_nobuffers() calls account_page_dirtied(), which
> already does the above two operations.  afacit we're now
> double-accounting.
> 

Yes, you are right.

> Now, it's possible that the accounting goes wrong very occasionally in
> the "/* Race with truncate?  */" case.  If the truncate path clears the
> page's dirty bit then it will decrement the dirty-page accounting, but
> this code path will fail to perform the increment of the dirty-page
> accounting.  IOW, once this function has set PG_Dirty, it is committed
> to altering some or all of the page-dirty accounting.
> 
> But afacit your test case will not trigger the race-with-truncate anyway?
> 
> Can you determine at approximately what frequency (pages-per-second)
> this accounting leak is occurring in your test?
> 

0 per minute actually. I've tried adding a printk when the

if (mapping2) {

} else {
  <--
}

case is hit, and it never triggered in my tests.

I am currently trying to figure out if I can reproduce the OOM problems
I had experienced with 2.6.29-rc3. I investigate memory accounting by
turning the memory accounting code into a slow cache-line bouncing
version and by adding some assertions about the fact that per-zone
global counters must never go below zero. Having unbalanced accounting
could have some nasty long-term effects on memory pressure accounting.

But so far the memory accounting code looks solid. It's my bad then. I
cannot reproduce the behavior I noticed with 2.6.29-rc3, so I guess we
should we consider this a non-issue (or code 9 if you prefer). ;)

Thanks for looking into this.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ