lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Nov 2011 00:46:59 +0100
From:	Jan Kara <jack@...e.cz>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Wu Fengguang <fengguang.wu@...el.com>,
	linux-fsdevel@...r.kernel.org, Jan Kara <jack@...e.cz>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Hellwig <hch@....de>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/5] writeback: charge leaked page dirties to active
 tasks

On Mon 21-11-11 13:49:29, Andrew Morton wrote:
> On Mon, 21 Nov 2011 21:03:44 +0800
> Wu Fengguang <fengguang.wu@...el.com> wrote:
> 
> > It's a years long problem that a large number of short-lived dirtiers
> > (eg. gcc instances in a fast kernel build) may starve long-run dirtiers
> > (eg. dd) as well as pushing the dirty pages to the global hard limit.
> > 
> > The solution is to charge the pages dirtied by the exited gcc to the
> > other random dirtying tasks. It sounds not perfect, however should
> > behave good enough in practice, seeing as that throttled tasks aren't
> > actually running so those that are running are more likely to pick it up
> > and get throttled, therefore promoting an equal spread.
> > 
> > --- linux-next.orig/mm/page-writeback.c	2011-11-17 20:57:04.000000000 +0800
> > +++ linux-next/mm/page-writeback.c	2011-11-17 20:57:13.000000000 +0800
> > @@ -1194,6 +1194,7 @@ void set_page_dirty_balance(struct page 
> >  }
> >  
> >  static DEFINE_PER_CPU(int, bdp_ratelimits);
> > +DEFINE_PER_CPU(int, dirty_leaks) = 0;
> 
> This is a poor identifier for a global symbol.  Generally such symbols
> should at least identify what subsystem they belong to.
> 
> Also, this would be a good site at whcih to document the global
> symbol's role.  The writeback code needs a lot of documentation. Of
> the design-level kind.
> 
> >  /**
> >   * balance_dirty_pages_ratelimited_nr - balance dirty memory state
> > @@ -1242,6 +1243,17 @@ void balance_dirty_pages_ratelimited_nr(
> >  			ratelimit = 0;
> >  		}
> >  	}
> > +	/*
> > +	 * Pick up the dirtied pages by the exited tasks. This avoids lots of
> > +	 * short-lived tasks (eg. gcc invocations in a kernel build) escaping
> > +	 * the dirty throttling and livelock other long-run dirtiers.
> > +	 */
> > +	p = &__get_cpu_var(dirty_leaks);
> > +	if (*p > 0 && current->nr_dirtied < ratelimit) {
> > +		nr_pages_dirtied = min(*p, ratelimit - current->nr_dirtied);
> > +		*p -= nr_pages_dirtied;
> > +		current->nr_dirtied += nr_pages_dirtied;
> > +	}
> >  	preempt_enable();
> >  
> >  	if (unlikely(current->nr_dirtied >= ratelimit))
> > --- linux-next.orig/kernel/exit.c	2011-11-17 20:57:02.000000000 +0800
> > +++ linux-next/kernel/exit.c	2011-11-17 20:57:04.000000000 +0800
> > @@ -1037,6 +1037,8 @@ NORET_TYPE void do_exit(long code)
> >  	validate_creds_for_do_exit(tsk);
> >  
> >  	preempt_disable();
> > +	if (tsk->nr_dirtied)
> > +		__this_cpu_add(dirty_leaks, tsk->nr_dirtied);
> 
> Whatever problem this code is solving, it only solved it in certain
> cases.  For example, if tasks are forking, dirtying and exiting at a
> rapid rate on CPU 0 then all the other CPUs don't know anything about
> this and we didn't fix anything.
  I think it will work - dirty_leaks at CPU0 will get leaked pages from the
first task when it dies, the second task will pick them up and as soon as
it dies it puts all the pages back to dirty_leaks at CPU0. This goes on and
on and eventually, dirty_leaks will accumulate enough pages so that a task
dirtying even a single page will enter balance_dirty_pages() as if it
dirtied 'ratelimit' pages.
 
								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ