linux-kernel - Re: [PATCH 4/8] vmscan: Do not writeback filesystem pages in direct reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100722091930.GD13117@csn.ul.ie>
Date:	Thu, 22 Jul 2010 10:19:30 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	Johannes Weiner <hannes@...xchg.org>, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Nick Piggin <npiggin@...e.de>, Rik van Riel <riel@...hat.com>,
	Christoph Hellwig <hch@...radead.org>,
	Wu Fengguang <fengguang.wu@...el.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: [PATCH 4/8] vmscan: Do not writeback filesystem pages in
	direct reclaim

On Thu, Jul 22, 2010 at 08:57:34AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 21 Jul 2010 15:27:10 +0100
> Mel Gorman <mel@....ul.ie> wrote:
> 
> > On Wed, Jul 21, 2010 at 09:01:11PM +0900, KAMEZAWA Hiroyuki wrote:
>  
> > > But, hmm, memcg will have to select to enter this rounine based on
> > > the result of 1st memory reclaim.
> > > 
> > 
> > It has the option of igoring pages being dirtied but I worry that the
> > container could be filled with dirty pages waiting for flushers to do
> > something.
> 
> I'll prepare dirty_ratio for memcg. It's not easy but requested by I/O cgroup
> guys, too...
> 

I can see why it might be difficult. Dirty pages are not being counted
on a per-container basis. It would require additional infrastructure to
count it or a lot of scanning.

> 
> > 
> > > >  
> > > > -		/*
> > > > -		 * The attempt at page out may have made some
> > > > -		 * of the pages active, mark them inactive again.
> > > > -		 */
> > > > -		nr_active = clear_active_flags(&page_list, NULL);
> > > > -		count_vm_events(PGDEACTIVATE, nr_active);
> > > > +		while (nr_reclaimed < nr_taken && nr_dirty && dirty_retry--) {
> > > > +			wakeup_flusher_threads(laptop_mode ? 0 : nr_dirty);
> > > > +			congestion_wait(BLK_RW_ASYNC, HZ/10);
> > > >  
> > >
> > > Congestion wait is required ?? Where the congestion happens ?
> > > I'm sorry you already have some other trick in other patch.
> > > 
> > 
> > It's to wait for the IO to occur.
> > 
>
> 1 tick penalty seems too large. I hope we can have some waitqueue in future.
> 

congestion_wait() if congestion occurs goes onto a waitqueue that is
woken if congestion clears. I didn't measure it this time around but I
doubt it waits for HZ/10 much of the time.

> > > > -		nr_reclaimed += shrink_page_list(&page_list, sc, PAGEOUT_IO_SYNC);
> > > > +			/*
> > > > +			 * The attempt at page out may have made some
> > > > +			 * of the pages active, mark them inactive again.
> > > > +			 */
> > > > +			nr_active = clear_active_flags(&page_list, NULL);
> > > > +			count_vm_events(PGDEACTIVATE, nr_active);
> > > > +	
> > > > +			nr_reclaimed += shrink_page_list(&page_list, sc,
> > > > +						PAGEOUT_IO_SYNC, &nr_dirty);
> > > > +		}
> > > 
> > > Just a question. This PAGEOUT_IO_SYNC has some meanings ?
> > > 
> > 
> > Yes, in pageout it will wait on pages currently being written back to be
> > cleaned before trying to reclaim them.
> > 
> Hmm. IIUC, this routine is called only when !current_is_kswapd() and
> pageout is done only whne current_is_kswapd(). So, this seems ....
> Wrong ?
> 

Both direct reclaim and kswapd can reach shrink_inactive_list

Direct reclaim
do_try_to_free_pages
  -> shrink_zones
    -> shrink_zone
      -> shrink_list
        -> shrink_inactive list <--- the routine in question

Kswapd
balance_pgdat
  -> shrink_zone
    -> shrink_list
      -> shrink_inactive_list

pageout() is still called by direct reclaim if the page is anon so it
will synchronously wait on those if PAGEOUT_IO_SYNC is set. For either
anon or file pages, if they are being currently written back, they will
be waited on in shrink_page_list() if PAGEOUT_IO_SYNC.

So it still has meaning. Did I miss something?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/