[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080705150401.8bd28b71.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Sat, 5 Jul 2008 15:04:01 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	YAMAMOTO Takashi <yamamoto@...inux.co.jp>,
	linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Nick Piggin <nickpiggin@...oo.com.au>
Subject: Re: [PATCH] fix task dirty balancing
On Wed, 02 Jul 2008 22:27:18 +0200
Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> On Wed, 2008-07-02 at 17:26 +0900, YAMAMOTO Takashi wrote:
> > hi,
> > 
> > task_dirty_inc doesn't seem to be called properly for
> > filesystems which don't use set_page_dirty for write(2).
> > eg. ext2 w/o nobh option.
> 
> I'm thinking this is an ext2 bug. So I'd rather it'd just call
> set_page_dirty() like a proper filesystem instead of doing things like
> this.
> 
> And I certainly don't like exporting task_dirty_inc() - filesystems and
> the like should not have to know about things like that.
> 
Hmm, a bit complicated for me.
At first, there are 2 __set_page_dirty() in the kernel.
  - mm/page-writeback.c: __set_page_dirty()
              .... set_page_dirty() calls this.
  - fs/buffer.c : __set_page_dirty()
              .... __set_page_dirty_buffers() and mark_buffer_dirty() calls this.
Why per-task dirty acconitng is done in mm/page-writeback.c::set_page_dirty() ?
It seems other accounting is done in the fs/buffer.c: __set_page_dirty()
The purpose of task-dirty accounting is different from others  ?
= fs/buffer.c
 697 static int __set_page_dirty(struct page *page,
 698                 struct address_space *mapping, int warn)
 699 {
 700         if (unlikely(!mapping))
 701                 return !TestSetPageDirty(page);
 702 
 703         if (TestSetPageDirty(page))
 704                 return 0;
 705 
 706         write_lock_irq(&mapping->tree_lock);
 707         if (page->mapping) {    /* Race with truncate? */
 708                 WARN_ON_ONCE(warn && !PageUptodate(page));
 709 
 710                 if (mapping_cap_account_dirty(mapping)) {
 711                         __inc_zone_page_state(page, NR_FILE_DIRTY);
 712                         __inc_bdi_stat(mapping->backing_dev_info,
 713                                         BDI_RECLAIMABLE);
 714                         task_io_account_write(PAGE_CACHE_SIZE);
 715                 }
 716                 radix_tree_tag_set(&mapping->page_tree,
 717                                 page_index(page), PAGECACHE_TAG_DIRTY);
 718         }
 719         write_unlock_irq(&mapping->tree_lock);
 720         __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
 721 
 722         return 1;
==
And task-dirty-limit don't have to take care of following 2 case ?
  - __set_page_dirty_nobuffers(struct page *page) (increment BDI_RECRAIMABLE)
  - test_set_page_writeback() (increment BDI_RECLAIMABLE)
Thanks,
-Kame
> Of course I'm utterly ignorant of filesystems, hence lets include more
> clue-full people.
> 
> > YAMAMOTO Takashi
> > 
> > 
> > Signed-off-by: YAMAMOTO Takashi <yamamoto@...inux.co.jp>
> > ---
> > 
> > commit e68f05bf56d0652c107bba1cff3f8491e41a2117
> > Author: YAMAMOTO Takashi <yamamoto@...inux.co.jp>
> > Date:   Wed Jul 2 16:17:33 2008 +0900
> > 
> >     fix dirty balancing for tasks.
> >     
> >     call task_dirty_inc when dirtying a page with mark_buffer_dirty.
> > 
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index 4788a9e..2f1c7c6 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -1219,8 +1219,9 @@ void mark_buffer_dirty(struct buffer_head *bh)
> >  			return;
> >  	}
> >  
> > -	if (!test_set_buffer_dirty(bh))
> > -		__set_page_dirty(bh->b_page, page_mapping(bh->b_page), 0);
> > +	if (!test_set_buffer_dirty(bh) &&
> > +	    __set_page_dirty(bh->b_page, page_mapping(bh->b_page), 0))
> > +		task_dirty_inc(current);
> >  }
> >  
> >  /*
> > diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> > index bd91987..61d0aec 100644
> > --- a/include/linux/writeback.h
> > +++ b/include/linux/writeback.h
> > @@ -95,6 +95,7 @@ int wakeup_pdflush(long nr_pages);
> >  void laptop_io_completion(void);
> >  void laptop_sync_completion(void);
> >  void throttle_vm_writeout(gfp_t gfp_mask);
> > +void task_dirty_inc(struct task_struct *);
> >  
> >  /* These are exported to sysctl. */
> >  extern int dirty_background_ratio;
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index 29b1d1e..4dc85d0 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -176,10 +176,11 @@ void bdi_writeout_inc(struct backing_dev_info *bdi)
> >  }
> >  EXPORT_SYMBOL_GPL(bdi_writeout_inc);
> >  
> > -static inline void task_dirty_inc(struct task_struct *tsk)
> > +void task_dirty_inc(struct task_struct *tsk)
> >  {
> >  	prop_inc_single(&vm_dirties, &tsk->dirties);
> >  }
> > +EXPORT_SYMBOL_GPL(task_dirty_inc);
> >  
> >  /*
> >   * Obtain an accurate fraction of the BDI's portion.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
