De-account the accumulative dirty counters on page redirty. Page redirties (very common in ext4) will introduce mismatch between counters (a) and (b) a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied b) NR_WRITTEN, BDI_WRITTEN This will introduce systematic errors in balanced_rate and result in dirty page position errors (ie. the dirty pages are no longer balanced around the global/bdi setpoints). Acked-by: Peter Zijlstra Signed-off-by: Wu Fengguang --- include/linux/writeback.h | 2 ++ mm/page-writeback.c | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) --- linux-next.orig/mm/page-writeback.c 2011-11-28 21:23:23.000000000 +0800 +++ linux-next/mm/page-writeback.c 2011-11-28 21:23:24.000000000 +0800 @@ -1806,6 +1806,24 @@ int __set_page_dirty_nobuffers(struct pa EXPORT_SYMBOL(__set_page_dirty_nobuffers); /* + * Call this whenever redirtying a page, to de-account the dirty counters + * (NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied), so that they match the written + * counters (NR_WRITTEN, BDI_WRITTEN) in long term. The mismatches will lead to + * systematic errors in balanced_dirty_ratelimit and the dirty pages position + * control. + */ +void account_page_redirty(struct page *page) +{ + struct address_space *mapping = page->mapping; + if (mapping && mapping_cap_account_dirty(mapping)) { + current->nr_dirtied--; + dec_zone_page_state(page, NR_DIRTIED); + dec_bdi_stat(mapping->backing_dev_info, BDI_DIRTIED); + } +} +EXPORT_SYMBOL(account_page_redirty); + +/* * When a writepage implementation decides that it doesn't want to write this * page for some reason, it should redirty the locked page via * redirty_page_for_writepage() and it should then unlock the page and return 0 @@ -1813,6 +1831,7 @@ EXPORT_SYMBOL(__set_page_dirty_nobuffers int redirty_page_for_writepage(struct writeback_control *wbc, struct page *page) { wbc->pages_skipped++; + account_page_redirty(page); return __set_page_dirty_nobuffers(page); } EXPORT_SYMBOL(redirty_page_for_writepage); --- linux-next.orig/include/linux/writeback.h 2011-11-28 21:23:20.000000000 +0800 +++ linux-next/include/linux/writeback.h 2011-11-28 21:23:24.000000000 +0800 @@ -197,6 +197,8 @@ void writeback_set_ratelimit(void); void tag_pages_for_writeback(struct address_space *mapping, pgoff_t start, pgoff_t end); +void account_page_redirty(struct page *page); + /* pdflush.c */ extern int nr_pdflush_threads; /* Global so it can be exported to sysctl read-only. */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/