lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 30 Jul 2009 11:19:27 +0800 From: Wu Fengguang <fengguang.wu@...el.com> To: Martin Bligh <mbligh@...gle.com> Cc: Chad Talbott <ctalbott@...gle.com>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>, Michael Rubin <mrubin@...gle.com>, Andrew Morton <akpm@...gle.com>, "sandeen@...hat.com" <sandeen@...hat.com>, Michael Davidson <md@...gle.com> Subject: Re: Bug in kernel 2.6.31, Slow wb_kupdate writeout On Thu, Jul 30, 2009 at 10:57:35AM +0800, Martin Bligh wrote: > > On closer looks I found this line: > > > > if (inode_dirtied_after(inode, start)) > > break; > > Ah, OK. > > > In this case "list_empty(&sb->s_io)" is not a good criteria: > > here we are breaking away for some other reasons, and shall > > not touch wbc.more_io. > > > > So let's stick with the current code? > > Well, I see two problems. One is that we set more_io based on > whether s_more_io is empty or not before we finish the loop. > I can't see how this can be correct, especially as there can be > other concurrent writers. So somehow we need to check when > we exit the loop, not during it. It is correct inside the loop, however with some overheads. We put it inside the loop because sometimes the whole filesystem is skipped and we shall not set more_io on them whether or not s_more_io is empty. > The other is that we're saying we are setting more_io when > nr_to_write is <=0 ... but we only really check it when > nr_to_write is > 0 ... I can't see how this can be useful? That's the caller's fault - I guess the logic was changed a bit by Jens in linux-next. I noticed this just now. It shall be fixed. > I'll admit there is one corner case when page_skipped it set > from one of the branches, but I am really not sure what the > intended logic is here, given the above? > > In the case where we hit the inode_dirtied_after break > condition, is it bad to set more_io ? There is more to do > on that inode after all. Is there a definition somewhere for > exactly what the more_io flag means? "More dirty pages to be put to io"? The exact semantics of more_io is determined by the caller, which used to be (in 2.6.31): background_writeout(): if (wbc.nr_to_write > 0 || wbc.pages_skipped > 0) { /* Wrote less than expected */ if (wbc.encountered_congestion || wbc.more_io) congestion_wait(BLK_RW_ASYNC, HZ/10); else break; } wb_kupdate() is same except that it does not check pages_skipped. Note that in 2.6.31, more_io is not used at all for sync(). Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists