linux-kernel - Re: Bug in kernel 2.6.31, Slow wb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <33307c790907301333i28b571eat29460164d558d370@mail.gmail.com>
Date:	Thu, 30 Jul 2009 13:33:09 -0700
From:	Martin Bligh <mbligh@...gle.com>
To:	Wu Fengguang <fengguang.wu@...el.com>
Cc:	Chad Talbott <ctalbott@...gle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Michael Rubin <mrubin@...gle.com>,
	Andrew Morton <akpm@...gle.com>,
	"sandeen@...hat.com" <sandeen@...hat.com>,
	Michael Davidson <md@...gle.com>
Subject: Re: Bug in kernel 2.6.31, Slow wb_kupdate writeout

(BTW: background ... I'm not picking through this code for fun, I'm
trying to debug writeback problems introduced in our new kernel
that are affecting Google production workloads ;-))

>> Well, I see two problems. One is that we set more_io based on
>> whether s_more_io is empty or not before we finish the loop.
>> I can't see how this can be correct, especially as there can be
>> other concurrent writers. So somehow we need to check when
>> we exit the loop, not during it.
>
> It is correct inside the loop, however with some overheads.
>
> We put it inside the loop because sometimes the whole filesystem is
> skipped and we shall not set more_io on them whether or not s_more_io
> is empty.

My point was that you're setting more_io based on a condition
at a point in time that isn't when you return to the caller.

By the time you return to the caller (after several more loops
iterations), that condition may no longer be true.

One other way to address that would to be only to set if if we're
about to fall off the end of the loop, ie change it to:

if (!list_empty(&sb->s_more_io) && list_empty(&sb->s_io))
       wbc->more_io = 1;

>> The other is that we're saying we are setting more_io when
>> nr_to_write is <=0 ... but we only really check it when
>> nr_to_write is > 0 ... I can't see how this can be useful?
>
> That's the caller's fault - I guess the logic was changed a bit by
> Jens in linux-next. I noticed this just now. It shall be fixed.

I am guessing you're setting more_io here because we're stopping
because our slice expired, presumably without us completing
all the io there was to do? That doesn't seem entirely accurate,
we could have finished all the pending IO (particularly given that
we can go over nr_to_write somewhat and send it negative).
Hence, I though that checking whether s_more_io and s_io were
empty at the time of return might be a more accurate check,
but on the other hand they are shared lists.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/