[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h86vjhv0.fsf@notabene.neil.brown.name>
Date: Tue, 06 Aug 2019 09:46:27 +1000
From: NeilBrown <neilb@...e.com>
To: Jinpu Wang <jinpu.wang@...ud.ionos.com>,
linux-raid <linux-raid@...r.kernel.org>
Cc: Alexandr Iarygin <alexandr.iarygin@...ud.ionos.com>,
Guoqing Jiang <guoqing.jiang@...ud.ionos.com>,
Paul Menzel <pmenzel@...gen.mpg.de>,
linux-kernel@...r.kernel.org
Subject: Re: Bisected: Kernel 4.14 + has 3 times higher write IO latency than Kernel 4.4 with raid1
On Mon, Aug 05 2019, Jinpu Wang wrote:
> Hi Neil,
>
> For the md higher write IO latency problem, I bisected it to these commits:
>
> 4ad23a97 MD: use per-cpu counter for writes_pending
> 210f7cd percpu-refcount: support synchronous switch to atomic mode.
>
> Do you maybe have an idea? How can we fix it?
Hmmm.... not sure.
My guess is that the set_in_sync() call from md_check_recovery()
is taking a long time, and is being called too often.
Could you try two experiments please.
1/ set /sys/block/md0/md/safe_mode_delay
to 20 or more. It defaults to about 0.2.
2/ comment out the call the set_in_sync() in md_check_recovery().
Then run the least separately after each of these changes.
I the second one makes a difference, I'd like to know how often it gets
called - and why. The test
if ( ! (
(mddev->sb_flags & ~ (1<<MD_SB_CHANGE_PENDING)) ||
test_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
test_bit(MD_RECOVERY_DONE, &mddev->recovery) ||
(mddev->external == 0 && mddev->safemode == 1) ||
(mddev->safemode == 2
&& !mddev->in_sync && mddev->recovery_cp == MaxSector)
))
return;
should normally return when doing lots of IO - I'd like to know
which condition causes it to not return.
Thanks,
NeilBrown
Download attachment "signature.asc" of type "application/pgp-signature" (833 bytes)
Powered by blists - more mailing lists