[<prev] [next>] [day] [month] [year] [list]
Message-ID: <8a90744a-8e2d-21b9-c88a-818991b3b753@pantelija.rs>
Date: Wed, 6 Feb 2019 13:45:31 +0100
From: Dragan Milenkovic <dragan.milenkovic@...telija.rs>
To: linux-kernel@...r.kernel.org
Subject: [BUG] Deadlock in block/blk-flush.c, with resolution
The bug manifests by mdX_raid1 and other related tasks being blocked.
It is triggered by LVM RAID, but is not caused by it. I have also
triggered it by LVM + mdraid, but only once. It is more frequent by
LVM RAID.
It does not occur in the master branch, but it does in 4.20.y, 4.19.y,
4.18.y. Here is a Debian bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913119
I have tracked it to this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=344e9ffcbd1898e1dc04085564a6e05c30ea8199
Specifically to this line:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/block/blk-flush.c?id=344e9ffcbd1898e1dc04085564a6e05c30ea8199
The commit log message makes it appear as if this is a refactoring
change, but the check for q->elevator was inverted.
The line has not been changed between that commit and the current master
branch. Since I applied this change to my distribution's kernel (4.19),
my system has been completely stable.
Let me know if you need me to do anything else, but this seems as a
straight-forward cherry-pick.
Dragan
Powered by blists - more mailing lists