lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20171211092755.uptoa23wlukuryie@hirez.programming.kicks-ass.net> Date: Mon, 11 Dec 2017 10:27:55 +0100 From: Peter Zijlstra <peterz@...radead.org> To: Tejun Heo <tj@...nel.org> Cc: axboe@...nel.dk, linux-kernel@...r.kernel.org, oleg@...hat.com, kernel-team@...com, osandov@...com Subject: Re: [PATCHSET] blk-mq: reimplement timeout handling On Sat, Dec 09, 2017 at 11:25:19AM -0800, Tejun Heo wrote: > Currently, blk-mq timeout path synchronizes against the usual > issue/completion path using a complex scheme involving atomic > bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence > rules. Unfortunatley, it contains quite a few holes. > > It's pretty easy to make blk_mq_check_expired() terminate a later > instance of a request. If we induce 5 sec delay before > time_after_eq() test in blk_mq_check_expired(), shorten the timeout to > 2s, and issue back-to-back large IOs, blk-mq starts timing out > requests spuriously pretty quickly. Nothing actually timed out. It > just made the call on a recycle instance of a request and then > terminated a later instance long after the original instance finished. > The scenario isn't theoretical either. > > This patchset replaces the broken synchronization mechanism with a RCU > and generation number based one. Please read the patch description of > the second path for more details. > > Oleg, Peter, I'd really appreciate if you guys can go over the > reported breakages and the new implementation. Great, yes that code seemed very suspicious when I looked at it; thanks for making it go away. I'll try and find a spot to stare at the patches.
Powered by blists - more mailing lists