linux-kernel - Re: [PATCH 1/2] workqueue: always pass cascading responsibility to the next flusher

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140527142443.GC22767@htj.dyndns.org>
Date:	Tue, 27 May 2014 10:24:43 -0400
From:	Tejun Heo <tj@...nel.org>
To:	Lai Jiangshan <laijs@...fujitsu.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] workqueue: always pass cascading responsibility to
 the next flusher

Hello, Lai.

On Mon, May 26, 2014 at 07:58:12PM +0800, Lai Jiangshan wrote:
> This optimization saves one mutex_lock(&wq->mutex) due to the current
> first-flusher already held it.
> 
> This optimization reduces the cascading-latency, because the next flusher
> is not running currently, it will delay a little when we keep the next's
> responsibility for cascading.
> 
> This optimization may also have other benefits. However, it is slow-path
> and low-probability-hit case, and it is not good at these aspects:
>     1) it adds a special case and makes the code complex, bad for review.
>     2) it adds a special state for the first-flusher which is allowed to
>        be deprived. It causes a race and we have to check wq->first_flusher
>        again with mutex held: 4ce48b37bfed ("workqueue: fix race condition
>        in flush_workqueue()").

The original goal was replicating the wake up behavior of the existing
implementation.  It doesn't matter whether we have a couple more
lockings or somewhat more complex logic there but it *does* make
noticeable difference when it starts involving scheduling latencies.
They are multiple orders of magnitude longer after all.  Not quite the
same but synchronize_rcu() is similar and we've had several cases
where blind synchronize_rcu() invocations in userland visible paths
causing crippling latencies (e.g. SCSI scanning through non-existent
LUNs ending up taking tens of seconds in pathological cases).

So, I think hiding latencies which can easily in millisecs range is
important.  It isn't a performance optimization.  It almost becomes a
correctness issue when the problem is severely hit and the amount of
code simplification that we get from dropping this doesn't seem like
much.  As such, I'm not quite convinced I wanna apply this one.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/