lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 9 Dec 2022 10:25:14 +0800 From: richard clark <richard.xnu.clark@...il.com> To: Lai Jiangshan <jiangshanlai@...il.com> Cc: tj@...nel.org, linux-kernel@...r.kernel.org Subject: Re: work item still be scheduled to execute after destroy_workqueue? On Thu, Dec 8, 2022 at 3:46 PM Lai Jiangshan <jiangshanlai@...il.com> wrote: > > On Thu, Dec 8, 2022 at 10:44 AM richard clark > <richard.xnu.clark@...il.com> wrote: > > > > On Wed, Dec 7, 2022 at 10:38 AM Lai Jiangshan <jiangshanlai@...il.com> wrote: > > > > > > On Tue, Dec 6, 2022 at 5:20 PM richard clark > > > <richard.xnu.clark@...il.com> wrote: > > > > > > > > On Tue, Dec 6, 2022 at 2:23 PM Lai Jiangshan <jiangshanlai@...il.com> wrote: > > > > > > > > > > On Tue, Dec 6, 2022 at 12:35 PM richard clark > > > > > <richard.xnu.clark@...il.com> wrote: > > > > > > > > > > > > > > > > > > A WARN is definitely reasonable and has its benefits. Can I try to > > > > > > submit the patch and you're nice to review as maintainer? > > > > > > > > > > > > Thanks, > > > > > > Richard > > > > > > > > > > > > > > > > > Sure, go ahead. > > > > > > > > > > What's in my mind is that the following code is wrapped in a new function: > > > > > > > > > > mutex_lock(&wq->mutex); > > > > > if (!wq->nr_drainers++) > > > > > wq->flags |= __WQ_DRAINING; > > > > > mutex_unlock(&wq->mutex); > > > > > > > > > > > > > > > and the new function replaces the open code drain_workqueue() and > > > > > is also called in destroy_workqueue() (before calling drain_workqueue()). > > > > > > > > > Except that, do we need to defer the __WQ_DRAINING clean to the > > > > rcu_call, thus we still have a close-loop of the drainer's count, like > > > > this? > > > > > > No, I don't think we need it. The wq is totally freed in rcu_free_wq. > > > > > > Or we can just introduce __WQ_DESTROYING. > > > > > > It seems using __WQ_DESTROYING is better. > > > > The wq->flags will be unreliable after kfree(wq), for example, in my > > machine, the wq->flags can be 0x7ec1e1a3, 0x37cff1a3 or 0x7fa23da3 ... > > after wq be kfreed, consequently the result of queueing a new work > > item to a kfreed wq is undetermined, sometimes it's ok because the > > queue_work will return directly(e.g, the wq->flags = 0x7ec1e1a3, a > > fake __WQ_DRAINING state), sometimes it will trigger a kernel NULL > > pointer dereference BUG when the wq->flags = 0x7fa23da3(fake > > !__WQ_DRAINING state). > > The whole wq is unreliable after destroy_workqueue(). > > All we need is just adding something to help identify any > wrong usage while the wq is in RCU grace period. > OK, understood! > > > > IMO, given the above condition, we can handle this in 2 phases: > > before the rcu_call and after. > > a. before rcu_call. Using __WQ_DESTROYING to allow the chained work > > queued in or not in destroy_workqueue(...) level, __WQ_DRAINING is > > used to make the drain_workqueue(...) still can be standalone. The > > code snippet like this: > > destroy_workqueue(...) > > { > > mutex_lock(&wq->mutex); > > wq->flags |= __WQ_DESTROYING; > > mutex_lock(&wq->mutex); > > Ok, put it before calling drain_workqueue() > > > ... > > } > > > > __queue_work(...) > > { > > if (unlikely((wq->flags & __WQ_DESTROYING) || (wq->flags & > > __WQ_DRAINING)) && > > WARN_ON_ONCE(!is_chained_work(wq))) > > Ok, combine __WQ_DESTROYING and __WQ_DRAINING together as: > if (unlikely((wq->flags & (__WQ_DESTROYING | __WQ_DRAINING)) && > > > > return; > > } > > > > b. after rcu_call. What in my mind is: > > rcu_free_wq(struct rcu_head *rcu) > > { > > ... > > kfree(wq); > > wq = NULL; > > It is useless code. > > > } > > > > __queue_work(...) > > { > > if (!wq) > > return; > > It is useless code. OK, will remove the above codes in the patch... > > > ... > > } > > > > Any comments? > > > > > > > > > > > > > --- a/kernel/workqueue.c > > > > +++ b/kernel/workqueue.c > > > > > > > > @@ -3528,6 +3526,9 @@ static void rcu_free_wq(struct rcu_head *rcu) > > > > > > > > else > > > > free_workqueue_attrs(wq->unbound_attrs); > > > > > > > > + if (!--wq->nr_drainers) > > > > + wq->flags &= ~__WQ_DRAINING; > > > > + > > > > kfree(wq); > > > > > > > > > > > > > > __WQ_DRAINING will cause the needed WARN on illegally queuing items on > > > > > destroyed workqueue. > > > > > > > > I will re-test it if there are no concerns about the above fix... > > > > > > > > > > > > > > Thanks > > > > > Lai
Powered by blists - more mailing lists