lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 21 Mar 2013 10:41:33 -0700
From:	Tejun Heo <tj@...nel.org>
To:	Lai Jiangshan <eag0628@...il.com>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 15/21] workqueue: remove worker_maybe_bind_and_lock()

Hello, Lai.

On Thu, Mar 21, 2013 at 07:03:30PM +0800, Lai Jiangshan wrote:
> On Thu, Mar 21, 2013 at 2:10 AM, Tejun Heo <tj@...nel.org> wrote:
> > On Wed, Mar 20, 2013 at 03:28:15AM +0800, Lai Jiangshan wrote:
> >>  static struct worker *alloc_worker(void)
> >>  {
> >>       struct worker *worker;
> >> @@ -2326,7 +2262,8 @@ repeat:
> >>               spin_unlock_irq(&wq_mayday_lock);
> >>
> >>               /* migrate to the target cpu if possible */
> >> -             worker_maybe_bind_and_lock(pool);
> >> +             set_cpus_allowed_ptr(current, pool->attrs->cpumask);
> >> +             spin_lock_irq(&pool->lock);
> >>               rescuer->pool = pool;
> >
> > You can't do this.  The CPU might go up between set_cpus_allowed_ptr()
> > and spin_lock_irq() and the rescuer can end up executing a work item
> > which was issued while the target CPU is up on the wrong CPU.
> 
> 
> Why it is wrong?
> It can be allowed if the wq has rescuer. (wq of work_on_cpu() don't has rescuer,
> it does not hurt to work_on_cpu() of something else.
> 
> Actually it was allowed by recent patches.
> If a cpu is up while a rescuer do process_scheduled_workers(),
> all the later work item will be running the wrong CPU while
>  the target CPU is up.(due to for_cpu_pool_workers() don't include
> active rescuer).

So,

* If the user wraps queueing and flushing inside get_online_cpus(), it
  better gets executed and finishes on the target CPU.

* Similarly, it must guaranteed that any work items queued and flushed
  by CPU_ONLINE or CPU_DOWN_PREP notifiers should start and finish
  execution on the target CPU.

Your patch breaks the above.

> I don't want to go back to make cpuhotplug nor rescuer_theread become
> complicated.
> so I prefer to allow work item can run on wrong cpu if the wq has rescuer.
> All just needs a comments.

It is explained in the comment.

> I guess you will agaist me...... let me rewrite it .....

It's not because I'm "against" you.  It's because it's buggy and we
wouldn't have any affinity guarantee with the proposed changes.  If
you can maintain the guarantees and remove
worker_maybe_bind_and_lock(), please be my guest.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ