lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170621175934.GB10139@htj.duckdns.org>
Date:   Wed, 21 Jun 2017 13:59:34 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org,
        Lai Jiangshan <jiangshanlai@...il.com>, kernel-team@...com
Subject: Re: simple repro case

Hello, Steven.

On Wed, Jun 21, 2017 at 10:24:57AM -0400, Steven Rostedt wrote:
> On Sat, 17 Jun 2017 08:11:49 -0400
> Tejun Heo <tj@...nel.org> wrote:
> 
> > Here's a simple rerpo.  The test code runs whenever a CPU goes
> > off/online.  The test kthread is created on a different CPU and
> > migrated to the target CPU while running.  Without the previous patch
> > applied, the kthread ends up running on the wrong CPU.
> > 
> 
> Hmm, I'm not able to trigger the warn_on, with this patch applied.
> 
> Adding a trace_printk("here!\n") just above the warn_on in
> wq_worker_sleeping(), and doing the following:
> 
>          cpuhp/2-20    [002] d..1   751.204894: console: [  751.018261] TEST: cpu 2 inactive, starting on 0 and migrating (active/online=0-1,3/0-3)
>          cpuhp/2-20    [002] d..1   751.318375: console: [  751.131745] TEST: test_last_cpu=0 cpus_allowed=0
>          cpuhp/2-20    [002] d..1   751.324249: console: [  751.137621] TEST: migrating to inactve cpu 2
>          cpuhp/2-20    [002] d..1   751.438368: console: [  751.251738] TEST: test_last_cpu=0 cpus_allowed=2

Ah, sorry about not being clear.  The repro is that test_last_cpu
isn't 2 on the last line.  It created a kthread on CPU 0 and tried to
migrate that to an online but inactive CPU 2 but the kthread couldn't
get on that CPU because the migration code disallowed the kthread from
moving to an inactive CPU.

The same problem affects workqueue rescuer.  It tries to migrate to an
inactive CPU to service the workqueue there but silently fails to and
then ends up running the work item on the wrong CPU.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ