lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1701171059150.3495@nanos>
Date:   Tue, 17 Jan 2017 11:05:46 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Stephane Eranian <eranian@...gle.com>
cc:     zhouchengming <zhouchengming1@...wei.com>,
        LKML <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "Liang, Kan" <kan.liang@...el.com>,
        David Carrillo Cisneros <davidcc@...gle.com>,
        dave.hansen@...ux.intel.com, qiaonuohan@...wei.com,
        guohanjun@...wei.com
Subject: Re: [PATCH] fix race caused by hyperthreads when online an offline
 cpu

On Mon, 16 Jan 2017, Stephane Eranian wrote:
> On Mon, Jan 16, 2017 at 1:53 AM, zhouchengming
> <zhouchengming1@...wei.com> wrote:
> > On 2017/1/16 17:05, Thomas Gleixner wrote:
> >>
> >> On Mon, 16 Jan 2017, Zhou Chengming wrote:
> >>
> >> Can you please stop sending the same patch over and over every other day?
> >>
> >> Granted, things get forgotten, but sending a polite reminder after a week
> >> is definitely enough.
> >>
> >> Maintainers are not machines responding within a split second on every
> >> mail
> >> they get. And that patch is not so substantial that it justifies that kind
> >> of spam.
> >>
> >
> > Very sorry for the noise. We are just not sure this is the right fix because
> > it's
> > hard to reproduce.
> >
> I believe this is the right fixed. I tried it and instrumented the
> code to verify thread_id
> assignment. The problem is easy to reproduce.
> 
> $ echo 0 >/sys/devices/system/cpu/cpu2/online
> $ echo 1 >/sys/devices/system/cpu/cpu2/online
> 
> Normally on Haswell Desktop part,  CPU2 gets thread_id 0 on boot, CPU6
> gets thread_id 1.
> If you offline CPU2 and bring it back in, it will get thread_id 1 and
> thus both sibling will point
> to the same exclusive state. The fix is, indeed, to check if the
> sibling is not already assigned 1,
> and if so to keep 0 for the CPU being online'd.

Right. So it's a simple static fully reproducible problem and not a race of
some sorts. I'll amend the changelog ....

Btw, this code has the hardcoded assumption two threads per core. So
anything which has more than two threads is broken vs. that exclusive
access. No idea whether that matters in practice, but I just noticed.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ