lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 15 Mar 2013 11:50:18 +0100
From:	Stephane Eranian <eranian@...gle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] perf fixes

On Fri, Mar 15, 2013 at 9:01 AM, Stephane Eranian <eranian@...gle.com> wrote:
> On Fri, Mar 15, 2013 at 2:06 AM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>> On Thu, Mar 14, 2013 at 5:24 PM, Stephane Eranian <eranian@...gle.com> wrote:
>>>
>>> I bet if you force the affinity of your perf record to be on
>>> a CPU other than CPU0, you will not get the crash.
>>>
>>> This is what I am seeing now. I appears on resume,
>>> CPU0 hotplug callbacks for perf_events are not invoked
>>> leaving DS_AREA MSR to 0.
>>>
>>> Can you confirm on your machine?
>>
>> I'm not even going to bother confirming it, because I think you're
>> right, and I think the reason is clear: the DS initialization code
>> uses the CPU_UP notifiers.
>>
> Ok, I instrumented the pebs_enable() function and I confirm that
> DS_AREA=0 on resume.
>
> So what seems broken here for me is that on suspend, the cpu notifier
> ends up calling fini_debug_store() to clear DS_AREA for CPU0. But
> on resume, the same notifier does NOT call the init_debug_store().
> I don't understand this asymmetry. You either do neither or you do
> both.
>
Ok, corrections. I ran some more tests. On the suspend path, the cpu
notifier is not called for CPU0. However when the machine comes back
up, DS_AREA is  0 which is the power-up default value. And given that
the notifier is not called for CPU0, that is the value we inherit later on
and which causes the crash.

>
>> And that's sufficient for CPU hotplug, which is what suspend/resume
>> ends up doing for all but the boot CPU. But the boot CPU is not
>> hotplugged.
>>
>> Using CPU_UP notifiers is wrong, and they get called too late anyway.
>>
>> The code should use a real resume method. Or, better yet, just do it
>> right, and do it from __restore_processor_state().
>>
I will produce a patch to use this function, it's simple enough.

>> Those f*cking CPU notifiers are a pain in the ass, and the tend to be
>> invariably broken, and they have their own idiotic hacks that are
>> equally broken (ie that x86_pmu_notifier() thing seems to make up its
>> own suspend/resume with
>> "x86_pmu.cpu_prepare/cpu_starting/cpu_dying/cpu_dead" things.
>>
>> I guess we could make the BP do a fake cpu notifier thing around the
>> suspend of the boot processor as well, but most of the per-CPU stuff
>> seems to be perfectly fine without it (ie mtrr, apic, etc etc all use
>> the suspend/resume infrastructure) and doesn't need that kind of
>> stuff.
>>
>>                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ