[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20111007180534.GA671@aftab>
Date: Fri, 7 Oct 2011 20:05:34 +0200
From: Borislav Petkov <bp@...64.org>
To: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc: Borislav Petkov <bp@...64.org>, Tejun Heo <tj@...nel.org>,
"Rafael J. Wysocki" <rjw@...k.pl>, Borislav Petkov <bp@...en8.de>,
"tigran@...azian.fsnet.co.uk" <tigran@...azian.fsnet.co.uk>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...e.hu" <mingo@...e.hu>, "hpa@...or.com" <hpa@...or.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux PM mailing list <linux-pm@...ts.linux-foundation.org>
Subject: Re: [BUGFIX][PATCH RESEND] Freezer, CPU hotplug, x86 Microcode: Fix
task freezing failures
On Fri, Oct 07, 2011 at 12:48:17PM -0400, Srivatsa S. Bhat wrote:
> Boris, it is only now (after you explained) that I really understood why you
> saw value in this patch (even though it was not the proper fix). So actually
> this patch is just a good-to-have cpu hotplug optimization, but the real fix
> would be the exclusion approach. More than that, this patch has got nothing
> intentional to do with freezer, but its motivation is just to avoid doing
> something needless in the cpu hotplug path. And an entirely different patch
> (having the exclusion stuff) is needed to properly fix the problem we are
> facing. This is what you mean right?
>
> If so, then in a way we are trying to reposition why we need this patch. And
> since we don't want to position this as a fix to this problem, I should
> probably submit this patch with a different patch description and subject,
> to explain the new usecase/motivation for this patch. Am I right?
Absolutely, right on the money. Just write a short commit message
explaining why it does what it does and send it to x86 guys. Thanks for
that.
> By the way, even I believe that the exclusion approach is the best fix to the
> problem. (I have been mulling about this in some of my previous mails as well).
> At least we can see 3 call paths that get into trouble when racing with
> freezer:
> 1. CPU hotplug.
> 2. Microcode module load/unload.
> 3. Reloading the microcode by controlling the sysfs file
> /sys/devices/system/cpu/cpu*/microcode/reload. See below for log for this new
> scenario.
Please note that microcode is not supposed to be reloaded that often and
the box suspended at the same time as your test does. So I don't really
consider it relevant case - normally, you either update your microcode
XOR hibernate the box. Besides, microcode is not something you get on a
monthly basis to require such often updates.
> At least this is what I got from looking at the microcode call paths that
> involve a call to request_firmware.
>
> I am still working on implementing the mutual exclusion at appropriate
> places. However, since any of this would involve locking, with the
> freezer/suspend involved as well (and especially since cpu hotplug is
> used by the suspend code itself), I am trying to tread cautiously
> (read: needing more time) to ensure that I don't introduce incorrect locking
> scenarios and hence task freezing failures myself, while intending to fix it.
IMO, you should concentrate on fixing _your_ use case and where your
testing fails instead of trying to cover all hypothetical failure
scenarios. Let's say that that's impossible, and also, doing the ucode
loading through the boot loader should take care of all those later.
Thanks.
--
Regards/Gruss,
Boris.
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists