linux-kernel - Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <b9136bed-c5c1-3c8d-a571-f0f1242d2f5b@linux.vnet.ibm.com>
Date:   Mon, 17 Apr 2017 16:49:41 +0530
From:   Mahesh Jagannath Salgaonkar <mahesh@...ux.vnet.ibm.com>
To:     Daniel Axtens <dja@...ens.net>,
        linuxppc-dev <linuxppc-dev@...abs.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Michael Ellerman <mpe@...erman.id.au>
Subject: Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in
 machine_check_early().

On 04/17/2017 04:09 PM, Daniel Axtens wrote:
> Hi Mahesh,
> 
>> Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors.
> 
> I notice this Fixes a commit I introduced. Please could you cc me when
> you do this? I am likely to miss it otherwise, especially since I have
> now left IBM.

Sure will do. :-)

> 
> Being cced allows me to provide an Ack or a review. And getting feedback
> on my changes is very helpful in becoming a better programmer.
> 
> In this case, as per Michael's comment, why don't we just move the
> add_taint from machine_check_early to
> machine_check_process_queued_event - the other side of the work queue.

Yes. That is what my plan is. Also, that is not the only place.
add_taint() need to be called from machine_check_exception() as well. So
it will be called from two places.

Thanks,
-Mahesh.

> 
> The work queue system is supposed to provide us with a safe place to do
> printing, etc., so it's an appropriate place. Also, we already do
> machine_check_print_event_info there, and adding the taint doesn't need
> to be done synchronously.
> 
> Regards,
> Daniel
> 
> Mahesh J Salgaonkar <mahesh@...ux.vnet.ibm.com> writes:
> 
>> From: Mahesh Salgaonkar <mahesh@...ux.vnet.ibm.com>
>>
>> machine_check_early() gets called in real mode. The very first time when
>> add_taint() is called, it prints a warning which ends up calling opal
>> call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
>> very first machine check while we are in opal we are doomed. OPAL_CALL
>> overwrites the PACASAVEDMSR in r13 and in this case when we are done with
>> MCE handling the original opal call will use this new MSR on it's way
>> back to opal_return. This usually leads unexpected behaviour or kernel
>> to panic. Instead use the add_taint_no_warn() that does not call printk.
>>
>> This is broken with current FW level. We got lucky so far for not getting
>> very first MCE hit while in OPAL. But easily reproducible on Mambo.
>> This should go to stable as well alongwith patch 1/2.
>>
>> Signed-off-by: Mahesh Salgaonkar <mahesh@...ux.vnet.ibm.com>
>> ---
>>  arch/powerpc/kernel/traps.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
>> index 62b587f..4a048dc 100644
>> --- a/arch/powerpc/kernel/traps.c
>> +++ b/arch/powerpc/kernel/traps.c
>> @@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs)
>>  
>>  	__this_cpu_inc(irq_stat.mce_exceptions);
>>  
>> -	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>> +	add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>>  
>>  	/*
>>  	 * See if platform is capable of handling machine check. (e.g. PowerNV
>