lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1591f1d-ddd6-1cd5-afd6-c42eb4671a03@linux.ibm.com>
Date:   Wed, 20 Nov 2019 13:34:51 -0800
From:   Tyrel Datwyler <tyreld@...ux.ibm.com>
To:     Michael Ellerman <mpe@...erman.id.au>,
        Chen Wandun <chenwandun@...wei.com>,
        linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
        mahesh@...ux.vnet.ibm.com, paulus@...ba.org
Subject: Re: [PATCH] powerpc/pseries: remove variable 'status' set but not
 used

On 11/18/19 9:53 PM, Michael Ellerman wrote:
> Chen Wandun <chenwandun@...wei.com> writes:
>> Fixes gcc '-Wunused-but-set-variable' warning:
>>
>> arch/powerpc/platforms/pseries/ras.c: In function ras_epow_interrupt:
>> arch/powerpc/platforms/pseries/ras.c:319:6: warning: variable status set but not used [-Wunused-but-set-variable]
> 
> Thanks for the patch.
> 
> But it almost certainly is wrong to not check the status.

Agreed, I started drafting a NACK response, but got sidetracked.

> 
> It's calling firmware and just assuming that the call succeeded. It then
> goes on to use the result that should have been written by firmware, but
> is now potentially random junk.
> 
> So I'd much rather a patch to change it to check the status.

+1

> 
>> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
>> index 1d7f973..4a61d0f 100644
>> --- a/arch/powerpc/platforms/pseries/ras.c
>> +++ b/arch/powerpc/platforms/pseries/ras.c
>> @@ -316,12 +316,11 @@ static irqreturn_t ras_hotplug_interrupt(int irq, void *dev_id)
>>  /* Handle environmental and power warning (EPOW) interrupts. */
>>  static irqreturn_t ras_epow_interrupt(int irq, void *dev_id)
>>  {
>> -	int status;
>>  	int state;
>>  	int critical;
>>  
>> -	status = rtas_get_sensor_fast(EPOW_SENSOR_TOKEN, EPOW_SENSOR_INDEX,
>> -				      &state);
>> +	rtas_get_sensor_fast(EPOW_SENSOR_TOKEN, EPOW_SENSOR_INDEX,
>> +			     &state);
> 
> This is calling a helper which already does some translation of the
> return value, any value < 0 indicates an error.

There are three possible architected failures here: Hardware, Non-existant
sensor, and an DR isolation error which namely would be reported in the status
as -EIO, -EINVAL, and -EFAULT. Further, the EPOW sensor is required, and is not
a DR entity so we can never get an -EINVAL or -EFAULT (baring broken firmware).
This leaves -EIO (HARDWARE_ERROR) and as I mention further down this will
generate its own error log in response. So, I don't think we need to do any
reporting here, and just return.

> 
>> @@ -330,12 +329,12 @@ static irqreturn_t ras_epow_interrupt(int irq, void *dev_id)
>>  
>>  	spin_lock(&ras_log_buf_lock);
>>  
>> -	status = rtas_call(ras_check_exception_token, 6, 1, NULL,
>> -			   RTAS_VECTOR_EXTERNAL_INTERRUPT,
>> -			   virq_to_hw(irq),
>> -			   RTAS_EPOW_WARNING,
>> -			   critical, __pa(&ras_log_buf),
>> -				rtas_get_error_log_max());
>> +	rtas_call(ras_check_exception_token, 6, 1, NULL,
>> +		  RTAS_VECTOR_EXTERNAL_INTERRUPT,
>> +		  virq_to_hw(irq),
>> +		  RTAS_EPOW_WARNING,
>> +		  critical, __pa(&ras_log_buf),
>> +		  rtas_get_error_log_max());
> 
> This is directly calling firmware.
> 
> As documented in LoPAPR, a negative status indicates an error, 0
> indicates a new error log was found (ie. the function should continue),
> or 1 there was no error log (ie. nothing to do).

It is highly unlikely that we will find no new error log since we are processing
an interrupt that supposedly fired to tell us there is a new one. However, the
ras_log_buf is never zeroed so in the unlikely case there is no new error log we
will parse stale data from the previous log. Better safe than sorry and just return.

In the case of an error the only error code we supposedly can get here is -1
(HARDWARE_ERROR), and the RTAS handling will generate an error log in response
to that. So, I don't think we need to report anything here. I would suggest for
the (status != 0) case that you just return.

-Tyrel

>
> cheers
> 
>>  	log_error(ras_log_buf, ERR_TYPE_RTAS_LOG, 0);
>>  
>> -- 
>> 2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ