lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 26 Jun 2018 11:32:44 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Vadim Pasternak <vadimp@...lanox.com>
Cc:     Andrew Lunn <andrew@...n.ch>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "rui.zhang@...el.com" <rui.zhang@...el.com>,
        "edubezval@...il.com" <edubezval@...il.com>,
        "jiri@...nulli.us" <jiri@...nulli.us>, mlxsw <mlxsw@...lanox.com>,
        Michael Shych <michaelsh@...lanox.com>
Subject: Re: [patch net-next RFC 11/12] mlxsw: core: Extend hwmon interface
 with FAN fault attribute

On Tue, Jun 26, 2018 at 04:47:05PM +0000, Vadim Pasternak wrote:
> 
> 
> > -----Original Message-----
> > From: Guenter Roeck [mailto:linux@...ck-us.net]
> > Sent: Tuesday, June 26, 2018 7:33 PM
> > To: Vadim Pasternak <vadimp@...lanox.com>
> > Cc: Andrew Lunn <andrew@...n.ch>; davem@...emloft.net;
> > netdev@...r.kernel.org; rui.zhang@...el.com; edubezval@...il.com;
> > jiri@...nulli.us; mlxsw <mlxsw@...lanox.com>; Michael Shych
> > <michaelsh@...lanox.com>
> > Subject: Re: [patch net-next RFC 11/12] mlxsw: core: Extend hwmon interface
> > with FAN fault attribute
> > 
> > On Tue, Jun 26, 2018 at 02:47:01PM +0000, Vadim Pasternak wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Andrew Lunn [mailto:andrew@...n.ch]
> > > > Sent: Tuesday, June 26, 2018 5:29 PM
> > > > To: Vadim Pasternak <vadimp@...lanox.com>
> > > > Cc: davem@...emloft.net; netdev@...r.kernel.org; linux@...ck-us.net;
> > > > rui.zhang@...el.com; edubezval@...il.com; jiri@...nulli.us; mlxsw
> > > > <mlxsw@...lanox.com>; Michael Shych <michaelsh@...lanox.com>
> > > > Subject: Re: [patch net-next RFC 11/12] mlxsw: core: Extend hwmon
> > > > interface with FAN fault attribute
> > > >
> > > > > +static ssize_t mlxsw_hwmon_fan_fault_show(struct device *dev,
> > > > > +					  struct device_attribute *attr,
> > > > > +					  char *buf)
> > > > > +{
> > > > > +	struct mlxsw_hwmon_attr *mlwsw_hwmon_attr =
> > > > > +			container_of(attr, struct mlxsw_hwmon_attr,
> > > > dev_attr);
> > > > > +	struct mlxsw_hwmon *mlxsw_hwmon = mlwsw_hwmon_attr->hwmon;
> > > > > +	char mfsm_pl[MLXSW_REG_MFSM_LEN];
> > > > > +	u16 tach;
> > > > > +	int err;
> > > > > +
> > > > > +	mlxsw_reg_mfsm_pack(mfsm_pl, mlwsw_hwmon_attr->type_index);
> > > > > +	err = mlxsw_reg_query(mlxsw_hwmon->core, MLXSW_REG(mfsm),
> > > > mfsm_pl);
> > > > > +	if (err) {
> > > > > +		dev_err(mlxsw_hwmon->bus_info->dev, "Failed to query
> > > > fan\n");
> > > > > +		return err;
> > > > > +	}
> > > > > +	tach = mlxsw_reg_mfsm_rpm_get(mfsm_pl);
> > > > > +
> > > > > +	return sprintf(buf, "%u\n", (tach < mlxsw_hwmon->tach_min) ? 1 :
> > > > > +0); }
> > > >
> > > > Documentation/hwmon/sysfs-interface says:
> > > >
> > > > Alarms are direct indications read from the chips. The drivers do
> > > > NOT make comparisons of readings to thresholds. This allows
> > > > violations between readings to be caught and alarmed. The exact
> > > > definition of an alarm (for example, whether a threshold must be met
> > > > or must be exceeded to cause an alarm) is chip-dependent.
> > > >
> > > > Now, this is a fault, not an alarm. But does the same apply?
> > >
> > Yes, it does. There are no "soft" alarms / faults.
> > 
> > > Hi Andrew,
> > >
> > > Hardware provides minimum value for tachometer.
> > > Tachometer is considered as faulty in case it's below this value.
> > 
> > This is for user space to decide, not for the kernel.
> 
> Hi Guenter,
> 
> Do you suggest to expose provide fan{x}_min, instead of fan{x}_fault
> and give to user to compare fan{x}_input versus fan{x}_min for the
> fault decision?
> 

fanX_min only makes sense if programmed into or reported by the chip
or controller (that is what the attribute is for), usually to enable
the chip/controller to set an alarm. If the chip or controller does
not have a minimum speed register, the attribute should not exist,
and any decision based on a comparison between a minimum fan speed
and the actual fan speed is a user space problem.

I don't know what the tach_min calculation is about, but setting
it to the minimum of all tachometer speeds (or of all reported
minimums ?) is not the task of a hwmon driver. A hwmon driver
reports what it gets from hardware; the interpretation is up
to other parts of the system (eg userspace or the thermal
subsystem). That includes a software-based decision if an alarm
or fault should be reported or not.

> > 
> > > In case any tachometer is faulty, PWM according to the system
> > > requirements should be set to 100% until the fault
> > 
> > system requirements. Again, this is for user space to decide.
> 
> 
> Yes, user should decide in this case and I wanted to provide to user
> fan{x}_fault for this matter. But it could do it based on input and min
> attributes, of course.
> 
Note that "fault" and "alarm" do have distinct different meanings.
Many fan controllers can detect if a fan is faulty (eg no sensor
connected or it is deemed faulty) or if it just runs too slow.
The typical remedy is also different: A slow fan may just need
more pwm or voltage, a faulty fan needs to be replaced.

Guenter

> > 
> > > is not recovered (f.e. by physical replacing of bad unit).
> > > This is the motivation to expose fan{x}_fault in the way it's exposed.
> > >
> > > Thanks,
> > > Vadim.
> > >
> > > >
> > > >      Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ