lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0BA3FCBA62E2DC44AF3030971E174FB346690100@hasmsx107.ger.corp.intel.com>
Date:	Thu, 14 Jul 2016 09:08:51 +0000
From:	"Grumbach, Emmanuel" <emmanuel.grumbach@...el.com>
To:	Kalle Valo <kvalo@...eaurora.org>,
	Prarit Bhargava <prarit@...hat.com>
CC:	Luca Coelho <luca@...lho.fi>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	linuxwifi <linuxwifi@...el.com>,
	"Berg, Johannes" <johannes.berg@...el.com>,
	"Ivgi, Chaya Rachel" <chaya.rachel.ivgi@...el.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Sharon, Sara" <sara.sharon@...el.com>,
	"linux-wireless@...r.kernel.org" <linux-wireless@...r.kernel.org>
Subject: RE: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless
 ucode is loaded

> 
> Prarit Bhargava <prarit@...hat.com> writes:
> 
> > On 07/13/2016 03:24 AM, Luca Coelho wrote:
> >
> >> I totally agree with Emmanuel and Kalle. We should not change this.
> >> It is a design decision to return an error when the interface is
> >> down, this is very common with other subsystems as well.
> >
> > Please show me another subsystem or driver that does this.  I've
> > looked around the kernel but cannot find one that updates the firmware
> > and implements new features on the fly like this.  I have come across
> > several drivers that allow for an update, but they do not implement
> > new features based on the firmware.
> >
> > Additionally, what happens when someone back revs firmware versions
> > (which happens far more than you and I would expect)?  Does that mean
> > I now go from a functional system to a non-functional system wrt to
> userspace?
> 
> I'm not following, what do you mean exactly? Why are you talking updating
> the firmware?
> 
> So when we talk about "loading firmware" we mean that the driver pushes
> the firmware image to to the chipset. And then the interface is down the
> chipset is powered down and the RAM on it will be erased. That's the general
> idea anyway, I haven't checked how iwlwifi exactly works in this case but
> Luca or Emmanuel can correct me.

This is correct.


> 
> >> The userspace should be able to handle errors and report something
> >> like "unavailable" when this kind of error is returned.
> >
> > I myself have made the same arguments wrt to cpufreq code & bad
> > userspace choices.  I just went through this a few months back with
> > what went from a simple patch and turned out to be a hideous patch in
> > cpufreq.  You cannot break userspace like this.
> 
> Don't get me wrong, I'm a strong supporter of stable user space interfaces
> and I always try to adher to that. But there's a limit for everything. If I'm
> understanding correctly, what you mean is that the kernel should never
> return an error because an application doesn't handle errors gracefully.
> Sorry, but that doesn't make sense to me.
> 
> > See commit 51443fbf3d2c ("cpufreq: intel_pstate: Fix intel_pstate
> > powersave min_perf_pct value").  What should have been a trivial
> > change resulted in a massive change because of broken userspace.
> 
> In that cpufreq case I understand, it was about a combination of
> configuration values which broke the user space. But here we are just
> dealing with a simple error value, nothing fancy.
> 
> >> I'm not sure EIO is the best we can have, but for me that's exactly
> >> what it is.  The thermal zone *is* there, but cannot be accessed
> >> because the firmware is not available.  I'm okay to change it to
> >> EBUSY, if that would help userspace, but I think that's a bit
> >> misleading.  The device is not busy, on the contrary, it's not even running
> at all.
> >>
> >
> > I understand that, but by returning -EIO we end up with an error.
> >
> >> Furthermore, I don't think this is "breaking userspace" in the sense
> >> of being a regression.
> >
> > I run (let's say 4.5 kernel).  sensors works.  I update to 4.7.
> > sensors doesn't work.  How is that not a regression?  That's _exactly_
> > what it should be reported as.
> 
> Sure, it's a regression in a way. But that's how the user space app you are
> using is implemented, the same problem would happen with any driver
> returning errors.
> 
> >> The userspace API has always been implemented with the possibility of
> >> returning errors.  It's not a good design if a single device
> >> returning an error causes all the other devices to also fail.
> >>
> >
> > If that were the case we would never have to worry about "breaking
> userspace"?
> > For any kernel change I could just say that the userspace design was
> > bad and be done with it.  Why fix anything then?
> 
> Because we are talking about a simple error value.
> 
> > I don't see any harm in waiting to register the sysfs files for hwmon
> > until the firmware has been validated.
> 
> I'm against of that because it's bad software design. It's standard practise in
> Linux that drivers register their capabilities during driver probe time so that
> user space can query them whenever needed. I assume a properly behaving
> user space app would want to know about all the available sensors once the
> driver is initialised and your suggestion would break that.
> 
> > IIUC, the up/down'ing of the device doesn't happen that often (during
> > initial boot, and suspend/resume, switching wifi connections,
> > shutdown?).
> 
> Basically it can happen anytime, this is fully controlled by user space.
> There's no point of trying to make any assumptions as they won't hold
> anyway.
> 
> > This would make the iwlwifi community happy (IMO) and sensors would
> > still work. At the same time I could write a patch for lm-sensors to
> > fix this issue if it comes up in future versions.
> > [Aside: I'm going to have the reproducing system available today and
> > will test this out. It looks like just moving some code around.]
> 
> Another option, but still a bad one I don't like, is that you change the kernel
> interface to ignore all errors from drivers (like iwlwifi). This way drivers don't
> need to make ugly workarounds.
> 
> > The bottom line is that lm-sensors is currently broken with this
> > change in iwlwifi.  AFAICT, no other thermal device returns an error
> > this way, and IMO that means the iwlwifi driver is doing something new
> > and unexpected wrt to userspace.
> 
> I haven't checked but I suspect ath10k has a similar problem when interface
> is down.
> 
> --
> Kalle Valo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ