linux-kernel - Re: [PATCH v9 1/5] firmware: add extensible driver data params

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170613194011.GI27288@wotan.suse.de>
Date:   Tue, 13 Jun 2017 21:40:11 +0200
From:   "Luis R. Rodriguez" <mcgrof@...nel.org>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     "Luis R. Rodriguez" <mcgrof@...nel.org>, wagi@...om.org,
        dwmw2@...radead.org, rafal@...ecki.pl,
        arend.vanspriel@...adcom.com, rjw@...ysocki.net,
        yi1.li@...ux.intel.com, atull@...nsource.altera.com,
        moritz.fischer@...us.com, pmladek@...e.com,
        johannes.berg@...el.com, emmanuel.grumbach@...el.com,
        luciano.coelho@...el.com, kvalo@...eaurora.org, luto@...nel.org,
        torvalds@...ux-foundation.org, keescook@...omium.org,
        takahiro.akashi@...aro.org, dhowells@...hat.com, pjones@...hat.com,
        hdegoede@...hat.com, alan@...ux.intel.com, tytso@....edu,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 1/5] firmware: add extensible driver data params

On Tue, Jun 13, 2017 at 11:05:48AM +0200, Greg KH wrote:
> On Mon, Jun 05, 2017 at 02:39:33PM -0700, Luis R. Rodriguez wrote:
> > As the firmware API evolves we keep extending functions with more arguments.
> > Stop this nonsense by proving an extensible data structure which can be used
> > to represent both user parameters and private internal parameters.
> 
> Let's take a simple C function interface and make it a more complex
> data-driven interface that is impossible to understand and obviously
> understand how it is to be used and works!

The firmware codebase was already complex!

What you have to ask yourself really is if this makes it *less complex* and
helps *clean things up* in a much better way than it was before. Also does it
allow us to *pave the way for new functionality easily*, without creating
further mess?

If not, what concrete alternatives do you suggest?

> :(
> 
> Seriously, why?  Why are we extending any of this at all? 

Addressing easy extensibility was a prerequisite of considering firmware signing
support. Its really the *only* reason I started reviewing the firmware API, to the
point I started helping fix quite a bit of bugs and now just became the maintainer.

So my original firmware signing effort, now driven by AKASHI Takahiro, was one
of the original motivations here!

I asked:

	How we should properly extend the API for new functionality in *a real
	clean way* ? This also means cleaning under the rug!

Ease of extensibility is a need we have as otherwise we end up in the same
situation as before, we end up having to add new functionality by extending the
number of arguments to an sync or async call, and as collateral have to changes
*tons* of callers. I think you and I agree on this, correct me if I'm wrong?

Another issue to address when deciding when there is merit for a new API call,
instead of just folding functionality into flags. For this its best consider
not only the future but also what was done in the past given some new desirable
functionality relies on existing mechanisms.

The approach Rafał took does what I do just that it forgets about cleaning
underneath the rug, so I really cannot see how its any different. Case in
point, we have functionality in place today to support no-cache, yet this is a
hidden feature, used implicitly only for one API call, there has been requests
to support the no-cache functionality by Johannes for iwlwifi long ago for the
async call given *they* do their own caching.  How do you suggest do we take
the no-cache functionality used only internally for another API call and apply
it for async?  Do we make a new exported symbol ? Or do we expose this as a
flag on a regular async call?  With the approach I take we expose the internal
functionality clearly and later Li Yi folds it as public API through a
parameter for the async API. Later he extends iwlwifi with just 2 lines of code
to get this functionality. This would later also be used by some other
functionality he's interested in!

Another good example is that the optional call request_firmware_direct() is for
sync, but async also has the same need --  do we add a new API call just for
the same purpose or do we fold some of this functionality as flags?

How do we properly keep evolving functionality for the firmware API? We have
to consider existing features, perhaps some hidden, and new features coming
down the pipe line.

Please take the time to think about all this for a bit, not only the past but
also all the incoming set of features in the pipeline! The FPGA streaming
support for instance reuses the no-cache "internal hidden feature", as well as
the existing request_firmware_into_buf() with some slight variations!  We could
just fold both requirements as parameters.

> This series adds a ton of new "features"

Actually, it *purposely* only folded the optional nature of firmware for async
and added support for enabling drivers to daisy chain requests using a simple
api numbering scheme as a counter to complex internal driver mechanisms --
Intel's solution actually used recursion. I split out this work on its own
series after a long history of working on firmware singing support. I'm now
also cleaning under the rug.

I purposely limited the amount of "features" to get proper review.

> and complexity,

The difficulty to properly understand the firmware code *was there before*, so
I really do believe its unfair for you to say what I have come up with is
complex, question is if its less complex, and *easier* to understand than what
we had before. I'd argue it is easier to understand given 2 developers have
been working off on top of it now and seem to grok it rather well.

Today's code base's complexity is one of the reasons we also we have had quite
a bit of regressions. One of the other issues was lack of proper documentation,
which I have been fixing as of late, but also clearly documenting the purpose
and limitations each new feature. The confusing #ifef'ery over some of the flag
options was also not helpful. This series did away with all that in preference
for annotating we have really 2 modes of operations with a set of features
which can be described in a proper structure, while also driving functionality
on par with functional testing.

> but for absolutely no gain.

I thought I had made the above problem pretty clear before, during some of the
last 9 iteration of the driver data API. Apologies if it was not.

The goal is to enable us to provide an easily extensible API to avoid
unnecessary collateral evolutions. It was only a subset goal of the firmware
signing effort. This in turn also drove me to consider how to clean under the
rug.

Where does one draws the line of a split of an API? After studying the code a
lot I chose that fine line between sync and async. Some even suggested we do
away with this and go with *one* API call, after all an async call really is
nothing more than a sync call with a schedule worker, however knowing easily if
a piece of code blocks or not in a more direct way in the kernel through seemed
important enough for me to draw a clear API distinction.

> Oh, I take it back, you removed 29 lines from the iwlwifi driver.

This I did simply to demo the gain of *one* possible feature.

I could have done something else. For instance, iwlwifi also can use the
no-cache feature. Do we expose a new async call for that alone? There is
also the firmware signing work which tons of people keep asking for.

If I would have bundled all pending features up into one series you would
have told me to break them down. This is why I just demo'd *one* use case.

The rest is in the pipeline, and folks are posting patches based on them.  I
ask you consider all that functionality in light of what we have today and what
that would look like today with our historical approach.

If you look at Yi Li's patchset for the no-cache feature you'd see that when
exposing *old internal functionality* its rather easy and straight forward to
review and understand [0], as another example consider then Yi LI's next
patchset which adds streaming support for FPGAs using the driver data API [1].

So the rest of the "features" you should consider would come with time as with
Yi Li or AKASHI's firmware signing patches [2]. I just added *two* example new
features to demo this. There rest should be considered by getting an idea
of how the API grows and compares to alternatives we have been practicing over
time.

[0] https://lkml.kernel.org/r/1495262819-981-1-git-send-email-yi1.li@linux.intel.com
[1] https://lkml.kernel.org/r/1495262948-1106-4-git-send-email-yi1.li@linux.intel.com
[2] https://lkml.kernel.org/r/20170526030609.1414-1-takahiro.akashi@linaro.org

> That's still not worth it at all, you have yet to sell me on this whole
> complex beast.  I can't see why we need it, and if I, one of the few
> people who thinks they actually understand this kernel interface, can't
> see it, how can you sell it to someone else?

Perhaps you have not tried to clean the code and also not cleanly tried to
extend it with all the functionality I have added. I invite you to do so, and
try accomplish the same I have in this series. Naturally I believe some part of
a different outcome it will be due to subjectivity, and if not I'd like to hear
very specific issues with what I propose.

> Sorry, but no, I'm still not going to take this series until you show
> some _REAL_ benefit for it.

I hope what I describe above helps.

  Luis