linux-kernel - Re: [PATCH v1] firmware_class: encapsulate firmware loading status

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7db50bd3-40de-5a5f-336e-8603df702746@monom.org>
Date:   Thu, 18 Aug 2016 20:55:54 +0200
From:   Daniel Wagner <wagi@...om.org>
To:     "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc:     Ming Lei <ming.lei@...onical.com>, linux-kernel@...r.kernel.org,
        Daniel Wagner <daniel.wagner@...-carit.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Takashi Iwai <tiwai@...e.de>,
        Kees Cook <keescook@...omium.org>,
        Dmitry Torokhov <dmitry.torokhov@...il.com>,
        Julia Lawall <julia.lawall@...6.fr>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jessica Yu <jeyu@...hat.com>, Jiri Kosina <jikos@...nel.org>,
        Miroslav Benes <mbenes@...e.cz>,
        Petr Mladek <pmladek@...e.com>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v1] firmware_class: encapsulate firmware loading status

On 18.08.2016 18:30, Luis R. Rodriguez wrote:
> On Wed, Aug 17, 2016 at 08:47:24AM +0200, Daniel Wagner wrote:
>> On 08/10/2016 08:52 PM, Luis R. Rodriguez wrote:
>> The current 'state machine' uses three variables to handle the state
>> and the transitions.
>>
>> struct completion {
>> 	unsigned int done;
>> 	wait_queue_head_t wait;
>> };
>>
>> struct firmware_buf {
>> 	...
>> 	struct completion completion;
>> 	unsigned long status;
>> 	...
>> };
>>
>> Obviously, the variable 'status' holds the state. 'wait' and 'done'
>> handles the synchronization. 'done' remembers how many waiters will
>> be woken at max. complete_all() sets it to UMAX/2. That should be
>> enough in most of the cases. 
> 
> Thanks, this helps and makes sense. How many data structures
> in comparison does the new swait require ? Is it smaller ? If
> so that is a nice simplification indeed, however we should make
> sure we have no compromises then.

Yes we save one 'unsigned int', that is the done member of struct
completion. For an earlier version of this patch I did check the size
changes. While we save a little on the data section, the code section
increased slightly, IIRC it was around 60 bytes. Will do another
measurement.

>> So any future wait_for_completion() call will not block.
> 
> This I don't get, do you mean that if we have already UMAX/2
> waiters on a completion and another one comes in, it will not
> wait at all ?

Sorry, I think I just confused you here with a implementation detail.
Whenever wait_for_completion() is woken it checks if done > 0. Then it
will decrement the counter. complete() increases the counter and then
wakes the waiter. Basically it is comparable with semamphore put and get
operation. complete_all() just sets done to max almost infinite value :)

> Is this documented well ? Either way clarifying exactly what is
> done here would be of huge help understanding the striking
> differences between a switch to the new API.

Obviously, there is Documentation/scheduler/completion.txt but the small
detail on UMAX/2 is not mentioned. I don't think it was considered to be
a real problem. I guess before you run into the problem of waking 2
billion threads you see other scaling issues first.

Note this has nothing to do with wait vs swait.

>> The patch just drops the 'done' completely because it is not
>> necessary. We have a waiter queue for all those pending waiters 
> 
> So there is no limit to waiters with the new API ?

Correct, the limit is gone, though I don't expect that there are so many
firmware user helper waiting that we hit the UMAX/2 limit ever.

>> and
>> as soon the final state is reached we just wake them up. The future
>> waiters will never be queued because we just check for the state
>> first.
> 
> I do not follow what this means, I take it here we are talking about
> possible race conditions between a wait and some work about to be
> done?

Let me reword that. I was not really concerned about race condition
here. I was just trying to point out that we just check for the condition.

Either we have reached FW_STATUS_{DONE|ABORTED} and just continue or we
put the thread to sleep and wait for the wake call. Because we check for
the a single condition (status == FW_STATUS_{DONE|ABORTED} in
swait_event_interruptable_timeout() we don't need any addition
synchronization. Come to think about it, that is why the mutex can be
removed.

>> wait vs swait: The main difference between the two APIs is the
>> implementation. So it is pretty simple to switch from one to the
>> other. So why swait, I hear you asking. The swait implentation is
>> pretty simple for the price that you can't do all the stuff what
>> wait offers. As long you don't need the extra features of wait just
>> go with swait.
> 
> OK so wait offers more features and its a kitchen sink of stuff,
> we only require a simple wait and swait is better and more light
> weight.

Yes, that summarized it pretty good.

> The above number of waiters is still something I'd like
> a bit clarification on.

As I understand the firmware loader helper userland API there is only
one waiter.

>> While the above points are nice side effect the real reason is the
>> cleanup of the code and getting rid of the mutex operations.
> 
> This indeed is huge and this can better be reflected on the commit log.
> In fact I wonder if its possible to do the switch without the change
> to swait, and do the conversion to swait as a secondary step.

Not sure about it because 'status' and the operation of completion need
to be synchronized. I'll give it a try just haven't had time yet. It is
not about wait or swait, it's about completion vs s/wait.

>> I can try to split the patch into two steps. Let's see how this
>> works out. But I wouldn't mind if we go with this version :)
> 
> I understand -- however I have to ask as if its possible it makes
> things easier to review and makes two logical changes split up. This
> would in turn be easier to debug if there are issues.

Sure, I completely understand. BTW, I just updated the patch and avoided
the moving of the loading_timeout. Now it doesn't contain any hard to
read section anymore.

>>> o once you have only a conversion from old wait to new swait you can
>>>   inspect the delta and try to write SmPL grammar to see if you can
>>>   generalize the change, so grammar can do the change for other
>>>   use cases. Of course, you'd need first to look for the IRQ context,
>>>   and I wonder if that's possible. If there are however generic
>>>   benefits of swait over old wait when complete_all() is used (is
>>>   live patching one?) then this will be very handy.
>>
>> From my attempts to figure out the execution context with SmPL I
>> fear that is rather hard to achieve because you need to create a
>> call graph and track the state.
> 
> OK..

I know you have a far better understanding. We need to discuss this over
a beer :)

cheers,
daniel