lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sgibeqcs.fsf@nanos.tec.linutronix.de>
Date:   Sat, 14 Mar 2020 01:23:47 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Logan Gunthorpe <logang@...tatee.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        linux-kernel@...r.kernel.org
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Kurt Schwemmer <kurt.schwemmer@...rosemi.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org
Subject: Re: [PATCH 3/9] pci/switchtec: Don't abuse completion wait queue for poll

Logan Gunthorpe <logang@...tatee.com> writes:
> On 2020-03-13 11:46 a.m., Sebastian Andrzej Siewior wrote:
>> The poll callback is abusing the completion wait queue and sticks it into
>> poll_wait() to wake up pollers after a command has completed.
>> 
>> First of all it's a layering violation as it imposes restrictions on the
>> inner workings of completions. Just because C allows to do so does not
>> justify that in any way. The proper way to do such things is to post
>> patches which extend the core infrastructure and not by silently abusing
>> it.
>
> As I've said previously, I disagree with this approach.

Feel free to do s.

> Open coding standard primitives sweeps issues under the rug and is a
> step backwards for code quality. Calling it a layering violation is
> just one opinion and if it is, the better solution would be to create
> an interface you find appropriate so that it isn't one.

There is no standard primitive which allows to poll on a completion.

You decided that this is smart to do and just because C does not
allow to hide implementation details this is not a justification for
this at all.

Due to the limitations of C, the kernel has to rely on the assumption
that developers know and respect the difference between API and
implementation.

Relying on implementation details of an interface and then arguing that
this is a standard primitive for the chosen purpose is just backwards.

What's even more hillarious is that you now request that we give you a
replacement interface for something which was not an interface to use in
the first place.

>>  1) It cannot work with EPOLLEXCLUSIVE
>
> Why? You don't explain this. And I don't see how this patch would change
> anything to do with the call to poll_wait(). All you've done is
> open-code the completion.
>
> Not that it matters that much, having multiple waiters poll on this
> interface can pretty much never happen. It only makes sense for the
> process who submitted the write to poll on the interface.

It does not matter whether your envisioned usage implies that it cannot
happen. Fact is that there is no restriction. That means using this with
the well documented semantics of poll(2) will result in failure. 

>>  2) It's racy:
>> 
>>   poll()	      	  	 write()
>>    switchtec_dev_poll()		   switchtec_dev_write()
>>     poll_wait(&s->comp.wait);        mrpc_queue_cmd()
>>     				       init_completion(&s->comp)
>> 					 init_waitqueue_head(&s->comp.wait)
>
> That's a nice catch! But wouldn't an easier solution be to change the
> code to use reinit_completion() instead of using the bug to justify a
> different change?

Sure taht can be cured by changing it to reinit, but that does not cure
the abuse of implementation details. As Peter, who maintains the
completion code says:

  Relying on implementation details of locking primitives like that is
  yuck.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ