lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 14 Mar 2018 12:37:52 -0700
From:   Vinicius Costa Gomes <vinicius.gomes@...el.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        randy.e.witt@...el.com, David Miller <davem@...emloft.net>,
        Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH net-next 0/2] skbuff: Fix applications not being woken for errors

Hi,

Willem de Bruijn <willemdebruijn.kernel@...il.com> writes:

>> Another interesting fact is that if the POLLIN event is added to the
>> poll() .events, poll() no longer becomes stuck,
>
> The process has registered interest only in POLLIN, which the call to
> sk_data_read (sock_def_readable) will trigger.
>
>> and more interestingly
>> the returned event in .revents is only POLLERR.
>
> datagram_poll will set (E)POLLERR based on non-empty sk_error_queue.
>
>> After a few debugging sessions, we got to 'sock_queue_err_skb()' and
>> how it notifies applications of the error just enqueued. Changing it
>> to use 'sk->sk_error_report()', fixes the issue for hardware and
>> software timestamping. That is patch (2).
>>
>> The "solution" proposed in patch (2) looks like too big a hammer,
>
> It looks fine to me. POLLERR is returned regardless of the mask a
> process sets up in pollfd.events. So waking with sk_error_report
> will fix this while still waking callers waiting on POLLIN.
>
> Note that on sock_dequeue_err_skb, if another notification (of the
> right kind) is waiting, sk_error_report is already called instead of
> sk_data_ready.

Thank you all who did confirm that this was the right fix.

>
> This should perhaps go to net, instead of net-next (though not the
> test).

Will propose it to net, what I am thinking is that now that a few TSN
features are upstream, applications that use hardware timestamping (the
easiest way to trigger this bug) could be more common.

>
> If resending, a small nit in the test: please keep the alphabetical
> order in getopt. The filepath also looks a bit fishy, but git am applied
> the mbox from patchwork without issue.

Will send a second version.


Thanks,
--
Vinicius

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ