lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Jul 2022 10:52:25 +0000
From:   Arseniy Krasnov <>
To:     Stefano Garzarella <>
CC:     "David S. Miller" <>,
        Jakub Kicinski <>,
        Paolo Abeni <>,
        Stefan Hajnoczi <>,
        "Michael S. Tsirkin" <>,
        Krasnov Arseniy <>,
        "" <>,
        "" <>,
        "" <>,
        kernel <>
Subject: Re: [RFC PATCH v1 0/3] virtio/vsock: use SO_RCVLOWAT to set

On 20.07.2022 12:30, Stefano Garzarella wrote:
> On Wed, Jul 20, 2022 at 06:07:47AM +0000, Arseniy Krasnov wrote:
>> On 19.07.2022 15:58, Stefano Garzarella wrote:
>>> On Mon, Jul 18, 2022 at 08:12:52AM +0000, Arseniy Krasnov wrote:
>>>> Hello,
>>>> during my experiments with zerocopy receive, i found, that in some
>>>> cases, poll() implementation violates POSIX: when socket has non-
>>>> default SO_RCVLOWAT(e.g. not 1), poll() will always set POLLIN and
>>>> POLLRDNORM bits in 'revents' even number of bytes available to read
>>>> on socket is smaller than SO_RCVLOWAT value. In this case,user sees
>>>> POLLIN flag and then tries to read data(for example using  'read()'
>>>> call), but read call will be blocked, because  SO_RCVLOWAT logic is
>>>> supported in dequeue loop in af_vsock.c. But the same time,  POSIX
>>>> requires that:
>>>> "POLLIN     Data other than high-priority data may be read without
>>>>            blocking.
>>>> POLLRDNORM Normal data may be read without blocking."
>>>> See, page 293.
>>>> So, we have, that poll() syscall returns POLLIN, but read call will
>>>> be blocked.
>>>> Also in man page socket(7) i found that:
>>>> "Since Linux 2.6.28, select(2), poll(2), and epoll(7) indicate a
>>>> socket as readable only if at least SO_RCVLOWAT bytes are available."
>>>> I checked TCP callback for poll()(net/ipv4/tcp.c, tcp_poll()), it
>>>> uses SO_RCVLOWAT value to set POLLIN bit, also i've tested TCP with
>>>> this case for TCP socket, it works as POSIX required.
>>> I tried to look at the code and it seems that only TCP complies with it or am I wrong?
>> Yes, i checked AF_UNIX, it also don't care about that. It calls skb_queue_empty() that of
>> course ignores SO_RCVLOWAT.
>>>> I've added some fixes to af_vsock.c and virtio_transport_common.c,
>>>> test is also implemented.
>>>> What do You think guys?
>>> Nice, thanks for fixing this and for the test!
>>> I left some comments, but I think the series is fine if we will support it in all transports.
>> Ack
>>> I'd just like to understand if it's just TCP complying with it or I'm missing some check included in the socket layer that we could reuse.
>> Seems sock_poll() which is socket layer entry point for poll() doesn't contain any such checks
>>> @David, @Jakub, @Paolo, any advice?
>>> Thanks,
>>> Stefano
>> PS: moreover, i found one more interesting thing with TCP and poll: TCP receive logic wakes up poll waiter
>> only when number of available bytes > SO_RCVLOWAT. E.g. it prevents "spurious" wake ups, when poll will be
>> woken up because new data arrived, but POLLIN to allow user dequeue this data won't be set(as amount of data
>> is too small).
>> See tcp_data_ready() in net/ipv4/tcp_input.c
> Do you mean that we should call sk->sk_data_ready(sk) checking SO_RCVLOWAT?
Yes, like tcp_data_read().
> It seems fine, maybe we can add vsock_data_ready() in af_vsock.c that transports should call instead of calling sk->sk_data_ready(sk) directly.
Yes, this will also update logic in vmci and hyperv transports
> Then we can something similar to tcp_data_ready().
> Thanks,
> Stefano

Powered by blists - more mailing lists