lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-K4auN9L=ijJpq+72XoUsmWiwiz2zCxkE7_7EJPBP=mjg@mail.gmail.com>
Date:   Thu, 18 Jan 2018 18:09:10 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Sowmini Varadhan <sowmini.varadhan@...cle.com>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Network Development <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, rds-devel@....oracle.com,
        santosh.shilimkar@...cle.com
Subject: Re: [PATCH RFC net-next 1/6] sock: MSG_PEEK support for sk_error_queue

On Thu, Jan 18, 2018 at 6:03 PM, Sowmini Varadhan
<sowmini.varadhan@...cle.com> wrote:
> On (01/18/18 17:54), Willem de Bruijn wrote:
>> > 2. If we have the option of passing completion-notification up as ancillary
>> >    data on the pollin/recvmsg channel itself (instead of MSG_ERRQUEUE)
>>
>> This assumes a somewhat symmetric workload, where there are enough recv
>> calls to reap the notification associated with the send calls.
>
> Your comment about the assumption is true, but at least for the database
> use-cases, we have a request-response model, so the assumption works out..
> I dont know if many other workloads that send large buffers have this
> pattern.

If that is true in general for PF_RDS, then it is a reasonable approach.
How about treating it as a (follow-on) optimization path. Opportunistic
piggybacking of notifications on data reads is more widely applicable.

>
>> I would stay with MSG_ERRQUEUE processing. One option is to pass data
>> up to userspace in the data portion of the notification skb instead of
>> encoding it in ancillary data, like tcp_get_timestamping_opt_stats.
>
> that's similar to what I have, except that it does not have the
> MSG_PEEK part (you'd need to enforce that the data portion
> is upper-bounded, and that the application has the responsibility
> of sending down "enough" buffer with recvmsg).

Right. I think that an upper bound is the simplest solution here.

By the way, if you allocate an skb immediately on page pinning, then
there are always sufficient skbs to store all notifications. On errqueue
enqueue just drop the new skb and copy its notification to the body of
the skb already on the queue, if one exists and it has room. That is
essentially what the tcp zerocopy code does with the [data, info] range.

> Note that any one of these choices are ok with me- I have no
> special attachments to any of them.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ