netdev - Re: [PATCH v2 net-next 4/5] xps_flows: XPS for packets that don't have a socket

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CALx6S34xHzoSNHvmFLsUyge=eptWpv2rRPspd22C=97vKRmHrA@mail.gmail.com>
Date:   Thu, 29 Sep 2016 16:26:30 -0400
From:   Tom Herbert <tom@...bertland.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Kernel Team <kernel-team@...com>,
        Rick Jones <rick.jones2@....com>,
        Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [PATCH v2 net-next 4/5] xps_flows: XPS for packets that don't
 have a socket

On Thu, Sep 29, 2016 at 11:15 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Thu, 2016-09-29 at 07:51 -0700, Eric Dumazet wrote:
>> On Thu, 2016-09-29 at 10:08 -0400, Tom Herbert wrote:
>>
>> > It addresses  the issue that Rick Jones pointed out was happening with
>> > XPS. When packets are sent for a flow that has no socket and XPS is
>> > enabled then each packet uses the XPS queue based on the running CPU.
>> > Since the thread sending on a flow can be rescheduled on different
>> > CPUs this is creating ooo packets. In this case the ooo is being
>> > caused by interaction with XPS.
>> >
>>
>> Nope, your patch does not address the problem properly.
>>
>> I am not sure I want to spend more time explaining the issue.
>>
>> Lets talk about this in Tokyo next week.
>>
>
> Just as a reminder, sorry to bother you, stating some obvious facts for
> both of us. We have public exchanges, so we also need to re-explain how
> things work.
>
> Queue selection on xmit happens before we hit the qdisc and its delays.
>
> So when you access txq->dql.num_completed_ops and
> txq->dql.num_enqueue_ops you can observe values that do not change for a
> while.
>
> Say a thread runs on a VM, and sends 2 packets P1, P2 on the same flow
> (skb_get_hash() returns the same value for these 2 packets)
>
> P1 is sent on behalf of CPU 1, we pickup queue txq1, and queue the
> packet on its qdisc . Transmit does not happen because of some
> constraints like rate limiting or scheduling constraints.
>
> P2 is sent on behalf of CPU 2, we pickup queue txq2, notice that prior
> packet chose txq1. We check txq1->dql and decide it is fine to use txq2,
> since the dql params of txq1 were not changed yet.
>
> ( txq->dql.num_completed_ops == ent.queue_ptr )
>
> Note that in RFS case, we have the guarantee that we observe 'live
> queues' since they are the per cpu backlog.
>
> So input_queue_head_incr() and input_queue_tail_incr_save() are
> correctly doing the OOO prevention, because a queued packet immediately
> changes the state.
>
> So really your patch works if you have no qdisc, or a non congested
> qdisc. (Think if P1 is dropped by a full pfifo or pfifo_fast : We really
> want to avoid steering P2, P3, ..., PN on this full pfifo while maybe
> other txq are idle). Strange attractors are back (check commit
> 9b462d02d6dd6 )
>
Understood.

> You could avoid (ab)using BQL with a different method, grabbing
> skb->destructor for the packets that are socketless
>
> The hash table would simply track the sum of skb->truesize to allow flow
> migration. This would be self contained and not intrusive.
>
Okay, will look that.

>
>
>