lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S344d16SVhO+FhXpyL09k4vSuv45F-bff-nMeqS4H=p+tw@mail.gmail.com>
Date:   Thu, 29 Sep 2016 10:08:14 -0400
From:   Tom Herbert <tom@...bertland.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Kernel Team <kernel-team@...com>,
        Rick Jones <rick.jones2@....com>,
        Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [PATCH v2 net-next 4/5] xps_flows: XPS for packets that don't
 have a socket

On Thu, Sep 29, 2016 at 9:18 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Thu, 2016-09-29 at 08:53 -0400, Tom Herbert wrote:
>> On Thu, Sep 29, 2016 at 12:54 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
>> > On Wed, 2016-09-28 at 20:54 -0700, Tom Herbert wrote:
>> >> xps_flows maintains a per device flow table that is indexed by the
>> >> skbuff hash. The table is only consulted when there is no queue saved in
>> >> a transmit socket for an skbuff.
>> >>
>> >> Each entry in the flow table contains a queue index and a queue
>> >> pointer. The queue pointer is set when a queue is chosen using a
>> >> flow table entry. This pointer is set to the head pointer in the
>> >> transmit queue (which is maintained by BQL).
>> >>
>> >> The new function get_xps_flows_index that looks up flows in the
>> >> xps_flows table. The entry returned gives the last queue a matching flow
>> >> used. The returned queue is compared against the normal XPS queue. If
>> >> they are different, then we only switch if the tail pointer in the TX
>> >> queue has advanced past the pointer saved in the entry. In this
>> >> way OOO should be avoided when XPS wants to use a different queue.
>> >
>> > There is something I do not understand with this.
>> >
>> > If this OOO avoidance is tied to BQL, it means that packets sitting in a
>> > qdisc wont be part of the detection.
>> >
>> > So packets of flow X could possibly be queued on multiple qdiscs.
>> >
>> Hi Eric,
>>
>> I'm not sure I understand your concern. If packets for flow X can be
>> queued on multiple qdiscs wouldn't that mean that flow will be subject
>> to ooo transmission regardless of this patch? In other words here
>> we're trying to keep packets for the flow in order as they are emitted
>> from the qdiscs which we assume maintains ordering of packets in a
>> flow.
>
> Well, then what this patch series is solving ?
>
It addresses  the issue that Rick Jones pointed out was happening with
XPS. When packets are sent for a flow that has no socket and XPS is
enabled then each packet uses the XPS queue based on the running CPU.
Since the thread sending on a flow can be rescheduled on different
CPUs this is creating ooo packets. In this case the ooo is being
caused by interaction with XPS.

> You have a producer of packets running on 8 vcpus in a VM.
>
> Packets are exiting the VM and need to be queued on a mq NIC in the
> hypervisor.
>
> Flow X can be scheduled on any of these 8 vcpus, so XPS is currently
> selecting different TXQ.
>
> Your patch aims to detect this and steer packets to one TXQ, but not
> using a static hash() % num_real_queues (as RSS would have done on a NIC
> at RX), but trying to please XPS (as : selecting a queue close to the
> CPU producing the packet. VM with one vcpu, and cpu bounded, would
> really be happy to not spread packets all over the queues)
>
You're not describing this patch, you're describing how XPS works in
the first place. This patch is done to make socketless packets work
well with XPS. If all you want is a static hash() % num_real_queues
then just disable XPS to get that.

> Your mechanism relies on counters updated at the time packets are given
> to the NIC, not at the time packets are given to the txq (and eventually
> sit for a while in the qdisc right before BQL layer)
>
> dev_queue_xmit() is not talking to the device directly...
>
>
> All I am saying is that the producer/consumer counters you added are not
> at the right place.
>
> You tried hard to not change the drivers, by adding something to
> existing BQL. But packets can sit in a qdisc for a while...
>
But again, this patch is to prevent ooo being caused by an interaction
with XPS. It does not address ooo being caused by qdiscs or VMs, but
then I don't see that as being the issue reported by Rick.

> So you added 2 potential cache lines misses per outgoing packet, but
> with no guarantee that OOO are really avoided.
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ