lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-JXpiJwxM_mHvAgJ6qhsgq4uOZYbsMBVvcOmZawbueayQ@mail.gmail.com>
Date:   Sat, 19 May 2018 16:13:44 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Tom Herbert <tom@...bertland.com>
Cc:     Amritha Nambiar <amritha.nambiar@...el.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        Sridhar Samudrala <sridhar.samudrala@...el.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>
Subject: Re: [net-next PATCH v2 2/4] net: Enable Tx queue selection based on
 Rx queues

On Fri, May 18, 2018 at 12:03 AM, Tom Herbert <tom@...bertland.com> wrote:
> On Tue, May 15, 2018 at 6:26 PM, Amritha Nambiar
> <amritha.nambiar@...el.com> wrote:
>> This patch adds support to pick Tx queue based on the Rx queue map
>> configuration set by the admin through the sysfs attribute
>> for each Tx queue. If the user configuration for receive
>> queue map does not apply, then the Tx queue selection falls back
>> to CPU map based selection and finally to hashing.
>>
>> Signed-off-by: Amritha Nambiar <amritha.nambiar@...el.com>
>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@...el.com>
>> ---
>>  include/net/sock.h       |   18 ++++++++++++++++++
>>  net/core/dev.c           |   36 +++++++++++++++++++++++++++++-------
>>  net/core/sock.c          |    5 +++++
>>  net/ipv4/tcp_input.c     |    7 +++++++
>>  net/ipv4/tcp_ipv4.c      |    1 +
>>  net/ipv4/tcp_minisocks.c |    1 +
>>  6 files changed, 61 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index 4f7c584..0613f63 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -139,6 +139,8 @@ typedef __u64 __bitwise __addrpair;
>>   *     @skc_node: main hash linkage for various protocol lookup tables
>>   *     @skc_nulls_node: main hash linkage for TCP/UDP/UDP-Lite protocol
>>   *     @skc_tx_queue_mapping: tx queue number for this connection
>> + *     @skc_rx_queue_mapping: rx queue number for this connection
>> + *     @skc_rx_ifindex: rx ifindex for this connection
>>   *     @skc_flags: place holder for sk_flags
>>   *             %SO_LINGER (l_onoff), %SO_BROADCAST, %SO_KEEPALIVE,
>>   *             %SO_OOBINLINE settings, %SO_TIMESTAMPING settings
>> @@ -215,6 +217,10 @@ struct sock_common {
>>                 struct hlist_nulls_node skc_nulls_node;
>>         };
>>         int                     skc_tx_queue_mapping;
>> +#ifdef CONFIG_XPS
>> +       int                     skc_rx_queue_mapping;
>> +       int                     skc_rx_ifindex;
>
> Isn't this increasing size of sock_common for a narrow use case functionality?

You can get the device from the already recorded sk_napi_id.
Sadly, not the queue number as far as I can see.


>> +static inline void sk_mark_rx_queue(struct sock *sk, struct sk_buff *skb)
>> +{
>> +#ifdef CONFIG_XPS
>> +       sk->sk_rx_ifindex = skb->skb_iif;
>> +       sk->sk_rx_queue_mapping = skb_get_rx_queue(skb);
>> +#endif
>> +}
>> +

Instead of adding this function and calls to it in many locations in
the stack, you can expand sk_mark_napi_id.

Also, it is not clear why this should be called in locations where
sk_mark_napi_id is not.


>> +static int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
>> +{
>> +#ifdef CONFIG_XPS
>> +       enum xps_map_type i = XPS_MAP_RXQS;
>> +       struct xps_dev_maps *dev_maps;
>> +       struct sock *sk = skb->sk;
>> +       int queue_index = -1;
>> +       unsigned int tci = 0;
>> +
>> +       if (sk && sk->sk_rx_queue_mapping <= dev->real_num_rx_queues &&
>> +           dev->ifindex == sk->sk_rx_ifindex)
>> +               tci = sk->sk_rx_queue_mapping;
>> +
>> +       rcu_read_lock();
>> +       while (queue_index < 0 && i < __XPS_MAP_MAX) {
>> +               if (i == XPS_MAP_CPUS)
>
> This while loop typifies exactly why I don't think the XPS maps should
> be an array.

+1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ