lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91eb2720-d933-f1fd-8d50-e9a81434545b@gmail.com>
Date:   Fri, 3 Jan 2020 12:50:35 -0800
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     netdev <netdev@...r.kernel.org>,
        Alexander Lobakin <alobakin@...nk.ru>,
        Russell King <rmk+kernel@...linux.org.uk>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Hauke Mehrtens <hauke@...ke-m.de>,
        Woojung Huh <woojung.huh@...rochip.com>,
        Microchip Linux Driver Support <UNGLinuxDriver@...rochip.com>,
        Sean Wang <sean.wang@...iatek.com>,
        Matthias Brugger <matthias.bgg@...il.com>,
        Vladimir Oltean <vladimir.oltean@....com>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Stanislav Fomichev <sdf@...gle.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Song Liu <songliubraving@...com>,
        Petar Penkov <ppenkov@...gle.com>,
        Matteo Croce <mcroce@...hat.com>,
        Jakub Sitnicki <jakub@...udflare.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Paul Blakey <paulb@...lanox.com>,
        Yoshiki Komachi <komachi.yoshiki@...il.com>,
        open list <linux-kernel@...r.kernel.org>,
        "moderated list:ARM/Mediatek SoC support" 
        <linux-arm-kernel@...ts.infradead.org>,
        "moderated list:ARM/Mediatek SoC support" 
        <linux-mediatek@...ts.infradead.org>
Subject: Re: [PATCH net-next] net: dsa: Remove indirect function call for flow
 dissection

On 1/2/20 4:19 PM, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Fri, 3 Jan 2020 at 01:39, Florian Fainelli <f.fainelli@...il.com> wrote:
>>
>> We only need "static" information to be given for DSA flow dissection,
>> so replace the expensive call to .flow_dissect() with an integer giving
>> us the offset into the packet array of bytes that we must de-reference
> 
> packet array? packed array?

Yes, packet array skb->data[] if you prefe

> 
>> to obtain the protocol number. The overhead was alreayd available from
> 
> already
> 
>> the dsa_device_ops structure so use that directly.
>>
>> The presence of a flow_dissect callback used to indicate that the DSA
>> tagger supported returning that information,we now encode this with a
>> proto_off value of DSA_PROTO_OFF_UNPSEC if the tagger does not support
> 
> UNSPEC
> 
>> providing that information yet.
>>
>> Signed-off-by: Florian Fainelli <f.fainelli@...il.com>
>> ---
> 
> Unfortunately I don't really understand the DSA implementations of flow_dissect.
> Is proto_off supposed to mean "the __be16 pointer difference A - B
> between A. the position of the real EtherType and B. the current
> skb->data (aka ETH_HLEN bytes into the frame, aka 2 bytes after the
> normal EtherType was supposed to be)"?
> Otherwise said, the offset in bytes between the real EtherType
> position and skb->data is 2 * (proto_off + 1).> Furthermore, the offset in bytes is exactly equal to the tagger
> overhead in bytes, unless it's a tag that doesn't push the EtherType
> to the right, such as the trailer tag.

The call path is the following on TX (e.g.: when you run a DHCP client),
so this is always hit for raw packets:

__sys_sendto
  packet_sendmsg
    packet_parse_headers
       __skb_flow_dissect

and on RX this is not hit by default until you configure a RFS map on
your DSA master network device (more on that below), then the call stack is:

napi_complete_done
  gro_normal_list
     netif_receive_skb_list_internal
        get_rps_cpu
           skb_get_hash
              __skb_get_hash
                  __skb_flow_dissect

and this is called from the DSA master's RX path, so with an Ethernet
frame that has the DSA switch tag, and for which eth_type_trans() has
already been called so the SKB has already been pulled by ETH_HLEN and
skb->protocol is ETH_P_XDSA.

I don't think your formula works for EDSA which has an EtherType, but
this would probably work for all tags we currently support except trailer.

proto = (__be16 *)(skb->data)[overhead / 2 - 1];

> 
> If the above is indeed correct, can you just skip DSA_PROTO_OFF_UNSPEC
> and add proper proto_off values "in blind" for all taggers? I think
> it's rather safe to assume that they all push the EtherType to the
> right with the exception of the trailer tag, which will have an offset
> of -1 in terms of __be16 pointers, by the way (so your -1 encoding of
> DSA_PROTO_OFF_UNSPEC won't work for it anyway).
> 
> Also, documenting the unit of measurement for proto_off would really
> go a long way.
> 
> What is a good test that the flow_dissector does what it's supposed to
> do with DSA?

The commit that introduced flow dissection meant to fix it for the DSA
master network device (as you can see from the call trace), and this was
presumably meant to be used to steer traffic onto different RX or TX
queues on the DSA master, which is IMHO the wrong location where it
should be done for a number of reasons, but mainly because DSA slave
network devices can inherit the number of RX queues of their DSA master
and you can perform RFS there, in a standard way, without requiring
further changes down net/core/flow_dissect.c.

I don't think anyone except Alexander did serious investigation this.
For now, what I am interested in is reducing the amount of technical
debt and expensive function calls.
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ