netdev - Re: [PATCH net-next,v3 00/12] add flow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20181126202844.GD30481@localhost.localdomain>
Date:   Mon, 26 Nov 2018 18:28:44 -0200
From:   Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
To:     Pablo Neira Ayuso <pablo@...filter.org>
Cc:     netdev@...r.kernel.org, davem@...emloft.net,
        thomas.lendacky@....com, f.fainelli@...il.com,
        ariel.elior@...ium.com, michael.chan@...adcom.com,
        santosh@...lsio.com, madalin.bucur@....com,
        yisen.zhuang@...wei.com, salil.mehta@...wei.com,
        jeffrey.t.kirsher@...el.com, tariqt@...lanox.com,
        saeedm@...lanox.com, jiri@...lanox.com, idosch@...lanox.com,
        jakub.kicinski@...ronome.com, peppe.cavallaro@...com,
        grygorii.strashko@...com, andrew@...n.ch,
        vivien.didelot@...oirfairelinux.com, alexandre.torgue@...com,
        joabreu@...opsys.com, linux-net-drivers@...arflare.com,
        ganeshgr@...lsio.com, ogerlitz@...lanox.com,
        Manish.Chopra@...ium.com
Subject: Re: [PATCH net-next,v3 00/12] add flow_rule infrastructure

On Mon, Nov 26, 2018 at 08:33:36PM +0100, Pablo Neira Ayuso wrote:
> Hi Marcelo,

Hello!

> 
> On Thu, Nov 22, 2018 at 07:08:32PM -0200, Marcelo Ricardo Leitner wrote:
> > On Thu, Nov 22, 2018 at 02:22:20PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Wed, Nov 21, 2018 at 03:51:20AM +0100, Pablo Neira Ayuso wrote:
> > > > Hi,
> > > > 
> > > > This patchset is the third iteration [1] [2] [3] to introduce a kernel
> > > > intermediate (IR) to express ACL hardware offloads.
> > > 
> > > On v2 cover letter you had:
> > > 
> > > """
> > > However, cost of this layer is very small, adding 1 million rules via
> > > tc -batch, perf shows:
> > > 
> > >      0.06%  tc               [kernel.vmlinux]    [k] tc_setup_flow_action
> > > """
> > > 
> > > The above doesn't include time spent on children calls and I'm worried
> > > about the new allocation done by flow_rule_alloc(), as it can impact
> > > rule insertion rate. I'll run some tests here and report back.
> > 
> > I'm seeing +60ms on 1.75s (~3.4%) to add 40k flower rules on ingress
> > with skip_hw and tc in batch mode, with flows like:
> > 
> > filter add dev p6p2 parent ffff: protocol ip prio 1 flower skip_hw
> > src_mac ec:13:db:00:00:00 dst_mac ec:14:c2:00:00:00 src_ip
> > 56.0.0.0 dst_ip 55.0.0.0 action drop
> > 
> > Only 20ms out of those 60ms were consumed within fl_change() calls
> > (considering children calls), though.
> > 
> > Do you see something similar?  I used current net-next (d59da3fbfe3f)
> > and with this patchset applied.
> 
> I see lots of send() and recv() in tc -batch via strace, using this
> example rule, repeating it N times:
> 
>         filter add dev eth0 parent ffff: protocol ip pref 1 flower dst_mac f4:52:14:10:df:92 action mirred egress redirect dev eth1
> 
> This is taking ~8 seconds for 40k rules from my old laptop [*], this
> is already not too fast (without my patchset).

On a E5-2643 v3 @ 3.40GHz I see a total of 1.17s with an old iproute
(4.11) (more below).

> 
> I remember we discussed about adding support for real batching for tc
> - probably we can probably do this transparently by assuming that if the
> skbuff length mismatches nlmsghdr->len field, then we enter the batch
> mode from the kernel. This would require to update iproute2 to use
> libmnl batching routines, or code that follows similar approach
> otherwise.

Yes, I believe you're referring to

commit 485d0c6001c4aa134b99c86913d6a7089b7b2ab0
Author: Chris Mi <chrism@...lanox.com>
Date:   Fri Jan 12 14:13:16 2018 +0900

    tc: Add batchsize feature for filter and actions

Which is present in 4.16. It does transparent batching on app side.

With tc from today's tip, I get 1.05s for 40k rules, both with this
patchset applied.

> 
> [*] 0.5 seconds in nft (similar ruleset), this is using netlink batching.

Nice.

Cheers,
  Marcelo