netdev - Re: [PATCH net-next] mlx4: Better use of order-0 pages in RX path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170314045730.GA5211@ast-mbp.thefacebook.com>
Date:   Mon, 13 Mar 2017 21:57:32 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     Eric Dumazet <edumazet@...gle.com>,
        "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Tariq Toukan <tariqt@...lanox.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Alexander Duyck <alexander.duyck@...il.com>
Subject: Re: [PATCH net-next] mlx4: Better use of order-0 pages in RX path

On Mon, Mar 13, 2017 at 06:02:11PM -0700, Eric Dumazet wrote:
> On Mon, 2017-03-13 at 16:40 -0700, Alexei Starovoitov wrote:
> 
> > that's not how it works. It's a job of submitter to prove
> > that additional code doesn't cause regressions especially
> > when there are legitimate concerns.
> 
> This test was moved out of the mlx4_en_prepare_rx_desc() section into
> the XDP_TX code path.
> 
> 
>         if (ring->page_cache.index > 0) {
>                 /* XDP uses a single page per frame */
>                 if (!frags->page) {
>                         ring->page_cache.index--;
>                         frags->page = ring->page_cache.buf[ring->page_cache.index].page;
>                         frags->dma  = ring->page_cache.buf[ring->page_cache.index].dma;
>                 }
>                 frags->page_offset = XDP_PACKET_HEADROOM;
>                 rx_desc->data[0].addr = cpu_to_be64(frags->dma +
>                                                     XDP_PACKET_HEADROOM);
>                 return 0;
>         }
> 
> Can you check again your claim, because I see no additional cost
> for XDP_TX.

Let's look what it was:
- xdp_tx xmits the page regardless whether driver can replenish
- at the end of the napi mlx4_en_refill_rx_buffers() will replenish
rx in bulk either from page_cache or by allocating one page at a time

after the changes:
- xdp_tx will check page_cache if it's empty it will try to do
order 10 (ten!!) alloc, will fail, will try to alloc single page,
will xmit the packet, and will place just allocated page into rx ring.
on the next packet in the same napi loop, it will try to allocate
order 9 (since the cache is still empty), will fail, will try single
page, succeed... next packet will try order 8 and so on.
And that spiky order 10 allocations will be happening 4 times a second
due to new logic in mlx4_en_recover_from_oom().
We may get lucky and order 2 alloc will succeed, but before that
we will waste tons of cycles.
If an attacker somehow makes existing page recycling logic not effective,
the xdp performance will be limited by order0 page allocator.
Not great, but still acceptable.
After this patch it will just tank due to this crazy scheme.
Yet you're not talking about this new behavior in the commit log.
You didn't test XDP at all and still claiming that everything is fine ?!
NACK