netdev - Re: [PATCH v2 net-next 00/14] mlx4: order-0 allocations and page recycling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Sun, 12 Feb 2017 16:33:10 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Tariq Toukan <ttoukan.linux@...il.com>,
        Eric Dumazet <edumazet@...gle.com>,
        "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Tariq Toukan <tariqt@...lanox.com>,
        Martin KaFai Lau <kafai@...com>,
        Willem de Bruijn <willemb@...gle.com>,
        Brenden Blanco <bblanco@...mgrid.com>,
        Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH v2 net-next 00/14] mlx4: order-0 allocations and page
 recycling

On Sun, 2017-02-12 at 23:38 +0100, Jesper Dangaard Brouer wrote:

> Just so others understand this: The number of RX queue slots is
> indirectly the size of the page-recycle "cache" in this scheme (that
> depend on refcnt tricks to see if page can be reused).

Note that the page recycle tricks only work on some occasions.

To provision correctly hosts dealing with TCP flows, one should not rely
on page recycling or any opportunistic (non guaranteed) behavior.

Page recycling, _if_ possible, will help to reduce system load
and thus lower latencies.

> 
> 
> > A single TCP flow easily can have more than 1024 MSS waiting in its
> > receive queue (typical receive window on linux is 6MB/2 )
> 
> So, you do need to increase the page-"cache" size, and need this for
> real-life cases, interesting.

I believe this sizing was done mostly to cope with normal system
scheduling constraints [1], reducing packet losses under incast blasts.

Sizing happened before I did my patches to switch to order-0 pages
anyway.

The fact that it allowed page-recycling to happen more often was nice of
course.

[1]
- One can not really assume host will always have the ability to process
the RX ring in time, unless maybe CPU are fully dedicated to the napi
polling logic.
- Recent work to shift softirqs to ksoftirqd is potentially magnifying
the problem.