lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170214.120426.2032015522492111544.davem@davemloft.net>
Date:   Tue, 14 Feb 2017 12:04:26 -0500 (EST)
From:   David Miller <davem@...emloft.net>
To:     ttoukan.linux@...il.com
Cc:     edumazet@...gle.com, brouer@...hat.com, alexander.duyck@...il.com,
        netdev@...r.kernel.org, tariqt@...lanox.com, kafai@...com,
        saeedm@...lanox.com, willemb@...gle.com, bblanco@...mgrid.com,
        ast@...nel.org, eric.dumazet@...il.com, linux-mm@...ck.org
Subject: Re: [PATCH v3 net-next 08/14] mlx4: use order-0 pages for RX

From: Tariq Toukan <ttoukan.linux@...il.com>
Date: Tue, 14 Feb 2017 16:56:49 +0200

> Internally, I already implemented "dynamic page-cache" and
> "page-reuse" mechanisms in the driver, and together they totally
> bridge the performance gap.

I worry about a dynamically growing page cache inside of drivers
because it is invisible to the rest of the kernel.

It responds only to local needs.

The price of the real page allocator comes partly because it can
respond to global needs.

If a driver consumes some unreasonable percentage of system memory, it
is keeping that memory from being used from other parts of the system
even if it would be better for networking to be slightly slower with
less cache because that other thing that needs memory is more
important.

I think this is one of the primary reasons that the MM guys severely
chastise us when we build special purpose local caches into networking
facilities.

And the more I think about it the more I think they are right.

One path I see around all of this is full integration.  Meaning that
we can free pages into the page allocator which are still DMA mapped.
And future allocations from that device are prioritized to take still
DMA mapped objects.

Yes, we still need to make the page allocator faster, but this kind of
work helps everyone not just 100GB ethernet NICs.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ