lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Jun 2018 23:41:45 +0000
From:   Saeed Mahameed <saeedm@...lanox.com>
To:     "eric.dumazet@...il.com" <eric.dumazet@...il.com>,
        "kafai@...com" <kafai@...com>, Tariq Toukan <tariqt@...lanox.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "edumazet@...gle.com" <edumazet@...gle.com>
Subject: Re: [net RFC] net/mlx4_en: Use frag stride in crossing page boundary
 condition

On Tue, 2018-06-19 at 17:25 -0700, Eric Dumazet wrote:
> 
> On 06/19/2018 11:05 AM, Saeed Mahameed wrote:
> 
> > this is only true for XDP setup, for non XDP max stride_size can
> > only
> > be around ~3k and only for mtu > ~6k
> > 
> > For XDP setup you suggested:
> > -               priv->frag_info[0].frag_size = eff_mtu;
> > +               priv->frag_info[0].frag_size = PAGE_SIZE;
> > 
> > currently the condition is:
> > 
> > release = frags->page_offset + frag_info->frag_size > PAGE_SIZE;
> > 
> > so my solution and yours have the same problem you described above.
> > 
> > the problem is not with the initial values or with stride/farg size
> > math, it just that in XDP we shouldn't reuse at ALL. I agree with
> > you
> > that we need to optimize and maybe for PAGE_SIZE > 8k we need to
> > allow
> > XDP setup to reuses. but for now there is a data corruption to
> > handle.
> 
> 
> Sure, we all agree there is a bug to fix.
> 
> The way you are fixing it is kind of illogical.
> 
> The NIC can use a frag if its _size_ is big enough to receive the
> frame.
> 
> The _stride_  is an abstraction created by the driver to report an
> estimation of the _truesize_,
> or memory consumption, so that linux can better track overall memory
> usage.
> 
> For example, if MTU=1500, the size of the fragment is 1536 bytes, but
> since we can put only
> 2 fragments per 4KB page (on x86), we declare the _stride_ to be 2048
> bytes.
> 
> Declaring that a final blob of a page, being 1600 bytes, not able to
> receive a frame because
> _stride_ is 2048 is illogical and waste resources.
> 
> 

I see, I wanted to use _stride_ as grantee for how much a page frag can
grow, for example in mlx5 we need the whole stride to build_skb  around
the frag, since we always need the trailer, but it is different in here
and we can avoid resource waste.

so how a bout this: (As suggested by Martin).
currently as mlx4_en_complete_rx_desc assumes that priv->rx_headroom
is always 0 in non-XDP setup, hence:

frags->page_offset += sz_align;

where it really should be:
frags->page_offset += sz_align + priv->rx_headroom;

we can use it as a hint to not reuse as below:
what do you think ?


diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 9f54ccbddea7..f14c7a574cc8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -474,10 +474,10 @@ static int mlx4_en_complete_rx_desc(struct
mlx4_en_priv *priv,
 {
        const struct mlx4_en_frag_info *frag_info = priv->frag_info;
        unsigned int truesize = 0;
+       bool release = true;
        int nr, frag_size;
        struct page *page;
        dma_addr_t dma;
-       bool release;
index 9f54ccbddea7..f14c7a574cc8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c

        /* Collect used fragments while replacing them in the HW
descriptors */
        for (nr = 0;; frags++) {
@@ -500,7 +500,7 @@ static int mlx4_en_complete_rx_desc(struct
mlx4_en_priv *priv,
                        release = page_count(page) != 1 ||
                                  page_is_pfmemalloc(page) ||
                                  page_to_nid(page) != numa_mem_id();
-               } else {
+               } elseif(!priv->rx_headroom) {
                        u32 sz_align = ALIGN(frag_size,
SMP_CACHE_BYTES);

                        frags->page_offset += sz_align;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ