netdev - Re: Alignment issues with freescale FEC driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d0d6f333-c6fc-6572-0633-d7c2c29b8b3f@nelint.com>
Date:   Fri, 23 Sep 2016 11:26:18 -0700
From:   Eric Nelson <eric@...int.com>
To:     Russell King - ARM Linux <linux@...linux.org.uk>
Cc:     Eric Dumazet <edumazet@...gle.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Fugang Duan <fugang.duan@....com>,
        Troy Kisky <troy.kisky@...ndarydevices.com>,
        Otavio Salvador <otavio@...ystems.com.br>,
        Simone <cjb.sw.nospam@...il.com>
Subject: Re: Alignment issues with freescale FEC driver

Thanks Russell,

On 09/23/2016 10:37 AM, Russell King - ARM Linux wrote:
> On Fri, Sep 23, 2016 at 10:19:50AM -0700, Eric Nelson wrote:
>> Oddly, it does prevent the vast majority (90%+) of the alignment errors.
>>
>> I believe this is because the compiler is generating an ldm instruction
>> when the ntohl() call is used, but I'm stumped about why these aren't
>> generating faults:

After looking at it, I have to think that the code that reads iph->id
is just hit more frequently than the other code in this routine.

> 
> ldm generates alignment faults when the address is not aligned to a
> 32-bit boundary.  ldr on ARMv6+ does not.
> 
>> I don't think that's the case.
>>
>> # CONFIG_IPV6_GRE is not set
>>
>> Hmm... Instrumenting the kernel, it seems that iphdr **is** aligned on
>> a 4-byte boundary.
>>
>> Does the ldm instruction require 8-byte alignment?
>>
>> There's definitely a compiler-version dependency involved here,
>> since using gcc 4.9 also reduced the number of faults dramatically.
> 
> Well, I don't think it's that gcc related:
> 

I can only say that I noticed a dramatic drop in the number of faults, and
didn't see the inet_gro_receive reported in /proc/cpu/alignment with gcc 4.9
when trying to identify the issue.

> User:           0
> System:         312855 (ip6_route_input+0x6c/0x1e0)
> Skipped:        0
> Half:           0
> Word:           0
> DWord:          2
> Multi:          312853
> 
> c06d8998 <ip6_route_input>:
> c06d89ac:       e1a04000        mov     r4, r0
> c06d89b0:       e1d489b4        ldrh    r8, [r4, #148]  ; 0x94
> c06d89b8:       e594a0a0        ldr     sl, [r4, #160]  ; 0xa0
> c06d89cc:       e08ac008        add     ip, sl, r8
> c06d89d4:       e28c3018        add     r3, ip, #24
> c06d89dc:       e28c7008        add     r7, ip, #8
> c06d89e4:       e893000f        ldm     r3, {r0, r1, r2, r3}
> c06d89ec:       e24be044        sub     lr, fp, #68     ; 0x44
> c06d89f4:       e24b5054        sub     r5, fp, #84     ; 0x54
> c06d89fc:       e885000f        stm     r5, {r0, r1, r2, r3}
> c06d8a04:       e897000f        ldm     r7, {r0, r1, r2, r3}
> c06d8a10:       e88e000f        stm     lr, {r0, r1, r2, r3}
> 
> This is from:
> 
>         struct flowi6 fl6 = {
>                 .flowi6_iif = l3mdev_fib_oif(skb->dev),
>                 .daddr = iph->daddr,
>                 .saddr = iph->saddr,
>                 .flowlabel = ip6_flowinfo(iph),
>                 .flowi6_mark = skb->mark,
>                 .flowi6_proto = iph->nexthdr,
>         };
> 
> specifically, I suspect, the saddr and daddr initialisations.
> 
> There's not much to get away from this - the FEC on iMX requires a
> 16-byte alignment for DMA addresses, which violates the network
> stack's requirement for the ethernet packet to be received with a
> two byte offset.  So the IP header (and IPv6 headers) will always
> be mis-aligned in memory, which leads to a huge number of alignment
> faults.
> 
> There's not much getting away from this - the problem is not in the
> networking stack, but the FEC hardware/network driver.  See:
> 
>         struct  fec_enet_private *fep = netdev_priv(ndev);
>         int off;
> 
>         off = ((unsigned long)skb->data) & fep->rx_align;
>         if (off)
>                 skb_reserve(skb, fep->rx_align + 1 - off);
> 
>         bdp->cbd_bufaddr = cpu_to_fec32(dma_map_single(&fep->pdev->dev, skb->data, FEC_ENET_RX_FRSIZE - fep->rx_align, DMA_FROM_DEVICE));
> 
> in fec_enet_new_rxbdp().
> 

So the question is: should we just live with this and acknowledge a
performance penalty of bad alignment or do something about it?

I'm not sure the cost (or the details) of Eric's proposed fix of allocating
and copying the header to another skb.

The original report was of bad network performance, but I haven't
been able to see an impact doing some simple tests using wget
and SSH.