lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Tue, 23 Sep 2008 09:34:22 -0700 (PDT)
From:	Gertjan Hofman <gertjan_hofman@...oo.com>
To:	Patrick McHardy <kaber@...sh.net>
Cc:	netdev@...r.kernel.org
Subject: Re: VLAN & ARP requests fail for ARM EABI (2.6.24)


This e-mail is for completeness only and to stop anyone from wrongly going down this debugging route

The ARM EABI/OABI VLAN & ARP  bug discussed was real - however, it was also resolved.
A new multicast address structure had been introduced without proper initialization. See 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=12aa343add3eced38a44bdb612b35fdf634d918c
Not entirely sure why this happened to cause issue only with EABI compilers, but it did.
Unfortunately, the 3 months when this bug existed in 2.6.24 was exactly the time we froze our kernel. Perhaps our fault - I should have included patches as they came out

Cheers

G

--- On Sat, 4/12/08, Gertjan Hofman <gertjan_hofman@...oo.com> wrote:

> From: Gertjan Hofman <gertjan_hofman@...oo.com>
> Subject: Re: VLAN & ARP requests fail for ARM EABI (2.6.24)
> To: "Patrick McHardy" <kaber@...sh.net>
> Cc: netdev@...r.kernel.org
> Date: Saturday, April 12, 2008, 10:58 AM
> Patrick,
> 
> Ben mentioned you might be the person to talk to. Just to
> make sure I did what you suggested:
> 
> From: 
> http://devresources.linux-foundation.org/dev/iproute2/download/
>   I downloaded:
> iproute2-2.6.24-rc7.tar.bz2       08-Jan-2008 09:06  336K 
> and cross compiled EABI.
> 
> I created the VLAN with:
> 
>  ./ip link add link eth0 eth0.0  type vlan id 0   (did I
> get the syntax correct  ?)
> 
> /proc/net/vlan/ indicated eth0.0 is there and looks fine.
> 
> 
> Unfortunately pinging through a VLAN to this VLAN fails as
> before withthe same symptoms - ARP requests are received but
> not answered.
> 
> About OABI/EABI incompatibilities - I didnt explicitly
> mention it but when testing the EABI, the entire file system
> is EABI and when testingOABI  the entire filesystem is also
> OABI - so it should not be theproblem.
> 
> 
> We spent quite of bit of time tracking this problem deeper
> down thestack but with limited results.  It looks like  the
> calling sequence is:
> 
> driver-->
>   -- ?
>    - ---> vlan.c 
>         ---> ifnet_tx
>            --->  ?
>               ---->  arp.c
>                     ---> (arp_process)
>                       ----> ip_route_input 
>                         ----> ip_route_input_slow
>                           ----> fib_validate_source
> 
> 
> Its in fib_validate_source that things go wrong.
> 
> In the EABI (faulty kernel), we print values of the device
> pointers, which are considered in fib_validate_source()
>  FIB_RES_DEV(res) : 0xC3C77000 
>  dev                          : 0xC3E2E800
> 
> These are not the same,  so the variable rpf is checked and
> it bails returning  -EINVAL. You can fake it, by setting
> rpf=0 using  echo 0>  
> /proc/sys/net/ipv4/conf/eth2.0/rp_filter --> 0 and then
> pingsfrom the foreign PC to the ARM work.  Still, pings from
> ARM to PC  dontwork - the ARP request goes out, but the
> response (which gets to arp.c)is ignored. Presumable for a
> similar reason - some device pointer check fails.
> 
> My guess is that there is a problem with the dev pointer
> all  the wayback in the vlan.c code, which only manifest
> itself with the EABIcompiler.
> If you run the working kernel version, in  
> fib_validate_source:
> 
> if (in_dev) {
>         no_addr = in_dev->ifa_list == NULL;
>         rpf = IN_DEV_RPFILTER(in_dev);          ----> 
> rpfreturns 0 here eventhough the
> proc/sys/net/ipv4/conf/eth2.0/rp_filteris set to 1.
> 
>         if (DEBUG_XXX == 0xDEADBEEF)
>           printk(KERN_INFO "*********rpf =
> 0x%X\n", rpf);
>     }
> 
> 
> If EABI  rpf =1 , in OABI rpf=0.  So there is something
> different about  the in_dev. pointer
> 
> Do you know what IN_DEV_RPFILTER(in_dev) does exactly ?   
> 
> I think I need to check the validity of the device pointer
> already at the VLAN level, but I am not sure how to do this.
> Any tips ?
> 
> Thanks
> 
> Gertjan
> 
> 
> 
> 
> 
> 
> ----- Original Message ----
> From: Patrick McHardy <kaber@...sh.net>
> To: Gertjan Hofman <gertjan_hofman@...oo.com>
> Cc: netdev@...r.kernel.org
> Sent: Wednesday, April 9, 2008 5:40:45 PM
> Subject: Re: VLAN & ARP requests fail for ARM EABI
> (2.6.24)
> 
> Gertjan Hofman wrote:
> > Dear Sirs,
> > 
> > Since the VLAN mailing list is closed, its author
> suggested I post here. 
> > We have an ARM920T processor based system. When
> compiling the kernel 2.6.24 using OABI (and appropiate 4.1.1
> cross toolchain), VLAN functionality is fine. When setting
> the CONFIG_EABI flag and using  the 4.2.2 toolchain (created
> by the OpenEmbedded project) a VLAN device fails to respond.
> > 
> > When pinging through the ARM VLAN device to a (PC
> based) VLAN device, the following is seen in the vlan
> driver:
> > The ping request is sent out, followed by an ARP
> request. The PC returns the ARP reply and it is seen by the
> VLAN driver (vlan_skb_recv) which calls netif_rx(). This
> repeats a couple of pings later i.e. the arp reply is not
> used or received properly.
> > 
> > Similarly, when pinging from the PC, the ARP request
> is seen by vlan_skb_recv() but there is no ARP reply from
> the ARM cascading through the vlan driver.
> > 
> > It seems to me that either the issue is with the code
> that handles the ARP request when compiling in EABI format,
> or that VLAN doesnt process the frame properly and sends it
> on incorrectly. Recompile the kernel with OABI and
> everything is fine.
> > 
> > Note that communication works fine on either OABI or
> EABI when using 'normal' devices (eth0 etc). This
> puts the suspicion back on vlan.
> > 
> > 
> > Since EABI changes structure packing and other things,
> I suspect the cause is some networking code that knows a bit
> too much about its size & packing.
> > 
> > I am happy to troubleshoot, but I am no kernel expert.
> Tips would be appreciated. Like how to dump the sbk buffer
> in both cases..
> 
> 
> I actually have no idea about the differences between
> OABI and EABI, but I know a mix of both broke some
> iptables setups (kernel EABI/userspace OABI or something
> like that). Could you fetch the latest iproute and try
> again with adding your VLANs using iproute?
> 
> The syntax is:
> 
> ip link add link <lowerdev> [name] <name> type
> vlan id VID
> 
> If that works the problem is most likely an inappropriate
> ABI mix.
> 
> 
> 
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection
> around 
> http://mail.yahoo.com


      
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ