lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Jan 2019 22:58:37 -0800
From:   Maciej Żenczykowski <zenczykowski@...il.com>
To:     Michal Soltys <soltys@....info>
Cc:     David Miller <davem@...emloft.net>,
        Linux NetDev <netdev@...r.kernel.org>,
        Jay Vosburgh <jay.vosburgh@...onical.com>,
        Vincent Bernat <vincent@...nat.ch>,
        Mahesh Bandewar <maheshb@...gle.com>,
        Chonggang Li <chonggangli@...gle.com>
Subject: Re: [PATCH net 1/1] bonding: fix PACKET_ORIGDEV regression on bonding masters

I'm not sure there's a truly good answer.

Also note, that there's subtle differences between af_packet sockets
tied to ALL protocols vs tied to specific protocols (vs none/0 of
course).
However ETH_P_ALL protocol raw sockets need to be avoided if at all
possible due to perf impact.
Certainly we don't want any of these created outside of debugging.

I believe we only cared about the utterly unbound to any device case
working (ie. delivering all LLDP packets all nics are receiving).
So that a single simple daemon could collect and export all the link
information.
[btw. now that I know about the PACKET_ORIGDEV option, I do -
unsurprisingly - see our daemon sets it]
We really didn't want the complexity of having to bind to individual
interfaces and having to try to dynamically adjust the set of raw
sockets.
(not to mention that less raw sockets is always good, also it's never
actually entirely clear which interfaces would need to be monitored
due to the preponderance of tunnels/macvlan/ipvlan/veth/dummy and
other virtual interface types)

I think from a logic perspective:

- if you bind to slave (physical interface) you should see *all*
packets on that interface,
regardless of whether the slave is active or not, and whether the
packet is link-local or not.
ie. this should show you exactly what nic is receiving (& sending:
PACKET_OUTGOING)
I should see *exactly* one copy of any packet.
This mode has to be useful for debugging.
This includes seeing packets which aren't even destined for me if nic
receives them.
(ie. destined to unicast/multicast mac we'd filter out: PACKET_OTHERHOST)
Although by itself this socket's existence shouldn't affect nic mac
filters and promiscuous mode
(I believe there are all sorts of various additional socket options to
change those).

- if you don't bind to an interface then I think I'd expect to see
packets delivered to stack + link local packets
  - ie. should not see non-link-local packets discarded due to being
on inactive slaves
    similarly to how I should not see packets filtered out by virtue
of mac not matching interface mac filters
  - IMHO you should see *all* link local packets arriving at system
(ie. the original change's purpose)
    [and I think, but am not certain, I shouldn't need to use any
socket options to register the link local macs,
     although glancing at code I do think the daemon uses
PACKET_ADD_MEMBERSHIP to register lldp mac,
     so perhaps filtering should apply as normal?]
  - with PACKET_ORIGDEV they should always show up as from the
physical interfaces (ie. slaves)
  - without PACKET_ORIGDEV:
     - non link local packets should show up as from bonding/team master
     - link local packets on active slaves could show up on master or
slave (probably master is required ifirc some earlier fix???)
     - link local packets on inactive slaves should show up on slave -
and definitely not on master
  - I don't think I should ever see any PACKET_OTHERHOST packets

- if you bind to master you should see packets from active slaves (ie.
those that will get delivered to stack)
  - clearly you should not see non-link-local inactive slave packets
(they'll be dropped)
  - behaviour wrt. link local packets is more dicey... (I believe
somewhere in these threads patches there was some description of what
standard requires, but I don't off the top of my head remember)
  - for an active-backup bond it would seem logical to see only active
links link local
  - for a multiple-active aggregate bond I'd be fine with seeing none,
or all active slaves link local packets - I guess even though the
later is confusing it makes sense
  - it might be okay to see link-local inactive slave packets, but I
think I'd prefer not to (could be configurable though) - this would
seem confusing/wrong to me.
  - and again I don't think I should see any PACKET_OTHERHOST packets here...
  - PACKET_ORIGDEV as above...

I gave the above a fair amount of thought... but I'm not guaranteeing
I didn't make typos, or write something utterly stupid or
unimplementable or not how stuff should work for other reasons...
Comments welcome.

I think this continues to be in line with my proposal from earlier in
the thread?

On Thu, Jan 17, 2019 at 4:27 PM Michal Soltys <soltys@....info> wrote:
>
> On 19/01/14 03:01, Maciej Żenczykowski wrote:
> > So I don't remember the specifics...
> >
> > (note I'm writing this all from memory without looking it up/testing
> > it - I may be utterly wrong or dreaming)
> >
> > But I seem to recall that the core problem we were trying to solve was
> > that a daemon listening
> > on an AF_PACKET ethertype 88CC [LLDP] socket not bound to any device
> > would not receive LLDP packets
> > arriving on inactive bond slaves (either active-backup or lag).
> >
> > [inactive = link/carrier up, but not part of active aggregator]
> >
> > This made monitoring for miscabling harder (IFIRC the only non kernel
> > fix was to get the daemon to create
> > a separate AF_PACKET/88CC socket bound to every physical interface in
> > the system, or monitor for
> > inactive slaves and add extra packet sockets as needed).
> >
> > They would get re-parented to the master and then since the slave was
> > inactive they would be considered RX_HANDLER_EXACT match only and not
> > match the * interface.
> >
> > Honestly I wasn't aware of PACKET_ORIGDEV, although I don't think it
> > helps in this case - AFAICR the packets never made it to the packet
> > socket.
> >
> > Perhaps going from:
> >    /* don't change skb->dev for link-local packets */
> >    if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) return RX_HANDLER_PASS;
> >    if (bond_should_deliver_exact_match(skb, slave, bond)) return
> > RX_HANDLER_EXACT;
> >
> > to something more like:
> >    if (bond_should_deliver_exact_match(skb, slave, bond)) {
> >      /* don't change skb->dev for link-local packets on inactive slaves */
> >      if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) return RX_HANDLER_PASS;
> >      return RX_HANDLER_EXACT;
> >    }
>
> Having checked the code (if I get the flow correctly), one
> thing/question - currently with Mahesh's fixes, not bound LLDP listener
> will receive all packets - both from active and inactive slaves directly
> (as the check for suppression is done after the link-local check).
>
> The version above will do the suppression check first - so all inactive
> slaves - excluding non-multi/non-broad ALB - will pass it and return
> RX_HANDLER_PASS if the packet is link-local. So those will be available
> w/o binding, but active slaves' packets will be available via master
> device (but with working PACKET_ORIGDEV now - so slave device can be
> retrieved easily). This is fine in your scenario I presume ?
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ