lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <505CB607.7080207@msgid.tls.msk.ru>
Date:	Fri, 21 Sep 2012 22:46:31 +0400
From:	Michael Tokarev <mjt@....msk.ru>
To:	netdev <netdev@...r.kernel.org>
Subject: multicast, interfaces, kernel 3.0+...

Hello.

We found some, well, interesting behavour of kernels
3.0 and later, while 2.6.32 (previous long-stable
series) worked fine.  I'm not sure when it "broke",
since this is a production machine and we've difficult
time diagnosing it, and the app causing it is, well,
large.

The short story.  A big java app uses multicast group
to register one component and find it later.

The machine in question has 3 active network interfaces:
usual lo, eth0, and virtual (tap, pointopoint) tinc.
Tinc interface is marked as "multicast off".

When the app starts on 2.6.32 kernel, netstat -g shows
that multicast group on 2 interfaces: lo and eth0, but
not on tinc, which is sort of expected:

$ netstat -g
IPv6/IPv4 Group Memberships
Interface       RefCnt Group
--------------- ------ ---------------------
lo              4      228.5.6.7
lo              1      all-systems.mcast.net
eth0            4      228.5.6.7
eth0            1      all-systems.mcast.net
tinc            1      all-systems.mcast.net


But when the same app (actually the same userspace) is
booted on the same machine but on 3.0+ kernel, the same
multicast group is registered also on 2 interfaces, but
this time these are lo (as before) and tinc, but not eth0:

$ netstat -g
IPv6/IPv4 Group Memberships
Interface       RefCnt Group
--------------- ------ ---------------------
lo              4      228.5.6.7
lo              1      all-systems.mcast.net
eth0            1      all-systems.mcast.net
tinc            4      228.5.6.7
tinc            1      all-systems.mcast.net

Now, on 3.0+ kernel, parts of this app can't find each
other.  The "client" tries to send a datagram packet
to this address, 228.5.6.7, but receives no reply.

On 2.6.32 kernel, when eth0 is used instead of tinc,
it all works as expected.

Now, my knowlege of this multicast stuff is very limited
(reading about it now), so I don't really know what it
all means.  At least the fact that it somehow registers
tinc (which is multicast-off!) is already somewhat strange.
I tried removing this multicast setting from this iface,
but that didn't help.  I also tried enabling multicast on
lo (which was disabled!) and disabling it on others, but
that didn't help either.

According to strace, the app does not try to change iface
group membership, it does bind of a udp socket to 0.0.0.0:port,
and uses SOL_IP, IP_ADD_MEMBERSHIP to add this socket to a
multicast group.

Note: there's just ONE machine involved, and two applications
running on it.

Why with 3.0+, the non-multicast "tinc" interface is shown
as a member of 228.5.6.7 group, but not eth0 which actually
*is* multicast?

For the record, this "big java app" is Oracle reports server.
I've no idea why they use multicast to find two components
of one thing running on the same machine, and does not provide
any usable unicast solution...

Thanks!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ