lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 22 Apr 2012 09:07:11 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Jim Paris <jim@...n.com>
Cc:	linux-hotplug@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: udev missing events?

Jim Paris <jim@...n.com> writes:

> Eric W. Biederman wrote:
>> Jim Paris <jim@...n.com> wrote:
>> 
>> >(cc'd Eric and Milan because 5f71a296 and ebf4127c might be related)
>> >
>> >I'm trying to track down a problem that started with virt-manager not
>> >being able to connect to libvirtd.  Long story short, the problem is
>> >that "udevadm settle" is timing out, with no events in the udev queue.
>> >
>> >(I know it's bad for libvirt to rely on "udevadm settle", but it seems
>> >that this is exposing some other problem).
>> >
>> >This is with 3.2.14 from Debian.
>> >
>> >I can trigger the problem easily by just starting libvirtd and killing
>> >it.  Then, the counters are stuck here:
>> >
>> >  # cat /sys/kernel/uevent_seqnum
>> >  13593
>> >  # od -t d2 /dev/.udev/queue.bin | head -1
>> >  0000000  13592
>> >
>> >During the libvirtd startup, I ran "udevadm monitor", which seems to
>> >indicate that 13593 never made it to udev:
>> >
>> > # udevadm monitor --kernel --property
>> >  monitor will print the received events for:
>> >  KERNEL - the kernel uevent
>> >  
>> >KERNEL[537903.765581] add      /devices/virtual/net/lo/queues/rx-0
>> >(queues)
>> >  ACTION=add
>> >  DEVPATH=/devices/virtual/net/lo/queues/rx-0
>> >  SEQNUM=13589
>> >  SUBSYSTEM=queues
>> >  UDEV_LOG=3
>> >  
>> >KERNEL[537903.765683] add      /devices/virtual/net/lo/queues/tx-0
>> >(queues)
>> >  ACTION=add
>> >  DEVPATH=/devices/virtual/net/lo/queues/tx-0
>> >  SEQNUM=13590
>> >  SUBSYSTEM=queues
>> >  UDEV_LOG=3
>> >  
>> >KERNEL[537903.844154] remove   /devices/virtual/net/lo/queues/rx-0
>> >(queues)
>> >  ACTION=remove
>> >  DEVPATH=/devices/virtual/net/lo/queues/rx-0
>> >  SEQNUM=13591
>> >  SUBSYSTEM=queues
>> >  UDEV_LOG=3
>> >  
>> >KERNEL[537903.844192] remove   /devices/virtual/net/lo/queues/tx-0
>> >(queues)
>> >  ACTION=remove
>> >  DEVPATH=/devices/virtual/net/lo/queues/tx-0
>> >  SEQNUM=13592
>> >  SUBSYSTEM=queues
>> >  UDEV_LOG=3
>> >
>> >To try to track it further, I wrote a systemtap script (available at
>> >http://jim.sh/~jim/tmp/monitor.stp).  It dumps the result of
>> >netlink_broadcast_filtered at lib/kobject_uevent.c:248, and then dumps
>> >the uevent_seqnum and environment at the end of the same function.
>> >It seems to show that netlink_broadcast_filter is failing to send out
>> >the event, with -ESRCH:
>> >
>> >  $ sudo ./monitor.stp 
>> >  begin
>> >  -----
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  event was:
>> >    uevent_seqnum: 13588
>> >    "ACTION=add"
>> >    "DEVPATH=/devices/virtual/net/lo"
>> >    "SUBSYSTEM=net"
>> >    "INTERFACE=lo"
>> >    "IFINDEX=68"
>> >    "SEQNUM=13588"
>> >  -----
>> >  netlink_broadcast_filtered: 0x0
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  event was:
>> >    uevent_seqnum: 13589
>> >    "ACTION=add"
>> >    "DEVPATH=/devices/virtual/net/lo/queues/rx-0"
>> >    "SUBSYSTEM=queues"
>> >    "SEQNUM=13589"
>> >  -----
>> >  netlink_broadcast_filtered: 0x0
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  event was:
>> >    uevent_seqnum: 13590
>> >    "ACTION=add"
>> >    "DEVPATH=/devices/virtual/net/lo/queues/tx-0"
>> >    "SUBSYSTEM=queues"
>> >    "SEQNUM=13590"
>> >  -----
>> >  netlink_broadcast_filtered: 0x0
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  event was:
>> >    uevent_seqnum: 13591
>> >    "ACTION=remove"
>> >    "DEVPATH=/devices/virtual/net/lo/queues/rx-0"
>> >    "SUBSYSTEM=queues"
>> >    "SEQNUM=13591"
>> >  -----
>> >  netlink_broadcast_filtered: 0x0
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  event was:
>> >    uevent_seqnum: 13592
>> >    "ACTION=remove"
>> >    "DEVPATH=/devices/virtual/net/lo/queues/tx-0"
>> >    "SUBSYSTEM=queues"
>> >    "SEQNUM=13592"
>> >  -----
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  netlink_broadcast_filtered: -3 (ESRCH)
>> >  event was:
>> >    uevent_seqnum: 13593
>> >    "ACTION=remove"
>> >    "DEVPATH=/devices/virtual/net/lo"
>> >    "SUBSYSTEM=net"
>> >    "INTERFACE=lo"
>> >    "IFINDEX=68"
>> >    "SEQNUM=13593"
>> >  -----
>> >  ^Cend
>> >
>> >
>> >Is it expected that those events would not be sent like that?
>> 
>> Roughly. Events for network devices are only delivered to sockets in the same netwoerk namespace.
>> 
>> So you really shouldn't see much of an event for the creationof a network namespace.
>> 
>> >If not, is there something else I can do to track this further?
>> 
>> >Or, if that's expected behavior, is there anything "udevadm settle"
>> >can do to avoid the timeout?
>> 
>> I don't think so.  Udev for the host should not see events for things that happen inside a container.
>>
>> I am a little mystified that you see as much of the add of lo as you do in the udem monitor command.
>> 
>> I am infering what is going on with libvirt and your machine and assuming containers are involved because of the additions and removals of lo.
>
> I'm not using them on purpose, but it's caused by a probing clone() in
> libvirt's lxcContainerAavilable, followed by a "ip link set lo netns -1"
> in lxcCheckNetNsSupport.  It can be recreated with just:
>
>   #include <unistd.h>
>   #include <sched.h>
>   static int dummy(void *argv) { _exit(0); }
>   main() {
>           char stack[4096];
>           clone(dummy, stack+4096, CLONE_NEWNET, NULL);
>           wait();
>           system("ip link set lo netns -1");
>   }
>
> Running that program causes "udevadm settle" to stop working due to
> the filtered events.

By stop working you mean "udevadm settle" waits because it thinks there
are unprocessed events but those events are never found, and eventually
"udevadm settle" times out?

I'm sorry I don't see the problem.

I think the most we could do would be to have a per network namespace
sequence number, but I don't get see the problem with skipping a sequence
number or two.  Does that cause udev to explode or something?  It
doesn't sound like udev has problems.

It does look like we aren't filtering all of the events for the new
loopback device properly.  We should be filtering the events for
/sys/devices/virtual/net/lo/queues/rx-0 and friends.  But fixing that
problem looks like it will only make the gap in sequence numbers
larger, and if anything make the "udevadm settle" problem worse.

What precisely is the problem you are seeing?

Of course in this instance I can't see how the version of libvirt you
are playing with would ever have worked.  It is relying on features to
act in a way the simply don't act.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ