lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080617072658.GA12535@elte.hu>
Date:	Tue, 17 Jun 2008 09:26:58 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	David Miller <davem@...emloft.net>
Cc:	kuznet@....inr.ac.ru, vgusev@...nvz.org, mcmanus@...ksong.com,
	xemul@...nvz.org, netdev@...r.kernel.org,
	ilpo.jarvinen@...sinki.fi, linux-kernel@...r.kernel.org
Subject: Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets


* David Miller <davem@...emloft.net> wrote:

> From: Ingo Molnar <mingo@...e.hu>
> Date: Fri, 13 Jun 2008 13:47:46 +0200
> 
> > this threw the warning below - never saw that before in thousands of 
> > bootups and this was the only networking change that happened. 
> > config and bootlog attached. Might be unlucky coincidence.
> 
> So that we can make forward progress here, please confirm that the 
> following patch against -tip makes your problems go away for good.
> 
> Once you can confirm I will push it to Linus.

i triggered the net/sched/sch_generic.c:222 warning once more meanwhile 
(yesterday) with the full revert applied (which i think is the same as 
the patch below).

So i think it's either some unlucky coincidence or some timing 
relationship - perhaps the change impacts packet ordering for certain 
workload patterns? [but that same condition can occur without that patch 
too]

I also checked kerneloops.org and this warning seems to have been 
reported by others as well - although it's not triggering heavily. In 
some of those other reports the warning came together with a dead 
interface, while in my case it's just a warning with still working 
networking.

So since there's no clear bug pattern and no sure reproducability on my 
side i'd suggest we track this problem separately and "do nothing" right 
now. I've excluded this warning from my 'is the freshly booted kernel 
buggy' list of conditions of -tip testing so it's not holding me up.

and i can apply any test-patch if that would be helpful - if it does a 
WARN_ON() i'll notice it. (pure extra debug printks with no stack trace 
are much harder to notice in automated tests)

btw., it would be nice if there was some .config driven networking debug 
option that randomized packet ordering in the tx and rx queue. 
(transparently enabled, with zero-config on the userspace side)

I.e. it would have an (expensive, because O(1)) debug mechanism that 
randomized things - it would insert new packets into a random place 
within the queue where it gets queued. We could hit races and rarer 
codepaths much sooner that way - as especially in LAN based testing 
there's a strong natural ordering of packets so randomizing it 
artificially looks promising to me.

If you make that new option =y enable-able in the .config(dependent on 
DEBUG_KERNEL && default off, etc.), and as long as it does not have to 
be configured on the userspace side (i'm testing unmodified userspace 
images with default distro installs, etc.) the randconfig test will 
still be able to reach it in a percentage of the tests and i think we'll 
be able to hit a lot of exciting races much sooner than with the normal 
in-order/FIFO queueing methods.

it's basically massively parallel coverage testing. It doesnt matter how 
unbelievably slow packet ordering randomization might be, the coverage 
testing it would do would be worth gold i'm sure. (I'd love to test 
something like that in -tip, if it comes in form of some standalone 
patch against a mainline-ish tree.)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ