lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB9dFdv8zzODKMr2Ane8jSKxNhJdDMyPZ8YaYJqL1cW0TFOO4g@mail.gmail.com>
Date:	Mon, 11 Jul 2016 18:17:39 -0300
From:	Marc Dionne <marc.c.dionne@...il.com>
To:	Pablo Neira Ayuso <pablo@...filter.org>
Cc:	Florian Westphal <fw@...len.de>, netdev <netdev@...r.kernel.org>,
	Thorsten Leemhuis <regressions@...mhuis.info>,
	netfilter-devel@...r.kernel.org
Subject: Re: Multi-thread udp 4.7 regression, bisected to 71d8c47fc653

On Mon, Jul 11, 2016 at 1:26 PM, Pablo Neira Ayuso <pablo@...filter.org> wrote:
> On Sun, Jul 10, 2016 at 04:48:26PM -0300, Marc Dionne wrote:
>> An update here since I've had some interactions with Pablo off list.
>>
>> Further testing shows that the underlying cause of the different test
>> results is a udp packet that has a bogus source port number.  In the
>> test the server process tries to send an ack to the bogus port and the
>> flow is disrupted.
>>
>> Notes:
>> - The packet with the bad source port is from a sendmsg() call that
>> has hit the connection tracker clash code introduced by 71d8c47fc653
>> - Packets are successfully sent after the bad one, from the same
>> socket, with the correct source port number
>> - The problem does not reproduce with 71d8c47fc653 reverted, or
>> without nf_conntrack loaded
>> - The bogus port numbers start at 1024, bumping up by 1 every few
>> times the problem occurs (1025, 1026, etc.)
>> - The patch above does not change the behaviour
>> - Enabling lockdep does not show anything
>>
>> Our workaround for the original race was to retry sendmsg() once on
>> EPERM errors, and that had been effective.
>> I can trigger the insertion clash easily with some simple test code,
>> but I have not been able so far to reproduce the packets with bad
>> source port numbers with some simpler code that I could share.
>
> The NAT nul-binding is triggering the source port mangling, even if
> there is no NAT rules in place. The following patch skips the clash
> resolution for NAT by now since we don't see a simple solution for
> this case at the moment.
>
> Could you give a try to this patch in these two cases?
>
> 1) No NAT in place: Make sure iptable_nat module is not there. Or if
>    you're using nf_tables, just make sure you have no nat chains at
>    all.
>
> 2) With NAT in place, you hit back the EPERM errors that you've
>    observed so far.
>
> Please, test both scenarios and report back. Thanks.

Hi Pablo,

Testing out your patch:

1) With no NAT in place, the clash resolution happens, with no side
effects.  No EPERM errors are seen.

2) With ip(6)table_nat loaded, the clash resolution fails and I get
some EPERM errors from sendmsg(), same as before 71d8c47fc653.

Turns out that even though I have no NAT rules in my iptables config,
the system also had firewalld active and that caused the modules to be
loaded.

So the bottom line is that the patch looks good to me..

Thanks,
Marc

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ