netdev - Re: [PATCH 3/9] [TCP]: Tighten tcp_sock's belt, drop left

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0705311846390.14736@kivilampi-30.cs.helsinki.fi>
Date:	Thu, 31 May 2007 19:31:21 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	David Miller <davem@...emloft.net>
cc:	Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH 3/9] [TCP]: Tighten tcp_sock's belt, drop left_out

Thanks for you comments...

On Thu, 31 May 2007, David Miller wrote:

> From: "Ilpo_Järvinen" <ilpo.jarvinen@...sinki.fi>
> Date: Sat, 26 May 2007 11:35:56 +0300
> 
> > It is easily calculable when needed and user are not that many
> > after all.
>
> This looks good, but are you absolutely sure we never used
> a stale value of tp->left_out on purpose?
> 
> I tried to audit this but there are so many cases :-)

Yes I know... So often complexity in these kind of things really
perplexes me too, in a way this is somewhat similar to the case with FRTO 
and no SACKed bits touched (which just looked very obvious and safe) but 
would have caused the dreaded S+L situation unless special handling for 
IsReno(tp) case would have been added.

> What I mean here is, was there a case where we updated
> sacked_out or lost_out, but then used left_out before
> it was updated, and we were doing this on purpose?

I understand what you mean even though I think that such approach would be 
very fragile if being done (and should be fixed). I was actually wondering 
earlier that could it be the only reason for the separate variable in the 
first place. Though so far I don't know any cases... ...but the code ain't 
always that trivial like you say :-), however, this after proven safe, 
will make it cleaner and will save next one to come from the same
wondering... ;-)

Anyway, I'll be try to do some grep tricks in order to reduce the number 
of cases that need an expensive manual audit to a smaller set where 
tcp_sync_left_out is not right after the adjustment. As plenty of 
functions end to tcp_sync_left_out without returning in between, it could 
help a lot, of course the calls within that function could have issues too 
but they're relatively easy to check then (common pattern is a trivial 
loop + sync as you know)... Usually only when the "stale state" leaks from 
a function, we're in troubles with the number of possible paths. ...I'll 
come back to this later on after verifying it and perhaps provide the 
same shortened list for you too so that you can see if I overlooked 
something... 

It might actually end up being easier to audit all tcp_packets_in_flight 
callers instead to see if there's always sync_left_out before them.

Another thing, I was thinking of renaming tcp_sync_left_out to 
tcp_verify_left_out now that it does only BUG traps... Do you have a hunch 
about this (from patch 8 msg): I wonder if we could get that BUG_ON place 
pinpointed more exactly in oops because now inlining makes it lose its 
original context as they always seem to be in tcp_ack, #define helps?
(i.e, how does it show up in oops if tcp_verify_left_out looks like this):

#define tcp_verify_left_out(tp) BUG_ON(...)

...does it still point to the tcp.h line xxxx then or to the
tcp_sync_left_out(tp) @ line yyyy in tcp_input.c? ...I'll test that
later on by myself in case you don't know the answer.

-- 
 i.