lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 20 Nov 2008 00:33:06 -0800 (PST)
From:	David Miller <davem@...emloft.net>
To:	greearb@...delatech.com
Cc:	rick.jones2@...com, netdev@...r.kernel.org, kaber@...sh.net
Subject: Re: ARP table question

From: Ben Greear <greearb@...delatech.com>
Date: Mon, 17 Nov 2008 17:50:50 -0800

> Rick Jones wrote:
> > Ben Greear wrote:
> >> Rick Jones wrote:
> >>
> >>>> +static unsigned long neigh_rand_retry(struct neighbour* neigh) {
> >>>> +    if (neigh->parms->retrans_rand_backoff) {
> >>>> +        return net_random() % neigh->parms->retrans_rand_backoff;
> >>>> +    }
> >>>> +    return 0;
> >>>> +}
> >>>> +
> >>>>  /* Called when a timer expires for a neighbour entry. */
> >>>
> >>>
> >>> I thought that mod was something we tried to avoid?  Could you instead use something that isn't random but perhaps varies among all the requests?  Say some of the low-order bits of the IP being resolved?
> >>
> >>
> >> This is only called when we are going to retransmit an ARP, which shouldn't
> >> be in any sort of hot path, so I figured MOD was fine.
> >>
> >> The net_random is a very cheap method (last I checked), as well.
> >>
> >> So, I think that part is OK as it is, but I'm open to
> >> persuasion :)
> > Perhaps I'm confused, or simply channeling Emily Litella again, but if you only do this on the 1st through Nth retransmissions (ie after the first retransmission timer has popped) don't you still have a thundering herd problem on the first transmission and the first retransmission of ARP requests?
> 
> You'd certainly have it on the first transmission, but I think from there on
> the randomness should kick in.  This is a pretty rare case, and I'd rather
> not slow down the initial ARP.  If we *are* in the overload situation, then
> the network can just purge/drop/whatever the initial flood and then the
> retransmits should start doing their random thing.  On my system, it still
> takes maybe 30 seconds for all the ARPs to resolve since a good deal of
> the requests and/or responses are being lost.
> 
> After some more testing, I can still get it into a bad
> state if I have a retrans timer of 1 sec and a randomness of 5 secs
> and manage to cause all 1000 arp entries to go stale at once (by
> yanking a cable, for instance).
> 
> It seems I have to bump up the base timer to 3-5 seconds (I'm
> leaving the random backoff at 5 secs as well).

This scheme still seems hackish to me, so I'm going to defer on this
for now.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ