[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1273210774.2222.45.camel@edumazet-laptop>
Date: Fri, 07 May 2010 07:39:34 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Bhaskar Dutta <bhaskie@...il.com>
Cc: Stephen Hemminger <shemminger@...tta.com>,
Ben Hutchings <bhutchings@...arflare.com>,
netdev@...r.kernel.org
Subject: Re: TCP-MD5 checksum failure on x86_64 SMP
Le jeudi 06 mai 2010 à 17:25 +0530, Bhaskar Dutta a écrit :
> On Thu, May 6, 2010 at 12:23 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > I am not familiar with this code, but I suspect same per_cpu data can be
> > used at both time by a sender (process context) and by a receiver
> > (softirq context).
> >
> > To trigger this, you need at least two active md5 sockets.
> >
> > tcp_get_md5sig_pool() should probably disable bh to make sure current
> > cpu wont be preempted by softirq processing
> >
> >
> > Something like :
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index fb5c66b..e232123 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -1221,12 +1221,15 @@ struct tcp_md5sig_pool *tcp_get_md5sig_pool(void)
> > struct tcp_md5sig_pool *ret = __tcp_get_md5sig_pool(cpu);
> > if (!ret)
> > put_cpu();
> > + else
> > + local_bh_disable();
> > return ret;
> > }
> >
> > static inline void tcp_put_md5sig_pool(void)
> > {
> > __tcp_put_md5sig_pool();
> > + local_bh_enable();
> > put_cpu();
> > }
> >
> >
> >
>
> I put in the above change and ran some load tests with around 50
> active TCP connections doing MD5.
> I could see only 1 bad packet in 30 min (earlier the problem used to
> occur instantaneously and repeatedly).
>
> I think there is another possibility of being preempted when calling
> tcp_alloc_md5sig_pool()
> this function releases the spinlock when calling __tcp_alloc_md5sig_pool().
>
> I will run some more tests after changing the tcp_alloc_md5sig_pool
> and see if the problem is completely resolved.
I cant see a race with spinlock in
tcp_alloc_md5sig_pool/__tcp_alloc_md5sig_pool().
We allocate structures for all cpus, so preemption.migration should be
OK
Could you elaborate please ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists