[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5073AACF.5030608@oracle.com>
Date: Tue, 09 Oct 2012 12:40:47 +0800
From: Jie Liu <jeff.liu@...cle.com>
To: Venkat Venkatsubra <venkat.x.venkatsubra@...cle.com>
CC: rds-devel@....oracle.com, Dan Carpenter <dan.carpenter@...cle.com>,
davem@...emloft.net, James Morris <james.l.morris@...cle.com>,
netdev@...r.kernel.org
Subject: Re: [PATCH] RDS: Fix spinlock recursion for rds over tcp transmit
Hi Venkat,
On 10/09/12 00:47, Venkat Venkatsubra wrote:
> On 10/6/2012 12:42 AM, Jeff Liu wrote:
>> Hello,
>>
>> RDS ping/pong over TCP feature has broke for years(2.6.39 to 3.6.0)
>> since we have to set TCP cork and
>> call kerenel_sendmsg() to reply a ping requirement which both need to
>> lock "struct sock *sk".
>> However, this lock has already been hold before our
>> rda_tcp_data_ready() callback is triggerred.
>> As a result, we always facing spinlock recursion which would
>> resulting in system panic...
>>
>> Given that RDS ping is a special kind of message, we don't need to
>> reply it as
>> soon as possible, IMHO, we can schedule it to work queue as a delayed
>> response to
>> make TCP transport totally works. Also, I think we can using the
>> system default
>> work queue to serve it to reduce the possible impact on general TCP
>> transmit.
>>
> Hi Jeff,
>
> I was looking at the history of changes to rds_send_pong.
> At one time rds_send_pong did this to transmit the pong message:
> queue_delayed_work(rds_wq, &conn->c_send_w, 0);
> instead of the current
> ret = rds_send_xmit(conn);
> i.e. the older versions did not have the deadlock problem and used to
> work once. ;-)
>
> I have suggestions for fixing it in a couple of other ways which you
> may want to consider
> to reduce the amount of code changes in a transport independent layer
> such as "send.c" for a specific underlying transport (tcp in this case).
Thanks for the feedback!
>
> 1. One option is to move back to the old way for all transports (IB,
> tcp, loopback) since
> queuing delay shouldn't be an issue for a diagnostic tool like
> rds-ping which is typically used just to test the connectivity
> and not for serious performance measurements.
So I prefer to your first suggestions since I have also tried to fix
this issue in this way at that time, of course, it really works fine. :)
And also, it could keep the code change as little as possible.
I changed my mind to queue pong message to the system default queue as
it might reduce the impact on general RDS TCP transmits,
but I turned out to be a bit overkill and with more code changes.
I'll send out the V2 patch for your review after a little while.
Thanks,
-Jeff
>
> 2. The underlying transport such as IB, loopback,TCP tells which
> method it wants to send the pong: queued way or send immediately.
> And the code change in rds_send_pong could then simply be:
> if (conn->c_flags & QUEUE_PONG)
> queue_delayed_work(rds_wq, &conn->c_send_w,0);
> else
> ret = rds_send_xmit(conn);
> (The above example codes are not complete. You will need to propagate
> this new flag to "conn" from "rds_transport", etc. at connection setup
> time)
>
> Venkat
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists