[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S369T_hvoTgyHbmeSiR2p3d68h+0tMKqMmcGYLrKiN3JMA@mail.gmail.com>
Date: Tue, 13 Dec 2016 15:32:09 -0800
From: Tom Herbert <tom@...bertland.com>
To: Craig Gallek <kraigatgoog@...il.com>
Cc: Josef Bacik <jbacik@...com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Eric Dumazet <eric.dumazet@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Soft lockup in inet_put_port on 4.6
On Tue, Dec 13, 2016 at 3:03 PM, Craig Gallek <kraigatgoog@...il.com> wrote:
> On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert <tom@...bertland.com> wrote:
>> I think there may be some suspicious code in inet_csk_get_port. At
>> tb_found there is:
>>
>> if (((tb->fastreuse > 0 && reuse) ||
>> (tb->fastreuseport > 0 &&
>> !rcu_access_pointer(sk->sk_reuseport_cb) &&
>> sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
>> smallest_size == -1)
>> goto success;
>> if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true)) {
>> if ((reuse ||
>> (tb->fastreuseport > 0 &&
>> sk->sk_reuseport &&
>> !rcu_access_pointer(sk->sk_reuseport_cb) &&
>> uid_eq(tb->fastuid, uid))) &&
>> smallest_size != -1 && --attempts >= 0) {
>> spin_unlock_bh(&head->lock);
>> goto again;
>> }
>> goto fail_unlock;
>> }
>>
>> AFAICT there is redundancy in these two conditionals. The same clause
>> is being checked in both: (tb->fastreuseport > 0 &&
>> !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
>> uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is true the
>> first conditional should be hit, goto done, and the second will never
>> evaluate that part to true-- unless the sk is changed (do we need
>> READ_ONCE for sk->sk_reuseport_cb?).
> That's an interesting point... It looks like this function also
> changed in 4.6 from using a single local_bh_disable() at the beginning
> with several spin_lock(&head->lock) to exclusively
> spin_lock_bh(&head->lock) at each locking point. Perhaps the full bh
> disable variant was preventing the timers in your stack trace from
> running interleaved with this function before?
Could be, although dropping the lock shouldn't be able to affect the
search state. TBH, I'm a little lost in reading function, the
SO_REUSEPORT handling is pretty complicated. For instance,
rcu_access_pointer(sk->sk_reuseport_cb) is checked three times in that
function and also in every call to inet_csk_bind_conflict. I wonder if
we can simply this under the assumption that SO_REUSEPORT is only
allowed if the port number (snum) is explicitly specified.
Tom
Powered by blists - more mailing lists