lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190628154841.32b96fb1@cakuba.netronome.com>
Date:   Fri, 28 Jun 2019 15:48:41 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     daniel@...earbox.io, ast@...nel.org, netdev@...r.kernel.org,
        edumazet@...gle.com, bpf@...r.kernel.org
Subject: Re: [PATCH 1/2] tls: remove close callback sock unlock/lock and
 flush_sync

On Fri, 28 Jun 2019 12:40:29 -0700, John Fastabend wrote:
> The lock() is already held when entering unhash() side so need to
> handle this case as well,
> 
> CPU 0 (free)          CPU 1 (wq)
> 
> lock(sk)              ctx = tls_get_ctx(sk) <- need to be check null ptr
> sk_prot->unhash()
>   set_bit()
>   cancel_work()
>   ...
>   kfree(ctx)
> unlock(sk)
> 
> but using cancel and doing an unlikely(!ctx) check should be
> sufficient to handle wq. 

I'm not sure we can kfree ctx, the work struct itself is in it, no?

> What I'm not sure how to solve now is
> in patch 2 of this series unhash is still calling strp_done
> with the sock lock. Maybe we need to do a deferred release
> like sockmap side?

Right, we can't do anything that sleeps in unhash, since we're holding
the spinlock there, not the "owner" lock.

> Trying to drop the lock and then grabbing it again doesn't
> seem right to me seems based on comment in tcp_abort we
> could potentially "race with userspace socket closes such
> as tcp_close". iirc I think one of the tls splats from syzbot
> looked something like this may have happened.
> 
> For now I'm considering adding a strp_cancel() op. Seeing
> we are closing() the socket and tearkng down we can probably
> be OK with throwing out strp results.

But don't we have to flush the work queues before we free ctx?  We'd
need to alloc a workqueue and schedule a work to flush the other works
and then free?

Why can't tls sockets exist outside of established state?  If shutdown
doesn't call close, perhaps we can add a shutdown callback?  It doesn't
seem to be called from BH?

Sorry for all the questions, I'm not really able to fully wrap my head
around this. I also feel like I'm missing the sockmap piece that may
be why you prefer unhash over disconnect.

FWIW Davide's ULP diag support patches will require us to most likely
free ctx with kfree_rcu(). diag only has a ref on struct sock, so if 
we want to access ctx we need RCU or to lock every socket. It's a
little bit of an abuse of RCU, because the data under our feet may
actually change, but the fields we dump will only get inited once
after ulp is installed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ