linux-kernel - Re: use-after-free in sock_wake

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1448557398.912656.450874953.4CA824F0@webmail.messagingengine.com>
Date:	Thu, 26 Nov 2015 18:03:18 +0100
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Rainer Weikusat <rweikusat@...ileactivedefense.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Benjamin LaHaise <bcrl@...ck.org>,
	"David S. Miller" <davem@...emloft.net>,
	Al Viro <viro@...iv.linux.org.uk>,
	David Howells <dhowells@...hat.com>,
	Ying Xue <ying.xue@...driver.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	syzkaller <syzkaller@...glegroups.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: use-after-free in sock_wake_async

On Thu, Nov 26, 2015, at 16:51, Eric Dumazet wrote:
> On Thu, 2015-11-26 at 14:32 +0100, Hannes Frederic Sowa wrote:
> > Hannes Frederic Sowa <hannes@...essinduktion.org> writes:
> > 
> > 
> > > I have seen filesystems already doing so in .destroy_inode, that's why I
> > > am asking. The allocation happens the same way as we do with sock_alloc,
> > > e.g. shmem. I actually thought that struct inode already provides an
> > > rcu_head for exactly that reason.
> > 
> > E.g.:
> 
> > +static void sock_destroy_inode(struct inode *inode)
> > +{
> > +	call_rcu(&inode->i_rcu, sock_cache_free_rcu);
> > +}
> 
> I guess you missed few years back why we had to implement
> SLAB_DESTROY_BY_RCU for TCP sockets to not destroy performance.

I think I wasn't even subscribed to netdev@ at that time, so I probably
missed it. Few years back is 7. :}

> By adding RCU grace period before reuse of this inode (about 640 bytes
> today), you are asking the CPU to evict from its cache precious content,
> and slow down some workloads, adding lot of ram pressure, as the cpu
> allocating a TCP socket will have to populate its cache for a cold
> inode.

My rationale was like this: we already have rcu to free the wq, so we
don't add any more callbacks as current code. sock_alloc is right now
1136 bytes, which is huge, like 18 cachelines. I wouldn't think it does
matter a lot as we thrash anyway. tcp_sock is like 45 cachelines right
now, hui.

Also isn't the reason why slub exists so it can track memory regions
per-cpu.

Anyway, I am only speculating why it could be tried. I probably need to
do some performance experiments.

> The reason we put in a small object the RCU protected fields should be
> pretty clear.

Yes, I thought about that.

> Do not copy code that people wrote in other layers without understanding
> the performance implications.

Duuh. :)

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/