lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200321010246.GC3828@localhost.localdomain>
Date:   Fri, 20 Mar 2020 22:02:46 -0300
From:   Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
To:     Qiujun Huang <hqjagain@...il.com>
Cc:     "David S. Miller" <davem@...emloft.net>, vyasevich@...il.com,
        nhorman@...driver.com, Jakub Kicinski <kuba@...nel.org>,
        linux-sctp@...r.kernel.org, netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>, anenbupt@...il.com
Subject: Re: [PATCH v3] sctp: fix refcount bug in sctp_wfree

On Sat, Mar 21, 2020 at 07:53:29AM +0800, Qiujun Huang wrote:
...
> > > So, sctp_wfree was not called to destroy SKB)
> > >
> > > then migrate happened
> > >
> > >       sctp_for_each_tx_datachunk(
> > >       sctp_clear_owner_w);
> > >       sctp_assoc_migrate();
> > >       sctp_for_each_tx_datachunk(
> > >       sctp_set_owner_w);
> > > SKB was not in the outq, and was not changed to newsk
> >
> > The real fix is to fix the migration to the new socket, though the
> > situation on which it is happening is still not clear.
> >
> > The 2nd sendto() call on the reproducer is sending 212992 bytes on a
> > single call. That's usually the whole sndbuf size, and will cause
> > fragmentation to happen. That means the datamsg will contain several
> > skbs. But still, the sacked chunks should be freed if needed while the
> > remaining ones will be left on the queues that they are.
> 
> in sctp_sendmsg_to_asoc
> datamsg holds his chunk result in that the sacked chunks can't be freed

Right! Now I see it, thanks.
In the end, it's not a locking race condition. It's just not iterating
on the lists properly.

> 
> list_for_each_entry(chunk, &datamsg->chunks, frag_list) {
> sctp_chunk_hold(chunk);
> sctp_set_owner_w(chunk);
> chunk->transport = transport;
> }
> 
> any ideas to handle it?

sctp_for_each_tx_datachunk() needs to be aware of this situation.
Instead of iterating directly/only over the chunk list, it should
iterate over the datamsgs instead. Something like the below (just
compile tested).

Then, the old socket will be free to die regardless of the new one.
Otherwise, if this association gets stuck on retransmissions or so,
the old socket would not be freed till then.

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index fed26a1e9518..85c742310d26 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -151,9 +151,10 @@ static void sctp_for_each_tx_datachunk(struct sctp_association *asoc,
 				       void (*cb)(struct sctp_chunk *))
 
 {
+	struct sctp_datamsg *msg, *prev_msg = NULL;
 	struct sctp_outq *q = &asoc->outqueue;
 	struct sctp_transport *t;
-	struct sctp_chunk *chunk;
+	struct sctp_chunk *chunk, *c;
 
 	list_for_each_entry(t, &asoc->peer.transport_addr_list, transports)
 		list_for_each_entry(chunk, &t->transmitted, transmitted_list)
@@ -162,8 +163,14 @@ static void sctp_for_each_tx_datachunk(struct sctp_association *asoc,
 	list_for_each_entry(chunk, &q->retransmit, transmitted_list)
 		cb(chunk);
 
-	list_for_each_entry(chunk, &q->sacked, transmitted_list)
-		cb(chunk);
+	list_for_each_entry(chunk, &q->sacked, transmitted_list) {
+		msg = chunk->msg;
+		if (msg == prev_msg)
+			continue;
+		list_for_each_entry(c, &msg->chunks, frag_list)
+			cb(c);
+		prev_msg = msg;
+	}
 
 	list_for_each_entry(chunk, &q->abandoned, transmitted_list)
 		cb(chunk);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ