netdev - Re: [PATCH -next v2] unix stream: Fix use-after-free crashes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAM7YAnqLyK6JWPW_Y8wD=ykqWMn4fPdJ3_7yUUB+TQZWfDJzQ@mail.gmail.com>
Date:	Wed, 7 Sep 2011 07:09:17 +0800
From:	"Yan, Zheng" <zheng.z.yan@...ux.intel.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Tim Chen <tim.c.chen@...ux.intel.com>,
	"Yan, Zheng" <zheng.z.yan@...el.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"sfr@...b.auug.org.au" <sfr@...b.auug.org.au>,
	"jirislaby@...il.com" <jirislaby@...il.com>,
	"sedat.dilek@...il.com" <sedat.dilek@...il.com>, alex.shi@...el.com
Subject: Re: [PATCH -next v2] unix stream: Fix use-after-free crashes

On Wed, Sep 7, 2011 at 4:19 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> Le mardi 06 septembre 2011 à 12:59 -0700, Tim Chen a écrit :
>> On Tue, 2011-09-06 at 21:43 +0200, Eric Dumazet wrote:
>> > Le mardi 06 septembre 2011 à 12:33 -0700, Tim Chen a écrit :
>> >
>> > > Yes, I think locking the sendmsg for the entire duration of
>> > > unix_stream_sendmsg makes a lot of sense.  It simplifies the logic a lot
>> > > more.  I'll try to cook something up in the next couple of days.
>> >
>> > Thats not really possible, we cant hold a spinlock and call
>> > sock_alloc_send_skb() and/or memcpy_fromiovec(), wich might sleep.
>> >
>> > You would need to prepare the full skb list, then :
>> > - stick the ref on the last skb of the list.
>> >
>> > Transfert the whole skb list in other->sk_receive_queue in one go,
>> > instead of one after another.
>> >
>> > Unfortunately, this would break streaming (big send(), and another
>> > thread doing the receive)
>> >
>> > Listen, I am wondering why hackbench even triggers SCM code. This is
>> > really odd. We should not have a _single_ pid/cred ref/unref at all.
>> >
>>
>> Hackbench triggers the code because it has a bunch of threads sending
>> msgs on UNIX socket.
>> >
>>
>> Well, if the lock socket approach doesn't work, then my original patch
>> plus Yan Zheng's fix should still work.  I'll try to answer your
>> objections below:
>>
>>
>> > I was discussing of things after proposed patch, not current net-next.
>> >
>> > This reads :
>> >
>> > err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, scm_ref);
>> >
>> > So first skb is sent without ref taken, as mentioned in Changelog ?
>> >
>>
>> No. the first skb is sent *with* ref taken, as scm_ref is set to true for
>> first skb.
>>
>> >
>> > If second skb cannot be built, we exit this system call with an already
>> > queued skb. Receiver can then access to freed memory.
>> >
>>
>> No, we do have reference set.  For first skb, in unix_scm_to_skb.  For the
>> second skb (which is the last skb), in scm_sent.  Should the second skb alloc failed,
>> we'll release the ref in scm_destroy.  Otherwise, the receiver will release
>> the references will consuming the skb.
>>
>
> This is crap. This is not the intent of the code I read from the patch.
>
> unless scm_ref really means scm_noref ?
>
> I really hate this patch. I mean it.
>
> I read it 10 times, spent 2 hours and still dont understand it.
>

Sorry, scm_ref means "sender hold a scm reference". I should add comment for it.

>
> @@ -1577,6 +1577,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>        int sent = 0;
>        struct scm_cookie tmp_scm;
>        bool fds_sent = false;
> +       bool scm_ref = true;
>        int max_level;
>
>        if (NULL == siocb->scm)
> @@ -1637,12 +1638,15 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>                 */
>                size = min_t(int, size, skb_tailroom(skb));
>
> +               /* pass the scm reference to the very last skb */
>
> HERE: I understand : on the last skb, set scm_ref to false.
> So comment is wrong.
>
> +               if (sent + size >= len)
> +                       scm_ref = false;
>
> -               /* Only send the fds and no ref to pid in the first buffer */
> -               err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
> +               /* Only send the fds in the first buffer */
> +               err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, scm_ref);
>                if (err < 0) {
>                        kfree_skb(skb);
> -                       goto out;
> +                       goto out_err;
>                }
>
>
>
> As I said, we should revert the buggy patch, and rewrite a performance
> fix from scratch, with not a single get_pid()/put_pid() in fast path.
>
> read()/write() on AF_UNIX sockets should not use a single
> get_pid()/put_pid().
>
> This is a serious regression we should fix at 100%, not 50% or even 75%,
> adding serious bugs.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html