lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <YL31i98CpOO/AjKY@kroah.com>
Date:   Mon, 7 Jun 2021 12:31:39 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Hillf Danton <hdanton@...a.com>
Cc:     Leon Romanovsky <leon@...nel.org>, SyzScope <syzscope@...il.com>,
        davem@...emloft.net, johan.hedberg@...il.com, kuba@...nel.org,
        linux-bluetooth@...r.kernel.org, linux-kernel@...r.kernel.org,
        Luiz Augusto von Dentz <luiz.dentz@...il.com>,
        marcel@...tmann.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: KASAN: use-after-free Read in hci_chan_del

On Mon, Jun 07, 2021 at 06:02:01PM +0800, Hillf Danton wrote:
> On Mon, 7 Jun 2021 09:55:31 +0200 Greg KH wrote:
> >On Mon, Jun 07, 2021 at 03:48:28PM +0800, Hillf Danton wrote:
> >> On Sun, 6 Jun 2021 11:54:22 +0200 Greg KH wrote:
> >> >On Sun, Jun 06, 2021 at 04:50:04PM +0800, Hillf Danton wrote:
> >> >> 
> >> >> To fix the uaf reported, add reference count to hci channel to track users.
> >> >> Then only channels with zero users will be released.
> >> >> 
> >> >> It is now only for thoughts.
> >> >> 
> >> >> +++ x/include/net/bluetooth/hci_core.h
> >> >> @@ -704,6 +704,7 @@ struct hci_chan {
> >> >>  	struct sk_buff_head data_q;
> >> >>  	unsigned int	sent;
> >> >>  	__u8		state;
> >> >> +	atomic_t ref;
> >> >
> >> >Please no, never use "raw" atomic variables.  Especially for something
> >> >like this, use a kref.
> >> 
> >> Fair, thanks for taking a look at it.
> >> 
> >> Spin with care for the race the added ref fails to cut.
> >
> >I do not understand what you mean here.
> >
> >> To ease review the full syzreport is also attached.
> >> 
> >> To fix uaf, add user track to hci channel and we will only release channel if
> >> its user hits zero. And a dryrun mechanism is also added to take care of the
> >> race user track fails to cut.
> >> 
> >> 	CPU0			CPU1
> >> 	----			----
> >> 	hci_chan_del		l2cap_conn_del
> >> 				chan->user = 0;
> >> 
> >> 	if (chan->user != 0)
> >> 		return;
> >> 	synchronize_rcu();
> >> 	kfree(chan);
> >> 
> >> 				hci_chan_del();
> >> 
> >> It is now only for thoughts.
> >> 
> >> +++ x/include/net/bluetooth/hci_core.h
> >> @@ -704,6 +704,10 @@ struct hci_chan {
> >>  	struct sk_buff_head data_q;
> >>  	unsigned int	sent;
> >>  	__u8		state;
> >> +	__u8		user;
> >
> >No.
> >
> >> +	__u8		release;
> >
> >No please no.
> >
> >> +
> >> +#define HCHAN_RELEASE_DRYRUN 1
> >>  };
> >>  
> >>  struct hci_conn_params {
> >> +++ x/net/bluetooth/l2cap_core.c
> >> @@ -1903,6 +1903,12 @@ static void l2cap_conn_del(struct hci_co
> >>  
> >>  	mutex_unlock(&conn->chan_lock);
> >>  
> >> +	/* see comment in hci_chan_del() */
> >> +	conn->hchan->release = HCHAN_RELEASE_DRYRUN;
> >> +	smp_wmb();
> >> +	conn->hchan->user--;
> >
> >And the reason you are open-coding a kref is why???
> >
> >Please again no.
> >
> >> +	hci_chan_del(conn->hchan);
> >> +	conn->hchan->release = 0;
> >>  	hci_chan_del(conn->hchan);
> >>  
> >>  	if (conn->info_state & L2CAP_INFO_FEAT_MASK_REQ_SENT)
> >> @@ -7716,6 +7722,8 @@ static struct l2cap_conn *l2cap_conn_add
> >>  	kref_init(&conn->ref);
> >>  	hcon->l2cap_data = conn;
> >>  	conn->hcon = hci_conn_get(hcon);
> >> +	/* dec in l2cap_conn_del() */
> >> +	hchan->user++;
> >
> >{sigh}
> >
> >No, there is a reason we wrote kref many _decades_ ago.  Please use it,
> >your original attempt with an atomic was just fine, just use the proper
> >data structures the kernel provides you as this is obviously a reference
> >counted object.
> 
> I see your concern. I thought this is a simpler user track than kref and
> open coded a couple of lines. I see it is incorrect. Sorry for that.
> 
> After taking another look at the added user track, I realised that it serves
> no more than a one-off state word that prevents channel from being released.
> Then the race behind the uaf can be fixed by adding a state on top of the
> dryrun introduced even without tracking users.
> 
> The state machine works as the following,
> 1) it is initialised to be backoff that means channel cannot be released
>    at the moment.
> 2) it is changed to be dryrun on releasing to cut the race that survived
>    backoff.
> 3) it is finally set to zero for release after cutting the chance for race.

Adding another state on top of this feels rough, does it really solve
the race here?  Normally a reference count should be enough to properly
tear things down when needed, rolling back from a "can I try this now"
state still seems racy without the needed lock somewhere.


> 
> 
> +++ x/include/net/bluetooth/hci_core.h
> @@ -704,6 +704,10 @@ struct hci_chan {
>  	struct sk_buff_head data_q;
>  	unsigned int	sent;
>  	__u8		state;
> +	__u8		release;
> +
> +#define HCHAN_RELEASE_DRYRUN  1
> +#define HCHAN_RELEASE_BACKOFF 2
>  };
>  
>  struct hci_conn_params {
> +++ x/net/bluetooth/l2cap_core.c
> @@ -1903,6 +1903,10 @@ static void l2cap_conn_del(struct hci_co
>  
>  	mutex_unlock(&conn->chan_lock);
>  
> +	/* see comment in hci_chan_del() */
> +	conn->hchan->release = HCHAN_RELEASE_DRYRUN;
> +	hci_chan_del(conn->hchan);
> +	conn->hchan->release = 0;
>  	hci_chan_del(conn->hchan);
>  
>  	if (conn->info_state & L2CAP_INFO_FEAT_MASK_REQ_SENT)
> @@ -7716,6 +7720,8 @@ static struct l2cap_conn *l2cap_conn_add
>  	kref_init(&conn->ref);
>  	hcon->l2cap_data = conn;
>  	conn->hcon = hci_conn_get(hcon);
> +	/* release is changed in l2cap_conn_del() */
> +	hchan->release = HCHAN_RELEASE_BACKOFF;
>  	conn->hchan = hchan;
>  
>  	BT_DBG("hcon %p conn %p hchan %p", hcon, conn, hchan);
> +++ x/net/bluetooth/hci_conn.c
> @@ -1769,12 +1769,26 @@ void hci_chan_del(struct hci_chan *chan)
>  	struct hci_conn *conn = chan->conn;
>  	struct hci_dev *hdev = conn->hdev;
>  
> +	if (chan->release == HCHAN_RELEASE_BACKOFF)
> +		return;
> +
>  	BT_DBG("%s hcon %p chan %p", hdev->name, conn, chan);
>  
> -	list_del_rcu(&chan->list);
> +	if (!list_empty(&chan->list))
> +		list_del_rcu(&chan->list);
>  
>  	synchronize_rcu();
>  
> +	if (chan->release == HCHAN_RELEASE_DRYRUN) {
> +		/*
> +		 * after list_del chan is only visible to the owner of dryrun,
> +		 * which is needed to cut the race that survived backoff,
> +		 * and simply return to owner. 

checkpatch will ding you for this not being the proper format for
networking, it hit me last week as well :)

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ