linux-kernel - Re: [PATCH v2 11/13] xen/pvcalls: implement release command

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7ace9427-5215-6be7-907a-46dd15ea2a8f@suse.com>
Date:   Tue, 1 Aug 2017 17:34:44 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Stefano Stabellini <sstabellini@...nel.org>
Cc:     xen-devel@...ts.xen.org, linux-kernel@...r.kernel.org,
        Stefano Stabellini <stefano@...reto.com>
Subject: Re: [PATCH v2 11/13] xen/pvcalls: implement release command

On 01/08/17 17:23, Boris Ostrovsky wrote:
> On 07/31/2017 06:34 PM, Stefano Stabellini wrote:
>> On Thu, 27 Jul 2017, Boris Ostrovsky wrote:
>>>> +int pvcalls_front_release(struct socket *sock)
>>>> +{
>>>> +	struct pvcalls_bedata *bedata;
>>>> +	struct sock_mapping *map;
>>>> +	int req_id, notify;
>>>> +	struct xen_pvcalls_request *req;
>>>> +
>>>> +	if (!pvcalls_front_dev)
>>>> +		return -EIO;
>>>> +	bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
>>>> +	if (!bedata)
>>>> +		return -EIO;
>>> Some (all?) other ops don't check bedata validity. Should they all do?
>> No, I don't think they should: dev_set_drvdata is called in the probe
>> function (pvcalls_front_probe). I'll remove it.
>>
>>
>>>> +
>>>> +	if (sock->sk == NULL)
>>>> +		return 0;
>>>> +
>>>> +	map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
>>>> +	if (map == NULL)
>>>> +		return 0;
>>>> +
>>>> +	spin_lock(&bedata->pvcallss_lock);
>>>> +	req_id = bedata->ring.req_prod_pvt & (RING_SIZE(&bedata->ring) - 1);
>>>> +	if (RING_FULL(&bedata->ring) ||
>>>> +	    READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
>>>> +		spin_unlock(&bedata->pvcallss_lock);
>>>> +		return -EAGAIN;
>>>> +	}
>>>> +	WRITE_ONCE(sock->sk->sk_send_head, NULL);
>>>> +
>>>> +	req = RING_GET_REQUEST(&bedata->ring, req_id);
>>>> +	req->req_id = req_id;
>>>> +	req->cmd = PVCALLS_RELEASE;
>>>> +	req->u.release.id = (uint64_t)sock;
>>>> +
>>>> +	bedata->ring.req_prod_pvt++;
>>>> +	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&bedata->ring, notify);
>>>> +	spin_unlock(&bedata->pvcallss_lock);
>>>> +	if (notify)
>>>> +		notify_remote_via_irq(bedata->irq);
>>>> +
>>>> +	wait_event(bedata->inflight_req,
>>>> +		READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
>>>> +
>>>> +	if (map->active_socket) {
>>>> +		/* 
>>>> +		 * Set in_error and wake up inflight_conn_req to force
>>>> +		 * recvmsg waiters to exit.
>>>> +		 */
>>>> +		map->active.ring->in_error = -EBADF;
>>>> +		wake_up_interruptible(&map->active.inflight_conn_req);
>>>> +
>>>> +		mutex_lock(&map->active.in_mutex);
>>>> +		mutex_lock(&map->active.out_mutex);
>>>> +		pvcalls_front_free_map(bedata, map);
>>>> +		mutex_unlock(&map->active.out_mutex);
>>>> +		mutex_unlock(&map->active.in_mutex);
>>>> +		kfree(map);
>>> Since you are locking here I assume you expect that someone else might
>>> also be trying to lock the map. But you are freeing it immediately after
>>> unlocking. Wouldn't that mean that whoever is trying to grab the lock
>>> might then dereference freed memory?
>> The lock is to make sure there are no recvmsg or sendmsg in progress. We
>> are sure that no newer sendmsg or recvmsg are waiting for
>> pvcalls_front_release to release the lock because before send a message
>> to the backend we set sk_send_head to NULL.
> 
> Is there a chance that whoever is potentially calling send/rcvmsg has
> checked that sk_send_head is non-NULL but hasn't grabbed the lock yet?
> 
> Freeing a structure containing a lock right after releasing the lock
> looks weird (to me). Is there any other way to synchronize with
> sender/receiver? Any other lock?

Right. This looks fishy. Either you don't need the locks or you can't
just free the area right after releasing the lock.

> BTW, I also noticed that in rcvmsg you are calling
> wait_event_interruptible() while holding the lock. Have you tested with
> CONFIG_DEBUG_ATOMIC_SLEEP? (or maybe it's some other config  option that
> would complain about those sorts of thing)

I believe sleeping while holding a mutex is allowed. Sleeping in
spinlocked paths is bad.

BTW: You are looking for CONFIG_DEBUG_MUTEXES (see
Documentation/locking/mutex-design.txt ).


Juergen