lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1707261630390.22381@sstabellini-ThinkPad-X260>
Date:   Wed, 26 Jul 2017 16:59:29 -0700 (PDT)
From:   Stefano Stabellini <sstabellini@...nel.org>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>
cc:     Stefano Stabellini <sstabellini@...nel.org>,
        xen-devel@...ts.xen.org, linux-kernel@...r.kernel.org,
        jgross@...e.com, Stefano Stabellini <stefano@...reto.com>
Subject: Re: [PATCH v2 05/13] xen/pvcalls: implement bind command

On Wed, 26 Jul 2017, Boris Ostrovsky wrote:
> On 7/25/2017 5:22 PM, Stefano Stabellini wrote:
> > Send PVCALLS_BIND to the backend. Introduce a new structure, part of
> > struct sock_mapping, to store information specific to passive sockets.
> > 
> > Introduce a status field to keep track of the status of the passive
> > socket.
> > 
> > Introduce a waitqueue for the "accept" command (see the accept command
> > implementation): it is used to allow only one outstanding accept
> > command at any given time and to implement polling on the passive
> > socket. Introduce a flags field to keep track of in-flight accept and
> > poll commands.
> > 
> > sock->sk->sk_send_head is not used for ip sockets: reuse the field to
> > store a pointer to the struct sock_mapping corresponding to the socket.
> > 
> > Convert the struct socket pointer into an uint64_t and use it as id for
> > the socket to pass to the backend.
> > 
> > Signed-off-by: Stefano Stabellini <stefano@...reto.com>
> > CC: boris.ostrovsky@...cle.com
> > CC: jgross@...e.com
> > ---
> >   drivers/xen/pvcalls-front.c | 73
> > +++++++++++++++++++++++++++++++++++++++++++++
> >   drivers/xen/pvcalls-front.h |  3 ++
> >   2 files changed, 76 insertions(+)
> > 
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index d0f5f42..af2ce20 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -59,6 +59,23 @@ struct sock_mapping {
> >     			wait_queue_head_t inflight_conn_req;
> >   		} active;
> > +		struct {
> > +		/* Socket status */
> > +#define PVCALLS_STATUS_UNINITALIZED  0
> > +#define PVCALLS_STATUS_BIND          1
> > +#define PVCALLS_STATUS_LISTEN        2
> > +			uint8_t status;
> > +		/*
> > +		 * Internal state-machine flags.
> > +		 * Only one accept operation can be inflight for a socket.
> > +		 * Only one poll operation can be inflight for a given socket.
> > +		 */
> > +#define PVCALLS_FLAG_ACCEPT_INFLIGHT 0
> > +#define PVCALLS_FLAG_POLL_INFLIGHT   1
> > +#define PVCALLS_FLAG_POLL_RET        2
> > +			uint8_t flags;
> > +			wait_queue_head_t inflight_accept_req;
> > +		} passive;
> >   	};
> >   };
> >   @@ -292,6 +309,62 @@ int pvcalls_front_connect(struct socket *sock, struct
> > sockaddr *addr,
> >   	return ret;
> >   }
> >   +int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int
> > addr_len)
> > +{
> > +	struct pvcalls_bedata *bedata;
> > +	struct sock_mapping *map = NULL;
> > +	struct xen_pvcalls_request *req;
> > +	int notify, req_id, ret;
> > +
> > +	if (!pvcalls_front_dev)
> > +		return -ENOTCONN;
> > +	if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
> > +		return -ENOTSUPP;
> > +	bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
> > +
> > +	map = kzalloc(sizeof(*map), GFP_KERNEL);
> > +	if (map == NULL)
> > +		return -ENOMEM;
> > +
> > +	spin_lock(&bedata->pvcallss_lock);
> > +	req_id = bedata->ring.req_prod_pvt & (RING_SIZE(&bedata->ring) - 1);
> > +	if (RING_FULL(&bedata->ring) ||
> > +	    READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
> > +		kfree(map);
> > +		spin_unlock(&bedata->pvcallss_lock);
> > +		return -EAGAIN;
> > +	}
> > +	req = RING_GET_REQUEST(&bedata->ring, req_id);
> > +	req->req_id = req_id;
> > +	map->sock = sock;
> > +	req->cmd = PVCALLS_BIND;
> > +	req->u.bind.id = (uint64_t) sock;
> > +	memcpy(req->u.bind.addr, addr, sizeof(*addr));
> > +	req->u.bind.len = addr_len;
> > +
> > +	init_waitqueue_head(&map->passive.inflight_accept_req);
> > +
> > +	list_add_tail(&map->list, &bedata->socketpass_mappings);
> > +	WRITE_ONCE(sock->sk->sk_send_head, (void *)map);
> > +	map->active_socket = false;
> > +
> > +	bedata->ring.req_prod_pvt++;
> > +	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&bedata->ring, notify);
> > +	spin_unlock(&bedata->pvcallss_lock);
> > +	if (notify)
> > +		notify_remote_via_irq(bedata->irq);
> > +
> > +	wait_event(bedata->inflight_req,
> > +		   READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
> 
> This all looks very similar to previous patches. Can it be factored out?

You are right that the pattern is the same for all commands:
- get a request
- fill the request
- possibly do something else
- wait
however each request is different, the struct and fields are different.
There are spin_lock and spin_unlock calls intermingled. I am not sure I
can factor out much of this. Maybe I could create a static inline or
macro as a syntactic sugar to replace the wait call, but that's pretty
much it I think.


> Also, you've used wait_event_interruptible in socket() implementation. Why not
> here (and connect())?

My intention was to use wait_event to wait for replies everywhere but I
missed some of them in the conversion (I used to use
wait_event_interruptible in early versions of the code).

The reason to use wait_event is that it makes it easier to handle the
rsp slot in bedata (bedata->rsp[req_id]): in case of EINTR the response
in bedata->rsp would not be cleared by anybody. If we use wait_event
there is no such problem, and the backend could still return EINTR and
we would handle it just fine as any other responses.

I'll make sure to use wait_event to wait for a response (like here), and
wait_event_interruptible elsewhere (like in recvmsg, where we don't risk
leaking a rsp slot).

 
> > +
> > +	map->passive.status = PVCALLS_STATUS_BIND;
> > +	ret = bedata->rsp[req_id].ret;
> > +	/* read ret, then set this rsp slot to be reused */
> > +	smp_mb();
> > +	WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
> > +	return 0;
> > +}
> > +
> >   static const struct xenbus_device_id pvcalls_front_ids[] = {
> >   	{ "pvcalls" },
> >   	{ "" }
> > diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
> > index 63b0417..8b0a274 100644
> > --- a/drivers/xen/pvcalls-front.h
> > +++ b/drivers/xen/pvcalls-front.h
> > @@ -6,5 +6,8 @@
> >   int pvcalls_front_socket(struct socket *sock);
> >   int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr,
> >   			  int addr_len, int flags);
> > +int pvcalls_front_bind(struct socket *sock,
> > +		       struct sockaddr *addr,
> > +		       int addr_len);
> >     #endif

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ