lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 17 Mar 2014 11:53:35 +0000 From: Andrew Bennieston <andrew.bennieston@...rix.com> To: Ian Campbell <Ian.Campbell@...rix.com> CC: <xen-devel@...ts.xenproject.org>, <wei.liu2@...rix.com>, <paul.durrant@...rix.com>, <netdev@...r.kernel.org>, <david.vrabel@...rix.com> Subject: Re: [PATCH V6 net-next 1/5] xen-netback: Factor queue-specific data into queue struct. On 14/03/14 15:55, Ian Campbell wrote: > On Mon, 2014-03-03 at 11:47 +0000, Andrew J. Bennieston wrote: >> From: "Andrew J. Bennieston" <andrew.bennieston@...rix.com> >> >> In preparation for multi-queue support in xen-netback, move the >> queue-specific data from struct xenvif into struct xenvif_queue, and >> update the rest of the code to use this. >> >> Also[...] >> >> Finally,[...] > > This is already quite a big patch, and I don't think the commit log > covers everything it changes/refactors, does it? > > It's always a good idea to break these things apart but in particular > separating the mechanical stuff (s/vif/queue/g) from the non-mechanical > stuff, since the mechanical stuff is essentially trivial to review and > getting it out the way makes the non-mechanical stuff much easier to > check (or even spot). > The vast majority of changes in this patch are s/vif/queue/g. The rest are related changes, such as inserting loops over queues, and moving queue-specific initialisation away from the vif-wide initialisation, so that it can be done once per queue. I consider these things to be logically related and definitely within the purview of this single patch. Without doing this, it is difficult to get a patch that results in something that even compiles, without putting in a bunch of placeholder code that will be removed in the very next patch. When I split this feature into multiple patches, I took care to group as little as possible into this first patch (and the same for netfront). It is still a large patch, but by my count most of this is a simple replacement of vif with queue... A first-order approximation, searching for line pairs where the first has 'vif' and the second has 'queue', yields: ➜ xen-netback git:(saturn) git show HEAD~4 | grep -A 1 vif | grep queue | wc -l 380 i.e. 760 (=380*2) lines out of the 2240 (~ 40%) are trivial replacements of vif with queue, and this is not counting multi-line replacements, of which there are many. What remains is mostly adding loops over these queues. This could, in principle, be done in a second patch, but the impact of this is small. > >> >> Signed-off-by: Andrew J. Bennieston <andrew.bennieston@...rix.com> >> Reviewed-by: Paul Durrant <paul.durrant@...rix.com> >> --- >> drivers/net/xen-netback/common.h | 85 ++++-- >> drivers/net/xen-netback/interface.c | 329 ++++++++++++++-------- >> drivers/net/xen-netback/netback.c | 530 ++++++++++++++++++----------------- >> drivers/net/xen-netback/xenbus.c | 87 ++++-- >> 4 files changed, 608 insertions(+), 423 deletions(-) >> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h >> index ae413a2..4176539 100644 >> --- a/drivers/net/xen-netback/common.h >> +++ b/drivers/net/xen-netback/common.h >> @@ -108,17 +108,39 @@ struct xenvif_rx_meta { >> */ >> #define MAX_GRANT_COPY_OPS (MAX_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE) >> >> -struct xenvif { >> - /* Unique identifier for this interface. */ >> - domid_t domid; >> - unsigned int handle; >> +/* Queue name is interface name with "-qNNN" appended */ >> +#define QUEUE_NAME_SIZE (IFNAMSIZ + 6) > > One more than necessary? Or does IFNAMSIZ not include the NULL? (I can't > figure out if it does or not!) interface.c contains the line: snprintf(name, IFNAMSIZ - 1, "vif%u.%u", domid, handle); This suggests that IFNAMSIZ counts the trailing NULL, so I can reduce this count by 1 on that basis. > >> [...] >> - /* This array is allocated seperately as it is large */ >> - struct gnttab_copy *grant_copy_op; >> + struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS]; > > Is this deliberate? It seems like a retrograde step reverting parts of > ac3d5ac27735 "xen-netback: fix guest-receive-side array sizes" from Paul > (at least you are nuking a speeling erorr) Yes, this was deliberate. These arrays were moved out to avoid problems with kmalloc for the struct net_device (which contains the struct xenvif in its netdev_priv space). Since the queues are now allocated via vzalloc, there is no need to do separate allocations (with the requirement to also separately free on every error/teardown path) so I moved these back into the main queue structure. > > How does this series interact with Zoltan's foreign mapping one? Badly I > should imagine, are you going to rebase? I'm working on the rebase right now. > >> + /* First, check if there is only one queue to optimise the >> + * single-queue or old frontend scenario. >> + */ >> + if (vif->num_queues == 1) { >> + queue_index = 0; >> + } else { >> + /* Use skb_get_hash to obtain an L4 hash if available */ >> + hash = skb_get_hash(skb); >> + queue_index = (u16) (((u64)hash * vif->num_queues) >> 32); > > No modulo num_queues here? > > Is the multiply and shift from some best practice somewhere? Or else > what is it doing? It seems to be what a bunch of other net drivers do in this scenario. I guess the reasoning is it'll be faster than a mod num_queues. > > >> + /* Obtain the queue to be used to transmit this packet */ >> + index = skb_get_queue_mapping(skb); >> + if (index >= vif->num_queues) >> + index = 0; /* Fall back to queue 0 if out of range */ > > Is this actually allowed to happen? > > Even if yes, not modulo num_queue so spread it around a bit? This probably isn't allowed to happen. I figured it didn't hurt to be a little defensive with the code here, and falling back to queue 0 is a fairly safe thing to do. >> static void xenvif_up(struct xenvif *vif) >> { >> - napi_enable(&vif->napi); >> - enable_irq(vif->tx_irq); >> - if (vif->tx_irq != vif->rx_irq) >> - enable_irq(vif->rx_irq); >> - xenvif_check_rx_xenvif(vif); >> + struct xenvif_queue *queue = NULL; >> + unsigned int queue_index; >> + >> + for (queue_index = 0; queue_index < vif->num_queues; ++queue_index) { > > This vif->num_queues -- is it the same as dev->num_tx_queues? Or areew > there differing concepts of queue around? It should be the same as dev->real_num_tx_queues, which may be less than dev->num_tx_queues. >> + queue = &vif->queues[queue_index]; >> + napi_enable(&queue->napi); >> + enable_irq(queue->tx_irq); >> + if (queue->tx_irq != queue->rx_irq) >> + enable_irq(queue->rx_irq); >> + xenvif_check_rx_xenvif(queue); >> + } >> } >> >> static void xenvif_down(struct xenvif *vif) >> { >> - napi_disable(&vif->napi); >> - disable_irq(vif->tx_irq); >> - if (vif->tx_irq != vif->rx_irq) >> - disable_irq(vif->rx_irq); >> - del_timer_sync(&vif->credit_timeout); >> + struct xenvif_queue *queue = NULL; >> + unsigned int queue_index; > > Why unsigned? Why not? You can't have a negative number of queues. Zero indicates "I don't have any set up yet". I'm not expecting people to have 4 billion or so queues, but equally I can't see a valid use for negative values here. > >> @@ -496,9 +497,30 @@ static void connect(struct backend_info *be) >> return; >> } >> >> - xen_net_read_rate(dev, &be->vif->credit_bytes, >> - &be->vif->credit_usec); >> - be->vif->remaining_credit = be->vif->credit_bytes; >> + xen_net_read_rate(dev, &credit_bytes, &credit_usec); >> + read_xenbus_vif_flags(be); >> + >> + be->vif->num_queues = 1; >> + be->vif->queues = vzalloc(be->vif->num_queues * >> + sizeof(struct xenvif_queue)); >> + >> + for (queue_index = 0; queue_index < be->vif->num_queues; ++queue_index) { >> + queue = &be->vif->queues[queue_index]; >> + queue->vif = be->vif; >> + queue->id = queue_index; >> + snprintf(queue->name, sizeof(queue->name), "%s-q%u", >> + be->vif->dev->name, queue->id); >> + >> + xenvif_init_queue(queue); >> + >> + queue->remaining_credit = credit_bytes; >> + >> + err = connect_rings(be, queue); >> + if (err) >> + goto err; >> + } >> + >> + xenvif_carrier_on(be->vif); >> >> unregister_hotplug_status_watch(be); >> err = xenbus_watch_pathfmt(dev, &be->hotplug_status_watch, >> @@ -507,18 +529,24 @@ static void connect(struct backend_info *be) >> if (!err) >> be->have_hotplug_status_watch = 1; >> >> - netif_wake_queue(be->vif->dev); >> + netif_tx_wake_all_queues(be->vif->dev); >> + >> + return; >> + >> +err: >> + vfree(be->vif->queues); >> + be->vif->queues = NULL; >> + be->vif->num_queues = 0; >> + return; > > Do you not need to unwind the setup already done on the previous queues > before the failure? Err... yes. I was sure that code existed at some point, but I can't find it now. Oops! -Andrew > >> } >> >> >> -static int connect_rings(struct backend_info *be) >> +static int connect_rings(struct backend_info *be, struct xenvif_queue *queue) >> { >> - struct xenvif *vif = be->vif; >> struct xenbus_device *dev = be->dev; >> unsigned long tx_ring_ref, rx_ring_ref; >> - unsigned int tx_evtchn, rx_evtchn, rx_copy; >> + unsigned int tx_evtchn, rx_evtchn; >> int err; >> - int val; >> >> err = xenbus_gather(XBT_NIL, dev->otherend, >> "tx-ring-ref", "%lu", &tx_ring_ref, >> @@ -546,6 +574,27 @@ static int connect_rings(struct backend_info *be) >> rx_evtchn = tx_evtchn; >> } >> >> + /* Map the shared frame, irq etc. */ >> + err = xenvif_connect(queue, tx_ring_ref, rx_ring_ref, >> + tx_evtchn, rx_evtchn); >> + if (err) { >> + xenbus_dev_fatal(dev, err, >> + "mapping shared-frames %lu/%lu port tx %u rx %u", >> + tx_ring_ref, rx_ring_ref, >> + tx_evtchn, rx_evtchn); >> + return err; >> + } >> + >> + return 0; >> +} >> + > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists