netdev - Re: [PATCH] NET: Multiqueue network device support.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <466DEF9D.9070509@trash.net>
Date:	Tue, 12 Jun 2007 02:58:05 +0200
From:	Patrick McHardy <kaber@...sh.net>
To:	hadi@...erus.ca
CC:	"Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@...el.com>,
	davem@...emloft.net, netdev@...r.kernel.org, jeff@...zik.org,
	"Kok, Auke-jan H" <auke-jan.h.kok@...el.com>
Subject: Re: [PATCH] NET: Multiqueue network device support.

jamal wrote:
> On Mon, 2007-11-06 at 17:44 +0200, Patrick McHardy wrote:
> 
>>At this point the qdisc might send new packets. What do you do when a
>>packet for a full ring arrives?
>>
> 
> 
> Hrm... ok, is this a trick question or i am missing the obvious?;->
> What is wrong with what any driver would do today - which is:
> netif_stop and return BUSY; core requeues the packet?
> 
> 
>>I see three choices:
>>
>>- drop it, even though its still within the qdiscs configured limits
>>- requeue it, which does not work because the qdisc is still active
>>  and might just hand you the same packet over and over again in a
>>  busy loop, until the ring has more room (which has the same worst
>>  case, just that we're sitting in a busy loop now).
>>- requeue and stop the queue: we're back to where we started since
>>  now higher priority packets will not get passed to the driver.
> 
> 
> Refer to choice #4 above. 


Replying again so we can hopefully move forward soon. Your choice #4
is exactly what I proposed as choice number 3.

Let me repeat my example why it doesn't work (well) without multiple
queue states (with typos etc fixed) and describe the possibilities.
If you still disagree I hope you can just change my example to show
how it gets fixed. As a thank you I will actually understand that
your solution works as well :)

We have n empty HW queues served in ascending priority order
with a maximum length of m packets per queue:

[0] empty
[1] empty
[2] empty
..
[n-1] empty

Now we receive m - 1 packets for all priorities >= 1 and < n - 1,
so we have:

[0] empty
[1] m - 1 packets
[2] m - 1 packets
..
[n-2] m - 1 packets
[n-1] empty

Since no HW queue is completely full, the queue is still active.
Now we receive m packets of priority n - 1:

[0] empty
[1] m - 1 packets
[2] m - 1 packets
..
[n-2] m - 1 packets
[n-1] m packets

At this point the queue needs to be stopped since the highest
priority queue is entirely full. To start it again at least
one packet of queue n - 1 needs to be sent, which requires
that queues 1 to n - 2 are serviced first. So any prio 0 packet
arriving during this period will sit in the qdisc and will not
reach the device for up to the time for (n - 2) * (m - 1) + 1
full sized packet transmissions. With multiple queue states
we'd know that queue 0 can still take packets.

You agreed that this is a problem and instead of keeping the
queue stopped until all rings can take at least one packet
again you proposed:

> - let the driver shutdown whenever a ring is full. Remember which
>   ring X shut it down.
> - when you get a tx interupt or prun tx descriptors, if a
>   ring <= X has transmitted a packet (or threshold of packets),
>   then wake up the driver (i.e open up).

At this point the queue is active, but at least one ring is already
full and the qdisc can still pass packets for it to the driver.
When this happens we can:

- drop it. This makes qdisc configured limit meaningless since
  the qdisc can't anticipate when the packet will make it through
  or get dropped.

- requeue it: this might result in a busy loop if the qdisc
  decides to hand out the packet again. The loop will be
  terminated once the ring has more room available an can
  eat the packet, which has the same worst case behaviour
  I described above.

- requeue (return BUSY) and stop the queue: thats what you
  proposed as option #4. The question is when to wake the
  queue again. Your suggestion was to wake it when some
  other queue with equal or higher priority got dequeued.
  Correcting my previous statement, you are correct that
  this will fix the starvation of higher priority queues
  because the qdisc has a chance to hand out either a packet
  of the same priority or higher priority, but at the cost of
  at worst (n - 1) * m unnecessary dequeues+requeues in case
  there is only a packet of lowest priority and we need to
  fully serve all higher priority HW queues before it can
  actually be dequeued. The other possibility would be to
  activate the queue again once all rings can take packets
  again, but that wouldn't fix the problem, which you can
  easily see if you go back to my example and assume we still
  have a low priority packet within the qdisc when the lowest
  priority ring fills up (and the queue is stopped), and after
  we tried to wake it and stopped it again the higher priority
  packet arrives.

Considering your proposal in combination with RR, you can see
the same problem of unnecessary dequeues+requeues. Since there
is no priority for waking the queue when a equal or higher
priority ring got dequeued as in the prio case, I presume you
would wake the queue whenever a packet was sent. For the RR
qdisc dequeue after requeue should hand out the same packet,
independantly of newly enqueued packets (which doesn't happen
and is a bug in Peter's RR version), so in the worst case the
HW has to make the entire round before a packet can get
dequeued in case the corresponding HW queue is full. This is
a bit better than prio, but still up to n - 1 unnecessary
requeues+dequeues. I think it can happen more often than
for prio though.

> The patches are trivial - really; as soon as Peter posts the e1000
> change for his version i should be able to cutnpaste and produce one
> that will work with what i am saying.


Forgetting about things like multiple qdisc locks and just
looking at queueing behaviour, the question seems to come
down to whether the unnecessary dequeues/requeues are acceptable
(which I don't think since they are easily avoidable). OTOH
you could turn it around and argue that the patches won't do
much harm since ripping them out again (modulo queue mapping)
should result in the same behaviour with just more overhead.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html