netdev - Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs. dev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 20 Sep 2008 00:21:37 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	jarkao2@...il.com
Cc:	herbert@...dor.apana.org.au, netdev@...r.kernel.org,
	kaber@...sh.net
Subject: Re: [PATCH take 2] pkt_sched: Fix qdisc_watchdog() vs.
 dev_deactivate() race

From: Jarek Poplawski <jarkao2@...il.com>
Date: Sun, 14 Sep 2008 22:27:15 +0200

[ I'm just picking one posting out of several in this thread.  ]

> Anyway, the main problem here was a high cpu load despite stopped
> queue. Are you sure this peek, which is almost full dequeue, can
> really help for this? BTW, since after current fix there were no
> later complains I guess it's just about full netif_stop or non-mq
> device.

I think we are overengineering this situation.

Let's look at what actually matters for cpu utilization.  These
__qdisc_run() things are invoked in two situations where we might
block on the hw queue being stopped:

1) When feeding packets into the qdisc in dev_queue_xmit().

   Guess what?  We _know_ the queue this packet is going to
   hit.

   The only new thing we can possible trigger and be interested
   in at this specific point is if _this_ packet can be sent at
   this time.

   And we can check that queue mapping after the qdisc_enqueue_root()
   call, so that multiq aware qdiscs can have made their changes.

2) When waking up a queue.  And here we should schedule the qdisc_run
   _unconditionally_.

   If the queue was full, it is extremely likely that new packets
   are bound for that device queue.  There is no real savings to
   be had by doing this peek/requeue/dequeue stuff.

The cpu utilization savings exist for case #1 only, and we can
implement the bypass logic _perfectly_ as described above.

For #2 there is nothing to check, just do it and see what comes
out of the qdisc.

I would suggest adding an skb pointer argument to qdisc_run().
If it's NULL, unconditionally schedule __qdisc_run().  Else,
only schedule if the TX queue indicated by skb_queue_mapping()
is not stopped.

dev_queue_xmit() will use the "pass the skb" case, but only if
qdisc_enqueue_root()'s return value doesn't indicate that there
is a potential drop.  On potential drop, we'll pass NULL to
make sure we don't potentially reference a free'd SKB.

The other case in net_tx_action() can always pass NULL to qdisc_run().
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html