netdev - Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080812070005.GB5066@ff.dom.local>
Date:	Tue, 12 Aug 2008 07:00:05 +0000
From:	Jarek Poplawski <jarkao2@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock().

On Mon, Aug 11, 2008 at 10:40:47PM -0700, David Miller wrote:
...
> Those comments are out of date and I need to update them.
> In fact this whole loop is now largely pointless.
> 
> The rcu_dereference() on dev_queue->qdisc happens before the
> QDISC_RUNNING bit is set.
> 
> We no longer resample the qdisc under any kind of lock.  Because we no
> longer have a top-level lock that synchronizes the setting of
> dev_queue->qdisc
> 
> Rather, the lock we use for calling ->enqueue() and ->dequeue() is
> inside of the root qdisc itself.
> 
> That's why all of the real destruction has to occur in the RCU handler.
> 
> Anyways, this is part of the problem I think is causing the crash the
> Intel folks are triggering.
> 
> We sample the qdisc in dev_queue_xmit() or wherever, then we attach
> that to the per-cpu ->output_queue to process it via qdisc_run()
> in the software interrupt handler.
> 
> The RCU quiesce period extends to the next scheduling point and this
> is enough if we do normal direct softirq processing of this qdisc.
> 
> But if it gets postponed into ksoftirqd... the RCU will pass too
> early.
> 
> I'm still thinking about how to fix this without avoiding RCU
> and without adding new synchronization primitives.

Of course I've to miss something, but I still don't get it: after
synchronize_rcu() in dev_deactivate() we are sure anyone in
dev_queue_xmit() rcu block has to see the change to noop_qdisc(),
so it can only lose packets and not really enqueue(). IMHO the
only problem is this __netif_schedule(), which could be done with
dev_queues instead of Qdiscs with proper dereferencing there.
(BTW, I think we need rcu_read_lock() instead of the _bh() version in
dev_queue_xmit() to match this with rcu_call() or synchronize_rcu().)

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html