lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 20 Oct 2006 13:48:26 -0700
From:	Stephen Hemminger <shemminger@...l.org>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] netpoll: rework skb transmit queue

On Fri, 20 Oct 2006 13:42:09 -0700 (PDT)
David Miller <davem@...emloft.net> wrote:

> From: Stephen Hemminger <shemminger@...l.org>
> Date: Fri, 20 Oct 2006 08:40:15 -0700
> 
> > -static void queue_process(void *p)
> > +static void netpoll_run(unsigned long arg)
> >  {
>  ...
> > -		spin_unlock_irqrestore(&queue_lock, flags);
> > +		netif_tx_lock(dev);
> > +		if (netif_queue_stopped(dev) ||
> > +		    dev->hard_start_xmit(skb, dev) != NETDEV_TX_OK) {
> > +			skb_queue_head(&npinfo->tx_q, skb);
> > +			netif_tx_unlock(dev);
> > +			tasklet_schedule(&npinfo->tx_task);
> > +			return;
> > +		}
> 
> We really can't handle TX stopped this way from the netpoll_send_skb()
> path.  All that old retry logic in netpoll_send_skb() is really
> necessary.
> 
> If we are in deep IRQ context, took an OOPS, and are trying to get a
> netpoll packet out for the kernel log message, we have to try as hard
> as possible to get the packet out then and there, even if that means
> waiting some amount of time for netif_queue_stopped() to become false.
> 

But, it also violates the assumptions of the network devices.
It calls NAPI poll back with IRQ's disabled and potentially doesn't
obey the semantics about only running on the same CPU as the
received packet.

> That is what the existing code is trying to do.
> 
> If you defer to a tasklet, the kernel state from the OOPS can be so
> corrupted that the tasklet will never run and we'll never get the
> netconsole message needed to debug the problem.

So we can try once and if that fails we have to defer to tasklet.
We can't call NAPI, we can't try and cleanup the device will need
an IRQ to get unblocked.  

Or add another device callback that just to handle that case.

> Also, if we tasklet schedule from the tasklet, we'll just keep looping
> in the tasklet and never leave softirq context, which is also bad
> behavior.  Even in the tasklet, we should spin and poll when possible
> like the current netpoll_send_skb() code does.
> 
> So we really can't apply this patch.


-- 
Stephen Hemminger <shemminger@...l.org>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ