lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8738if8klj.fsf@xmission.com>
Date:	Tue, 18 Mar 2014 10:47:36 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Stephen Hemminger <stephen@...workplumber.org>
Cc:	David Miller <davem@...emloft.net>, eric.dumazet@...il.com,
	netdev@...r.kernel.org, xiyou.wangcong@...il.com, mpm@...enic.com,
	satyam.sharma@...il.com
Subject: Re: [PATCH 6/6] net: Free skbs from irqs when possible.

Stephen Hemminger <stephen@...workplumber.org> writes:

> On Mon, 17 Mar 2014 23:27:52 -0700
> ebiederm@...ssion.com (Eric W. Biederman) wrote:
>
>> Add a test skb_irq_freeable to report when it is safe to free a skb
>> from irq context.
>> 
>> It is not safe to free an skb from irq context when:
>> - The skb has a destructor as some skb destructors call local_bh_disable
>>   or spin_lock_bh.
>> - There is xfrm state as __xfrm_state_destroy calls spin_lock_bh.
>> - There is netfilter conntrack state as destroy_conntrack calls
>>   spin_lock_bh.
>> - If there is a refcounted dst entry on the skb, as __dst_free
>>   calls spin_lock_bh.
>> - If there is a frag_list, which could be a list of any skbs.
>> Otherwise it appears safe to free a skb from interrupt context.
>> 
>> - Update the warning in skb_releae_head_state to warn about freeing
>>   skb's in the wrong context.
>> 
>> - Update __dev_kfree_skb_irq to free all skbs that it can immediately
>> 
>> - Kill zap_completion_queue because there is no point going through
>>   a queue of packets that are not safe to free and looking for packets
>>   that are safe to free.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
>
> Why introduce the additional complexity for so little gain?
> It looks like you are only optimizing for the corner case where netpoll
> is cleaning up on Tx.
>
> -1

The netpoll cleanup is interesting.  And certainly something needs to be
done to fix/cleanup zap_completion_queue in netpoll.

However the deep reason for introducing skb_irq_freeable() is to fix the
warnings in skb_release_head_state aka __kfree_skb.  Only warning when
we have a destructor in irq context strongly suggests that it is only
when we have destructors that we have problems.

Most of the destructors today are fine (which doubly makes the warning
confusing).  Assuming we don't have a dst reference when a packet is
transmitted it is the presence of iptables in the system that makes tx
packets not freeable from hard irq context.

Received packets seem to be always freeable and freeing packets on the
error path of packet reception could actually be helped by this.
Especially in the drivers like the e1000 that have misplaced calls to
dev_kfree_skb_irq.

With respect to netpoll the only skbs that are likely to be freeable in
the netpoll transmit path are packets netpoll has transmitted itself.
So if this does not fit in the generic dev_kfree_skb_irq it definitely
makes sense to perform this test in netpoll's zap_completion_queue.

At a practical level seeing a skb->destructor typically will be what
pushes the code into the delayed freeing scenario.  So I don't see this
slowing things down noticably.

In addition freeing skbs in hard irq context (outside of netpoll) is
something only older mostly pre NAPI drivers do.  So frankly it seems
reasonable to me to optimize for the common case of freeing skbs in
hard irq context (i.e. netpoll).

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ