lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 3 Dec 2013 17:59:36 -0800
From:	Alexei Starovoitov <ast@...mgrid.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Or Gerlitz <or.gerlitz@...il.com>,
	David Miller <davem@...emloft.net>,
	Joseph Gasparakis <joseph.gasparakis@...el.com>,
	Jerry Chu <hkchu@...gle.com>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Pravin B Shelar <pshelar@...ira.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: vxlan/veth performance issues on net.git + latest kernels

On Tue, Dec 3, 2013 at 5:23 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Tue, 2013-12-03 at 16:55 -0800, Alexei Starovoitov wrote:
>> On Tue, Dec 3, 2013 at 4:36 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
>> > On Tue, 2013-12-03 at 16:26 -0800, Alexei Starovoitov wrote:
>> >
>> >>
>> >> Could you use some other way to mark skb ?
>> >
>> > I could ;)
>> >
>> >> In tracing we might want to examine skb more carefully and not being
>> >> able to see the device
>> >> will limit the usability of this tracepoint.
>> >
>> > Unfortunately, using skb->dev as a pointer to device would be buggy or
>> > expensive (you would need to take a reference on device in order not
>> > letting it disappear, as we escape RCU protection)
>>
>> well, yes, you might have an skb around when device is already freed
>> when skb_dst_noref.
>> but I'm not suggesting anything expensive. Tracing definitely should
>> not add overhead by doing rcu_lock() or dev_hold(). Instead it can go
>> through skb, skb->dev, skb->dev->xxx via probe_kernel_read(). If dev
>> is gone, it's still safe.
>
> Its certainly not safe to 'probe'.
>
> Its not about faulting inexistent memory, that is the least of the
> problem.
>
> Any kind of information fetched from skb->dev might have been
> overwritten.
>
> You could for example fetch security sensitive data and expose it.

Of course.
Even without walking pointer chains with probe() you could infer all
sorts of info from tracepoints.
That's why tracing filters are for root only.

>> > Anyway, this magic is pretty easy to change, I am open to suggestions.
>>
>> you're the expert :) use skb->mark field, since it's unused during
>> freeing path... ?
>
> cache line miss ;)
>
> skb->dev is in the first cache line, where we access skb->next anyway.
>
> I could use skb->cb[] like the following patch :
>
> +struct __dev_kfree_skb_cb {
> +       unsigned int reason;
> +};
> +
> +static inline struct __dev_kfree_skb_cb *get_kfree_skb_cb(const struct sk_buff *skb)
> +{
> +       return (struct __dev_kfree_skb_cb *)skb->cb;
> +}
> +
> +enum {
> +       SKB_REASON_CONSUMED,
> +       SKB_REASON_DROPPED,
> +};
> +
>  /* Use this variant in places where it could be invoked
>   * from either hardware interrupt or other context, with hardware interrupts
>   * either disabled or enabled.
> + * Note that TX completion should use dev_consume_skb_any()
>   */
> -void dev_kfree_skb_any(struct sk_buff *skb);
> +static inline void dev_kfree_skb_any(struct sk_buff *skb)
> +{
> +       get_kfree_skb_cb(skb)->reason = SKB_REASON_DROPPED;
> +       __dev_kfree_skb_any(skb);
> +}
> +
> +static inline void dev_consume_skb_any(struct sk_buff *skb)
> +{
> +       get_kfree_skb_cb(skb)->reason = SKB_REASON_CONSUMED;
> +       __dev_kfree_skb_any(skb);
> +}

thanks. I think that is much cleaner. Ack.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ