netdev - Re: [PATCH v2] Receive Packet Steering

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <873abli9is.fsf@basil.nowhere.org>
Date:	Mon, 04 May 2009 09:59:07 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Tom Herbert <therbert@...gle.com>
Cc:	netdev@...r.kernel.org, David Miller <davem@...emloft.net>
Subject: Re: [PATCH v2] Receive Packet Steering

Tom Herbert <therbert@...gle.com> writes:

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 052dd47..3107544 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1906,6 +1906,142 @@ int weight_p __read_mostly = 64;            /*
> old backlog weight */
>
>  DEFINE_PER_CPU(struct netif_rx_stats, netdev_rx_stat) = { 0, };
>
> +static u32 simple_hashrnd;

This should be __read_mostly

> +static int simple_hashrnd_initialized;

Also I suspect you can just use 0 as uninitialized 
and avoid one variable. If the RNG reports 0 you get
worst case another initialization.

> +static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb)
> +{
> +	u32 addr1, addr2, ports;
> +	struct ipv6hdr *ip6;
> +	struct iphdr *ip;
> +	u32 hash, ihl;
> +	u8 ip_proto;
> +	int cpu;
> +
> +	if (!dev->rps_map_len)
> +		return -1;
> +
> +	if (unlikely(!simple_hashrnd_initialized)) {
> +		get_random_bytes(&simple_hashrnd, 4);
> +		simple_hashrnd_initialized = 1;
> +	}

The usual problem of this is that if the kernel gets a packet sufficiently
early the random state will be always the default state, which is not
very random.

So either you use a timeout to reinit regularly (like other subsystems
do) or just use a fixed value for reproducibility. I suspect the
later would be nicer for benchmakers.

+	case __constant_htons(ETH_P_IPV6):
+		if (!pskb_may_pull(skb, sizeof(*ip6)))
+			return -1;
+
+		ip6 = (struct ipv6hdr *) skb->data;
+		ip_proto = ip6->nexthdr;
+		addr1 = ip6->saddr.s6_addr32[3];
+		addr2 = ip6->daddr.s6_addr32[3];

Wouldn't it be better to hash in everything in the ipv6 address in this case? 

> +
> +	hash = jhash_3words(addr1, addr2, ports, simple_hashrnd);
> +
> +	cpu = skb->dev->rps_map[((u64) hash * dev->rps_map_len) >> 32];

For 32bit systems it would be nice to avoid the u64 cast. gcc doesn't
generate very good code for that.

> +	return cpu_online(cpu) ? cpu : -1;

I suspect this is still racy with cpu hotunplug.

> +	cpus_clear(__get_cpu_var(rps_remote_softirq_cpus));
> +
> +	local_irq_enable();
> +}
> +
> +/**
> + * enqueue_to_backlog is called to queue an skb to a per CPU backlog
> + * queue (may be a remote CPU queue).
> + */
> +static int enqueue_to_backlog(struct sk_buff *skb, int cpu)
> +{
> +	struct softnet_data *queue;
> +	unsigned long flags;
> +
> +	queue = &per_cpu(softnet_data, cpu);

Are you sure preemption is disabled here? Otherwise this must be 
one line below (can be tested by enabling preempt & preempt debug)

> +
> +	if (!capable(CAP_NET_ADMIN))
> +		return -EPERM;
> +
> +	err = bitmap_parse(buf, len, cpumask_bits(&mask), nr_cpumask_bits);
> +	if (err)
> +		return err;
> +
> +	rtnl_lock();
> +	if (dev_isalive(net)) {
> +		if (!net->rps_map) {
> +			net->rps_map = kzalloc(sizeof(u16) *
> +			    num_possible_cpus(), GFP_KERNEL);
> +			if (!net->rps_map)
> +				return -ENOMEM;

You don't unlock rtnl_lock here.

-Andi

-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html