[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57E02F82.9060903@akamai.com>
Date: Mon, 19 Sep 2016 14:33:38 -0400
From: Jason Baron <jbaron@...mai.com>
To: "Mintz, Yuval" <Yuval.Mintz@...ium.com>,
"davem@...emloft.net" <davem@...emloft.net>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Ariel.Elior@...gic.com" <Ariel.Elior@...gic.com>
Subject: Re: [PATCH net-next 2/2] bnx2x: allocate mac filtering pending list
in PAGE_SIZE increments
On 09/18/2016 06:25 AM, Mintz, Yuval wrote:
>> Currently, we can have high order page allocations that specify
>> GFP_ATOMIC when configuring multicast MAC address filters.
>>
>> For example, we have seen order 2 page allocation failures with
>> ~500 multicast addresses configured.
>>
>> Convert the allocation for the pending list to be done in PAGE_SIZE
>> increments.
>>
>> Signed-off-by: Jason Baron <jbaron@...mai.com>
>
> While I appreciate the effort, I wonder whether it's worth it:
>
> - The hardware [even in its newer generation] provides an approximate
> based classification [I.e., hashed] with 256 bins.
> When configuring 500 multicast addresses, one can argue the
> difference between multicast-promisc mode and actual configuration
> is insignificant.
With 256 bins, I think it takes close to: 256*lg(256) or 2,048
multicast addresses to expect to have all bins have at least one hash,
assuming a uniform distribution of the hashes.
> Perhaps the easier-to-maintain alternative would simply be to
> determine the maximal number of multicast addresses that can be
> configured using a single PAGE, and if in need of more than that
> simply move into multicast-promisc.
>
sizeof(struct bnx2x_mcast_list_elem) = 24. So there are 170 per
page on x86. So if we want to fit 2,048 elements, we need 12 pages.
> - While GFP_ATOMIC is required in this flow due to the fact it's being
> called from sleepless context, I do believe this is mostly a remnant -
> it's possible that by slightly changing the locking scheme we can have
> the configuration done from sleepless context and simply switch to
> GFP_KERNEL instead.
>
Ok if its GFP_KERNEL, I think its still undesirable to do large page
order allocations (unless of course its converted to a single page, but
I'm not sure this makes sense as mentioned).
> Regarding the patch itself, only comment I have:
>> + elem_group = (struct bnx2x_mcast_elem_group *)
>> + elem_group->mcast_group_link.next;
> Let's use list_next_entry() instead.
>
>
Yes, agreed.
I think it would be easy to add a check to bnx2x_set_rx_mode_inner() to
enforce some maximum number of elements (perhaps 2,048 based on the
above math) for the !CHIP_IS_E1() case on top of what I already posted.
Thanks,
-Jason
Powered by blists - more mailing lists