lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87twxsgxjt.fsf@x220.int.ebiederm.org>
Date:	Tue, 10 Mar 2015 19:58:14 -0500
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Julian Anastasov <ja@....bg>
Cc:	David Miller <davem@...emloft.net>, edumazet@...gle.com,
	netdev@...r.kernel.org, stephen@...workplumber.org,
	nicolas.dichtel@...nd.com, roopa@...ulusnetworks.com,
	hannes@...essinduktion.org, ddutt@...ulusnetworks.com,
	vipin@...ulusnetworks.com, shmulik.ladkani@...il.com,
	dsahern@...il.com
Subject: Re: [PATCH net-next 3/6] tcp_metrics: Add a field tcpm_net and verify it matches on lookup

Julian Anastasov <ja@....bg> writes:

> 	Hello,
>
> On Tue, 10 Mar 2015, Eric W. Biederman wrote:
>
>> Julian Anastasov <ja@....bg> writes:
>> 
>> >>  struct tcp_metrics_block {
>> >>  	struct tcp_metrics_block __rcu	*tcpm_next;
>> >> +	struct net			*tcpm_net;
>> >
>> > 	Is it better if we have this field under
>> > #ifef CONFIG_NET_NS ? read_pnet and write_pnet do not
>> > care when first argument is not declared for the
>> > !CONFIG_NET_NS case.
>> 
>> I don't actually believe there are many people compiling out network
>> namespaces but there is no point in making it unnecessarily bad for them.
>
> 	Small routers. Also, we can move tcpm_net after
> tcpm_daddr to help machines with 32-byte cache line.
> It would be good if fields and access is in this order:
> tcpm_next, tcpm_daddr, tcpm_saddr and finally tcpm_net.
> tcpm_daddr is the leading key, it was tcpm_addr originally.
> If such change is not suitable for this patchset I can try it
> later after your changes.

I think optimizing the order of the fields and their alignment for
machines with a 32-byte cache line is beyond the scope of my patchset.

My gut says if you are a small router you might want to just turn this
off, because you would not be using many tcp sockets.

For a small machine doing tcp and benefitting from this cache I think
figuring out rhashtables for this data structure has merit.   I don't
know if rhashtables are ready for general use yet.

For machines with 64-byte cache lines placing saddr, daddr, net and next
all in the first 4 fields should be sufficient.

For machines with 32-byte cache lines you are in trouble because even
without struct net in their you need two cache lines as saddr and daddr
are both 20 bytes long.  It is just a mess.

>> That said.  If we actually really care about struct net going away it
>> would be much better to globally replace struct net with a typedef that
>> looks something like:
>> 
>> #ifdef CONFIG_NET_NS
>> struct net_ref {
>>        struct net *;
>> };
>> #else
>> struct net_ref {
>> };
>> #endif
>> typedef struct net_ref net_t;
>> 
>> That would remove the need for write_pnet and read_pnet, make it
>> impossible to forget net_eq and make network namespace arguments to
>> functions also boil away at compile time if the network namespace code
>> was not enabled.
>> 
>> That was the original design and I forget why we didn't do that with
>> struct net.  But we did not.
>
> 	Not sure. Only macros help to avoid ifdefs here
> and there in the code...
>
>> In this specific case an extra member of tcp_metrics_block isn't going
>> to hurt so I don't think write_pnet and read_pnet are particularly
>> interesting.   net_eq definitely seems worthwhile.
>
> 	Lets use them, even neighbour.c uses them.
> Your patchset shows the direction we go, net field to be
> part of the structures and using common tables.
> Macros like write_pnet and read_pnet look the only
> way to save memory on small systems. And net_eq should
> be optimized by compilers too.

net_eq definitely will.  And I have figured out how to have
write_pnet and read_pnet be palatiable to me.

I wish I could force the use of net_eq when the network namespace
field might be compiled out but that seems one step too far for now.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ