netdev - Re: [PATCH net-next 1/8] ipvlan: Implement learnable L2-bridge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58174e6d-f473-4e95-b78e-7a4a9711174e@huawei.com>
Date: Thu, 23 Oct 2025 13:21:20 +0300
From: Dmitry Skorodumov <skorodumov.dmitry@...wei.com>
To: Simon Horman <horms@...nel.org>
CC: <netdev@...r.kernel.org>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <andrey.bokhanko@...wei.com>, "David S.
 Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub
 Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Jonathan Corbet
	<corbet@....net>, Andrew Lunn <andrew+netdev@...n.ch>
Subject: Re: [PATCH net-next 1/8] ipvlan: Implement learnable L2-bridge

On 22.10.2025 17:23, Simon Horman wrote:
> On Tue, Oct 21, 2025 at 05:44:03PM +0300, Dmitry Skorodumov wrote:
>> Now it is possible to create link in L2E mode: learnable
>> bridge. The IPs will be learned from TX-packets of child interfaces.
> Is there a standard for this approach - where does the L2E name come from?

Actually, I meant "E" here as "Extended". But more or less standard naming - is "MAC NAT" - "Mac network address translation". I discussed a bit naming with LLM, and it suggested name "macsnat".. looks like  it is a better name. Hope it is ok, but I don't mind to rename if anyone has better idea

> ...
>
> It is still preferred in networking code to linewrap lines
> so that they are not wider than 80 columns, where than can be done without
> reducing readability. Which appears to be the case here.
>
> Flagged by checkpatch.pl --max-line-length=80
...
> Please don't use the inline keyword in .c files

Thank you, this will be fixed

>> +static void ipvlan_addr_learn(struct ipvl_dev *ipvlan, void *lyr3h,
>> +			      int addr_type)
>> +{
>> +	void *addr = NULL;
>> +	bool is_v6;
>> +
>> +	switch (addr_type) {
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	/* No need to handle IPVL_ICMPV6, since it never has valid src-address */
>> +	case IPVL_IPV6: {
>> +		struct ipv6hdr *ip6h;
>> +
>> +		ip6h = (struct ipv6hdr *)lyr3h;
>> +		if (!is_ipv6_usable(&ip6h->saddr))
> It is preferred to avoid #if / #ifdef in order to improve compile coverage
> (and, I would argue, readability).
..
> In this case I think that can be achieved by changing the line above to:
>
> 		if (!IS_ENABLED(CONFIG_IPV6) || !is_ipv6_usable(&ip6h->saddr))
>
> I think it would be interesting to see if a similar approach can be used
> to remove other #if CONFIG_IPV6 conditions in this file, and if successful
> provide that as a clean-up as the opening patch in this series.
>
> However, without that, I can see how one could argue for the approach
> you have taken here on the basis of consistency.
>

Hmmmm.... this raises a complicated for me questions of testing this refactoring: 

- whether IPv6 specific functions (like csum_ipv6_magic(), register_inet6addr_notifier()) are available if kernel is compiled without CONFIG_IPV6

- ideally the code should be retested with kernel without CONFIG_IPV6

This looks like a separate work that requires more or less additional efforts...

> static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
>>  {
>> -	const struct ipvl_dev *ipvlan = netdev_priv(dev);
>> -	struct ethhdr *eth = skb_eth_hdr(skb);
>> -	struct ipvl_addr *addr;
>>  	void *lyr3h;
>> +	struct ipvl_addr *addr;
>>  	int addr_type;
>> +	bool same_mac_addr;
>> +	struct ipvl_dev *ipvlan = netdev_priv(dev);
>> +	struct ethhdr *eth = skb_eth_hdr(skb);
> I realise that the convention is not followed in the existing code,
> but please prefer to arrange local variables in reverse xmas tree order -
> longest line to shortest.
I fixed all my changes to follow this style, except one - where it seems a bit unnatural to to declare dependent variable before "parent" variable. Hope it is ok.
>> +	    ether_addr_equal(eth->h_source, dev->dev_addr)) {
>> +		/* ignore tx-packets from host */
>> +		goto out_drop;
>> +	}
>> +
>> +	same_mac_addr = ether_addr_equal(eth->h_dest, eth->h_source);
>> +
>> +	lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
>>  
>> -	if (!ipvlan_is_vepa(ipvlan->port) &&
>> -	    ether_addr_equal(eth->h_dest, eth->h_source)) {
>> -		lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, &addr_type);
>> +	if (ipvlan_is_learnable(ipvlan->port)) {
>> +		if (lyr3h)
>> +			ipvlan_addr_learn(ipvlan, lyr3h, addr_type);
>> +		/* Mark SKB in advance */
>> +		skb = skb_share_check(skb, GFP_ATOMIC);
>> +		if (!skb)
>> +			return NET_XMIT_DROP;
> I think that when you drop packets a counter should be incremented.
> Likewise elsewhere in this function.
The counter appears to be handled in parent function - in ipvlan_start_xmit()
>> +	addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
>> +	if (addr) {
>> +		int ret, len;
>> +
>> +		ipvlan_skb_crossing_ns(skb, addr->master->dev);
>> +		skb->protocol = eth_type_trans(skb, skb->dev);
>> +		skb->pkt_type = PACKET_HOST;
>> +		ipvlan_mark_skb(skb, port->dev);
>> +		len = skb->len + ETH_HLEN;
>> +		ret = netif_rx(skb);
>> +		ipvlan_count_rx(ipvlan, len, ret == NET_RX_SUCCESS, false);
>>
>> This fails to build because ipvlan is not declared in this scope.
>> Perhaps something got missed due to an edit?
Oops, really. Compilation was fixed in later patches.
>> +
>> +out:
>> +	dev_kfree_skb(skb);
>> +no_mem:
>> +	return 0; // actually, ret value is ignored
> Maybe, but it seems to me that the return values
> should follow that of netif_receive_skb_core().
Agree.. will be fixed.

Dmitru