lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <3CA45501-9FD3-4A33-97A8-AB0BD77AC61F@pjd.dev>
Date:   Thu, 15 Dec 2022 17:31:17 -0800
From:   Peter Delevoryas <peter@....dev>
To:     Alexander H Duyck <alexander.duyck@...il.com>
Cc:     Peter Delevoryas <peter@....dev>, sam@...dozajonas.com,
        davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
        pabeni@...hat.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net/ncsi: Always use unicast source MAC address



> On Dec 15, 2022, at 5:07 PM, Peter Delevoryas <peter@....dev> wrote:
> 
> 
> 
>> On Dec 13, 2022, at 8:41 AM, Alexander H Duyck <alexander.duyck@...il.com> wrote:
>> 
>> On Mon, 2022-12-12 at 16:47 -0800, Peter Delevoryas wrote:
>>> I use QEMU for development, and I noticed that NC-SI packets get dropped by
>>> the Linux software bridge[1] because we use a broadcast source MAC address
>>> for the first few NC-SI packets.
>> 
>> Normally NC-SI packets should never be seen by a bridge.
> 
> True, and it’s good to keep this in context. I’m trying to make this change
> to support simulation environments, but any change in NC-SI could easily
> result in the out-of-band network connection to BMC’s in real data centers
> failing to come up, which can be really bad and usually impossible to
> recover remotely.
> 
>> Isn't NC-SI
>> really supposed to just be between the BMC and the NIC firmware?
> 
> Yep
> 
>> Depending on your setup it might make more sense to use something like
>> macvtap or a socket connection to just bypass the need for the bridge
>> entirely.
> 
> For unicast, yes, but I want to test multiple NIC’s sharing an RMII
> link and verifying the broadcast behavior, and the failover behavior
> when an RX or TX channel goes down.
> 
> The multicast UDP socket backend _does_ work, but I was getting some
> recirculation problems or some kind of buffering thing. I managed
> to get tap0 + tap1 + br0 working faster.
> 
>> 
>>> The spec requires that the destination MAC address is FF:FF:FF:FF:FF:FF,
>>> but it doesn't require anything about the source MAC address as far as I
>>> know. From testing on a few different NC-SI NIC's (Broadcom 57502, Nvidia
>>> CX4, CX6) I don't think it matters to the network card. I mean, Meta has
>>> been using this in mass production with millions of BMC's [2].
>>> 
>>> In general, I think it's probably just a good idea to use a unicast MAC.
>> 
>> I'm not sure I agree there. What is the initial value of the address?
> 
> Ok so, to be honest, I thought that the BMC’s FTGMAC100 peripherals
> came with addresses provisioned from the factory, and that we were just
> discarding that value and using an address provisioned through the NIC,
> because I hadn’t really dug into the FTGMAC100 datasheet fully. I see now
> that the MAC address register I thought was a read-only manufacturing
> value is actually 8 different MAC address r/w registers for filtering.
> *facepalm*
> 
> It suddenly makes a lot more sense why all these OEM Get MAC Address
> commands exist: the BMC chip doesn’t come with any MAC addresses from
> manufacturing. It’s a necessity, not some convenience artifact/etc.
> 
> So, tracing some example systems to see what shows up:
> 
> One example:
> INIT: Entering runlevel: 5
> Configuring network interfaces... [   25.893118] 8021q: adding VLAN 0 to HW filter on device eth0
> [   25.904809] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   25.917307] ftgmac100 1e660000.ethernet eth0: NCSI: Handler for packet type 0x82 returned -19
> [   25.958096] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   25.978124] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   25.990559] ftgmac100 1e660000.ethernet eth0: NCSI: 'bad' packet ignored for type 0xd0
> [   26.018180] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.030631] ftgmac100 1e660000.ethernet eth0: NCSI: 'bad' packet ignored for type 0xd0
> [   26.046594] ftgmac100 1e660000.ethernet eth0: NCSI: transmit cmd 0x50 ncsi_oem_sma_mlx success during probe
> [   26.168109] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.198101] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.238237] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.272011] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.308155] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.320504] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> done.
> [   26.408094] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.438100] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.450537] ftgmac100 1e660000.ethernet eth0: NCSI: bcm_gmac16 MAC RE:DA:CT:ED:HE:HE
> Starting random number generator[   26.472388] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> daemon[   26.518241] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.559504] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> .
> [   26.608229] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> Setup dhclient for IPv6... done.
> [   26.681879] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.730523] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.808191] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> [   26.855689] NCSI: source=ff:ff:ff:ff:ff:ff dest=ff:ff:ff:ff:ff:ff
> 
> Oddly, due to that code you mentioned, all NC-SI packets are using
> a broadcast source MAC address, even after the Get MAC Address sequence
> gets the MAC provisioned for the BMC from the Broadcom NIC.
> 
> root@...-oob:~# ifconfig
> eth0      Link encap:Ethernet  HWaddr RE:DA:CT:ED:HE:HE
>          inet addr:XXXXXXX  Bcast:XXXXXXXX  Mask:XXXXXXXX
>          inet6 addr: XXXXXXXX Scope:Global
>          inet6 addr: XXXXXXXX Scope:Link
>          inet6 addr: XXXXXXXX Scope:Global
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:2965 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:637 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:872759 (852.3 KiB)  TX bytes:59936 (58.5 KiB)
>          Interrupt:19
> 
> But, that’s a system using the 5.0 kernel with lots of old hacks
> on top. A system using a 5.15 kernel with this change included:
> 
> INIT: Entering runlevel: 5
> Configuring network interfaces... [    6.596537] 8021q: adding VLAN 0 to HW filter on device eth0
> [    6.609264] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.622913] ftgmac100 1e690000.ftgmac eth0: NCSI: Handler for packet type 0x82 returned -19
> [    6.641447] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.662543] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.680454] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.694114] ftgmac100 1e690000.ftgmac eth0: NCSI: transmit cmd 0x50 ncsi_oem_sma_mlx success during probe
> [    6.715722] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> done.
> [    6.741372] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.741451] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.768714] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> Starting random [    6.782599] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> number generator[    6.799321] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> daemon[    6.815680] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.831388] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> .
> [    6.846921] ftgmac100 1e690000.ftgmac eth0: NCSI: Network controller supports NC-SI 1.1, querying MAC address through OEM(0x8119) command
> Setup dhclient for IPv6... done.
> [    6.908921] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> reloading rsyslo[    6.933085] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> 
> So, this BMC already had the provisioned MAC address somehow,
> even before the Nvidia Get MAC Address command towards the bottom.
> 
> Adding tracing to ftgmac100:
> 
> [    2.018672] ftgmac100_initial_mac
> [    2.026090] Read MAC address from FTGMAC100 register: RE:DA:CT:ED:AD:DR
> [    2.040771] ftgmac100 1e690000.ftgmac: Read MAC address RE:DA:CT:ED:AD:DR from chip
> [    2.057774] ftgmac100 1e690000.ftgmac: Using NCSI interface
> [    2.070957] ftgmac100 1e690000.ftgmac eth0: irq 33, mapped at (ptrval)
> 
> Now, after rewriting the FTGMAC100 register to fa:ce:b0:0c:20:22 and rebooting:
> 
> root@...p6-2620-10d-c0b9-4b08-0-0-0-e4e:~# devmem 0x1e690008 32 0x0000face
> root@...p6-2620-10d-c0b9-4b08-0-0-0-e4e:~# devmem 0x1e690008
> 0x0000FACE
> root@...p6-2620-10d-c0b9-4b08-0-0-0-e4e:~# devmem 0x1e69000c 32 0xb00c2022
> root@...p6-2620-10d-c0b9-4b08-0-0-0-e4e:~# devmem 0x1e69000c
> 0xB00C2022
> 
> [    2.001304] ftgmac100_initial_mac
> [    2.008727] Read MAC address from FTGMAC100 register: fa:ce:b0:0c:20:22
> [    2.023373] ftgmac100 1e690000.ftgmac: Read MAC address fa:ce:b0:0c:20:22 from chip
> [    2.040367] ftgmac100 1e690000.ftgmac: Using NCSI interface
> 
> [    6.581239] ftgmac100_reset_mac
> [    6.589193] ftgmac100_reset_mac
> [    6.596727] 8021q: adding VLAN 0 to HW filter on device eth0
> [    6.609462] NCSI: source=fa:ce:b0:0c:20:22 dest=ff:ff:ff:ff:ff:ff
> [    6.623117] ftgmac100 1e690000.ftgmac eth0: NCSI: Handler for packet type 0x82 returned -19
> [    6.641647] NCSI: source=fa:ce:b0:0c:20:22 dest=ff:ff:ff:ff:ff:ff
> [    6.662398] NCSI: source=fa:ce:b0:0c:20:22 dest=ff:ff:ff:ff:ff:ff
> [    6.680380] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.694000] ftgmac100 1e690000.ftgmac eth0: NCSI: transmit cmd 0x50 ncsi_oem_sma_mlx success during probe
> [    6.715700] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> [    6.729528] NCSI: source=RE:DA:CT:ED:AD:DR dest=ff:ff:ff:ff:ff:ff
> 
> So, it looks like whatever is initialized in ftgmac100_initial_mac becomes
> the address we use for the NCSI queries initially.
> 
> The Aspeed datasheet says the FTGMAC100 MAC address registers are initialized to zero,
> and in that case the ftgmac100 driver initializes it to something random
> with eth_hw_addr_random().
> 
> So, I mean correct me if I’m wrong, but I think it all seems fine?
> 
> On a hard power cycle (instead of just resetting the ARM cores, which doesn’t seem to
> have reset the peripherals), maybe it would actually be zero, and get initialized
> to the random value. I’ll test that, need to do some more debug image building to do it
> remotely.

Oh, that didn’t take as long as I expected. Here’s the results from a real
power cycle:

[    5.248154] ftgmac100_initial_mac
[    5.255470] Read MAC Address from FTGMAC100 register: 00:00:00:00:00:00
[    5.255482] Generated random MAC Address: 4e:c7:78:ec:cd:4a
[    5.282434] ftgmac100 1e690000.ftgmac: Generated random MAC address 4e:c7:78:ec:cd:4a

So yeah, in full power-cycles, it’ll be some random address.

> 
>> If I am not mistaken the gma_flag is used to indicate that the MAC
>> address has been acquired isn't it?
> 
> That’s correct.
> 
>> If using the broadcast is an issue
>> the maybe an all 0's MAC address might be more appropriate.
> 
> Possibly yeah, although that would also be dropped by the Linux bridge lol,
> so it wouldn’t solve my simulation problem.
> 
>> My main
>> concern would be that the dev_addr is not initialized for those first
>> few messages so you may be leaking information.
>> 
>>> This might have the effect of causing the NIC to learn 2 MAC addresses from
>>> an NC-SI link if the BMC uses OEM Get MAC Address commands to change its
>>> initial MAC address, but it shouldn't really matter. Who knows if NIC's
>>> even have MAC learning enabled from the out-of-band BMC link, lol.
>>> 
>>> [1]: https://tinyurl.com/4933mhaj
>>> [2]: https://tinyurl.com/mr3tyadb
>> 
>> The thing is the OpenBMC approach initializes the value themselves to
>> broadcast[3]. As a result the two code bases are essentially doing the
>> same thing since mac_addr is defaulted to the broadcast address when
>> the ncsi interface is registered.
> 
> That’s a very good point, thanks for pointing that out, I hadn’t
> even noticed that!
> 
> Anyways, let me know what you think of the traces I added above.
> Sorry for the delay, I’ve just been busy with some other stuff,
> but I do really actually care about upstreaming this (and several
> other NC-SI changes I’ll submit after this one, which are unrelated
> but more useful).
> 
> Thanks,
> Peter
> 
>> 
>> [3]: tinyurl.com/mr3cxf3b
>> 
>>> 
>>> Signed-off-by: Peter Delevoryas <peter@....dev>
>>> ---
>>> net/ncsi/ncsi-cmd.c | 10 +---------
>>> 1 file changed, 1 insertion(+), 9 deletions(-)
>>> 
>>> diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c
>>> index dda8b76b7798..fd090156cf0d 100644
>>> --- a/net/ncsi/ncsi-cmd.c
>>> +++ b/net/ncsi/ncsi-cmd.c
>>> @@ -377,15 +377,7 @@ int ncsi_xmit_cmd(struct ncsi_cmd_arg *nca)
>>> eh = skb_push(nr->cmd, sizeof(*eh));
>>> eh->h_proto = htons(ETH_P_NCSI);
>>> eth_broadcast_addr(eh->h_dest);
>>> -
>>> - /* If mac address received from device then use it for
>>> - * source address as unicast address else use broadcast
>>> - * address as source address
>>> - */
>>> - if (nca->ndp->gma_flag == 1)
>>> - memcpy(eh->h_source, nca->ndp->ndev.dev->dev_addr, ETH_ALEN);
>>> - else
>>> - eth_broadcast_addr(eh->h_source);
>>> + memcpy(eh->h_source, nca->ndp->ndev.dev->dev_addr, ETH_ALEN);
>>> 
>>> /* Start the timer for the request that might not have
>>> * corresponding response. Given NCSI is an internal


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ