lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UcPExnW2jcZ9pAs0D65gXTU89jPEoCpsGVVT=FAW616Vg@mail.gmail.com>
Date: Tue, 2 Jul 2024 11:57:33 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>, 
	Alexander Duyck <alexanderduyck@...com>, kuba@...nel.org, davem@...emloft.net, 
	pabeni@...hat.com, edumazet@...gle.com, kernel-team@...a.com
Subject: Re: [net-next PATCH v3 11/15] eth: fbnic: Add link detection

On Tue, Jul 2, 2024 at 10:21 AM Russell King (Oracle)
<linux@...linux.org.uk> wrote:
>
> Thanks for the patch - this is a review mainly from the phylink
> perspective.
>
> On Tue, Jul 02, 2024 at 08:00:22AM -0700, Alexander Duyck wrote:
> > +static irqreturn_t fbnic_pcs_msix_intr(int __always_unused irq, void *data)
> > +{
> > +     struct fbnic_dev *fbd = data;
> > +     struct fbnic_net *fbn;
> > +     bool link_up;
> > +
> > +     if (!fbd->mac->pcs_get_link_event(fbd)) {
> > +             fbnic_wr32(fbd, FBNIC_INTR_MASK_CLEAR(0),
> > +                        1u << FBNIC_PCS_MSIX_ENTRY);
> > +             return IRQ_HANDLED;
> > +     }
> > +
> > +     link_up = fbd->link_state == FBNIC_LINK_UP;
> > +
> > +     fbd->link_state = FBNIC_LINK_EVENT;
> > +     fbn = netdev_priv(fbd->netdev);
> > +
> > +     phylink_pcs_change(&fbn->phylink_pcs, link_up);
>
> fbd->link_state seems to be set to FBNIC_LINK_UP when the
> mac_link_up(), more specifically fbnic_mac_link_up_asic() gets called.
> No, never report back to phylink what phylink asked the MAC/PCS to do!
>
> If you don't know what happened to the link here, then report that the
> link went down - in other words, always pass "false" to
> phylink_pcs_change() which ensures that phylink will never miss a
> link-down event (because it will assume that the link went down.)
>
> I think you could even do:
>
>         struct fbnic_dev *fbd = data;
>         struct fbnic_net *fbn;
>         int status;
>
>         status = fbd->mac->pcs_get_link_event(fbd);
>         if (status == 0) {
>                 fbnic_wr32(fbd, FBNIC_INTR_MASK_CLEAR(0),
>                            1u << FBNIC_PCS_MSIX_ENTRY);
>         } else {
>                 fbn = netdiev_priv(fbd->netdev);
>
>                 phylink_pcs_change(&fbn->phylink_pcs, status > 0);
>         }
>
>         return IRQ_HANDLED;
>

I will probably just default it to false for now to keep it simple.

> > +/**
> > + * fbnic_mac_enable - Configure the MAC to enable it to advertise link
> > + * @fbd: Pointer to device to initialize
> > + *
> > + * This function provides basic bringup for the CMAC and sets the link
> > + * state to FBNIC_LINK_EVENT which tells the link state check that the
> > + * current state is unknown and that interrupts must be enabled after the
> > + * check is completed.
> > + *
> > + * Return: non-zero on failure.
> > + **/
> > +int fbnic_mac_enable(struct fbnic_dev *fbd)
> > +{
> > +     struct fbnic_net *fbn = netdev_priv(fbd->netdev);
> > +     u32 vector = fbd->pcs_msix_vector;
> > +     int err;
> > +
> > +     /* Request the IRQ for MAC link vector.
> > +      * Map MAC cause to it, and unmask it
> > +      */
> > +     err = request_irq(vector, &fbnic_pcs_msix_intr, 0,
> > +                       fbd->netdev->name, fbd);
> > +     if (err)
> > +             return err;
> > +
> > +     fbnic_wr32(fbd, FBNIC_INTR_MSIX_CTRL(FBNIC_INTR_MSIX_CTRL_PCS_IDX),
> > +                FBNIC_PCS_MSIX_ENTRY | FBNIC_INTR_MSIX_CTRL_ENABLE);
> > +
> > +     phylink_start(fbn->phylink);
> > +
> > +     fbnic_wr32(fbd, FBNIC_INTR_SET(0), 1u << FBNIC_PCS_MSIX_ENTRY);
>
> If this is enabling the interrupt, ideally that should be before
> phylink_start().

Actually the interrupt is enabled after the first time
fbnic_pcs_get_link_asic/pcs_get_state is called.

>From what I can tell this is just adding an extra manual interrupt
trigger, though it isn't setting any cause bits so it is most likely
there just to trigger the interrupt to clear stale data. I suspect
this is some holdover code from version 1 that wasn't based on
phylink.

> > +void fbnic_mac_disable(struct fbnic_dev *fbd)
> > +{
> > +     struct fbnic_net *fbn = netdev_priv(fbd->netdev);
> > +
> > +     /* Nothing to do if link is already disabled */
> > +     if (fbd->link_state == FBNIC_LINK_DISABLED)
> > +             return;
> > +
> > +     phylink_stop(fbn->phylink);
>
> Why is this conditional? If you've called phylink_start(), and the
> network interface is being taken down administratively, then
> phylink_stop() needs to be called no matter what. If the link was
> up at that point, phylink will call your mac_link_down() as part
> of that. Moreover, the networking layers guarantee that .ndo_stop
> won't be called unless .ndo_open has been successfully called.

It is a bit of vestigial code from when we were running before adding
phylink support. This shouldn't stop us from calling phylink_stop as
the current setter for FBNIC_LINK_DISABLED is pcs_disable which is
called at the end of the phylink_stop function. Odds are this probably
mirrors PCS_STATE_DOWN

> > +static int fbnic_pcs_get_link_event_asic(struct fbnic_dev *fbd)
> > +{
> > +     u32 pcs_intr_mask = rd32(fbd, FBNIC_SIG_PCS_INTR_STS);
> > +
> > +     if (pcs_intr_mask & FBNIC_SIG_PCS_INTR_LINK_DOWN)
> > +             return -1;
> > +
> > +     return (pcs_intr_mask & FBNIC_SIG_PCS_INTR_LINK_UP) ? 1 : 0;
> > +}
>
> I think an enum/#define of some symbolic names would be useful both
> here and in the interrupt handler so we have something descriptive
> instead of -1, 0, 1.

I can look at converting these to an enum.

> > +static bool fbnic_pcs_get_link_asic(struct fbnic_dev *fbd)
> > +{
> > +     int link_direction;
> > +     bool link;
> > +
> > +     /* If disabled do not update link_state nor change settings */
> > +     if (fbd->link_state == FBNIC_LINK_DISABLED)
> > +             return false;
>
> If phylink_stop() has been called (one of the places you set
> link_state to FBNIC_LINK_DISABLED) then phylink will force the link
> down and will disregard state read from the .pcs_get_state() method.
>
> The other place is when fbnic_pcs_disable_asic() has been called,
> which I think is hooked into the .pcs_disable method. Well, if
> phylink has called the .pcs_disable method, then it is disconnecting
> from this PCS, and won't be calling .pcs_get_state anyway.
>
> So all in all, I think this check is unnecessary and should be removed.

Agreed

> > +
> > +     /* In an interrupt driven setup we can just skip the check if
> > +      * the link is up as the interrupt should toggle it to the EVENT
> > +      * state if the link has changed state at any time since the last
> > +      * check.
> > +      */
> > +     if (fbd->link_state == FBNIC_LINK_UP)
> > +             return true;
>
> Again, don't feed back to phylink what phylink asked you to do!

So the general idea with this is to avoid the need to recheck the
registers if there hasn't been an interrupt event to indicate that the
link changed. If the link has been confirmed up, and there hasn't been
a link event the assumption is we are still up. The interrupt for up
would be masked so we shouldn't see any up events as long as we are
up. That is why we just store the fact that the link is up and report
true here to indicate the link is present.

> > +
> > +     link_direction = fbnic_pcs_get_link_event_asic(fbd);
> > +
> > +     /* Clear interrupt state due to recent changes. */
> > +     wr32(fbd, FBNIC_SIG_PCS_INTR_STS,
> > +          FBNIC_SIG_PCS_INTR_LINK_DOWN | FBNIC_SIG_PCS_INTR_LINK_UP);
> > +
> > +     /* If link bounced down clear the PCS_STS bit related to link */
> > +     if (link_direction < 0) {
> > +             wr32(fbd, FBNIC_SIG_PCS_OUT0, FBNIC_SIG_PCS_OUT0_LINK |
> > +                                      FBNIC_SIG_PCS_OUT0_BLOCK_LOCK |
> > +                                      FBNIC_SIG_PCS_OUT0_AMPS_LOCK);
> > +             wr32(fbd, FBNIC_SIG_PCS_OUT1, FBNIC_SIG_PCS_OUT1_FCFEC_LOCK);
> > +     }
>
> If the link "bounces" down, then phylink needs to know - but that
> would be covered if, when you receive an interrupt, you always
> call phylink_pcs_change(..., false). Still, phylink can deal with
> latched-low clear-on-read link statuses. I think you want to read
> the link status, and if it's indicating link-failed, then clear
> the latched link-failed state.
>
> > +
> > +     link = fbnic_mac_get_pcs_link_status(fbd);
> > +
> > +     if (link_direction)
> > +             wr32(fbd, FBNIC_SIG_PCS_INTR_MASK,
> > +                  link ?  ~FBNIC_SIG_PCS_INTR_LINK_DOWN :
> > +                          ~FBNIC_SIG_PCS_INTR_LINK_UP);
>
> Why do you need to change the interrupt mask? Can't you just leave
> both enabled and let the hardware tell you when something changes?

It is mostly just about screening out unwanted info. As I mentioned we
were already tracking if we were up or down previously. So we were
only concerned if the link transitioned to the other state.

> > +static int fbnic_pcs_enable_asic(struct fbnic_dev *fbd)
> > +{
> > +     /* Mask and clear the PCS interrupt, will be enabled by link handler */
> > +     wr32(fbd, FBNIC_SIG_PCS_INTR_MASK, ~0);
> > +     wr32(fbd, FBNIC_SIG_PCS_INTR_STS, ~0);
> > +
> > +     /* Pull in settings from FW */
> > +     fbnic_pcs_get_fw_settings(fbd);
> > +
> > +     /* Flush any stale link status info */
> > +     wr32(fbd, FBNIC_SIG_PCS_OUT0, FBNIC_SIG_PCS_OUT0_LINK |
> > +                              FBNIC_SIG_PCS_OUT0_BLOCK_LOCK |
> > +                              FBNIC_SIG_PCS_OUT0_AMPS_LOCK);
>
> If the link went down, it's better to allow phylink to see that.

This isn't link down, this is a state link up we are flushing.

> > +static void fbnic_mac_link_down_asic(struct fbnic_dev *fbd)
> > +{
> > +     u32 cmd_cfg, mac_ctrl;
> > +
> > +     if (fbd->link_state == FBNIC_LINK_DOWN)
> > +             return;
>
> You shouldn't need this.
>
> > +static void fbnic_mac_link_up_asic(struct fbnic_dev *fbd)
> > +{
> > +     u32 cmd_cfg, mac_ctrl;
> > +
> > +     if (fbd->link_state == FBNIC_LINK_UP)
> > +             return;
>
> You shouldn't need this.

I can strip those two checks. They are mostly carry-over from an
earlier version.

> > +/* Treat the FEC bits as a bitmask laid out as follows:
> > + * Bit 0: RS Enabled
> > + * Bit 1: BASER(Firecode) Enabled
> > + * Bit 2: Autoneg FEC
> > + */
> > +enum {
> > +     FBNIC_FEC_OFF           = 0,
> > +     FBNIC_FEC_RS            = 1,
> > +     FBNIC_FEC_BASER         = 2,
> > +     FBNIC_FEC_AUTO          = 4,
> > +};
> > +
> > +#define FBNIC_FEC_MODE_MASK  (FBNIC_FEC_AUTO - 1)
> > +
> > +/* Treat the link modes as a set of moldulation/lanes bitmask:
>
> Spelling: modulation
>

Will fix.

> > @@ -22,9 +23,19 @@ struct fbnic_net {
> >
> >       u16 num_napi;
> >
> > +     struct phylink *phylink;
> > +     struct phylink_config phylink_config;
> > +     struct phylink_pcs phylink_pcs;
> > +
> > +     u8 tx_pause;
> > +     u8 rx_pause;
>
> If you passed these flags into your .link_up method, then you don't need
> to store them.

I need to keep track of the tx_pause at a minimum as a part of the Rx
queue configuration as we use it to determine if we drop packets for
full queues or push back into the FIFO.

> > +     u8 fec;
> > +     u8 link_mode;
>
> I think "link_mode" can be entirely removed.

I was actually going to reach out to you guys about that. For this
patch set I think it may be needed as I have no way to sort out
50000baseCR2 (NRZ, 2 lanes) vs 50000baseCR (PAM4, 1 lane) in the
current phylink code for setting the mac link up. I was wondering if
you had any suggestions on how I might resolve that?

Basically I have a laundry list of things that I was planning to start
on in the follow on sets:

1. I still need to add CGMII support as technically we are using a
different interface mode to support 100Gbps. Seems like I can mostly
just do a find/insert based on the PHY_INTERFACE_MODE_XLGMII to add
that so it should be straight forward.

2. We have 2 PCS blocks we are working with to set up the CR2 modes. I
was wondering if I should just be writing my PCS code to be handling
the merged pair of IP or if I should look at changing the phylink code
to support more than one PCS device servicing a given connection?

3. The FEC config is integral to the PCS and MAC setup on my device. I
was wondering why FEC isn't included as a part of the phylink code?
Are there any plans to look at doing that? Otherwise what is the
recommended setup for handling that as it can be negotiated via
autoneg for our 25G and 50G-R2 links so I will need to work out how to
best go after that.

> > +static void
> > +fbnic_phylink_pcs_get_state(struct phylink_pcs *pcs,
> > +                         struct phylink_link_state *state)
> > +{
> > +     struct fbnic_net *fbn = fbnic_pcs_to_net(pcs);
> > +     struct fbnic_dev *fbd = fbn->fbd;
> > +
> > +     /* For now we use hard-coded defaults and FW config to determine
> > +      * the current values. In future patches we will add support for
> > +      * reconfiguring these values and changing link settings.
> > +      */
> > +     switch (fbd->fw_cap.link_speed) {
> > +     case FBNIC_FW_LINK_SPEED_25R1:
> > +             state->speed = SPEED_25000;
> > +             break;
> > +     case FBNIC_FW_LINK_SPEED_50R2:
> > +             state->speed = SPEED_50000;
> > +             break;
> > +     case FBNIC_FW_LINK_SPEED_100R2:
> > +             state->speed = SPEED_100000;
> > +             break;
> > +     default:
> > +             state->speed = SPEED_UNKNOWN;
> > +             break;
> > +     }
> > +
> > +     state->pause |= MLO_PAUSE_RX;
> > +     state->duplex = DUPLEX_FULL;
> > +     state->interface = PHY_INTERFACE_MODE_XLGMII;
>
> Please don't set state->interface, and please read the documentation
> for this method:
>
>  * Read the current inband link state from the MAC PCS, reporting the
>  * current speed in @state->speed, duplex mode in @state->duplex, pause
>  * mode in @state->pause using the %MLO_PAUSE_RX and %MLO_PAUSE_TX bits,
>  * negotiation completion state in @state->an_complete, and link up state
>  * in @state->link. If possible, @state->lp_advertising should also be
>  * populated.
>
> Note that it doesn't say that state->interface should be set (it
> shouldn't.)

Okay, I will remove that.

> > +int fbnic_phylink_init(struct net_device *netdev)
> > +{
> > +     struct fbnic_net *fbn = netdev_priv(netdev);
> > +     struct phylink *phylink;
> > +
> > +     fbn->phylink_pcs.ops = &fbnic_phylink_pcs_ops;
>
> Please also set phylink_pcs.pcs_neg_mode = true (required for modern
> drivers), especially as you call the argument "neg_mode" in your
> pcs_config function.

Will do.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ