lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL9ddJc-U+5dgi82gqEfurGKiLX=aTvATpPeR4xOHRVvn0PRwA@mail.gmail.com>
Date:   Thu, 27 Aug 2020 12:24:05 -0700
From:   David Awogbemila <awogbemila@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Yangchun Fu <yangchun@...gle.com>, netdev@...r.kernel.org,
        Kuo Zhao <kuozhao@...gle.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>
Subject: Re: [PATCH net-next 05/18] gve: Add Gvnic stats AQ command and
 ethtool show/set-priv-flags.

> > I could use some help figuring out the use of rtnl_link_stats64 here.
> > These 4 stats are per-queue stats written by the NIC. It looks like
> > rtnl_link_stats64 is meant to sum stats for the entire device? Is the
> > requirement here simply to use the member names in rtnl_link_stats64
> > when reporting stats via ethtool? Thanks.
>
> If these are per queue you can report them per queue in ethtool (name
> is up to you), but you must report the sum over all queues in
> rtnl_link_stats64.
>
> FWIW the mapping I'd suggest is probably:
>
> RX_QUEUE_DROP_CNT         -> rx_dropped
> RX_NO_BUFFERS_POSTED      -> rx_missed_errors
> RX_DROPS_PACKET_OVER_MRU  -> rx_length_errors
> RX_DROPS_INVALID_CHECKSUM -> rx_crc_errors
>
> The no-buffers-posted stat is unfortunately slightly disputable between
> rx_missed_errors and rx_fifo_errors (even rx_over_errors). I'd go for missed.
This is very helpful, thanks. The aggregate stats are the stats named
in gve_gstrings_main_stats (in gve_ethtool.c). This particular patch
does not change those. Patch 2 ("gve: Add stats for gve.") is what I
think may have to change to conform to rtnl_link_stats64(?)

> > > > > > +static int gve_set_priv_flags(struct net_device *netdev, u32 flags)
> > > > > > +{
> > > > > > +     struct gve_priv *priv = netdev_priv(netdev);
> > > > > > +     u64 ori_flags, new_flags;
> > > > > > +     u32 i;
> > > > > > +
> > > > > > +     ori_flags = READ_ONCE(priv->ethtool_flags);
> > > > > > +     new_flags = ori_flags;
> > > > > > +
> > > > > > +     for (i = 0; i < GVE_PRIV_FLAGS_STR_LEN; i++) {
> > > > > > +             if (flags & BIT(i))
> > > > > > +                     new_flags |= BIT(i);
> > > > > > +             else
> > > > > > +                     new_flags &= ~(BIT(i));
> > > > > > +             priv->ethtool_flags = new_flags;
> > > > > > +             /* set report-stats */
> > > > > > +             if (strcmp(gve_gstrings_priv_flags[i], "report-stats") == 0) {
> > > > > > +                     /* update the stats when user turns report-stats on */
> > > > > > +                     if (flags & BIT(i))
> > > > > > +                             gve_handle_report_stats(priv);
> > > > > > +                     /* zero off gve stats when report-stats turned off */
> > > > > > +                     if (!(flags & BIT(i)) && (ori_flags & BIT(i))) {
> > > > > > +                             int tx_stats_num = GVE_TX_STATS_REPORT_NUM *
> > > > > > +                                     priv->tx_cfg.num_queues;
> > > > > > +                             int rx_stats_num = GVE_RX_STATS_REPORT_NUM *
> > > > > > +                                     priv->rx_cfg.num_queues;
> > > > > > +                             memset(priv->stats_report->stats, 0,
> > > > > > +                                    (tx_stats_num + rx_stats_num) *
> > > > > > +                                    sizeof(struct stats));
> > > > >
> > > > > I don't quite get why you need the knob to disable some statistics.
> > > > > Please remove or explain this in the cover letter. Looks unnecessary.
> > > > We use this to give the guest the option of disabling stats reporting
> > > > through ethtool set-priv-flags. I'll update the cover letter.
> > >
> > > I asked you why you reply a week later with "I want to give user the
> > > option. I'll update the cover letter." :/ That's quite painful for the
> > > reviewer. Please just provide the justification.
> > I apologize for the pain; it certainly wasn't intended :) .
> > Just to clarify, stats will always be available to the user via ethtool.
> > This is only giving users the option of disabling the reporting of
> > stats from the driver to the virtual NIC should the user decide they
> > do not want to share driver stats with Google as a matter of privacy.
>
> Okay, so this is for the to-hypervisor direction. Hopefully the patch
> split will make this clearer.
>
> > > > > > @@ -880,6 +953,10 @@ static void gve_handle_status(struct gve_priv *priv, u32 status)
> > > > > >               dev_info(&priv->pdev->dev, "Device requested reset.\n");
> > > > > >               gve_set_do_reset(priv);
> > > > > >       }
> > > > > > +     if (GVE_DEVICE_STATUS_REPORT_STATS_MASK & status) {
> > > > > > +             dev_info(&priv->pdev->dev, "Device report stats on.\n");
> > > > >
> > > > > How often is this printed?
> > > > Stats reporting is disabled by default. But when enabled, this would
> > > > only get printed whenever the virtual NIC detects
> > > > an issue and triggers a report-stats request.
> > >
> > > What kind of issue? Something serious? Packet drops?
> > Sorry, to correct myself, this would get printed only at the moments
> > when the device switches report-stats on, not on every stats report.
> > After that, it would not get printed until it is switched off and then
> > on again, so rarely.
> > It would get switched on if there is a networking issue such as packet
> > drops and help us investigate a stuck queue for example.
>
> Reporting of the stats is a one-shot:
>
> +       if (gve_get_do_report_stats(priv)) {
> +               gve_handle_report_stats(priv);
> +               gve_clear_do_report_stats(priv);
> +       }
>
> So the hypervisor implements some rate limiting on the requests?
Yes, not every packet drop would trigger the request, only if the
virtual NIC feels that "too many" are being dropped.
When report-stats is turned on (both user ethtool setting is enabled
and device has requested to turn it on), the stats would be updated
once every 20 seconds.

>
> In general, I don't think users will want these messages to keep
> popping up. A ethtool -S statistic for the number of times reporting
> happened should be sufficient. Or if you really want them - have a
> verbose version of the priv flag, maybe?
An ethtool stat (counting requests to turn report-stats on from the
virtual NIC) should suffice.
(I think a verbose logging option should probably apply to not just
this log, so maybe that's another patch in itself?)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ