lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 23 May 2024 17:35:41 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Krzysztof Olędzki <ole@....pl>
Cc: Ido Schimmel <idosch@...dia.com>, Michal Kubecek <mkubecek@...e.cz>,
	Moshe Shemesh <moshe@...dia.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	tariqt@...dia.com
Subject: Re: "netlink error: Invalid argument" with ethtool-5.13+ on recent
 kernels due to "ethtool: Add netlink handler for getmodule (-m)" -
 25b64c66f58d3df0ad7272dda91c3ab06fe7a303, also no SFP-DOM support via
 netlink?

> > Before you do that, please could you work on ethtool. I would say it
> > has a bug. It has been provided with 256 bytes of SPF data. It should
> > be able to decode that and print it in human readable format. So the
> > EINVAL should not be considered fatal to decoding.
> 
> Yes, I was also thinking this way. Luckily, similar to the situation with the mlx4 driver, all the logic is there - sff8636_dom_parse() checks if map->page_03h is set and if not, just returns gracefully.
> 
> So, all we need to do is modify sff8636_memory_map_init_pages():
> @@ -1038,7 +1039,7 @@
>         sff8636_request_init(&request, 0x3, SFF8636_PAGE_SIZE);
>         ret = nl_get_eeprom_page(ctx, &request);
>         if (ret < 0)
> -               return ret;
> +               return 0;
>         map->page_03h = request.data - SFF8636_PAGE_SIZE;

Nice. Please submit a patch.

I see there is a discussion about if there should be a warning here or
not. The problem we have is that some NICs offload getting data from
the SFP to firmware, and the firmware does not support more than
getting the first page. The IOCTL API returned whatever the was
available, and ethtool would decode what it was given. With the change
to the newer API, ethtool can ask for any page. Those devices using
limited firmware cannot fulfil the request, and are going to return an
error, either -EOPNOTSUPP, or maybe -EINVAL. We don't want ethtool to
regress, so being silent is probably the correct thing to do.

It does however hide bugs in drivers, but we can maybe find them via
testing. It would be nice to have CI/CD test which runs ethtool using
both the IOCTL interface and the netlink and compares the output. The
IOCTL output should be a subset of the netlink output. In this case,
it clearly is not a subset.

> Finally, as I was looking at the code in fallback_set_params() I started thinking if the length check is actually correct?
> 
> I think instead of:
>  if (offset >= modinfo->eeprom_len)
> we may want:
>  if (offset + length > modinfo->eeprom_len)
> 

> I don't know if it is safe to assume we always read a single page
> and cross page reads are not allowed and even if so, that we should
> rely on this instead of checking the len explicitly? What do you
> think?

Cross page reads are not allowed by the netlink API. SFPs do funny
things when you do a cross page read. I forget the details, but it is
something like it wraps around within the same 1/2 page. However, SFPs
vendors like to ignore the standard, don't feel they need to conform
to it, and i would not be surprised if some do continue the read in
the next page because that is simpler, or they could not be bothered
to read the standard in detail. By defining the API to not allow cross
page reads, or even cross 1/2 page reads, we avoid this mess. This
validation should happen at a higher level, where the netlink
parameters are validated for the call.

You can probably test this:

       ethtool -m|--dump-module-eeprom|--module-info devname [raw on|off]
              [hex on|off] [offset N] [length N]

You probably need hex on, and then can then set offset and length and
see if the validation works.

    Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ