[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241107-packing-pack-fields-and-ice-implementation-v3-0-27c566ac2436@intel.com>
Date: Thu, 07 Nov 2024 11:50:31 -0800
From: Jacob Keller <jacob.e.keller@...el.com>
To: Vladimir Oltean <olteanv@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>,
Masahiro Yamada <masahiroy@...nel.org>, netdev <netdev@...r.kernel.org>
Cc: linux-kbuild@...r.kernel.org, Jacob Keller <jacob.e.keller@...el.com>,
Vladimir Oltean <vladimir.oltean@....com>
Subject: [PATCH net-next v3 0/9] lib: packing: introduce and use
(un)pack_fields
This series improves the packing library with a new API for packing or
unpacking a large number of fields at once with minimal code footprint. The
API is then used to replace bespoke packing logic in the ice driver,
preparing it to handle unpacking in the future. Finally, the ice driver has
a few other cleanups related to the packing logic.
The pack_fields and unpack_fields functions have the following improvements
over the existing pack() and unpack() API:
1. Packing or unpacking a large number of fields takes significantly less
code. This significantly reduces the .text size for an increase in the
.data size which is much smaller.
2. The unpacked data can be stored in sizes smaller than u64 variables.
This reduces the storage requirement both for runtime data structures,
and for the rodata defining the fields. This scales with the number of
fields used.
3. Most of the error checking is done at compile time, rather than
runtime, via additional checks added to modpost. This saves wasted
computation time *and* catches errors in the field definitions early,
rather than only after the offending code is executed.
The actual packing and unpacking code still uses the u64 size
variables. However, these are converted to the appropriate field sizes when
storing or reading the data from the buffer.
The compile time checks do add some complexity to modpost. An attempt was
made originally to implement the checks with C macros. Unfortunately, the C
pre-processor cannot generate loops. Without loops, drivers are responsible
for calling the correct CHECK macro. In addition, this required
several thousand lines of macro to provide variants for the sizes expected
to be used. While this was possible to be generated, it still meant a lot
of mess spreading to the drivers, Makefile, and build process.
To enable modpost checks, the packed field arrays are placed into a special
section of the binary, using DECLARE_PACKED_FIELDS_(S|M) macros. This makes
it relatively easy for drivers to enable. We can also easily implement
checkpatch or cocinelle scripts to detect raw usage of the packed fields
structures if this becomes a problem.
The modpost logic is implemented by scanning the symbols to find entries
in the relevant ".rodata.packed_field_s" or ".rodata.packed_field_m"
sections. This must be extended if we ever add support for a
"packed_field_l" or similar, though this seems unlikely.
In order to enforce buffer size, modpost must have access to the size of
the target buffer. This is achieved by having DECLARE_PACKED_FIELDS* also
generate a const variable with the buffer size. I tried a couple of other
methods of obtaining the size, but this was the simplest one. I think it
may be possible to have the unused symbols discarded, but I did not attempt
this.
I'm in favor of this modpost implementation vs the macros attempted in
previous iterations.
Pros:
* Significantly less code to implement the checks, even if we ignore the
generated portions in <generated/packing-checks.h>
* The modpost checks are able to work in arbitrarily sized arrays, without
needing to add any modification in the future. The macro implementation
could require adding additional macros if a driver ever needed >50
fields.
* Checks are relatively self contained to modpost. We don't need to edit
the top level build Kbuild, we don't need to add 50 extra config
options, and we don't need to leave behind a 20K+ LOC generated header
file.
* The fields are required to be fully ordered, by detecting whether the
first two fields indicate ascending or descending order. This enables
driver authors to order the fields in which ever ordering is most
natural for their data. The macro checks did a full overlap check but
did not enforce order.
* Drivers simply DECLARE_PACKED_FIELDS_(S|M) and get modpost warnings,
without requiring a manual call to CHECK_PACKED_FIELDS*. This should
make it easier to review, and less likely that checks are skipped.
Cons:
* modpost doesn't seem to have a way to track the symbols back to line
numbers, so we can't report that info to the user.
* modpost errors are reported slightly later in the compile process.
* To maintain the full set of checks, we need to export the size of the
target buffer to modpost, which I implemented with an additional
variable that is otherwise entirely unused.
* We have to place the packed field arrays in special sections to enable
modpost. This is handled by DECLARE macros, so we have to ensure all
users use these macros. I did not attempt to add a check for that in
this series, but it could be done in principle with cocinelle or
checkpatch.
* A fair amount of boilerplate code is required to extract the relevant
data from the symbol definitions, including special casing to handle the
potential endian issues when using packed_field_m.
* These type of checks do feel somewhat ancillary to the original purpose
of modpost.
The fact that the mess of checking the fields is fairly self contained and
avoids the mess of the CHECK_* and config options is a big win, and I think
the modpost option is significantly better for this.
Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
---
Changes in v3:
- Replace macro-based C pre-processor checks with checks implemented in
modpost.
- Move structure definitions into <linux/packing_types.h> to enable reuse
within modpost.
- Add DECLARE_PACKED_FIELDS_S and DECLARE_PACKED_FIELDS_M to enable
automatically generating the buffer size constants and the section
attributes.
- Add additional unit tests for the pack_fields and unpack_fields APIs.
- Update documentation with an explanation of the new API as well as some
example code.
- Link to v2: https://lore.kernel.org/r/20241025-packing-pack-fields-and-ice-implementation-v2-0-734776c88e40@intel.com
Changes in v2:
- Add my missing sign-off to the first patch
- Update the descriptions for a few patches
- Only generate CHECK_PACKED_FIELDS_N when another module selects it
- Add a new patch introducing wrapper structures for the packed Tx and Rx
queue context, suggested by Vladimir.
- Drop the now unnecessary macros in ice, thanks to the new types
- Link to v1: https://lore.kernel.org/r/20241011-packing-pack-fields-and-ice-implementation-v1-0-d9b1f7500740@intel.com
---
Jacob Keller (6):
ice: remove int_q_state from ice_tlan_ctx
ice: use structures to keep track of queue context size
ice: use <linux/packing.h> for Tx and Rx queue context data
ice: reduce size of queue context fields
ice: move prefetch enable to ice_setup_rx_ctx
ice: cleanup Rx queue context programming functions
Vladimir Oltean (3):
lib: packing: create __pack() and __unpack() variants without error checking
lib: packing: demote truncation error in pack() to a warning in __pack()
lib: packing: add pack_fields() and unpack_fields()
drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 11 +-
drivers/net/ethernet/intel/ice/ice_common.h | 5 +-
drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h | 49 +---
include/linux/packing.h | 29 +++
include/linux/packing_types.h | 54 +++++
scripts/mod/modpost.h | 19 ++
drivers/net/dsa/sja1105/sja1105_static_config.c | 8 +-
drivers/net/ethernet/intel/ice/ice_base.c | 6 +-
drivers/net/ethernet/intel/ice/ice_common.c | 297 +++++-------------------
lib/packing.c | 285 +++++++++++++++++------
lib/packing_test.c | 61 +++++
scripts/mod/modpost.c | 18 +-
scripts/mod/packed_fields.c | 294 +++++++++++++++++++++++
Documentation/core-api/packing.rst | 55 +++++
MAINTAINERS | 2 +
drivers/net/ethernet/intel/Kconfig | 1 +
scripts/mod/Makefile | 4 +-
17 files changed, 828 insertions(+), 370 deletions(-)
---
base-commit: a84e8c05f58305dfa808bc5465c5175c29d7c9b6
change-id: 20241004-packing-pack-fields-and-ice-implementation-b17c7ce8e373
Best regards,
--
Jacob Keller <jacob.e.keller@...el.com>
Powered by blists - more mailing lists