[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53c8b505-f992-4c2e-b2c0-616152b447c3@lunn.ch>
Date: Tue, 12 Nov 2024 23:26:49 +0100
From: Andrew Lunn <andrew@...n.ch>
To: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Cc: Divya Koppera <divya.koppera@...rochip.com>,
arun.ramadoss@...rochip.com, UNGLinuxDriver@...rochip.com,
hkallweit1@...il.com, linux@...linux.org.uk, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
richardcochran@...il.com
Subject: Re: [PATCH net-next v3 1/5] net: phy: microchip_ptp : Add header
file for Microchip ptp library
> I believe, the current design of mchp_ptp_clock has some issues:
>
> struct mchp_ptp_clock {
> struct mii_timestamper mii_ts; /* 0 48 */
> struct phy_device * phydev; /* 48 8 */
> struct sk_buff_head tx_queue; /* 56 24 */
> /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
> struct sk_buff_head rx_queue; /* 80 24 */
> struct list_head rx_ts_list; /* 104 16 */
> spinlock_t rx_ts_lock /* 120 4 */
> int hwts_tx_type; /* 124 4 */
> /* --- cacheline 2 boundary (128 bytes) --- */
> enum hwtstamp_rx_filters rx_filter; /* 128 4 */
> int layer; /* 132 4 */
> int version; /* 136 4 */
>
> /* XXX 4 bytes hole, try to pack */
>
> struct ptp_clock * ptp_clock; /* 144 8 */
> struct ptp_clock_info caps; /* 152 184 */
> /* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
> struct mutex ptp_lock; /* 336 32 */
> u16 port_base_addr; /* 368 2 */
> u16 clk_base_addr; /* 370 2 */
> u8 mmd; /* 372 1 */
>
> /* size: 376, cachelines: 6, members: 16 */
> /* sum members: 369, holes: 1, sum holes: 4 */
> /* padding: 3 */
> /* last cacheline: 56 bytes */
> };
>
> tx_queue will be splitted across 2 cache lines and will have spinlock on the
> cache line next to `struct sk_buff * next`. That means 2 cachelines
> will have to fetched to have an access to it - may lead to performance
> issues.
>
> Another issue is that locks in tx_queue and rx_queue, and rx_ts_lock
> share the same cache line which, again, can have performance issues on
> systems which can potentially have several rx/tx queues/irqs.
>
> It would be great to try to reorder the struct a bit.
Dumb question: How much of this is in the hot patch? If this is only
used for a couple of PTP packets per second, do we care about a couple
of cache misses per second? Or will every single packet the PHY
processes be affected by this?
Andrew
Powered by blists - more mailing lists