lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4c21a919-37df-52ab-a298-9ccdb715fc11@gmail.com>
Date:   Tue, 3 May 2022 22:37:07 +0900
From:   Taehee Yoo <ap420073@...il.com>
To:     davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
        edumazet@...gle.com, netdev@...r.kernel.org,
        ecree.xilinx@...il.com, habetsm.xilinx@...il.com
Subject: Re: [PATCH net] net: sfc: fix memory leak due to ptp channel

On 5/3/22 02:02, Taehee Yoo wrote:
 > It fixes memory leak in ring buffer change logic.
 >
 > When ring buffer size is changed(ethtool -G eth0 rx 4096), sfc driver
 > works like below.
 > 1. stop all channels and remove ring buffers.
 > 2. allocates new buffer array.
 > 3. allocates rx buffers.
 > 4. start channels.
 >
 > While the above steps are working, it skips some steps if the channel
 > doesn't have a ->copy callback function.
 > Due to ptp channel doesn't have ->copy callback, these above steps are
 > skipped for ptp channel.
 > It eventually makes some problems.
 > a. ptp channel's ring buffer size is not changed, it works only
 >     1024(default).
 > b. memory leak.
 >
 > The reason for memory leak is to use the wrong ring buffer values.
 > There are some values, they are related to ring buffer size.
 > a. efx->rxq_entries
 >   - This is a global value of rx queue size.
 > b. rx_queue->ptr_mask
 >   - used for access ring buffer as circular ring.
 >   - roundup_pow_of_two(efx->rxq_entries) - 1
 > c. rx_queue->max_fill
 >   - efx->rxq_entries - EFX_RXD_HEAD_ROOM
 >
 > These all values should be based on ring buffer size consistently.
 > But ptp channel's values are not.
 > a. efx->rxq_entries
 >   - This is global(for sfc) value, always new ring buffer size.
 > b. rx_queue->ptr_mask
 >   - This is always 1023(default).
 > c. rx_queue->max_fill
 >   - This is new ring buffer size - EFX_RXD_HEAD_ROOM.
 >
 > Let's assume we set 4096 for rx ring buffer,
 >
 >                        normal channel     ptp channel
 > efx->rxq_entries      4096               4096
 > rx_queue->ptr_mask    4095               1023
 > rx_queue->max_fill    4086               4086
 >
 > sfc driver allocates rx ring buffers based on these values.
 > When it allocates ptp channel's ring buffer, 4086 ring buffers are
 > allocated then, these buffers are attached to the allocated array.
 > But ptp channel's ring buffer array size is still 1024(default)
 > and ptr_mask is still 1023 too.
 > So, about 3K ring buffers will be overwritten to the array.
 > This is the reason for memory leak.
 >
 > Test commands:
 >     ethtool -G <interface name> rx 4096
 >     while :
 >     do
 >         ip link set <interface name> up
 >         ip link set <interface name> down
 >     done
 >
 > In order to avoid this problem, it adds ->copy callback to ptp channel
 > type.
 > So that rx_queue->ptr_mask value will be updated correctly.
 >
 > Fixes: 7c236c43b838 ("sfc: Add support for IEEE-1588 PTP")
 > Signed-off-by: Taehee Yoo <ap420073@...il.com>
 > ---
 >   drivers/net/ethernet/sfc/efx_channels.c |  7 ++++++-
 >   drivers/net/ethernet/sfc/ptp.c          | 13 ++++++++++++-
 >   drivers/net/ethernet/sfc/ptp.h          |  1 +
 >   3 files changed, 19 insertions(+), 2 deletions(-)
 >
 > diff --git a/drivers/net/ethernet/sfc/efx_channels.c 
b/drivers/net/ethernet/sfc/efx_channels.c
 > index 377df8b7f015..40df910aa140 100644
 > --- a/drivers/net/ethernet/sfc/efx_channels.c
 > +++ b/drivers/net/ethernet/sfc/efx_channels.c
 > @@ -867,7 +867,9 @@ static void efx_set_xdp_channels(struct efx_nic *efx)
 >
 >   int efx_realloc_channels(struct efx_nic *efx, u32 rxq_entries, u32 
txq_entries)
 >   {
 > -	struct efx_channel *other_channel[EFX_MAX_CHANNELS], *channel;
 > +	struct efx_channel *other_channel[EFX_MAX_CHANNELS], *channel,
 > +			   *ptp_channel = efx_ptp_channel(efx);
 > +	struct efx_ptp_data *ptp_data = efx->ptp_data;
 >   	unsigned int i, next_buffer_table = 0;
 >   	u32 old_rxq_entries, old_txq_entries;
 >   	int rc, rc2;
 > @@ -938,6 +940,7 @@ int efx_realloc_channels(struct efx_nic *efx, u32 
rxq_entries, u32 txq_entries)
 >
 >   	efx_set_xdp_channels(efx);
 >   out:
 > +	efx->ptp_data = NULL;
 >   	/* Destroy unused channel structures */
 >   	for (i = 0; i < efx->n_channels; i++) {
 >   		channel = other_channel[i];
 > @@ -948,6 +951,7 @@ int efx_realloc_channels(struct efx_nic *efx, u32 
rxq_entries, u32 txq_entries)
 >   		}
 >   	}
 >
 > +	efx->ptp_data = ptp_data;
 >   	rc2 = efx_soft_enable_interrupts(efx);
 >   	if (rc2) {
 >   		rc = rc ? rc : rc2;
 > @@ -966,6 +970,7 @@ int efx_realloc_channels(struct efx_nic *efx, u32 
rxq_entries, u32 txq_entries)
 >   	efx->txq_entries = old_txq_entries;
 >   	for (i = 0; i < efx->n_channels; i++)
 >   		swap(efx->channel[i], other_channel[i]);
 > +	efx_ptp_update_channel(efx, ptp_channel);
 >   	goto out;
 >   }
 >
 > diff --git a/drivers/net/ethernet/sfc/ptp.c 
b/drivers/net/ethernet/sfc/ptp.c
 > index f0ef515e2ade..2cf714f92655 100644
 > --- a/drivers/net/ethernet/sfc/ptp.c
 > +++ b/drivers/net/ethernet/sfc/ptp.c
 > @@ -45,6 +45,7 @@
 >   #include "farch_regs.h"
 >   #include "tx.h"
 >   #include "nic.h" /* indirectly includes ptp.h */
 > +#include "efx_channels.h"
 >
 >   /* Maximum number of events expected to make up a PTP event */
 >   #define	MAX_EVENT_FRAGS			3
 > @@ -541,6 +542,11 @@ struct efx_channel *efx_ptp_channel(struct 
efx_nic *efx)
 >   	return efx->ptp_data ? efx->ptp_data->channel : NULL;
 >   }
 >
 > +void efx_ptp_update_channel(struct efx_nic *efx, struct efx_channel 
*channel)
 > +{
 > +	efx->ptp_data->channel = channel;

I missed that efx->ptp_data can be a NULL.
If so, it makes the kernel panic.
So I will send a v2 patch tomorrow after some tests.

Thanks,
Taehee Yoo

 > +}
 > +
 >   static u32 last_sync_timestamp_major(struct efx_nic *efx)
 >   {
 >   	struct efx_channel *channel = efx_ptp_channel(efx);
 > @@ -1443,6 +1449,11 @@ int efx_ptp_probe(struct efx_nic *efx, struct 
efx_channel *channel)
 >   	int rc = 0;
 >   	unsigned int pos;
 >
 > +	if (efx->ptp_data) {
 > +		efx->ptp_data->channel = channel;
 > +		return 0;
 > +	}
 > +
 >   	ptp = kzalloc(sizeof(struct efx_ptp_data), GFP_KERNEL);
 >   	efx->ptp_data = ptp;
 >   	if (!efx->ptp_data)
 > @@ -2176,7 +2187,7 @@ static const struct efx_channel_type 
efx_ptp_channel_type = {
 >   	.pre_probe		= efx_ptp_probe_channel,
 >   	.post_remove		= efx_ptp_remove_channel,
 >   	.get_name		= efx_ptp_get_channel_name,
 > -	/* no copy operation; there is no need to reallocate this channel */
 > +	.copy                   = efx_copy_channel,
 >   	.receive_skb		= efx_ptp_rx,
 >   	.want_txqs		= efx_ptp_want_txqs,
 >   	.keep_eventq		= false,
 > diff --git a/drivers/net/ethernet/sfc/ptp.h 
b/drivers/net/ethernet/sfc/ptp.h
 > index 9855e8c9e544..7b1ef7002b3f 100644
 > --- a/drivers/net/ethernet/sfc/ptp.h
 > +++ b/drivers/net/ethernet/sfc/ptp.h
 > @@ -16,6 +16,7 @@ struct ethtool_ts_info;
 >   int efx_ptp_probe(struct efx_nic *efx, struct efx_channel *channel);
 >   void efx_ptp_defer_probe_with_channel(struct efx_nic *efx);
 >   struct efx_channel *efx_ptp_channel(struct efx_nic *efx);
 > +void efx_ptp_update_channel(struct efx_nic *efx, struct efx_channel 
*channel);
 >   void efx_ptp_remove(struct efx_nic *efx);
 >   int efx_ptp_set_ts_config(struct efx_nic *efx, struct ifreq *ifr);
 >   int efx_ptp_get_ts_config(struct efx_nic *efx, struct ifreq *ifr);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ