lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fdf7a8c3-3137-4090-ba41-a7b84e3a695a@intel.com>
Date: Tue, 30 Jan 2024 08:54:15 +0200
From: Adrian Hunter <adrian.hunter@...el.com>
To: Shengyu Qu <wiagn233@...look.com>, ulf.hansson@...aro.org,
 linux-mmc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] mmc: sdhci: fix max req size based on spec

On 28/01/24 12:01, Shengyu Qu wrote:
> For almost 2 decades, the max allowed requests were limited to 512KB
> because of SDMA's max 512KiB boundary limit.

It is not limited by SDMA.  It is limited by choice.

> 
> ADMA2 and ADMA3 do not have such limit and were effectively made so any
> kind of block count would not impose interrupt and managing stress to the
> host.

The main benefit of ADMA is that it provides scatter/gather and so does
not need a bounce buffer.

> 
> By limiting that to 512KiB, it effectively downgrades these DMA modes to
> SDMA.

Not really.

> 
> Fix that by actually following the spec:
> When ADMA is selected tuning mode is advised. On lesser modes, 4MiB
> transfer is selected as max, so re-tuning if timer trigger or if requested
> by host interrupt, can be done in time. Otherwise, the only limit is the
> variable size of types used. In this implementation, 16MiB is used as
> maximum since tests showed that after that point, there are diminishing
> returns.
> 
> Also 16MiB in worst case scenarios, when card is eMMC and its max speed is
> a generous 350MiB/s, will generate interrupts every 45ms on huge data
> transfers.
> 
> A new `adma_get_req_limit` sdhci host function was also introduced, to let
> vendors override imposed limits by the generic implementation if needed.

Not in this patch?

> 
> For example, on local tests with rigorous CPU/GPU burn-in tests and abrupt
> cut-offs to generate huge temperature changes (upwards/downwards) to the
> card, tested host was fine up to 128MB/s transfers on slow cards that used
> SDR104 bus timing without re-tuning.
> In that case the 4MiB limit was overridden with a more than safe 8MiB
> value.
> 
> In all testing cases and boards, that change brought the following:

"all testing cases and boards" doesn't mean much to anyone else. You
need to be more explicit.

> 
> Depending on bus timing and eMMC/SD specs:
> * Max Read throughput increased by 2-20%
> * Max Write throughput increased by 50-200%
> Depending on CPU frequency and transfer sizes:
> * Reduced mmcqd cpu core usage by 4-50%

The main issue with increasing the request size is that it introduces much
more latency for synchronous reads e.g. a synchronous read may have to wait
for a large write operation.  Generally, that is probably a show-stopper
for unconditionally increasing the maximum request size.

> 
> Above commit message comes from original author whose id is CTCaer, with
> SoB email address ctcaer@...il.com. I tried to contact with the author 1
> month ago to ask for sending it to mainline or get the authority to submit
> by myself, but I didn't get any reply, so I decided to send this patch by
> myself. Original commit is here[1].

Ok, so it is not your patch and the original author is out of touch.

Is there a particular reason you wanted this patch?

> 
> The author also has a patch[2] applied after this patch, which overrides
> adma size on tegra device from device tree property. But I don't have a
> tegra device to actually test that, so it is not sent, and
> adma_get_req_limit part is not included in this version of patch.

Does that mean you haven't tested this patch yourself at all?

> 
> [1]: https://github.com/CTCaer/switch-l4t-kernel-4.9/commit/fa86ebbd56d30b3b6af26e1d1c3f9c411a47e98e
> [2]: https://github.com/CTCaer/switch-l4t-kernel-4.9/commit/385f9335b9a60ce471ac3291f202b1326212be3e
> 
> Signed-off-by: Shengyu Qu <wiagn233@...look.com>
> ---
>  drivers/mmc/host/sdhci.c | 17 ++++++++++++-----
>  drivers/mmc/host/sdhci.h |  4 ++--
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index c79f73459915..f546b675c7b9 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -1081,7 +1081,7 @@ static void sdhci_initialize_data(struct sdhci_host *host,
>  	WARN_ON(host->data);
>  
>  	/* Sanity checks */
> -	BUG_ON(data->blksz * data->blocks > 524288);
> +	BUG_ON(data->blksz * data->blocks > host->mmc->max_req_size);
>  	BUG_ON(data->blksz > host->mmc->max_blk_size);
>  	BUG_ON(data->blocks > 65535);
>  
> @@ -4690,11 +4690,18 @@ int sdhci_setup_host(struct sdhci_host *host)
>  
>  	/*
>  	 * Maximum number of sectors in one transfer. Limited by SDMA boundary
> -	 * size (512KiB). Note some tuning modes impose a 4MiB limit, but this
> -	 * is less anyway.
> +	 * size and by tuning modes on ADMA. On tuning mode 3 16MiB is more than
> +	 * enough to cover big data transfers.
>  	 */
> -	mmc->max_req_size = 524288;
> -
> +	if (host->flags & SDHCI_USE_ADMA) {
> +		if (host->tuning_mode != SDHCI_TUNING_MODE_3)
> +			mmc->max_req_size = SZ_4M;
> +		else
> +			mmc->max_req_size = SZ_16M;
> +	} else {
> +		/* On PIO/SDMA use SDMA boundary size (512KiB). */
> +		mmc->max_req_size = SZ_512K;
> +	}
>  	/*
>  	 * Maximum number of segments. Depends on if the hardware
>  	 * can do scatter/gather or not.
> diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
> index a20864fc0641..98252c427feb 100644
> --- a/drivers/mmc/host/sdhci.h
> +++ b/drivers/mmc/host/sdhci.h
> @@ -346,11 +346,11 @@ struct sdhci_adma2_64_desc {
>  #define ADMA2_END		0x2
>  
>  /*
> - * Maximum segments assuming a 512KiB maximum requisition size and a minimum
> + * Maximum segments assuming a 16MiB maximum requisition size and a minimum
>   * 4KiB page size. Note this also allows enough for multiple descriptors in
>   * case of PAGE_SIZE >= 64KiB.
>   */
> -#define SDHCI_MAX_SEGS		128
> +#define SDHCI_MAX_SEGS		4096
>  
>  /* Allow for a command request and a data request at the same time */
>  #define SDHCI_MAX_MRQS		2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ