[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID:
<TY3P286MB2611D07641D842BD373FD3A8987F2@TY3P286MB2611.JPNP286.PROD.OUTLOOK.COM>
Date: Sun, 28 Jan 2024 18:01:24 +0800
From: Shengyu Qu <wiagn233@...look.com>
To: adrian.hunter@...el.com,
ulf.hansson@...aro.org,
linux-mmc@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Shengyu Qu <wiagn233@...look.com>
Subject: [PATCH v1] mmc: sdhci: fix max req size based on spec
For almost 2 decades, the max allowed requests were limited to 512KB
because of SDMA's max 512KiB boundary limit.
ADMA2 and ADMA3 do not have such limit and were effectively made so any
kind of block count would not impose interrupt and managing stress to the
host.
By limiting that to 512KiB, it effectively downgrades these DMA modes to
SDMA.
Fix that by actually following the spec:
When ADMA is selected tuning mode is advised. On lesser modes, 4MiB
transfer is selected as max, so re-tuning if timer trigger or if requested
by host interrupt, can be done in time. Otherwise, the only limit is the
variable size of types used. In this implementation, 16MiB is used as
maximum since tests showed that after that point, there are diminishing
returns.
Also 16MiB in worst case scenarios, when card is eMMC and its max speed is
a generous 350MiB/s, will generate interrupts every 45ms on huge data
transfers.
A new `adma_get_req_limit` sdhci host function was also introduced, to let
vendors override imposed limits by the generic implementation if needed.
For example, on local tests with rigorous CPU/GPU burn-in tests and abrupt
cut-offs to generate huge temperature changes (upwards/downwards) to the
card, tested host was fine up to 128MB/s transfers on slow cards that used
SDR104 bus timing without re-tuning.
In that case the 4MiB limit was overridden with a more than safe 8MiB
value.
In all testing cases and boards, that change brought the following:
Depending on bus timing and eMMC/SD specs:
* Max Read throughput increased by 2-20%
* Max Write throughput increased by 50-200%
Depending on CPU frequency and transfer sizes:
* Reduced mmcqd cpu core usage by 4-50%
Above commit message comes from original author whose id is CTCaer, with
SoB email address ctcaer@...il.com. I tried to contact with the author 1
month ago to ask for sending it to mainline or get the authority to submit
by myself, but I didn't get any reply, so I decided to send this patch by
myself. Original commit is here[1].
The author also has a patch[2] applied after this patch, which overrides
adma size on tegra device from device tree property. But I don't have a
tegra device to actually test that, so it is not sent, and
adma_get_req_limit part is not included in this version of patch.
[1]: https://github.com/CTCaer/switch-l4t-kernel-4.9/commit/fa86ebbd56d30b3b6af26e1d1c3f9c411a47e98e
[2]: https://github.com/CTCaer/switch-l4t-kernel-4.9/commit/385f9335b9a60ce471ac3291f202b1326212be3e
Signed-off-by: Shengyu Qu <wiagn233@...look.com>
---
drivers/mmc/host/sdhci.c | 17 ++++++++++++-----
drivers/mmc/host/sdhci.h | 4 ++--
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index c79f73459915..f546b675c7b9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1081,7 +1081,7 @@ static void sdhci_initialize_data(struct sdhci_host *host,
WARN_ON(host->data);
/* Sanity checks */
- BUG_ON(data->blksz * data->blocks > 524288);
+ BUG_ON(data->blksz * data->blocks > host->mmc->max_req_size);
BUG_ON(data->blksz > host->mmc->max_blk_size);
BUG_ON(data->blocks > 65535);
@@ -4690,11 +4690,18 @@ int sdhci_setup_host(struct sdhci_host *host)
/*
* Maximum number of sectors in one transfer. Limited by SDMA boundary
- * size (512KiB). Note some tuning modes impose a 4MiB limit, but this
- * is less anyway.
+ * size and by tuning modes on ADMA. On tuning mode 3 16MiB is more than
+ * enough to cover big data transfers.
*/
- mmc->max_req_size = 524288;
-
+ if (host->flags & SDHCI_USE_ADMA) {
+ if (host->tuning_mode != SDHCI_TUNING_MODE_3)
+ mmc->max_req_size = SZ_4M;
+ else
+ mmc->max_req_size = SZ_16M;
+ } else {
+ /* On PIO/SDMA use SDMA boundary size (512KiB). */
+ mmc->max_req_size = SZ_512K;
+ }
/*
* Maximum number of segments. Depends on if the hardware
* can do scatter/gather or not.
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index a20864fc0641..98252c427feb 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -346,11 +346,11 @@ struct sdhci_adma2_64_desc {
#define ADMA2_END 0x2
/*
- * Maximum segments assuming a 512KiB maximum requisition size and a minimum
+ * Maximum segments assuming a 16MiB maximum requisition size and a minimum
* 4KiB page size. Note this also allows enough for multiple descriptors in
* case of PAGE_SIZE >= 64KiB.
*/
-#define SDHCI_MAX_SEGS 128
+#define SDHCI_MAX_SEGS 4096
/* Allow for a command request and a data request at the same time */
#define SDHCI_MAX_MRQS 2
--
2.39.2
Powered by blists - more mailing lists