[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6984ee8c-7848-46f8-a421-1b31d8c2231a@amd.com>
Date: Thu, 18 Apr 2024 10:14:57 +0530
From: Kishon Vijay Abraham I <kvijayab@....com>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc: Jingoo Han <jingoohan1@...il.com>,
Gustavo Pimentel <gustavo.pimentel@...opsys.com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Krzysztof WilczyĆski <kw@...ux.com>,
Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
Kishon Vijay Abraham I <kishon@...nel.org>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: endpoint: Migrate to Genalloc framework for outbound
window memory allocation
Hi Mani,
On 4/14/2024 6:30 PM, Manivannan Sadhasivam wrote:
> On Wed, Mar 20, 2024 at 04:59:28PM +0530, Manivannan Sadhasivam wrote:
>> On Wed, Mar 20, 2024 at 03:26:41PM +0530, Kishon Vijay Abraham I wrote:
>>> Hi Mani,
>>>
>>> On 3/17/2024 11:39 AM, Manivannan Sadhasivam wrote:
>>>> As proposed during the last year 'PCI Endpoint Subsystem Open Items
>>>> Discussion' of Linux Plumbers conference [1], let's migrate to Genalloc
>>>> framework for managing the endpoint outbound window memory allocation.
>>>>
>>>> PCI Endpoint subsystem is using a custom memory allocator in pci-epc-mem
>>>> driver from the start for managing the memory required to map the host
>>>> address space (outbound) in endpoint. Even though it works well, it
>>>> completely defeats the purpose of the 'Genalloc framework', a general
>>>> purpose memory allocator framework created to avoid various custom memory
>>>> allocators in the kernel.
>>>>
>>>> The migration to Genalloc framework is done is such a way that the existing
>>>> API semantics are preserved. So that the callers of the EPC mem APIs do not
>>>> need any modification (apart from the pcie-designware-epc driver that
>>>> queries page size).
>>>>
>>>> Internally, the EPC mem driver now uses Genalloc framework's
>>>> 'gen_pool_first_fit_order_align' algorithm that aligns the allocated memory
>>>> based on the requested size as like the previous allocator. And the
>>>> page size passed during pci_epc_mem_init() API is used as the minimum order
>>>> for the memory allocations.
>>>>
>>>> During the migration, 'struct pci_epc_mem' is removed as it is seems
>>>> redundant and the existing 'struct pci_epc_mem_window' in 'struct pci_epc'
>>>> is now used to hold the address windows of the endpoint controller.
>>>>
>>>> [1] https://lpc.events/event/17/contributions/1419/
>>>
>>> Thank you for working on this!
>>>>
>>>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
>>>> ---
>>>> drivers/pci/controller/dwc/pcie-designware-ep.c | 14 +-
>>>> drivers/pci/endpoint/pci-epc-mem.c | 182 +++++++++---------------
>>>> include/linux/pci-epc.h | 25 +---
>>>> 3 files changed, 81 insertions(+), 140 deletions(-)
>>>>
>>> .
>>> .
>>>> diff --git a/drivers/pci/endpoint/pci-epc-mem.c b/drivers/pci/endpoint/pci-epc-mem.c
>>>> index a9c028f58da1..f9e6e1a6aeaa 100644
>>>> --- a/drivers/pci/endpoint/pci-epc-mem.c
>>>> +++ b/drivers/pci/endpoint/pci-epc-mem.c
>>>> @@ -4,37 +4,18 @@
>>>> *
>>>> * Copyright (C) 2017 Texas Instruments
>>>> * Author: Kishon Vijay Abraham I <kishon@...com>
>>>> + *
>>>> + * Copyright (C) 2024 Linaro Ltd.
>>>> + * Author: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
>>>> */
>>>> +#include <linux/genalloc.h>
>>>> #include <linux/io.h>
>>>> #include <linux/module.h>
>>>> #include <linux/slab.h>
>>>> #include <linux/pci-epc.h>
>>>> -/**
>>>> - * pci_epc_mem_get_order() - determine the allocation order of a memory size
>>>> - * @mem: address space of the endpoint controller
>>>> - * @size: the size for which to get the order
>>>> - *
>>>> - * Reimplement get_order() for mem->page_size since the generic get_order
>>>> - * always gets order with a constant PAGE_SIZE.
>>>> - */
>>>> -static int pci_epc_mem_get_order(struct pci_epc_mem *mem, size_t size)
>>>> -{
>>>> - int order;
>>>> - unsigned int page_shift = ilog2(mem->window.page_size);
>>>> -
>>>> - size--;
>>>> - size >>= page_shift;
>>>> -#if BITS_PER_LONG == 32
>>>> - order = fls(size);
>>>> -#else
>>>> - order = fls64(size);
>>>> -#endif
>>>> - return order;
>>>> -}
>>>> -
>>>> /**
>>>> * pci_epc_multi_mem_init() - initialize the pci_epc_mem structure
>>>> * @epc: the EPC device that invoked pci_epc_mem_init
>>>> @@ -48,17 +29,11 @@ int pci_epc_multi_mem_init(struct pci_epc *epc,
>>>> struct pci_epc_mem_window *windows,
>>>> unsigned int num_windows)
>>>> {
>>>> - struct pci_epc_mem *mem = NULL;
>>>> - unsigned long *bitmap = NULL;
>>>> - unsigned int page_shift;
>>>> + struct pci_epc_mem_window *window = NULL;
>>>> size_t page_size;
>>>> - int bitmap_size;
>>>> - int pages;
>>>> int ret;
>>>> int i;
>>>> - epc->num_windows = 0;
>>>> -
>>>> if (!windows || !num_windows)
>>>> return -EINVAL;
>>>> @@ -70,45 +45,51 @@ int pci_epc_multi_mem_init(struct pci_epc *epc,
>>>> page_size = windows[i].page_size;
>>>> if (page_size < PAGE_SIZE)
>>>> page_size = PAGE_SIZE;
>>>> - page_shift = ilog2(page_size);
>>>> - pages = windows[i].size >> page_shift;
>>>> - bitmap_size = BITS_TO_LONGS(pages) * sizeof(long);
>>>> - mem = kzalloc(sizeof(*mem), GFP_KERNEL);
>>>> - if (!mem) {
>>>> + windows[i].pool = gen_pool_create(ilog2(page_size), -1);
>>>> + if (!windows[i].pool) {
>>>> ret = -ENOMEM;
>>>> - i--;
>>>> - goto err_mem;
>>>> + goto err_free_mem;
>>>> + }
>>>> +
>>>> + gen_pool_set_algo(windows[i].pool, gen_pool_first_fit_order_align,
>>>> + NULL);
>>>> +
>>>> + windows[i].virt_base = ioremap(windows[i].phys_base, windows[i].size);
>>>
>>> Do you have to ioremap upfront the entire window? This could be a problem in
>>> 32-bit systems which has limited vmalloc space. I have faced issues before
>>> when trying to map the entire memory window and had to manipulate vmalloc
>>> boot parameter.
>>>
>>> I'd prefer we find a way to do ioremap per allocation as before.
>>>
>>
>> Hmm, thanks for pointing it out. Current genalloc implementation works on the
>> virtual address as opposed to physical address (that might be due to majority of
>> its users managing the virtual address only). That's the reason I did ioremap of
>> the entire window upfront.
>>
>> Let me see if we can somehow avoid this.
>>
>
> Looks like we have to introduce some good amount of change to support dynamic
> ioremap with genalloc. But IMO it doesn't worth the effort to introduce these
> changes for some old systems which are supposed to go away soon.
>
> So I think we can keep the old and genalloc based allocators and use the old one
> only for 32bit systems and genalloc allocator for the rest.
>
> What do you think?
sure, that should be okay. But can we check with genalloc maintainers to
see what they think?
Thanks,
Kishon
Powered by blists - more mailing lists