[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1331212923.14096350.1586188737626.JavaMail.zimbra@kalray.eu>
Date: Mon, 6 Apr 2020 17:58:57 +0200 (CEST)
From: Clément Leger <cleger@...rayinc.com>
To: Arnaud Pouliquen <arnaud.pouliquen@...com>
Cc: Rishabh Bhatnagar <rishabhb@...eaurora.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-remoteproc <linux-remoteproc@...r.kernel.org>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
psodagud <psodagud@...eaurora.org>, tsoni <tsoni@...eaurora.org>,
sidgup <sidgup@...eaurora.org>
Subject: Re: [PATCH v2 1/2] remoteproc: Add character device interface
Hi Arnaud,
----- On 6 Apr, 2020, at 16:17, Arnaud Pouliquen arnaud.pouliquen@...com wrote:
> Hi Clément,
>
> On 4/6/20 2:06 PM, Clément Leger wrote:
>> Hi Arnaud,
>>
>> ----- On 6 Apr, 2020, at 11:01, Arnaud Pouliquen arnaud.pouliquen@...com wrote:
>>
>>> On 4/3/20 9:13 PM, rishabhb@...eaurora.org wrote:
>>>> On 2020-04-02 10:28, Arnaud POULIQUEN wrote:
>>>>> Hi
>>>>>
>>>>> On 4/1/20 2:03 AM, Rishabh Bhatnagar wrote:
>>>>>> Add the character device interface for userspace applications.
>>>>>> This interface can be used in order to boot up and shutdown
>>>>>> remote subsystems. Currently there is only a sysfs interface
>>>>>> which the userspace clients can use. If a usersapce application
>>>>>> crashes after booting the remote processor does not get any
>>>>>> indication about the crash. It might still assume that the
>>>>>> application is running. For example modem uses remotefs service
>>>>>> to fetch data from disk/flash memory. If the remotefs service
>>>>>> crashes, modem keeps on requesting data which might lead to a
>>>>>> crash. Adding a character device interface makes the remote
>>>>>> processor tightly coupled with the user space application.
>>>>>> A crash of the application leads to a close on the file descriptors
>>>>>> therefore shutting down the remoteproc.
>>>>>
>>>>> Sorry I'm late in the discussion, I hope I've gone through the whole
>>>>> discussion so I don't reopen a closed point...
>>>>>
>>>>> Something here is not crystal clear to me so I'd rather share it...
>>>>>
>>>>> I suppose that you the automatic restart of the application is not possible to
>>>>> stop and restart the remote processor...
>>>> Yes correct, while we wait for the application to restart we might observe a
>>>> fatal crash.
>>>>>
>>>>> Why this use case can not be solved by a process monitor or a service
>>>>> in userland that detects the application crash and stop the remote
>>>>> firmware using
>>>>> the sysfs interface?
>>>>>
>>>> What happens in the case where the process monitor itself crashes? This is
>>>> actually the approach we follow in our downstream code. We have a central entity
>>>> in userspace that controls bootup/shutdown of some remote processors based on
>>>> the
>>>> votes from userspace clients. We have observed cases where this entity
>>>> itself crashes and remote processors are left hanging.
>>>
>>> Your description makes me feel like this patch is only a workaround of something
>>> that
>>> should be fixed in the userland, even if i understand that hanging is one of the
>>> most
>>> critical problem and have to be fixed.
>>> For instance, how to handle several applications that interact with the remote
>>> processor
>>> ( e.g. rpmsg service applications) how to stop and restart everything. Using the
>>> char
>>> device would probaly resolve only a part of the issue...
>>>
>>> I'm not aware about your environment and i'm not a userland expert. But what i
>>> still not
>>> understand why a parent process can not do the job...
>>> I just test a simple script on my side that treat the kill -9 of an application
>>> ("cat" in my case).
>>
>> This is not entirely true, if the parent process is killed with a SIGKILL, then
>> the process will not be able to handle anything and the remoteproc will still
>> be running.
>>
>> What I understood from Rishabh patch is a way to allow a single process handling
>> the rproc state. We have the same kind of need and currently, if the
>> user application crashes, then the rproc is still running (which happens).
>>
>>>
>>> #start the remote firmware
>>> cp $1 /lib/firmware/
>>> echo $1> /sys/class/remoteproc/remoteproc0/firmware
>>> echo start >/sys/class/remoteproc/remoteproc0/state
>>> #your binary
>>> cat /dev/kmsg
>>> # stop the remote firmware in case of crash (and potentially some other apps)
>>> echo stop >/sys/class/remoteproc/remoteproc0/state
>>>
>>
>> This is not really "production proof" and what happens if the application is
>> responsible of setting the firmware which might be jitted ?
>> And if the script receives the SIGKILL, then we are back to the same problem.
> Yes this is just a basic example, not an implementation which would depend on
> the
> environment. i'm just trying here to put forward a multi-process solution...and
> that I'm not an userland expert :).
>
>>
>> I really think, this is a step forward an easier and reliable use of the
>> remoteproc
>> on userland to guarantee a coherent rproc state even if host application
>> crashes.
>
> I can see 3 ways of handling an application crash:
> - just shutdown the firmware
> => can be done through char device
> - stop some other related processes and/or generate a remote proc crash dump for
> debug
> => /sysfs and/or debugfs
> - do nothing as you want a silence application reboot and re-attach to the
> running firmware
> => use sysfs
>
> I'm challenging the solution because splitting the API seems to me not a good
> solution.
Completely ok with that, we have to fully understand the targeted usecase to
avoid implemented a flawed interface.
> Now i wonder how it works for the other applications that are relying on some
> other
> kernel frameworks...
For some other device, there is a chardev. The watchdog for intance uses a
/dev/watchdog. Regarding the gpio, it seems they are also using a chardev
and the sysfs interface is deprecated.
> Perhaps the answer is that these frameworks don't use sysfs but char device.
> That would means that the sysfs solution is not the more adapted solution and
> perhaps we should migrate to a char device.
> But in this case, i think that it should implement the whole API and be
> exclusive with
> the syfs legacy API (so no sysfs or sysfs in read-only).
I agree with that, if another interface must be defined, then it should
implement everything that is supported right now with the sysfs.
Regards,
Clément
>
> Regards,
> Arnaud
>
>>
>> Regards,
>>
>> Clément
>>
>>> Anyway, it's just my feeling, let other people give their feedback.
>>>
>>>>> I just want to be sure that there is no alternative to this, because
>>>>> having two ways
>>>>> for application to shutdown the firmware seems to me confusing...
>>>> Does making this interface optional/configurable helps?
>>>>>
>>>>> What about the opposite service, mean inform the application that the remote
>>>>> processor is crashed?
>>>>> Do you identify such need? or the "auto" crash recovery is sufficient?
>>>> Auto recovery works perfectly for us. Although there is a mechanism in
>>>> place using QMI(Qualcomm MSM interface) that can notify clients about remote
>>>> processor crash.
>>>
>>> Thanks for the information.
>>>
>>> Regards
>>> Arnaud
>>>
>>>>>
>>>>> Thanks,
>>>>> Arnaud
>>>>>>
>>>>>> Signed-off-by: Rishabh Bhatnagar <rishabhb@...eaurora.org>
>>>>>> ---
>>>>>> drivers/remoteproc/Kconfig | 9 +++
>>>>>> drivers/remoteproc/Makefile | 1 +
>>>>>> drivers/remoteproc/remoteproc_cdev.c | 100 +++++++++++++++++++++++++++++++
>>>>>> drivers/remoteproc/remoteproc_internal.h | 22 +++++++
>>>>>> include/linux/remoteproc.h | 2 +
>>>>>> 5 files changed, 134 insertions(+)
>>>>>> create mode 100644 drivers/remoteproc/remoteproc_cdev.c
>>>>>>
>>>>>> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
>>>>>> index de3862c..6374b79 100644
>>>>>> --- a/drivers/remoteproc/Kconfig
>>>>>> +++ b/drivers/remoteproc/Kconfig
>>>>>> @@ -14,6 +14,15 @@ config REMOTEPROC
>>>>>>
>>>>>> if REMOTEPROC
>>>>>>
>>>>>> +config REMOTEPROC_CDEV
>>>>>> + bool "Remoteproc character device interface"
>>>>>> + help
>>>>>> + Say y here to have a character device interface for Remoteproc
>>>>>> + framework. Userspace can boot/shutdown remote processors through
>>>>>> + this interface.
>>>>>> +
>>>>>> + It's safe to say N if you don't want to use this interface.
>>>>>> +
>>>>>> config IMX_REMOTEPROC
>>>>>> tristate "IMX6/7 remoteproc support"
>>>>>> depends on ARCH_MXC
>>>>>> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
>>>>>> index e30a1b1..b7d4f77 100644
>>>>>> --- a/drivers/remoteproc/Makefile
>>>>>> +++ b/drivers/remoteproc/Makefile
>>>>>> @@ -9,6 +9,7 @@ remoteproc-y += remoteproc_debugfs.o
>>>>>> remoteproc-y += remoteproc_sysfs.o
>>>>>> remoteproc-y += remoteproc_virtio.o
>>>>>> remoteproc-y += remoteproc_elf_loader.o
>>>>>> +obj-$(CONFIG_REMOTEPROC_CDEV) += remoteproc_cdev.o
>>>>>> obj-$(CONFIG_IMX_REMOTEPROC) += imx_rproc.o
>>>>>> obj-$(CONFIG_MTK_SCP) += mtk_scp.o mtk_scp_ipi.o
>>>>>> obj-$(CONFIG_OMAP_REMOTEPROC) += omap_remoteproc.o
>>>>>> diff --git a/drivers/remoteproc/remoteproc_cdev.c
>>>>>> b/drivers/remoteproc/remoteproc_cdev.c
>>>>>> new file mode 100644
>>>>>> index 0000000..8182bd1
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/remoteproc/remoteproc_cdev.c
>>>>>> @@ -0,0 +1,100 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>>>> +/*
>>>>>> + * Character device interface driver for Remoteproc framework.
>>>>>> + *
>>>>>> + * Copyright (c) 2020, The Linux Foundation. All rights reserved.
>>>>>> + */
>>>>>> +
>>>>>> +#include <linux/cdev.h>
>>>>>> +#include <linux/fs.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/mutex.h>
>>>>>> +#include <linux/remoteproc.h>
>>>>>> +
>>>>>> +#include "remoteproc_internal.h"
>>>>>> +
>>>>>> +#define NUM_RPROC_DEVICES 64
>>>>>> +static dev_t rproc_cdev;
>>>>>> +static DEFINE_IDA(cdev_minor_ida);
>>>>>> +
>>>>>> +static int rproc_cdev_open(struct inode *inode, struct file *file)
>>>>>> +{
>>>>>> + struct rproc *rproc;
>>>>>> +
>>>>>> + rproc = container_of(inode->i_cdev, struct rproc, char_dev);
>>>>>> +
>>>>>> + if (!rproc)
>>>>>> + return -EINVAL;
>>>>>> +
>>>>>> + if (rproc->state == RPROC_RUNNING)
>>>>>> + return -EBUSY;
>>>>>> +
>>>>>> + return rproc_boot(rproc);
>>>>>> +}
>>>>>> +
>>>>>> +static int rproc_cdev_release(struct inode *inode, struct file *file)
>>>>>> +{
>>>>>> + struct rproc *rproc;
>>>>>> +
>>>>>> + rproc = container_of(inode->i_cdev, struct rproc, char_dev);
>>>>>> +
>>>>>> + if (!rproc || rproc->state != RPROC_RUNNING)
>>>>>> + return -EINVAL;
>>>>>> +
>>>>>> + rproc_shutdown(rproc);
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static const struct file_operations rproc_fops = {
>>>>>> + .open = rproc_cdev_open,
>>>>>> + .release = rproc_cdev_release,
>>>>>> +};
>>>>>> +
>>>>>> +int rproc_char_device_add(struct rproc *rproc)
>>>>>> +{
>>>>>> + int ret, minor;
>>>>>> + dev_t cdevt;
>>>>>> +
>>>>>> + minor = ida_simple_get(&cdev_minor_ida, 0, NUM_RPROC_DEVICES,
>>>>>> + GFP_KERNEL);
>>>>>> + if (minor < 0) {
>>>>>> + dev_err(&rproc->dev, "%s: No more minor numbers left! rc:%d\n",
>>>>>> + __func__, minor);
>>>>>> + return -ENODEV;
>>>>>> + }
>>>>>> +
>>>>>> + cdev_init(&rproc->char_dev, &rproc_fops);
>>>>>> + rproc->char_dev.owner = THIS_MODULE;
>>>>>> +
>>>>>> + cdevt = MKDEV(MAJOR(rproc_cdev), minor);
>>>>>> + ret = cdev_add(&rproc->char_dev, cdevt, 1);
>>>>>> + if (ret < 0)
>>>>>> + ida_simple_remove(&cdev_minor_ida, minor);
>>>>>> +
>>>>>> + rproc->dev.devt = cdevt;
>>>>>> + return ret;
>>>>>> +}
>>>>>> +
>>>>>> +void rproc_char_device_remove(struct rproc *rproc)
>>>>>> +{
>>>>>> + __unregister_chrdev(MAJOR(rproc->dev.devt), MINOR(rproc->dev.devt), 1,
>>>>>> + "rproc");
>>>>>> + ida_simple_remove(&cdev_minor_ida, MINOR(rproc->dev.devt));
>>>>>> +}
>>>>>> +
>>>>>> +void __init rproc_init_cdev(void)
>>>>>> +{
>>>>>> + int ret;
>>>>>> +
>>>>>> + ret = alloc_chrdev_region(&rproc_cdev, 0, NUM_RPROC_DEVICES, "rproc");
>>>>>> + if (ret < 0) {
>>>>>> + pr_err("Failed to alloc rproc_cdev region, err %d\n", ret);
>>>>>> + return;
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>> +void __exit rproc_exit_cdev(void)
>>>>>> +{
>>>>>> + __unregister_chrdev(MAJOR(rproc_cdev), 0, NUM_RPROC_DEVICES, "rproc");
>>>>>> +}
>>>>>> diff --git a/drivers/remoteproc/remoteproc_internal.h
>>>>>> b/drivers/remoteproc/remoteproc_internal.h
>>>>>> index 493ef92..28d61a1 100644
>>>>>> --- a/drivers/remoteproc/remoteproc_internal.h
>>>>>> +++ b/drivers/remoteproc/remoteproc_internal.h
>>>>>> @@ -47,6 +47,27 @@ struct dentry *rproc_create_trace_file(const char *name,
>>>>>> struct rproc *rproc,
>>>>>> int rproc_init_sysfs(void);
>>>>>> void rproc_exit_sysfs(void);
>>>>>>
>>>>>> +#ifdef CONFIG_REMOTEPROC_CDEV
>>>>>> +void rproc_init_cdev(void);
>>>>>> +void rproc_exit_cdev(void);
>>>>>> +int rproc_char_device_add(struct rproc *rproc);
>>>>>> +void rproc_char_device_remove(struct rproc *rproc);
>>>>>> +#else
>>>>>> +static inline void rproc_init_cdev(void)
>>>>>> +{
>>>>>> +}
>>>>>> +static inline void rproc_exit_cdev(void)
>>>>>> +{
>>>>>> +}
>>>>>> +static inline int rproc_char_device_add(struct rproc *rproc)
>>>>>> +{
>>>>>> + return 0;
>>>>>> +}
>>>>>> +static inline void rproc_char_device_remove(struct rproc *rproc)
>>>>>> +{
>>>>>> +}
>>>>>> +#endif
>>>>>> +
>>>>>> void rproc_free_vring(struct rproc_vring *rvring);
>>>>>> int rproc_alloc_vring(struct rproc_vdev *rvdev, int i);
>>>>>>
>>>>>> @@ -63,6 +84,7 @@ struct resource_table *rproc_elf_find_loaded_rsc_table(struct
>>>>>> rproc *rproc,
>>>>>> struct rproc_mem_entry *
>>>>>> rproc_find_carveout_by_name(struct rproc *rproc, const char *name, ...);
>>>>>>
>>>>>> +
>>>>>> static inline
>>>>>> int rproc_fw_sanity_check(struct rproc *rproc, const struct firmware *fw)
>>>>>> {
>>>>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>>>>>> index 16ad666..c4ca796 100644
>>>>>> --- a/include/linux/remoteproc.h
>>>>>> +++ b/include/linux/remoteproc.h
>>>>>> @@ -37,6 +37,7 @@
>>>>>>
>>>>>> #include <linux/types.h>
>>>>>> #include <linux/mutex.h>
>>>>>> +#include <linux/cdev.h>
>>>>>> #include <linux/virtio.h>
>>>>>> #include <linux/completion.h>
>>>>>> #include <linux/idr.h>
>>>>>> @@ -514,6 +515,7 @@ struct rproc {
>>>>>> bool auto_boot;
>>>>>> struct list_head dump_segments;
>>>>>> int nb_vdev;
>>>>>> + struct cdev char_dev;
>>>>>> };
>>>>>>
> >>>>> /**
Powered by blists - more mailing lists