lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Feb 2022 08:29:44 -0600
From:   Alex Elder <elder@...aro.org>
To:     Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc:     mhi@...ts.linux.dev, quic_hemantk@...cinc.com,
        quic_bbhatt@...cinc.com, quic_jhugo@...cinc.com,
        vinod.koul@...aro.org, bjorn.andersson@...aro.org,
        dmitry.baryshkov@...aro.org, quic_vbadigan@...cinc.com,
        quic_cang@...cinc.com, quic_skananth@...cinc.com,
        linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Paul Davey <paul.davey@...iedtelesis.co.nz>,
        stable@...r.kernel.org
Subject: Re: [PATCH v3 02/25] bus: mhi: Fix MHI DMA structure endianness

On 2/16/22 1:04 AM, Manivannan Sadhasivam wrote:
> On Tue, Feb 15, 2022 at 02:02:01PM -0600, Alex Elder wrote:
>> On 2/12/22 12:20 PM, Manivannan Sadhasivam wrote:
>>> From: Paul Davey <paul.davey@...iedtelesis.co.nz>
>>>
>>> The MHI driver does not work on big endian architectures.  The
>>> controller never transitions into mission mode.  This appears to be due
>>> to the modem device expecting the various contexts and transfer rings to
>>> have fields in little endian order in memory, but the driver constructs
>>> them in native endianness.
>>
>> Yes, this is true.
>>
>>> Fix MHI event, channel and command contexts and TRE handling macros to
>>> use explicit conversion to little endian.  Mark fields in relevant
>>> structures as little endian to document this requirement.
>>
>> Basically every field in the external interface whose size
>> is greater than one byte must have its endianness noted.
>>  From what I can tell, you did that for all of the exposed
>> structures defined in "drivers/bus/mhi/core/internal.h",
>> which is good.
>>
>> *However* some of the *constants* were defined the wrong way.
>>
>> Basically, all of the constant values should be expressed
>> in host byte order.  And any needed byte swapping should be
>> done at the time the value is read from memory--immediately.
>> That way, we isolate that activity to the one place we
>> interface with the possibly "foreign" format, and from then
>> on, everything may be assumed to be in natural (CPU) byte order.
>>
> 
> Well, I did think about it but I convinced myself that doing the
> conversion in code rather in defines make the code look messy.
> Also in some places it just makes it look complicated. More below:

I thought this might the case.

>> I will point out what I mean, below.
>>
>>> Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
>>> Fixes: 6cd330ae76ff ("bus: mhi: core: Add support for ringing channel/event ring doorbells")
>>> Signed-off-by: Paul Davey <paul.davey@...iedtelesis.co.nz>
>>> Cc: stable@...r.kernel.org
>>> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
>>> ---
>>>    drivers/bus/mhi/core/debugfs.c  |  26 +++----
>>>    drivers/bus/mhi/core/init.c     |  36 +++++-----
>>>    drivers/bus/mhi/core/internal.h | 119 ++++++++++++++++----------------
>>>    drivers/bus/mhi/core/main.c     |  22 +++---
>>>    drivers/bus/mhi/core/pm.c       |   4 +-
>>>    5 files changed, 104 insertions(+), 103 deletions(-)
>>>
> 
> [...]
> 
>>> @@ -277,57 +277,58 @@ enum mhi_cmd_type {
>>>    /* No operation command */
>>>    #define MHI_TRE_CMD_NOOP_PTR (0)
>>>    #define MHI_TRE_CMD_NOOP_DWORD0 (0)
>>> -#define MHI_TRE_CMD_NOOP_DWORD1 (MHI_CMD_NOP << 16)
>>> +#define MHI_TRE_CMD_NOOP_DWORD1 (cpu_to_le32(MHI_CMD_NOP << 16))
>>
>> This just looks wrong to me.  The original definition
>> should be fine, but then where it's *used* it should
>> be passed to cpu_to_le32().  I realize this might be
>> a special case, where these "DWORD" values are getting
>> written out to command ring elements, but even so, the
>> byte swapping that's happening is important and should
>> be made obvious in the code using these symbols.
>>
>> This comment applies to many more similar definitions
>> below.  I don't know; maybe it looks cumbersome if
>> it's done in the code, but I still think it's better to
>> consistenly define symbols like this in CPU byte order
>> and do the conversions explicitly only when the values
>> are read/written to "foreign" (external interface)
>> memory.
>>
> 
> Defines like MHI_TRE_GET_CMD_CHID are making the conversion look messy
> to me. In this we first extract the DWORD from TRE and then doing
> shifting + masking to get the CHID.

I didn't say so, but I don't really like those defines either.
I personally would rather see the field values get extracted
in open code rather than this, because they're actually pretty
simple operations.  But I understand, sometimes things just
"look complicated" if you do them certain ways (even if simple).

I did it in a certain way in the IPA code and I find that
preferable to the use of the "DWORD" definitions you're
using.  I also stand by my belief that it's preferable to
not hide the byte swaps in macro definitions.

You use this for reading/writing the command/transfer/event
ring elements (only) though, and you do that consistently.

> So without splitting the DWORD extraction and GET_CHID macros
> separately, we can't just do the conversion in code. And we may end up
> doing the conversion in defines just for these special cases but that
> will break the uniformity.
> 
> So IMO it looks better if we trust the defines to do the conversion itself.
> 
> Please let me know if you think the other way.

I'm OK with it.  I'm not convinced, but I won't object...

					-Alex

> 
> Thanks,
> Mani
> 
>> Outside of this issue, the remainder of the patch looks
>> OK to me.
>>
>> 					-Alex
>>
>>>    /* Channel reset command */
>>>    #define MHI_TRE_CMD_RESET_PTR (0)
>>>    #define MHI_TRE_CMD_RESET_DWORD0 (0)
>>> -#define MHI_TRE_CMD_RESET_DWORD1(chid) ((chid << 24) | \
>>> -					(MHI_CMD_RESET_CHAN << 16))
>>> +#define MHI_TRE_CMD_RESET_DWORD1(chid) (cpu_to_le32((chid << 24) | \
>>> +					(MHI_CMD_RESET_CHAN << 16)))
>>>    /* Channel stop command */
>>>    #define MHI_TRE_CMD_STOP_PTR (0)
>>>    #define MHI_TRE_CMD_STOP_DWORD0 (0)
>>> -#define MHI_TRE_CMD_STOP_DWORD1(chid) ((chid << 24) | \
>>> -				       (MHI_CMD_STOP_CHAN << 16))
>>> +#define MHI_TRE_CMD_STOP_DWORD1(chid) (cpu_to_le32((chid << 24) | \
>>> +				       (MHI_CMD_STOP_CHAN << 16)))
>>>    /* Channel start command */
>>>    #define MHI_TRE_CMD_START_PTR (0)
>>>    #define MHI_TRE_CMD_START_DWORD0 (0)
>>> -#define MHI_TRE_CMD_START_DWORD1(chid) ((chid << 24) | \
>>> -					(MHI_CMD_START_CHAN << 16))
>>> +#define MHI_TRE_CMD_START_DWORD1(chid) (cpu_to_le32((chid << 24) | \
>>> +					(MHI_CMD_START_CHAN << 16)))
>>> -#define MHI_TRE_GET_CMD_CHID(tre) (((tre)->dword[1] >> 24) & 0xFF)
>>> -#define MHI_TRE_GET_CMD_TYPE(tre) (((tre)->dword[1] >> 16) & 0xFF)
>>> +#define MHI_TRE_GET_DWORD(tre, word) (le32_to_cpu((tre)->dword[(word)]))
>>> +#define MHI_TRE_GET_CMD_CHID(tre) ((MHI_TRE_GET_DWORD(tre, 1) >> 24) & 0xFF)
>>> +#define MHI_TRE_GET_CMD_TYPE(tre) ((MHI_TRE_GET_DWORD(tre, 1) >> 16) & 0xFF)
>>>    /* Event descriptor macros */
>>> -#define MHI_TRE_EV_PTR(ptr) (ptr)
>>> -#define MHI_TRE_EV_DWORD0(code, len) ((code << 24) | len)
>>> -#define MHI_TRE_EV_DWORD1(chid, type) ((chid << 24) | (type << 16))
>>> -#define MHI_TRE_GET_EV_PTR(tre) ((tre)->ptr)
>>> -#define MHI_TRE_GET_EV_CODE(tre) (((tre)->dword[0] >> 24) & 0xFF)
>>> -#define MHI_TRE_GET_EV_LEN(tre) ((tre)->dword[0] & 0xFFFF)
>>> -#define MHI_TRE_GET_EV_CHID(tre) (((tre)->dword[1] >> 24) & 0xFF)
>>> -#define MHI_TRE_GET_EV_TYPE(tre) (((tre)->dword[1] >> 16) & 0xFF)
>>> -#define MHI_TRE_GET_EV_STATE(tre) (((tre)->dword[0] >> 24) & 0xFF)
>>> -#define MHI_TRE_GET_EV_EXECENV(tre) (((tre)->dword[0] >> 24) & 0xFF)
>>> -#define MHI_TRE_GET_EV_SEQ(tre) ((tre)->dword[0])
>>> -#define MHI_TRE_GET_EV_TIME(tre) ((tre)->ptr)
>>> -#define MHI_TRE_GET_EV_COOKIE(tre) lower_32_bits((tre)->ptr)
>>> -#define MHI_TRE_GET_EV_VEID(tre) (((tre)->dword[0] >> 16) & 0xFF)
>>> -#define MHI_TRE_GET_EV_LINKSPEED(tre) (((tre)->dword[1] >> 24) & 0xFF)
>>> -#define MHI_TRE_GET_EV_LINKWIDTH(tre) ((tre)->dword[0] & 0xFF)
>>> +#define MHI_TRE_EV_PTR(ptr) (cpu_to_le64(ptr))
>>> +#define MHI_TRE_EV_DWORD0(code, len) (cpu_to_le32((code << 24) | len))
>>> +#define MHI_TRE_EV_DWORD1(chid, type) (cpu_to_le32((chid << 24) | (type << 16)))
>>> +#define MHI_TRE_GET_EV_PTR(tre) (le64_to_cpu((tre)->ptr))
>>> +#define MHI_TRE_GET_EV_CODE(tre) ((MHI_TRE_GET_DWORD(tre, 0) >> 24) & 0xFF)
>>> +#define MHI_TRE_GET_EV_LEN(tre) (MHI_TRE_GET_DWORD(tre, 0) & 0xFFFF)
>>> +#define MHI_TRE_GET_EV_CHID(tre) ((MHI_TRE_GET_DWORD(tre, 1) >> 24) & 0xFF)
>>> +#define MHI_TRE_GET_EV_TYPE(tre) ((MHI_TRE_GET_DWORD(tre, 1) >> 16) & 0xFF)
>>> +#define MHI_TRE_GET_EV_STATE(tre) ((MHI_TRE_GET_DWORD(tre, 0) >> 24) & 0xFF)
>>> +#define MHI_TRE_GET_EV_EXECENV(tre) ((MHI_TRE_GET_DWORD(tre, 0) >> 24) & 0xFF)
>>> +#define MHI_TRE_GET_EV_SEQ(tre) MHI_TRE_GET_DWORD(tre, 0)
>>> +#define MHI_TRE_GET_EV_TIME(tre) (MHI_TRE_GET_EV_PTR(tre))
>>> +#define MHI_TRE_GET_EV_COOKIE(tre) lower_32_bits(MHI_TRE_GET_EV_PTR(tre))
>>> +#define MHI_TRE_GET_EV_VEID(tre) ((MHI_TRE_GET_DWORD(tre, 0) >> 16) & 0xFF)
>>> +#define MHI_TRE_GET_EV_LINKSPEED(tre) ((MHI_TRE_GET_DWORD(tre, 1) >> 24) & 0xFF)
>>> +#define MHI_TRE_GET_EV_LINKWIDTH(tre) (MHI_TRE_GET_DWORD(tre, 0) & 0xFF)
>>>    /* Transfer descriptor macros */
>>> -#define MHI_TRE_DATA_PTR(ptr) (ptr)
>>> -#define MHI_TRE_DATA_DWORD0(len) (len & MHI_MAX_MTU)
>>> -#define MHI_TRE_DATA_DWORD1(bei, ieot, ieob, chain) ((2 << 16) | (bei << 10) \
>>> -	| (ieot << 9) | (ieob << 8) | chain)
>>> +#define MHI_TRE_DATA_PTR(ptr) (cpu_to_le64(ptr))
>>> +#define MHI_TRE_DATA_DWORD0(len) (cpu_to_le32(len & MHI_MAX_MTU))
>>> +#define MHI_TRE_DATA_DWORD1(bei, ieot, ieob, chain) (cpu_to_le32((2 << 16) | (bei << 10) \
>>> +	| (ieot << 9) | (ieob << 8) | chain))
>>>    /* RSC transfer descriptor macros */
>>> -#define MHI_RSCTRE_DATA_PTR(ptr, len) (((u64)len << 48) | ptr)
>>> -#define MHI_RSCTRE_DATA_DWORD0(cookie) (cookie)
>>> -#define MHI_RSCTRE_DATA_DWORD1 (MHI_PKT_TYPE_COALESCING << 16)
>>> +#define MHI_RSCTRE_DATA_PTR(ptr, len) (cpu_to_le64(((u64)len << 48) | ptr))
>>> +#define MHI_RSCTRE_DATA_DWORD0(cookie) (cpu_to_le32(cookie))
>>> +#define MHI_RSCTRE_DATA_DWORD1 (cpu_to_le32(MHI_PKT_TYPE_COALESCING << 16))
>>>    enum mhi_pkt_type {
>>>    	MHI_PKT_TYPE_INVALID = 0x0,
>>> @@ -500,7 +501,7 @@ struct state_transition {
>>>    struct mhi_ring {
>>>    	dma_addr_t dma_handle;
>>>    	dma_addr_t iommu_base;
>>> -	u64 *ctxt_wp; /* point to ctxt wp */
>>> +	__le64 *ctxt_wp; /* point to ctxt wp */
>>>    	void *pre_aligned;
>>>    	void *base;
>>>    	void *rp;
>>> diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c
>>> index ffde617f93a3..85f4f7c8d7c6 100644
>>> --- a/drivers/bus/mhi/core/main.c
>>> +++ b/drivers/bus/mhi/core/main.c
>>> @@ -114,7 +114,7 @@ void mhi_ring_er_db(struct mhi_event *mhi_event)
>>>    	struct mhi_ring *ring = &mhi_event->ring;
>>>    	mhi_event->db_cfg.process_db(mhi_event->mhi_cntrl, &mhi_event->db_cfg,
>>> -				     ring->db_addr, *ring->ctxt_wp);
>>> +				     ring->db_addr, le64_to_cpu(*ring->ctxt_wp));
>>>    }
>>>    void mhi_ring_cmd_db(struct mhi_controller *mhi_cntrl, struct mhi_cmd *mhi_cmd)
>>> @@ -123,7 +123,7 @@ void mhi_ring_cmd_db(struct mhi_controller *mhi_cntrl, struct mhi_cmd *mhi_cmd)
>>>    	struct mhi_ring *ring = &mhi_cmd->ring;
>>>    	db = ring->iommu_base + (ring->wp - ring->base);
>>> -	*ring->ctxt_wp = db;
>>> +	*ring->ctxt_wp = cpu_to_le64(db);
>>>    	mhi_write_db(mhi_cntrl, ring->db_addr, db);
>>>    }
>>> @@ -140,7 +140,7 @@ void mhi_ring_chan_db(struct mhi_controller *mhi_cntrl,
>>>    	 * before letting h/w know there is new element to fetch.
>>>    	 */
>>>    	dma_wmb();
>>> -	*ring->ctxt_wp = db;
>>> +	*ring->ctxt_wp = cpu_to_le64(db);
>>>    	mhi_chan->db_cfg.process_db(mhi_cntrl, &mhi_chan->db_cfg,
>>>    				    ring->db_addr, db);
>>> @@ -432,7 +432,7 @@ irqreturn_t mhi_irq_handler(int irq_number, void *dev)
>>>    	struct mhi_event_ctxt *er_ctxt =
>>>    		&mhi_cntrl->mhi_ctxt->er_ctxt[mhi_event->er_index];
>>>    	struct mhi_ring *ev_ring = &mhi_event->ring;
>>> -	dma_addr_t ptr = er_ctxt->rp;
>>> +	dma_addr_t ptr = le64_to_cpu(er_ctxt->rp);
>>>    	void *dev_rp;
>>>    	if (!is_valid_ring_ptr(ev_ring, ptr)) {
>>> @@ -537,14 +537,14 @@ static void mhi_recycle_ev_ring_element(struct mhi_controller *mhi_cntrl,
>>>    	/* Update the WP */
>>>    	ring->wp += ring->el_size;
>>> -	ctxt_wp = *ring->ctxt_wp + ring->el_size;
>>> +	ctxt_wp = le64_to_cpu(*ring->ctxt_wp) + ring->el_size;
>>>    	if (ring->wp >= (ring->base + ring->len)) {
>>>    		ring->wp = ring->base;
>>>    		ctxt_wp = ring->iommu_base;
>>>    	}
>>> -	*ring->ctxt_wp = ctxt_wp;
>>> +	*ring->ctxt_wp = cpu_to_le64(ctxt_wp);
>>>    	/* Update the RP */
>>>    	ring->rp += ring->el_size;
>>> @@ -801,7 +801,7 @@ int mhi_process_ctrl_ev_ring(struct mhi_controller *mhi_cntrl,
>>>    	struct device *dev = &mhi_cntrl->mhi_dev->dev;
>>>    	u32 chan;
>>>    	int count = 0;
>>> -	dma_addr_t ptr = er_ctxt->rp;
>>> +	dma_addr_t ptr = le64_to_cpu(er_ctxt->rp);
>>>    	/*
>>>    	 * This is a quick check to avoid unnecessary event processing
>>> @@ -940,7 +940,7 @@ int mhi_process_ctrl_ev_ring(struct mhi_controller *mhi_cntrl,
>>>    		mhi_recycle_ev_ring_element(mhi_cntrl, ev_ring);
>>>    		local_rp = ev_ring->rp;
>>> -		ptr = er_ctxt->rp;
>>> +		ptr = le64_to_cpu(er_ctxt->rp);
>>>    		if (!is_valid_ring_ptr(ev_ring, ptr)) {
>>>    			dev_err(&mhi_cntrl->mhi_dev->dev,
>>>    				"Event ring rp points outside of the event ring\n");
>>> @@ -970,7 +970,7 @@ int mhi_process_data_event_ring(struct mhi_controller *mhi_cntrl,
>>>    	int count = 0;
>>>    	u32 chan;
>>>    	struct mhi_chan *mhi_chan;
>>> -	dma_addr_t ptr = er_ctxt->rp;
>>> +	dma_addr_t ptr = le64_to_cpu(er_ctxt->rp);
>>>    	if (unlikely(MHI_EVENT_ACCESS_INVALID(mhi_cntrl->pm_state)))
>>>    		return -EIO;
>>> @@ -1011,7 +1011,7 @@ int mhi_process_data_event_ring(struct mhi_controller *mhi_cntrl,
>>>    		mhi_recycle_ev_ring_element(mhi_cntrl, ev_ring);
>>>    		local_rp = ev_ring->rp;
>>> -		ptr = er_ctxt->rp;
>>> +		ptr = le64_to_cpu(er_ctxt->rp);
>>>    		if (!is_valid_ring_ptr(ev_ring, ptr)) {
>>>    			dev_err(&mhi_cntrl->mhi_dev->dev,
>>>    				"Event ring rp points outside of the event ring\n");
>>> @@ -1533,7 +1533,7 @@ static void mhi_mark_stale_events(struct mhi_controller *mhi_cntrl,
>>>    	/* mark all stale events related to channel as STALE event */
>>>    	spin_lock_irqsave(&mhi_event->lock, flags);
>>> -	ptr = er_ctxt->rp;
>>> +	ptr = le64_to_cpu(er_ctxt->rp);
>>>    	if (!is_valid_ring_ptr(ev_ring, ptr)) {
>>>    		dev_err(&mhi_cntrl->mhi_dev->dev,
>>>    			"Event ring rp points outside of the event ring\n");
>>> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
>>> index 4aae0baea008..c35c5ddc7220 100644
>>> --- a/drivers/bus/mhi/core/pm.c
>>> +++ b/drivers/bus/mhi/core/pm.c
>>> @@ -218,7 +218,7 @@ int mhi_ready_state_transition(struct mhi_controller *mhi_cntrl)
>>>    			continue;
>>>    		ring->wp = ring->base + ring->len - ring->el_size;
>>> -		*ring->ctxt_wp = ring->iommu_base + ring->len - ring->el_size;
>>> +		*ring->ctxt_wp = cpu_to_le64(ring->iommu_base + ring->len - ring->el_size);
>>>    		/* Update all cores */
>>>    		smp_wmb();
>>> @@ -420,7 +420,7 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
>>>    			continue;
>>>    		ring->wp = ring->base + ring->len - ring->el_size;
>>> -		*ring->ctxt_wp = ring->iommu_base + ring->len - ring->el_size;
>>> +		*ring->ctxt_wp = cpu_to_le64(ring->iommu_base + ring->len - ring->el_size);
>>>    		/* Update to all cores */
>>>    		smp_wmb();
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ