FreeBSD Mail Archives

Date:      Wed, 28 Dec 2016 11:21:08 -0700
From:      Ian Lepore <ian@freebsd.org>
To:        tony moseby <xmoseby@yahoo.se>
Cc:        "freebsd-embedded@freebsd.org" <freebsd-embedded@freebsd.org>
Subject:   Re: SV: A few questions about SD/MMC drivers
Message-ID:  <1482949268.16152.7.camel@freebsd.org>
In-Reply-To: <2053136483.1957700.1482528343527@mail.yahoo.com>
References:  <2053136483.1957700.1482528343527.ref@mail.yahoo.com> <2053136483.1957700.1482528343527@mail.yahoo.com>

index | next in thread | previous in thread | raw e-mail


[-- Attachment #1 --]
[trimming the CC list to just the poster and the embedded list]

On Fri, 2016-12-23 at 21:25 +0000, tony moseby via freebsd-embedded
wrote:
> Hello Ian,
> I can read that you have been involved in a similar problem that I am
> facing now.I am running FreeBSD 8.2 and a armv5TE (marvel mv78100), I
> have an ssd disk(4Giga).While running the system and doing disk
> acesses� after sometime (can be days or at least several hours)the
> system total freezes.Serial port freezes and� ethernet freezes.(just
> internal connections, no public).
> If I do not use the disk, this(the freeze) do not occur.My watchdog
> is disabled , so I am suspecting that the kernel crashes and that is
> the cause of the freeze.I connected an emulator to� the board, but
> whenever the freeze occur,the emulator looses also contactwith the
> CPU.I have tested to do a lot of read/write towards the ssd,often
> this also causes freeze, but for a coupleof times I can see the
> kernel crashing with alignment fault 1.Do you have any
> ideas?ThanksBR/T
> 
> 

We use 8.2 on a marvel�Kirkwood 88F6281 at $work. �I've used it with an
ssd connected by usb, and connected by eSata, and have not seen the
problem you report, even when using Poudriere to build packages on that
machine, which uses the ssd drive for 18+ hours nonstop.

But I have seen the problem you report on every version of freebsd
later than 8.2 on armv4 and armv5 systems (which is why we still use
8.2 at work). �Our 8.2 isn't stock, we've pulled dozens of fixes from
later versions of freebsd into our repo. �I suppose that one of those
fixes might have helped the problem, but figuring out which one is
probably impossible.

I also spent many hours trying to get freebsd 10 and/or 11 to work
reliably on armv4/v5 without alignment faults related to disk IO, and
failed to ever track down what part of the system is using an unaligned
buffer. �It may be that the real problem isn't a simple unaligned
buffer, but maybe a wild pointer or other programming error.

An interesting thing to try is building the kernel with option�
ARM9_CACHE_WRITE_THROUGH. �We don't routinely do that, but when that
option makes a problem stop happening, that's a strong clue that busdma
cache coherency is part of the problem.

Another kernel option that you should be using if you use USB is�

�option USB_HOST_ALIGN=32

I've looked through all the fixes we've applied locally to 8.2 and
found some that are likely to be related to alignment faults, or should
just be helpful to anyone running armv5 on 8.2. �I'm attaching a set of
patches for you, maybe some or all of these will help your situation.
�Our starting snapshot that all these patches are based on is stable-8
at r223364.

-- Ian

[-- Attachment #2 --]
# Note that there are multiple separate patches in this file.
#
# HG changeset patch
# User Ian Lepore
# Date 1312309613 21600
#      Tue Aug 02 12:26:53 2011 -0600
# Node ID c5c5186ef4b2bf775b7783d9665543768fad5cd3
Disable interrupts while performing busdma buffer sync operations.

diff -r 18d5166659aa -r c5c5186ef4b2 sys/arm/arm/busdma_machdep.c
--- a/sys/arm/arm/busdma_machdep.c	Tue Aug 02 09:55:50 2011 -0600
+++ b/sys/arm/arm/busdma_machdep.c	Tue Aug 02 12:26:53 2011 -0600
@@ -1091,6 +1091,14 @@ static void
 bus_dmamap_sync_buf(void *buf, int len, bus_dmasync_op_t op)
 {
 	char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align];
+	uint32_t intr;
+
+        /* Interrupts MUST be disabled when handling partial cacheline flush
+         * and most likely should be disabled for all flushes.  (I know for
+         * certain interrupts can cause failures on partial flushes, and suspect
+         * problems could also happen in other scenarios.)
+         */
+	intr = intr_disable();
 
 	if ((op & BUS_DMASYNC_PREWRITE) && !(op & BUS_DMASYNC_PREREAD)) {
 		cpu_dcache_wb_range((vm_offset_t)buf, len);
@@ -1129,6 +1137,8 @@ bus_dmamap_sync_buf(void *buf, int len, 
 			    arm_dcache_align - (((vm_offset_t)(buf) + len) &
 			   arm_dcache_align_mask));
 	}
+
+	intr_restore(intr);
 }
 
 static void
# HG changeset patch
# User Ian Lepore
# Date 1315341262 21600
#      Tue Sep 06 14:34:22 2011 -0600
# Node ID 48e738be8f1a06bf52ea6c64d2c3902dd244547a
When performing a cache sync, only disable interrupts if there's a partial
cacheline flush involved.

diff -r fcbba24de147 -r 48e738be8f1a sys/arm/arm/busdma_machdep.c
--- a/sys/arm/arm/busdma_machdep.c	Sat Sep 03 09:45:01 2011 -0600
+++ b/sys/arm/arm/busdma_machdep.c	Tue Sep 06 14:34:22 2011 -0600
@@ -1091,14 +1091,6 @@ static void
 bus_dmamap_sync_buf(void *buf, int len, bus_dmasync_op_t op)
 {
 	char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align];
-	uint32_t intr;
-
-        /* Interrupts MUST be disabled when handling partial cacheline flush
-         * and most likely should be disabled for all flushes.  (I know for
-         * certain interrupts can cause failures on partial flushes, and suspect
-         * problems could also happen in other scenarios.)
-         */
-	intr = intr_disable();
 
 	if ((op & BUS_DMASYNC_PREWRITE) && !(op & BUS_DMASYNC_PREREAD)) {
 		cpu_dcache_wb_range((vm_offset_t)buf, len);
@@ -1114,31 +1106,45 @@ bus_dmamap_sync_buf(void *buf, int len, 
 	    		cpu_l2cache_wbinv_range((vm_offset_t)buf, len);
 		}
 	}
+	/*
+	 * Interrupts must be disabled if handling a partial cacheline flush,
+	 * otherwise the interrupt handling code could modify data in the
+	 * non-DMA part of a cacheline while we have it stashed away in the
+	 * temporary stack buffer, then we end up restoring the stale value. As
+	 * unlikely as this seems, it has been observed in the real world.
+	 */
 	if (op & BUS_DMASYNC_POSTREAD) {
-		if ((vm_offset_t)buf & arm_dcache_align_mask) {
-			memcpy(_tmp_cl, (void *)((vm_offset_t)buf & ~
-			    arm_dcache_align_mask),
-			    (vm_offset_t)buf & arm_dcache_align_mask);
-		}
-		if (((vm_offset_t)buf + len) & arm_dcache_align_mask) {
-			memcpy(_tmp_clend, (void *)((vm_offset_t)buf + len),
-			    arm_dcache_align - (((vm_offset_t)(buf) + len) &
-			   arm_dcache_align_mask));
+		uint32_t intr;
+		int partial = (((vm_offset_t)buf) | len) & arm_dcache_align_mask;
+
+		if (partial) {
+			intr = intr_disable();
+			if ((vm_offset_t)buf & arm_dcache_align_mask) {
+				memcpy(_tmp_cl, (void *)((vm_offset_t)buf & ~
+				    arm_dcache_align_mask),
+				    (vm_offset_t)buf & arm_dcache_align_mask);
+			}
+			if (((vm_offset_t)buf + len) & arm_dcache_align_mask) {
+				memcpy(_tmp_clend, (void *)((vm_offset_t)buf + len),
+				    arm_dcache_align - (((vm_offset_t)(buf) + len) &
+				    arm_dcache_align_mask));
+			}
 		}
 		cpu_dcache_inv_range((vm_offset_t)buf, len);
 		cpu_l2cache_inv_range((vm_offset_t)buf, len);
 
-		if ((vm_offset_t)buf & arm_dcache_align_mask)
-			memcpy((void *)((vm_offset_t)buf &
-			    ~arm_dcache_align_mask), _tmp_cl, 
-			    (vm_offset_t)buf & arm_dcache_align_mask);
-		if (((vm_offset_t)buf + len) & arm_dcache_align_mask)
-			memcpy((void *)((vm_offset_t)buf + len), _tmp_clend,
-			    arm_dcache_align - (((vm_offset_t)(buf) + len) &
-			   arm_dcache_align_mask));
+		if (partial) {
+			if ((vm_offset_t)buf & arm_dcache_align_mask)
+				memcpy((void *)((vm_offset_t)buf &
+				    ~arm_dcache_align_mask), _tmp_cl, 
+				    (vm_offset_t)buf & arm_dcache_align_mask);
+			if (((vm_offset_t)buf + len) & arm_dcache_align_mask)
+				memcpy((void *)((vm_offset_t)buf + len), _tmp_clend,
+				    arm_dcache_align - (((vm_offset_t)(buf) + len) &
+				    arm_dcache_align_mask));
+			intr_restore(intr);
+		}
 	}
-
-	intr_restore(intr);
 }
 
 static void
# HG changeset patch
# User Ian Lepore
# Date 1335024426 21600
#      Sat Apr 21 10:07:06 2012 -0600
# Node ID 0365aa062b91eb98e2b63ad425219e22795630ac
Re-roll the patch that disables interrupts during busdma partial cacheline
flushes to be more style(9) and KNF compliant.  No functional changes.

FreeBSD is finally getting around to committing some of the patches we
submitted last year, and I noticed that the patch submitted for this problem
didn't match the code we've been running for months.  The original patch
disables interrupts during the whole function, but we switched to disabling
them only when doing a partial cacheline flush.  I want to submit our latest
and most-tested code, so doing a style(9) cleanup at the same time should
help get the patch accepted more quickly.

diff -r 33d607a43942 -r 0365aa062b91 sys/arm/arm/busdma_machdep.c
--- a/sys/arm/arm/busdma_machdep.c	Fri Apr 20 17:08:21 2012 -0600
+++ b/sys/arm/arm/busdma_machdep.c	Sat Apr 21 10:07:06 2012 -0600
@@ -1090,6 +1090,8 @@ void
 static void
 bus_dmamap_sync_buf(void *buf, int len, bus_dmasync_op_t op)
 {
+	uint32_t intr;
+	int partial; 
 	char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align];
 
 	if ((op & BUS_DMASYNC_PREWRITE) && !(op & BUS_DMASYNC_PREREAD)) {
@@ -1107,40 +1109,41 @@ bus_dmamap_sync_buf(void *buf, int len, 
 		}
 	}
 	/*
-	 * Interrupts must be disabled if handling a partial cacheline flush,
+	 * Interrupts must be disabled while handling a partial cacheline flush,
 	 * otherwise the interrupt handling code could modify data in the
 	 * non-DMA part of a cacheline while we have it stashed away in the
-	 * temporary stack buffer, then we end up restoring the stale value. As
-	 * unlikely as this seems, it has been observed in the real world.
+	 * temporary stack buffer, then we end up restoring the stale value.
+	 * As unlikely as this seems, it has been observed in the real world.
 	 */
 	if (op & BUS_DMASYNC_POSTREAD) {
-		uint32_t intr;
-		int partial = (((vm_offset_t)buf) | len) & arm_dcache_align_mask;
-
+		partial = (((vm_offset_t)buf) | len) & arm_dcache_align_mask;
 		if (partial) {
 			intr = intr_disable();
 			if ((vm_offset_t)buf & arm_dcache_align_mask) {
-				memcpy(_tmp_cl, (void *)((vm_offset_t)buf & ~
-				    arm_dcache_align_mask),
+				memcpy(_tmp_cl, (void *)((vm_offset_t)buf &
+				    ~arm_dcache_align_mask),
 				    (vm_offset_t)buf & arm_dcache_align_mask);
 			}
 			if (((vm_offset_t)buf + len) & arm_dcache_align_mask) {
-				memcpy(_tmp_clend, (void *)((vm_offset_t)buf + len),
-				    arm_dcache_align - (((vm_offset_t)(buf) + len) &
+				memcpy(_tmp_clend, 
+				    (void *)((vm_offset_t)buf + len),
+				    arm_dcache_align - 
+				    (((vm_offset_t)(buf) + len) &
 				    arm_dcache_align_mask));
 			}
 		}
 		cpu_dcache_inv_range((vm_offset_t)buf, len);
 		cpu_l2cache_inv_range((vm_offset_t)buf, len);
-
 		if (partial) {
 			if ((vm_offset_t)buf & arm_dcache_align_mask)
 				memcpy((void *)((vm_offset_t)buf &
 				    ~arm_dcache_align_mask), _tmp_cl, 
 				    (vm_offset_t)buf & arm_dcache_align_mask);
 			if (((vm_offset_t)buf + len) & arm_dcache_align_mask)
-				memcpy((void *)((vm_offset_t)buf + len), _tmp_clend,
-				    arm_dcache_align - (((vm_offset_t)(buf) + len) &
+				memcpy((void *)((vm_offset_t)buf + len), 
+				    _tmp_clend,
+				    arm_dcache_align - 
+				    (((vm_offset_t)(buf) + len) &
 				    arm_dcache_align_mask));
 			intr_restore(intr);
 		}
# HG changeset patch
# User Ian Lepore
# Date 1335114029 21600
#      Sun Apr 22 11:00:29 2012 -0600
# Node ID 30010e2f6a55db2f07c0c22c7d03b117e9711183
Our busdma partial cacheline flush fix has been committed to freebsd, with a
few minor differences.  This commit syncs our code with freebsd-current as
of 2012-04-22, which gets us these other minor changes in addition to the
final form of our fix:

 - A couple datatypes are changed between bus_size_t and bus_addr_t.  This
   is a no-op for us (both are u_long types for our chip) but is more correct
   for a platform with 64-bit addressing.
 - Non-standard min() and max() invocations changed to MIN() and MAX().
 - In bus_dmamap_destroy(), the list of bounce pages is walked and freed
   before freeing the dma map that contains the list head.  (Nothing we
   do uses bounce pages right now.)

diff -r 0365aa062b91 -r 30010e2f6a55 sys/arm/arm/busdma_machdep.c
--- a/sys/arm/arm/busdma_machdep.c	Sat Apr 21 10:07:06 2012 -0600
+++ b/sys/arm/arm/busdma_machdep.c	Sun Apr 22 11:00:29 2012 -0600
@@ -29,7 +29,7 @@
  */
 
 #include <sys/cdefs.h>
-__FBSDID("$FreeBSD: sys/arm/arm/busdma_machdep.c 205193 2010-03-15 19:59:16Z raj $");
+__FBSDID("$FreeBSD: head/sys/arm/arm/busdma_machdep.c 234561 2012-04-22 00:58:04Z marius $");
 
 /*
  * ARM bus dma support routines
@@ -68,7 +68,7 @@ struct bounce_zone;
 struct bus_dma_tag {
 	bus_dma_tag_t		parent;
 	bus_size_t		alignment;
-	bus_size_t		boundary;
+	bus_addr_t		boundary;
 	bus_addr_t		lowaddr;
 	bus_addr_t		highaddr;
 	bus_dma_filter_t	*filter;
@@ -126,7 +126,7 @@ static int total_bpages;
 static int busdma_zonecount;
 static STAILQ_HEAD(, bounce_zone) bounce_zone_list;
 
-SYSCTL_NODE(_hw, OID_AUTO, busdma, CTLFLAG_RD, 0, "Busdma parameters");
+static SYSCTL_NODE(_hw, OID_AUTO, busdma, CTLFLAG_RD, 0, "Busdma parameters");
 SYSCTL_INT(_hw_busdma, OID_AUTO, total_bpages, CTLFLAG_RD, &total_bpages, 0,
 	   "Total bounce pages");
 
@@ -332,7 +332,7 @@ static __inline void
 
 int
 bus_dma_tag_create(bus_dma_tag_t parent, bus_size_t alignment,
-		   bus_size_t boundary, bus_addr_t lowaddr,
+		   bus_addr_t boundary, bus_addr_t lowaddr,
 		   bus_addr_t highaddr, bus_dma_filter_t *filter,
 		   void *filterarg, bus_size_t maxsize, int nsegments,
 		   bus_size_t maxsegsz, int flags, bus_dma_lock_t *lockfunc,
@@ -378,12 +378,12 @@ bus_dma_tag_create(bus_dma_tag_t parent,
 	 * Take into account any restrictions imposed by our parent tag
 	 */
         if (parent != NULL) {
-                newtag->lowaddr = min(parent->lowaddr, newtag->lowaddr);
-                newtag->highaddr = max(parent->highaddr, newtag->highaddr);
+                newtag->lowaddr = MIN(parent->lowaddr, newtag->lowaddr);
+                newtag->highaddr = MAX(parent->highaddr, newtag->highaddr);
 		if (newtag->boundary == 0)
 			newtag->boundary = parent->boundary;
 		else if (parent->boundary != 0)
-                	newtag->boundary = min(parent->boundary,
+                	newtag->boundary = MIN(parent->boundary,
 					       newtag->boundary);
 		if ((newtag->filter != NULL) ||
 		    ((parent->flags & BUS_DMA_COULD_BOUNCE) != 0))
@@ -555,12 +555,12 @@ int
 bus_dmamap_destroy(bus_dma_tag_t dmat, bus_dmamap_t map)
 {
 
-	_busdma_free_dmamap(map);
 	if (STAILQ_FIRST(&map->bpages) != NULL) {
 		CTR3(KTR_BUSDMA, "%s: tag %p error %d",
 		    __func__, dmat, EBUSY);
 		return (EBUSY);
 	}
+	_busdma_free_dmamap(map);
 	if (dmat->bounce_zone)
 		dmat->bounce_zone->map_count--;
         dmat->map_count--;
@@ -1090,17 +1090,17 @@ void
 static void
 bus_dmamap_sync_buf(void *buf, int len, bus_dmasync_op_t op)
 {
-	uint32_t intr;
+	char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align];
+	register_t s;
 	int partial; 
-	char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align];
 
 	if ((op & BUS_DMASYNC_PREWRITE) && !(op & BUS_DMASYNC_PREREAD)) {
 		cpu_dcache_wb_range((vm_offset_t)buf, len);
 		cpu_l2cache_wb_range((vm_offset_t)buf, len);
 	}
+	partial = (((vm_offset_t)buf) | len) & arm_dcache_align_mask;
 	if (op & BUS_DMASYNC_PREREAD) {
-		if (!(op & BUS_DMASYNC_PREWRITE) &&
-		    ((((vm_offset_t)(buf) | len) & arm_dcache_align_mask) == 0)) {
+		if (!(op & BUS_DMASYNC_PREWRITE) && !partial) {
 			cpu_dcache_inv_range((vm_offset_t)buf, len);
 			cpu_l2cache_inv_range((vm_offset_t)buf, len);
 		} else {
@@ -1108,29 +1108,18 @@ bus_dmamap_sync_buf(void *buf, int len, 
 	    		cpu_l2cache_wbinv_range((vm_offset_t)buf, len);
 		}
 	}
-	/*
-	 * Interrupts must be disabled while handling a partial cacheline flush,
-	 * otherwise the interrupt handling code could modify data in the
-	 * non-DMA part of a cacheline while we have it stashed away in the
-	 * temporary stack buffer, then we end up restoring the stale value.
-	 * As unlikely as this seems, it has been observed in the real world.
-	 */
 	if (op & BUS_DMASYNC_POSTREAD) {
-		partial = (((vm_offset_t)buf) | len) & arm_dcache_align_mask;
 		if (partial) {
-			intr = intr_disable();
-			if ((vm_offset_t)buf & arm_dcache_align_mask) {
+			s = intr_disable();
+			if ((vm_offset_t)buf & arm_dcache_align_mask)
 				memcpy(_tmp_cl, (void *)((vm_offset_t)buf &
 				    ~arm_dcache_align_mask),
 				    (vm_offset_t)buf & arm_dcache_align_mask);
-			}
-			if (((vm_offset_t)buf + len) & arm_dcache_align_mask) {
+			if (((vm_offset_t)buf + len) & arm_dcache_align_mask)
 				memcpy(_tmp_clend, 
 				    (void *)((vm_offset_t)buf + len),
-				    arm_dcache_align - 
-				    (((vm_offset_t)(buf) + len) &
-				    arm_dcache_align_mask));
-			}
+				    arm_dcache_align - (((vm_offset_t)(buf) +
+				    len) & arm_dcache_align_mask));
 		}
 		cpu_dcache_inv_range((vm_offset_t)buf, len);
 		cpu_l2cache_inv_range((vm_offset_t)buf, len);
@@ -1141,11 +1130,10 @@ bus_dmamap_sync_buf(void *buf, int len, 
 				    (vm_offset_t)buf & arm_dcache_align_mask);
 			if (((vm_offset_t)buf + len) & arm_dcache_align_mask)
 				memcpy((void *)((vm_offset_t)buf + len), 
-				    _tmp_clend,
-				    arm_dcache_align - 
+				    _tmp_clend, arm_dcache_align - 
 				    (((vm_offset_t)(buf) + len) &
 				    arm_dcache_align_mask));
-			intr_restore(intr);
+			intr_restore(s);
 		}
 	}
 }

[-- Attachment #3 --]
# HG changeset patch
# User Ian Lepore
# Date 1316633416 21600
#      Wed Sep 21 13:30:16 2011 -0600
# Node ID 9669e5a3f2220fdc919eeed49925c77131ddbe07
Work around the disabled-icache performance problem with shared libraries by
using mmap()/memcpy() rather than pread() to read the first page of the shared
lib.  Any use of "normal IO" on an executable leaves a kernel-writable mapping
of the pages in the vfs buffer cache, and as long as a writable mapping exists
the icache bit will be turned off on that page.  Making the problem even worse,
using [p]read() triggers a readahead or cluster-read operation so it's not
just a single 4k page, it's up to 256k (which for many shared libs is the
whole library).  When an executable is mmap()'d and then the pages are
brought in by the pager the first time they're referenced, the kernel writable
mapping created to do so is deleted as soon as the IO is done, so mmap-based
IO doesn't have the same problem as "normal" IO.  (But it appears to be
non-trivial to make "normal" IO use temporary mappings like mmap does.)

This is a better workaround than using static linking on our apps, because
this helps all apps (system daemons, etc) that use shared libs.  But this does
not fix the problem of doing "normal IO" things to our apps, like using scp
to copy a new version onto the box -- in that case normal IO has happened and
icache will be disabled on that app until the next reboot or something flushes
those buffers out of the cache (and that pretty much never happens unless you
unmount the filesystem).

diff -r f3bc3736869f -r 9669e5a3f222 libexec/rtld-elf/map_object.c
--- a/libexec/rtld-elf/map_object.c	Wed Sep 21 10:06:17 2011 -0600
+++ b/libexec/rtld-elf/map_object.c	Wed Sep 21 13:30:16 2011 -0600
@@ -272,11 +272,14 @@ get_elf_header (int fd, const char *path
 	char buf[PAGE_SIZE];
     } u;
     ssize_t nbytes;
+    void * mapped;
 
-    if ((nbytes = pread(fd, u.buf, PAGE_SIZE, 0)) == -1) {
-	_rtld_error("%s: read error: %s", path, strerror(errno));
-	return NULL;
-    }
+    /* Use mmap() + memcpy() rather than [p]read() to avoid readahead. */
+    if ((mapped = mmap(NULL, PAGE_SIZE, PROT_READ, 0, fd, 0)) == MAP_FAILED)
+        return NULL;
+    memcpy(u.buf, mapped, sizeof(u.buf));
+    munmap(mapped, PAGE_SIZE);
+    nbytes = sizeof(u.buf);
 
     /* Make sure the file is valid */
     if (nbytes < (ssize_t)sizeof(Elf_Ehdr) || !IS_ELF(u.hdr)) {

[-- Attachment #4 --]
# HG changeset patch
# User Ian Lepore
# Date 1314822903 21600
#      Wed Aug 31 14:35:03 2011 -0600
# Node ID eea9dd75de6bb6e6f85b0640d7c7de2bb3a55696
On arm, make the minimum kernel malloc size equal to the arm cache line size.
In other words, don't let unrelated small allocations share a cache line, to
try to minimize impact when there's trouble with cache coherency logic.

(This increases the minimum alloc size from 16 to 32 bytes.)

diff -r 78911e2a8696 -r eea9dd75de6b sys/kern/kern_malloc.c
--- a/sys/kern/kern_malloc.c	Wed Aug 31 12:11:10 2011 -0600
+++ b/sys/kern/kern_malloc.c	Wed Aug 31 14:35:03 2011 -0600
@@ -94,6 +94,24 @@ dtrace_malloc_probe_func_t	dtrace_malloc
 #endif
 
 /*
+ * Symmetricom-local hack: If building for arm, force the minimum allocation
+ * size to be the size of a cache line.  This helps work around problems where
+ * doing DMA into a buffer that straddles a cache line boundary can lead to data
+ * corruption when context switches happen while DMA is in progress.  It should
+ * be noted that this IS NOT A FIX for that problem, it's just a cheap way to
+ * make the problem less likely, especially given that the USB driver in 6.x
+ * allocates small temp buffers for IO using malloc(9).
+ *
+ * When not building for arm, use MINALLOCSIZE (which is tied to UMA's min).
+ */
+#if defined(__arm__)
+#include <machine/cpufunc.h>
+#define LOCAL_MINALLOCSIZE arm_dcache_align
+#else
+#define LOCAL_MINALLOCSIZE MINALLOCSIZE
+#endif
+
+/*
  * When realloc() is called, if the new size is sufficiently smaller than
  * the old size, realloc() will allocate a new, smaller block to avoid
  * wasting memory. 'Sufficiently smaller' is defined as: newsize <=
@@ -364,6 +382,9 @@ malloc(unsigned long size, struct malloc
 	unsigned long osize = size;
 #endif
 
+	if (size < LOCAL_MINALLOCSIZE)
+		size = LOCAL_MINALLOCSIZE;
+
 #ifdef INVARIANTS
 	KASSERT(mtp->ks_magic == M_MAGIC, ("malloc: bad malloc type magic"));
 	/*
@@ -525,6 +546,9 @@ realloc(void *addr, unsigned long size, 
 	if (addr == NULL)
 		return (malloc(size, mtp, flags));
 
+	if (size != 0 && size < LOCAL_MINALLOCSIZE)
+		size = LOCAL_MINALLOCSIZE;
+
 	/*
 	 * XXX: Should report free of old memory and alloc of new memory to
 	 * per-CPU stats.

[-- Attachment #5 --]
# Note: three patches in this file.
#
# HG changeset patch
# User Ian Lepore
# Date 1318341352 21600
#      Tue Oct 11 07:55:52 2011 -0600
# Node ID 404447680b533879224d3d679f11ade1fa45ba8c
Properly initialize the ARM global per-thread-data page.  It gets allocated
with the VM_ALLOC_ZERO flag, but that does not g'tee that the page returned
has been zeroed, it just says you'd prefer that.  So check the page flags
and if a zeroed page was not provided, call bzero().

Also, init the global RAS_END value to 0xffffffff, because that's a required
pre-condition of the RAS scheme when there is no atomic sequence running.

diff -r 93a1e9567ee3 -r 404447680b53 sys/arm/arm/machdep.c
--- a/sys/arm/arm/machdep.c	Tue Oct 11 07:25:58 2011 -0600
+++ b/sys/arm/arm/machdep.c	Tue Oct 11 07:55:52 2011 -0600
@@ -312,6 +312,9 @@ cpu_startup(void *dummy)
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NOOBJ | VM_ALLOC_ZERO);
 	pmap_kenter_user(ARM_TP_ADDRESS, VM_PAGE_TO_PHYS(m));
 #endif
+	if ((m->flags & PG_ZERO) == 0)
+                bzero((void*)ARM_TP_ADDRESS, PAGE_SIZE);
+	*(uint32_t *)ARM_RAS_END = 0xffffffff;
 }
 
 SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL);
# HG changeset patch
# User Ian Lepore
# Date 1318342243 21600
#      Tue Oct 11 08:10:43 2011 -0600
# Node ID 24b9862c3ca5ae479a5a0397d7dada667ca1da1d
Oops, move the code to zero the thread-data page inside the #ifdef/#else
block, so that if ARM_CACHE_LOCK_ENABLE is defined it doesn't end up
referencing an undefined variable 'm'.  Also zero the page in the other
leg of the #ifdef/else, although it's not clear to me that the code in
the other leg would still work these days.

diff -r 404447680b53 -r 24b9862c3ca5 sys/arm/arm/machdep.c
--- a/sys/arm/arm/machdep.c	Tue Oct 11 07:55:52 2011 -0600
+++ b/sys/arm/arm/machdep.c	Tue Oct 11 08:10:43 2011 -0600
@@ -308,12 +308,13 @@ cpu_startup(void *dummy)
 #ifdef ARM_CACHE_LOCK_ENABLE
 	pmap_kenter_user(ARM_TP_ADDRESS, ARM_TP_ADDRESS);
 	arm_lock_cache_line(ARM_TP_ADDRESS);
+	bzero((void*)ARM_TP_ADDRESS, PAGE_SIZE);
 #else
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NOOBJ | VM_ALLOC_ZERO);
 	pmap_kenter_user(ARM_TP_ADDRESS, VM_PAGE_TO_PHYS(m));
+	if ((m->flags & PG_ZERO) == 0)
+		bzero((void*)ARM_TP_ADDRESS, PAGE_SIZE);
 #endif
-	if ((m->flags & PG_ZERO) == 0)
-                bzero((void*)ARM_TP_ADDRESS, PAGE_SIZE);
 	*(uint32_t *)ARM_RAS_END = 0xffffffff;
 }
 
# HG changeset patch
# User Ian Lepore
# Date 1320936167 25200
#      Thu Nov 10 07:42:47 2011 -0700
# Node ID 381865a62fbb370a560da9b5ce4affeee986f362
One of our RAS patches was accepted by FreeBSD in a slightly different form
than we submitted.  Bring our repo into line with what they did, to smooth
future imports.

diff -r eacb60bcc0d3 -r 381865a62fbb sys/arm/arm/machdep.c
--- a/sys/arm/arm/machdep.c	Mon Oct 31 09:33:07 2011 -0600
+++ b/sys/arm/arm/machdep.c	Thu Nov 10 07:42:47 2011 -0700
@@ -308,13 +308,11 @@ cpu_startup(void *dummy)
 #ifdef ARM_CACHE_LOCK_ENABLE
 	pmap_kenter_user(ARM_TP_ADDRESS, ARM_TP_ADDRESS);
 	arm_lock_cache_line(ARM_TP_ADDRESS);
-	bzero((void*)ARM_TP_ADDRESS, PAGE_SIZE);
 #else
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NOOBJ | VM_ALLOC_ZERO);
 	pmap_kenter_user(ARM_TP_ADDRESS, VM_PAGE_TO_PHYS(m));
-	if ((m->flags & PG_ZERO) == 0)
-		bzero((void*)ARM_TP_ADDRESS, PAGE_SIZE);
 #endif
+	*(uint32_t *)ARM_RAS_START = 0;
 	*(uint32_t *)ARM_RAS_END = 0xffffffff;
 }
 

[-- Attachment #6 --]
# HG changeset patch
# User Ian Lepore
# Date 1314482130 21600
#      Sat Aug 27 15:55:30 2011 -0600
# Node ID 920ecb84db20bd8e782acf5ff254bb4f2e9c47e9
Fixes for a couple related bugs in the ARM RAS logic.  The bugs provide a small
(exceedingly small) window during which a restartable atomic sequence might not
get restarted as it should, causing operations from atomic.h to fail.  This
might result in things such as handing the same mutex to two different threads
concurrently or failing to properly unlock a mutex.  (Note:  *might*.  These
problems were found by analysis and no confirmed sighting has ever happened.
On the other hand, any number of problems we've seen *might* have been caused
by a RAS failure.)

These are the fixes I developed in 2008, but they never got checked in.
Testing-wise, I've run DVB with these changes for months without problems.

Here's more info on the problems and how these changes fix it...

The "normal" RAS sequence goes like this:

        On entry, RAS_START = 0x00000000, RAS_END = 0xffffffff

        1: Set RAS_START to address of Step 1
        2: Set RAS_END to address of Step 4
        3: Do the atomic operation
        4: Set RAS_START to 0x00000000
        5: Set RAS_END to 0xffffffff

While the normal entry condition is for RAS_START/END to be 0x0/0xffffffff
the code in PUSHFRAMEINSVC does not reliably enforce this, which is the
source of one of the two bugs in the current implementation.  The other
bug is that the PUSHFRAMEINSVC code compares the PC to RAS_END using a
signed compare/branch sequence.

The first bug happens with interaction between a pair of threads doing RAS
sequences at the same time.  Thread 1 runs through step 4 of the sequence
and gets interrupted before completing step 5.  The PUSHFRAMEINSVC code
sees a zero in RAS_START and does nothing; notably it does not change the
value in RAS_END because START was zero.  Thread 2 begins to run and gets
interrupted between steps 1 and 2.  Now PUSHFRAMEINSVC sees a non-zero
RAS_START, and the RAS_END value is whatever thread 1 left there when it
got interrupted.  The behavior now depends on whether the RAS sequence in
thread 2 is at a higher or lower memory address than the one in thread 1.
In one case it accidentally works right, in the other case an interrupt
between steps 1 and 2 wouldn't lead to a proper restart because the PC is
not between RAS_START and RAS_END.  But because RAS_START was non-zero
when thread 2 got interrupted, RAS_START/END get reset, so when thread 2
gets resumed at step 2 it never reloads RAS_START, and another interrrupt
in the sequence leads to misbehavior.

The second bug (signed compare) is conceptually simpler, here is a
sequence that triggers that bug:

On entry, RAS_START/END are 0x0/0xffffffff. A thread enters a RAS
sequence and completes step 1, then gets interrupted before completing
step 2.  The PUSHFRAMEINSVC code sees a non-zero RAS_START so it compares
the PC to RAS_END.  Because of the signed compare, the PC will never be
between RAS_START and -1, so the PUSHFRAMEINSVC code doesn't set up for a
restart at the RAS_START location.  When the thread eventually resumes at
step 2, RAS_START will be zero, and if it gets interrupted again before
the end of the atomic operation it would not get restarted.

diff -r bb14ddf42374 -r 920ecb84db20 sys/arm/include/asmacros.h
--- a/sys/arm/include/asmacros.h	Wed Aug 24 10:19:37 2011 -0600
+++ b/sys/arm/include/asmacros.h	Sat Aug 27 15:55:30 2011 -0600
@@ -71,9 +71,8 @@
 	ldr	r0, =ARM_RAS_START;					   \
 	mov	r1, #0;							   \
 	str	r1, [r0];						   \
-	ldr	r0, =ARM_RAS_END;					   \
 	mov	r1, #0xffffffff;					   \
-	str	r1, [r0];
+	str	r1, [r0, #4];
 
 /*
  * PULLFRAME - macro to pull a trap frame from the stack in the current mode
@@ -118,23 +117,22 @@
 	stmia	sp, {r0-r12};		/* Push the user mode registers */ \
 	add	r0, sp, #(4*13);	/* Adjust the stack pointer */	   \
 	stmia	r0, {r13-r14}^;		/* Push the user mode registers */ \
-        mov     r0, r0;                 /* NOP for previous instruction */ \
-	ldr	r5, =ARM_RAS_START;	/* Check if there's any RAS */	   \
-	ldr	r3, [r5];						   \
-	cmp	r3, #0;			/* Is the update needed ? */	   \
-	ldrgt	lr, [r0, #16];						   \
-	ldrgt	r1, =ARM_RAS_END;					   \
-	ldrgt	r4, [r1];		/* Get the end of the RAS */	   \
-	movgt	r2, #0;			/* Reset the magic addresses */	   \
-	strgt	r2, [r5];						   \
-	movgt	r2, #0xffffffff;					   \
-	strgt	r2, [r1];						   \
-	cmpgt	lr, r3;			/* Were we in the RAS ? */	   \
-	cmpgt	r4, lr;							   \
-	strgt	r3, [r0, #16];		/* Yes, update the pc */	   \
-	mrs	r0, spsr_all;		/* Put the SPSR on the stack */	   \
-	str	r0, [sp, #-4]!
-
+	mov     r0, r0;                 /* NOP for previous instruction */ \
+	ldr     r5, =ARM_RAS_START;     /* Retrieve global RAS_END and  */ \
+	ldr     r4, [r5, #4];           /* reset it to point at the     */ \
+	cmp     r4, #0xffffffff;        /* end of memory if necessary;  */ \
+	movne   r1, #0xffffffff;        /* leave value in r4 for later  */ \
+	strne   r1, [r5, #4];           /* comparision against PC.      */ \
+	ldr     r3, [r5];               /* Retrieve global RAS_START    */ \
+	cmp     r3, #0;                 /* and reset it if non-zero.    */ \
+	movne   r1, #0;                 /* If non-zero RAS_START and    */ \
+	strne   r1, [r5];               /* PC was lower than RAS_END,   */ \
+	ldrne   r1, [r0, #16];          /* adjust the saved PC so that  */ \
+	cmpne   r4, r1;                 /* execution later resumes at   */ \
+	strhi   r3, [r0, #16];          /* the RAS_START location.      */ \
+	mrs     r0, spsr_all;                                              \
+	str     r0, [sp, #-4]!
+            
 /*
  * PULLFRAMEFROMSVCANDEXIT - macro to pull a trap frame from the stack
  * in SVC32 mode and restore the saved processor mode and PC.
diff -r bb14ddf42374 -r 920ecb84db20 sys/arm/include/sysarch.h
--- a/sys/arm/include/sysarch.h	Wed Aug 24 10:19:37 2011 -0600
+++ b/sys/arm/include/sysarch.h	Sat Aug 27 15:55:30 2011 -0600
@@ -42,9 +42,13 @@
  * The ARM_TP_ADDRESS points to a special purpose page, which is used as local
  * store for the ARM per-thread data and Restartable Atomic Sequences support.
  * Put it just above the "high" vectors' page.
- * the cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
+ * The cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
  * ARM_RAS_END is ARM_TP_ADDRESS + 8, so if that ever changes, be sure to
  * update the cpu_switch() (and cpu_throw()) code as well.
+ * In addition, code in arm/include/atomic.h and arm/include/asmacros.h
+ * assumes that ARM_RAS_END is at ARM_RAS_START+4, so be sure to update those
+ * if ARM_RAS_END moves in relation to ARM_RAS_START (look for occurrances
+ * of ldr/str rm,[rn, #4]).
  */
 #define ARM_TP_ADDRESS		(ARM_VECTORS_HIGH + 0x1000)
 #define ARM_RAS_START		(ARM_TP_ADDRESS + 4)


[-- Attachment #7 --]
# HG changeset patch
# User Ian Lepore
# Date 1323548504 25200
#      Sat Dec 10 13:21:44 2011 -0700
# Node ID 02bcdbcdb0fca09a9212267b9372e2f5f93e194f
Pull in a patch freebsd recently applied to fix a compiler code gen bug for arm.
The bug would generate bad code under some conditions when the -fstack-protector
option is used (which it is when compiling freebsd code).

For more info, see...

  http://www.freebsd.org/cgi/query-pr.cgi?pr=161128
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35965

diff -r 381865a62fbb -r 02bcdbcdb0fc contrib/gcc/config/arm/arm.c
--- a/contrib/gcc/config/arm/arm.c	Thu Nov 10 07:42:47 2011 -0700
+++ b/contrib/gcc/config/arm/arm.c	Sat Dec 10 13:21:44 2011 -0700
@@ -3217,7 +3217,8 @@ legitimize_pic_address (rtx orig, enum m
 	  gcc_assert (!no_new_pseudos);
 	  if (arm_pic_register != INVALID_REGNUM)
 	    {
-	      cfun->machine->pic_reg = gen_rtx_REG (Pmode, arm_pic_register);
+	      if (!cfun->machine->pic_reg)
+		cfun->machine->pic_reg = gen_rtx_REG (Pmode, arm_pic_register);
 
 	      /* Play games to avoid marking the function as needing pic
 		 if we are being called as part of the cost-estimation
@@ -3229,7 +3230,8 @@ legitimize_pic_address (rtx orig, enum m
 	    {
 	      rtx seq;
 
-	      cfun->machine->pic_reg = gen_reg_rtx (Pmode);
+	      if (!cfun->machine->pic_reg)
+		  cfun->machine->pic_reg = gen_reg_rtx (Pmode);
 
 	      /* Play games to avoid marking the function as needing pic
 		 if we are being called as part of the cost-estimation

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1482949268.16152.7.camel>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation