From owner-freebsd-arm@FreeBSD.ORG Sat Dec 15 20:00:00 2012 Return-Path: Delivered-To: freebsd-arm@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C669060D for ; Sat, 15 Dec 2012 20:00:00 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 92A648FC19 for ; Sat, 15 Dec 2012 20:00:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id qBFK00cN096503 for ; Sat, 15 Dec 2012 20:00:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id qBFK00Ke096502; Sat, 15 Dec 2012 20:00:00 GMT (envelope-from gnats) Resent-Date: Sat, 15 Dec 2012 20:00:00 GMT Resent-Message-Id: <201212152000.qBFK00Ke096502@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-arm@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Ian Lepore Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7D66B549 for ; Sat, 15 Dec 2012 19:57:43 +0000 (UTC) (envelope-from ilepore@damnhippie.dyndns.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id 49A2C8FC14 for ; Sat, 15 Dec 2012 19:57:42 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.5/8.14.5) with ESMTP id qBFJvfpn058813 for ; Sat, 15 Dec 2012 12:57:42 -0700 (MST) (envelope-from ilepore@damnhippie.dyndns.org) Received: from revolution.hippie.lan (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id qBFJvdBX060592 for ; Sat, 15 Dec 2012 12:57:39 -0700 (MST) (envelope-from ilepore@damnhippie.dyndns.org) Received: (from ilepore@localhost) by revolution.hippie.lan (8.14.5/8.14.4/Submit) id qBFJvdUo044496; Sat, 15 Dec 2012 12:57:39 -0700 (MST) (envelope-from ilepore) Message-Id: <201212151957.qBFJvdUo044496@revolution.hippie.lan> Date: Sat, 15 Dec 2012 12:57:39 -0700 (MST) From: Ian Lepore To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: arm/174461: [patch] Fix off-by-one in arm9/arm10 cache maintenance routines X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Ian Lepore List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Dec 2012 20:00:00 -0000 >Number: 174461 >Category: arm >Synopsis: [patch] Fix off-by-one in arm9/arm10 cache maintenance routines >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-arm >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Dec 15 20:00:00 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Ian Lepore >Release: FreeBSD 10.0-CURRENT arm >Organization: Symmetricom, Inc. >Environment: FreeBSD dpcur 10.0-CURRENT FreeBSD 10.0-CURRENT #23 r243920M: Sat Dec 15 11:31:47 MST 2012 ilepore@revolution.hippie.lan:/local/build/staging/freebsd/dp10/obj/arm.arm/local/build/staging/freebsd/dp10/src/sys/DP arm >Description: In all the routines that loop through a range of virtual addresses, the loop is controlled by subtracting the cache line size from the total length of the request. After the subtract, a 'bpl' instruction was used, which branches if the result of the subtraction is zero or greater, but we need to exit the loop when the count hits zero. Thus, all the bpl instructions in those loops have been changed to 'bhi' (branch if greater than zero). In addition, the two routines that walk through the cache using set-and-index were correct, but confusing. The loop control for those has been simplified, just so that it's easier to see by examination that the code is correct. Routines for other arm architectures and generations still have the bpl instruction, but compensate for the off-by-one situation by decrementing the count register by one before entering the loop. Just for the sake of consistancy, these should probably all be changed to remove the decrement and use the correct branch instruction. That would then make it easier to see what appears superficially to be lots of duplication in these routines, and some consolidation could happen. >How-To-Repeat: Build a kernel for Atmel ARM with INVARIANTS and INVARIANT_SUPPORT enabled. When it boots, the uart devices fail to instantiate. The reason was that the driver allocates a dma tag, then a dma buffer, and accidentally the tag ended up laid out in memory immediately after the buffer. When the driver did the bus_dmamap_sync(PREREAD) on the buffer, the off-by-one error caused the cache line immediately following that buffer (the first 32 bytes of the dma tag) to be erroniously invalidated. Because of INVARIANTS, the physical memory under that dirty cache line was full of 0xdeadc0de, so the invalidate operation corrupted the dma tag. When the driver then attempted to allocate another buffer using that tag it failed, because the 0xdeadc0de values in the tag led to insane allocation decisions that caused malloc() to fail. Without INVARIANTS, either things end up in different places in memory, or the values in underlying memory that become exposed after the bad invalidate are harmless; either way, it accidentally works right most of the time. >Fix: --- arm9_arm10_cacheops_offbyone_fix.diff begins here --- diff -r 0f2004466772 sys/arm/arm/cpufunc_asm_arm10.S --- sys/arm/arm/cpufunc_asm_arm10.S Thu Dec 06 08:24:00 2012 -0700 +++ sys/arm/arm/cpufunc_asm_arm10.S Sat Dec 15 11:30:41 2012 -0700 @@ -87,7 +87,7 @@ ENTRY_NP(arm10_icache_sync_range) mcr p15, 0, r0, c7, c10, 1 /* Clean D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm10_sync_next + bhi .Larm10_sync_next mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr @@ -108,12 +108,10 @@ ENTRY_NP(arm10_icache_sync_all) orr ip, s_max, i_max .Lnext_index: mcr p15, 0, ip, c7, c10, 2 /* Clean D cache SE with Set/Index */ - sub ip, ip, i_inc - tst ip, i_max /* Index 0 is last one */ - bne .Lnext_index /* Next index */ - mcr p15, 0, ip, c7, c10, 2 /* Clean D cache SE with Set/Index */ + subs ip, ip, i_inc + bhs .Lnext_index /* Next index */ subs s_max, s_max, s_inc - bpl .Lnext_set /* Next set */ + bhs .Lnext_set /* Next set */ mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr @@ -133,7 +131,7 @@ ENTRY(arm10_dcache_wb_range) mcr p15, 0, r0, c7, c10, 1 /* Clean D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm10_wb_next + bhi .Larm10_wb_next mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr @@ -150,7 +148,7 @@ ENTRY(arm10_dcache_wbinv_range) mcr p15, 0, r0, c7, c14, 1 /* Purge D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm10_wbinv_next + bhi .Larm10_wbinv_next mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr @@ -171,7 +169,7 @@ ENTRY(arm10_dcache_inv_range) mcr p15, 0, r0, c7, c6, 1 /* Invalidate D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm10_inv_next + bhi .Larm10_inv_next mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr @@ -189,7 +187,7 @@ ENTRY(arm10_idcache_wbinv_range) mcr p15, 0, r0, c7, c14, 1 /* Purge D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm10_id_wbinv_next + bhi .Larm10_id_wbinv_next mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr @@ -211,12 +209,10 @@ ENTRY(arm10_dcache_wbinv_all) orr ip, s_max, i_max .Lnext_index_inv: mcr p15, 0, ip, c7, c14, 2 /* Purge D cache SE with Set/Index */ - sub ip, ip, i_inc - tst ip, i_max /* Index 0 is last one */ - bne .Lnext_index_inv /* Next index */ - mcr p15, 0, ip, c7, c14, 2 /* Purge D cache SE with Set/Index */ + subs ip, ip, i_inc + bhs .Lnext_index_inv /* Next index */ subs s_max, s_max, s_inc - bpl .Lnext_set_inv /* Next set */ + bhs .Lnext_set_inv /* Next set */ mcr p15, 0, r0, c7, c10, 4 /* drain the write buffer */ bx lr diff -r 0f2004466772 sys/arm/arm/cpufunc_asm_arm9.S --- sys/arm/arm/cpufunc_asm_arm9.S Thu Dec 06 08:24:00 2012 -0700 +++ sys/arm/arm/cpufunc_asm_arm9.S Sat Dec 15 11:30:41 2012 -0700 @@ -81,7 +81,7 @@ ENTRY_NP(arm9_icache_sync_range) mcr p15, 0, r0, c7, c10, 1 /* Clean D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm9_sync_next + bhi .Larm9_sync_next mov pc, lr ENTRY_NP(arm9_icache_sync_all) @@ -101,12 +101,10 @@ ENTRY_NP(arm9_icache_sync_all) orr ip, s_max, i_max .Lnext_index: mcr p15, 0, ip, c7, c10, 2 /* Clean D cache SE with Set/Index */ - sub ip, ip, i_inc - tst ip, i_max /* Index 0 is last one */ - bne .Lnext_index /* Next index */ - mcr p15, 0, ip, c7, c10, 2 /* Clean D cache SE with Set/Index */ + subs ip, ip, i_inc + bhs .Lnext_index /* Next index */ subs s_max, s_max, s_inc - bpl .Lnext_set /* Next set */ + bhs .Lnext_set /* Next set */ mov pc, lr .Larm9_line_size: @@ -125,7 +123,7 @@ ENTRY(arm9_dcache_wb_range) mcr p15, 0, r0, c7, c10, 1 /* Clean D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm9_wb_next + bhi .Larm9_wb_next mov pc, lr ENTRY(arm9_dcache_wbinv_range) @@ -141,7 +139,7 @@ ENTRY(arm9_dcache_wbinv_range) mcr p15, 0, r0, c7, c14, 1 /* Purge D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm9_wbinv_next + bhi .Larm9_wbinv_next mov pc, lr /* @@ -161,7 +159,7 @@ ENTRY(arm9_dcache_inv_range) mcr p15, 0, r0, c7, c6, 1 /* Invalidate D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm9_inv_next + bhi .Larm9_inv_next mov pc, lr ENTRY(arm9_idcache_wbinv_range) @@ -178,7 +176,7 @@ ENTRY(arm9_idcache_wbinv_range) mcr p15, 0, r0, c7, c14, 1 /* Purge D cache SE with VA */ add r0, r0, ip subs r1, r1, ip - bpl .Larm9_id_wbinv_next + bhi .Larm9_id_wbinv_next mov pc, lr ENTRY_NP(arm9_idcache_wbinv_all) @@ -199,12 +197,10 @@ ENTRY(arm9_dcache_wbinv_all) orr ip, s_max, i_max .Lnext_index_inv: mcr p15, 0, ip, c7, c14, 2 /* Purge D cache SE with Set/Index */ - sub ip, ip, i_inc - tst ip, i_max /* Index 0 is last one */ - bne .Lnext_index_inv /* Next index */ - mcr p15, 0, ip, c7, c14, 2 /* Purge D cache SE with Set/Index */ + subs ip, ip, i_inc + bhs .Lnext_index_inv /* Next index */ subs s_max, s_max, s_inc - bpl .Lnext_set_inv /* Next set */ + bhs .Lnext_set_inv /* Next set */ mov pc, lr .Larm9_cache_data: --- arm9_arm10_cacheops_offbyone_fix.diff ends here --- >Release-Note: >Audit-Trail: >Unformatted: