From owner-freebsd-amd64@FreeBSD.ORG Sun Apr 5 18:30:38 2009 Return-Path: Delivered-To: amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87C1C10659C6; Sun, 5 Apr 2009 18:30:38 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [64.7.153.18]) by mx1.freebsd.org (Postfix) with ESMTP id 48C898FC14; Sun, 5 Apr 2009 18:30:38 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smtp1.sentex.ca (smtp1c.sentex.ca [64.7.153.10]) by smarthost1.sentex.ca (8.14.3/8.14.3) with ESMTP id n35IUaE9075784; Sun, 5 Apr 2009 14:30:36 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by smtp1.sentex.ca (8.14.3/8.14.3) with ESMTP id n35IUawC040860; Sun, 5 Apr 2009 14:30:36 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: by freebsd-current.sentex.ca (Postfix, from userid 666) id 138117302F; Sun, 5 Apr 2009 14:30:36 -0400 (EDT) Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Message-Id: <20090405183036.138117302F@freebsd-current.sentex.ca> Date: Sun, 5 Apr 2009 14:30:36 -0400 (EDT) X-Virus-Scanned: ClamAV 0.94.1/8983/Thu Feb 12 07:48:01 2009 clamav-milter version 0.94.2 on clamscanner2 X-Virus-Status: Clean X-Scanned-By: MIMEDefang 2.64 on 64.7.153.18 Cc: Subject: [head tinderbox] failure on amd64/amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Apr 2009 18:30:40 -0000 TB --- 2009-04-05 17:20:00 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2009-04-05 17:20:00 - starting HEAD tinderbox run for amd64/amd64 TB --- 2009-04-05 17:20:00 - cleaning the object tree TB --- 2009-04-05 17:21:10 - cvsupping the source tree TB --- 2009-04-05 17:21:10 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/HEAD/amd64/amd64/supfile TB --- 2009-04-05 17:21:17 - building world TB --- 2009-04-05 17:21:17 - MAKEOBJDIRPREFIX=/obj TB --- 2009-04-05 17:21:17 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2009-04-05 17:21:17 - TARGET=amd64 TB --- 2009-04-05 17:21:17 - TARGET_ARCH=amd64 TB --- 2009-04-05 17:21:17 - TZ=UTC TB --- 2009-04-05 17:21:17 - __MAKE_CONF=/dev/null TB --- 2009-04-05 17:21:17 - cd /src TB --- 2009-04-05 17:21:17 - /usr/bin/make -B buildworld >>> World build started on Sun Apr 5 17:21:20 UTC 2009 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything [...] rm -f .depend mkdep -f .depend -a -DRESCUE /src/sbin/routed/rtquery/rtquery.c echo rtquery: /obj/amd64/src/tmp/usr/lib/libc.a /obj/amd64/src/tmp/usr/lib/libmd.a >> .depend cc -O2 -pipe -DRESCUE -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-pointer-sign -c /src/sbin/routed/if.c cc -O2 -pipe -DRESCUE -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-pointer-sign -c /src/sbin/routed/input.c cc1: warnings being treated as errors /src/sbin/routed/input.c: In function 'ck_passwd': /src/sbin/routed/input.c:984: warning: format '%#x' expects type 'unsigned int', but argument 5 has type 'long unsigned int' *** Error code 1 Stop in /src/sbin/routed. *** Error code 1 Stop in /obj/amd64/src/rescue/rescue. *** Error code 1 Stop in /src/rescue/rescue. *** Error code 1 Stop in /src/rescue. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2009-04-05 18:30:35 - WARNING: /usr/bin/make returned exit code 1 TB --- 2009-04-05 18:30:35 - ERROR: failed to build world TB --- 2009-04-05 18:30:35 - 3245.27 user 338.72 system 4235.45 real http://tinderbox.des.no/tinderbox-head-HEAD-amd64-amd64.full From owner-freebsd-amd64@FreeBSD.ORG Mon Apr 6 11:06:50 2009 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C30D10656C6 for ; Mon, 6 Apr 2009 11:06:50 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 78A5A8FC26 for ; Mon, 6 Apr 2009 11:06:50 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n36B6oOU061792 for ; Mon, 6 Apr 2009 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n36B6nBk061788 for freebsd-amd64@FreeBSD.org; Mon, 6 Apr 2009 11:06:49 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 6 Apr 2009 11:06:49 GMT Message-Id: <200904061106.n36B6nBk061788@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-amd64@FreeBSD.org X-Mailman-Approved-At: Mon, 06 Apr 2009 11:30:44 +0000 Cc: Subject: Current problem reports assigned to freebsd-amd64@FreeBSD.org X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Apr 2009 11:06:51 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o amd64/132574 amd64 [boot] Freeze on bootstrap loader (CD) using ATA/133 P o amd64/132170 amd64 7.1 kernel compilation problem o amd64/132019 amd64 [install] kernel trap 12 while installation o amd64/131906 amd64 [ata] SATA data corruption with Promise PDC20378 (amd6 o amd64/131456 amd64 ACPI & ATA problems o amd64/131314 amd64 [modules] [panic] large modules fail to load on amd64 o amd64/131209 amd64 [panic] [bce] 7.1-STABLE amd64 crash - m0 NULL f amd64/130885 amd64 sockstat(1) on amd64 does not work o amd64/130864 amd64 [hang] Problem with copying files to a large partition o amd64/130817 amd64 FreeBSD does not support HP DL160G5 [regression] o amd64/130494 amd64 [boot] netbooting BTX fails on amd64 o amd64/130483 amd64 [mxge] MSI must be disabled when Myricom 10Gbps Card i o amd64/130368 amd64 [hang] Switching from xorg to console locks up compute o amd64/129889 amd64 [boot] The booting process stops at the line mounting o amd64/129721 amd64 [hang] Motherboard K9N2G Neo-FD hangs on boot of 7.0-R o amd64/129667 amd64 [ata] Elitegroup A780GM-A IDE controller not recognize o amd64/129488 amd64 [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o amd64/129426 amd64 [panic] FreeBSD 7.0 crash after subdiskXX: detached o amd64/129315 amd64 [boot] amd64 motherboard: Intel DG965WH motherboard co f amd64/128978 amd64 [install] FreeBSD 6.3 64-bit panics at boot time duri o amd64/128810 amd64 AMD 64 port installation o amd64/128765 amd64 [install] Install CD loads to Install choices but stop o amd64/128686 amd64 [ata] can't detect SATA Disk on 8.0-Current with NF550 o amd64/128263 amd64 [panic] 2 amd64 dl380 g5 with dual quadcore xeons, 8 a o amd64/128259 amd64 csh(1): "`" crashes csh o amd64/127640 amd64 gcc(1) will not build shared libraries with -fprofile- o amd64/127492 amd64 [zfs] System hang on ZFS input-output o amd64/127484 amd64 [timecounters] Drift problem with FreeBSD 7.0 and 7.1 o amd64/127451 amd64 [scheduler] incorrect load on quad core o amd64/127397 amd64 [amd64] 32bit application on FreeBSD-6.3 amd64 gets SI s amd64/127276 amd64 ldd(1) invokes linux yes o amd64/127129 amd64 mdconfig(8) is core dumping with Segmentation Fault 11 o amd64/125873 amd64 [smbd] [panic] Repeated kernel panics, trap 12 page fa o amd64/125002 amd64 [install] amd64, SATA hard disks not detected o amd64/124432 amd64 [panic] 7.0-STABLE panic: invalbuf: dirty bufs o amd64/124134 amd64 [kernel] The kernel doesn't follow the calling convent o amd64/123562 amd64 [install] FreeBSD amd64 not installs o amd64/123520 amd64 [ahd] unable to boot from net while using ahd o amd64/123456 amd64 fstat(1): /usr/bin/fstat shows error messages and hang f amd64/123275 amd64 [cbb] [pcmcia] cbb/pcmcia drivers on amd64 failure [re o kern/122782 amd64 [modules] accf_http.ko kernel module is not loadable o amd64/122695 amd64 [cpufreq] Lack of cpufreq control using amd64 eith cor o amd64/122624 amd64 unusable minimal installation of FreeBSD-7.0 o amd64/122549 amd64 7.0-RELEASE-amd64-bootonly.iso doesn't work w/ serial o amd64/122468 amd64 Compile problems after upgrading to 7.0 o amd64/122174 amd64 [panic] 7.0 no longer includes "device atpic" so fails o amd64/121590 amd64 [est] [p4tcc] [acpi_perf] setting dev.cpu.0.freq somet o amd64/121439 amd64 [boot] Installation of FreeBSD 7.0 fails: ACPI problem o amd64/120202 amd64 [amd64] [patch] [panic] kernel panic at start_all_aps, o amd64/119591 amd64 [amd64] [patch] time_t on 64-bit architecture o amd64/117418 amd64 [hang] FreeBSD 6.2 crash on amd64 4400+ with ssh o amd64/117316 amd64 [acpi] ACPI lockups on SuperMicro motherboard o amd64/117296 amd64 [ata] I don`t see second SATA IDE on VIA VT8237A a amd64/117186 amd64 [modules] kldload Unsupported file type on STABLE amd6 s amd64/116689 amd64 [request] support for MSI K9MM-V o amd64/116620 amd64 [hang] ifconfig spins when creating carp(4) device on o amd64/116322 amd64 [panic] At start fsck on current, the system panics o amd64/116159 amd64 [panic] Panic while debugging on CURRENT s amd64/115815 amd64 [ata] [request] Gigabyte GA-M61P-S3 Motherboard unsupp o amd64/115581 amd64 [Makefile] [patch] -mfancy-math-387 has no effect o amd64/115194 amd64 LCD screen remains blank after Dell XPS M1210 lid is c o amd64/114270 amd64 [cpufreq] cpufreq doesnt work when compiled in to kern o amd64/114111 amd64 [nfs] System crashes while writing on NFS-mounted shar f amd64/113021 amd64 [re] ASUS M2A-VM onboard NIC does not work o amd64/112222 amd64 [libc] 32-bit libc incorrectly converts some FP number f amd64/111992 amd64 [boot] BTX failed - HP Laptop dv2315nr o amd64/110655 amd64 [threads] 32 bit threaded applications crash on amd64 o amd64/110599 amd64 [geli] geli attach to gmirror device hangs and cannot s amd64/108861 amd64 [nve] nve(4) driver on FreeBSD 6.2 AMD64 does not work o amd64/106186 amd64 [panic] panic in swap_pager_swap_init (amd64/smp/6.2-p f amd64/105629 amd64 [re] TrendNet TEG-BUSR 10/100/1000 disables itself on f amd64/105531 amd64 [ata] gigabyte GA-M51GM-S2G / nVidia nForce 430 - does f amd64/105514 amd64 [boot] FreeBSD/amd64 - Fails to boot on HP Pavilion dv o amd64/102716 amd64 ex with no argument in an xterm gets SIGSEGV o amd64/97337 amd64 [dri] xorg reboots system if dri module is enabled o amd64/95888 amd64 [ata] kernel: ad2: TIMEOUT - WRITE_DMA retrying on HP f amd64/94989 amd64 [boot] BTX Halts on Sun Fire X2100 w/6.1-BETA4 (amd64) o amd64/94677 amd64 [panic] panic in amd64 install at non-root user creati o amd64/93961 amd64 [busdma] Problem in bounce buffer handling in sys/amd6 o amd64/92337 amd64 [em] FreeBSD 6.0 Release Intel Pro 1000 MT em1 no buff f amd64/91492 amd64 [boot] BTX halted o amd64/91405 amd64 [asr] [panic] Kernel panic caused by asr on 6.0-amd64 o amd64/89501 amd64 [install] System crashes on install using ftp on local o amd64/88790 amd64 [panic] kernel panic on first boot (after the FreeBSD o amd64/88568 amd64 [panic] 6.0-RELEASE install cd does not boot with usb o amd64/87689 amd64 [powerd] [hang] powerd hangs SMP Opteron 244 5-STABLE o amd64/87316 amd64 [vge] "vge0 attach returned 6" on FreeBSD 6.0-RC1 amd6 o amd64/87305 amd64 [smp] Dual Opteron / FreeBSD 5 & 6 / powerd results in f amd64/87258 amd64 [smp] [boot] cannot boot with SMP and Areca ARC-1160 r f amd64/86080 amd64 [radeon] [hang] radeon DRI causes system hang on amd64 s amd64/85273 amd64 [install] FreeBSD (NetBSD or OpenBSD) not install on l o amd64/78406 amd64 [panic]AMD64 w/ SCSI: issue 'rm -r /usr/ports' and sys o amd64/76136 amd64 [hang] system halts before reboot o amd64/74747 amd64 [panic] System panic on shutdown when process will not f amd64/73322 amd64 [msdosfs] [hang] unarchiving /etc to msdosfs locks up 95 problems total. From owner-freebsd-amd64@FreeBSD.ORG Wed Apr 8 20:25:05 2009 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 222BD1065672 for ; Wed, 8 Apr 2009 20:25:05 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by mx1.freebsd.org (Postfix) with SMTP id 26A4F8FC19 for ; Wed, 8 Apr 2009 20:25:03 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: (qmail invoked by alias); 08 Apr 2009 19:58:18 -0000 Received: from p54A3ED49.dip.t-dialin.net (EHLO tron.homeunix.org) [84.163.237.73] by mail.gmx.net (mp003) with SMTP; 08 Apr 2009 21:58:18 +0200 X-Authenticated: #1673122 X-Provags-ID: V01U2FsdGVkX18USWokAY8UFTCCyifQrh6VcCjKva/H0V5Jjs2LXe KHDQGN/SzkFTLo Message-ID: <49DD01CA.3040900@gmx.de> Date: Wed, 08 Apr 2009 21:58:02 +0200 From: Christoph Mallon User-Agent: Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: freebsd-amd64@freebsd.org, freebsd-i386@freebsd.org Content-Type: multipart/mixed; boundary="------------010606040208020605070204" X-Y-GMX-Trusted: 0 X-FuHaFi: 0.63,0.46 X-Mailman-Approved-At: Wed, 08 Apr 2009 20:36:55 +0000 Cc: Ed Schouten Subject: [PATCH] Simplify in*() and out*() functions of AMD64 and i386 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Apr 2009 20:25:05 -0000 This is a multi-part message in MIME format. --------------010606040208020605070204 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Hi amd64@ and i386@, attached is a patch which simplifies the in*() and out*() functions for I/O port access of AMD64 and i386. It removes an unnecessary distinction of cases for inb() and outb(), which was used to generate better code for ports < 256. This is unnecessary, because GCC supports an asm input constraint to handle this ("N"). The stale comment, which states there is no constraint for this, is removed, too. Also the {in,out}{w,l}() get treated with this constraint. They had no special case before, so now better code is generated for them. Further, the unnecessary "cld" is removed from {in,out}s{b,w,l}(), because it is guaranteed by the ABI that the direction flag is cleared. All in all the code for in/out gets a bit simpler. Comments are welcome Christoph --------------010606040208020605070204 Content-Type: text/plain; name="inout.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="inout.diff" Index: sys/i386/include/cpufunc.h =================================================================== --- sys/i386/include/cpufunc.h (Revision 190841) +++ sys/i386/include/cpufunc.h (Arbeitskopie) @@ -170,177 +170,97 @@ __asm __volatile("hlt"); } -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3 - -#define inb(port) inbv(port) -#define outb(port, data) outbv(port, data) - -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */ - -/* - * The following complications are to get around gcc not having a - * constraint letter for the range 0..255. We still put "d" in the - * constraint because "i" isn't a valid constraint when the port - * isn't constant. This only matters for -O0 because otherwise - * the non-working version gets optimized away. - * - * Use an expression-statement instead of a conditional expression - * because gcc-2.6.0 would promote the operands of the conditional - * and produce poor code for "if ((inb(var) & const1) == const2)". - * - * The unnecessary test `(port) < 0x10000' is to generate a warning if - * the `port' has type u_short or smaller. Such types are pessimal. - * This actually only works for signed types. The range check is - * careful to avoid generating warnings. - */ -#define inb(port) __extension__ ({ \ - u_char _data; \ - if (__builtin_constant_p(port) && ((port) & 0xffff) < 0x100 \ - && (port) < 0x10000) \ - _data = inbc(port); \ - else \ - _data = inbv(port); \ - _data; }) - -#define outb(port, data) ( \ - __builtin_constant_p(port) && ((port) & 0xffff) < 0x100 \ - && (port) < 0x10000 \ - ? outbc(port, data) : outbv(port, data)) - -static __inline u_char -inbc(u_int port) +static inline u_char +inb(u_short port) { - u_char data; - - __asm __volatile("inb %1,%0" : "=a" (data) : "id" ((u_short)(port))); - return (data); + u_char data; + __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port)); + return data; } -static __inline void -outbc(u_int port, u_char data) +static inline u_int +inl(u_short port) { - __asm __volatile("outb %0,%1" : : "a" (data), "id" ((u_short)(port))); -} - -#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3*/ - -static __inline u_char -inbv(u_int port) -{ - u_char data; - /* - * We use %%dx and not %1 here because i/o is done at %dx and not at - * %edx, while gcc generates inferior code (movw instead of movl) - * if we tell it to load (u_short) port. - */ - __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port)); + u_int data; + __asm volatile("inl %1, %0" : "=a" (data) : "Nd" (port)); return (data); } -static __inline u_int -inl(u_int port) +static inline void +insb(u_short port, void *addr, size_t cnt) { - u_int data; - - __asm __volatile("inl %%dx,%0" : "=a" (data) : "d" (port)); - return (data); + __asm volatile("rep; insb" + : "+D" (addr), "+c" (cnt) + : "d" (port) + : "memory"); } -static __inline void -insb(u_int port, void *addr, size_t cnt) +static inline void +insw(u_short port, void *addr, size_t cnt) { - __asm __volatile("cld; rep; insb" - : "+D" (addr), "+c" (cnt) - : "d" (port) - : "memory"); + __asm volatile("rep; insw" + : "+D" (addr), "+c" (cnt) + : "d" (port) + : "memory"); } -static __inline void -insw(u_int port, void *addr, size_t cnt) +static inline void +insl(u_short port, void *addr, size_t cnt) { - __asm __volatile("cld; rep; insw" - : "+D" (addr), "+c" (cnt) - : "d" (port) - : "memory"); + __asm volatile("rep; insl" + : "+D" (addr), "+c" (cnt) + : "d" (port) + : "memory"); } static __inline void -insl(u_int port, void *addr, size_t cnt) -{ - __asm __volatile("cld; rep; insl" - : "+D" (addr), "+c" (cnt) - : "d" (port) - : "memory"); -} - -static __inline void invd(void) { __asm __volatile("invd"); } -static __inline u_short -inw(u_int port) +static inline u_short +inw(u_short port) { - u_short data; - - __asm __volatile("inw %%dx,%0" : "=a" (data) : "d" (port)); + u_short data; + __asm volatile("inw %1, %0" : "=a" (data) : "Nd" (port)); return (data); } -static __inline void -outbv(u_int port, u_char data) +static inline void +outb(u_short port, u_char data) { - u_char al; - /* - * Use an unnecessary assignment to help gcc's register allocator. - * This make a large difference for gcc-1.40 and a tiny difference - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for - * best results. gcc-2.6.0 can't handle this. - */ - al = data; - __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port)); + __asm volatile("outb %0, %1" : : "a" (data), "Nd" (port)); } -static __inline void -outl(u_int port, u_int data) +static inline void +outl(u_short port, u_int data) { - /* - * outl() and outw() aren't used much so we haven't looked at - * possible micro-optimizations such as the unnecessary - * assignment for them. - */ - __asm __volatile("outl %0,%%dx" : : "a" (data), "d" (port)); + __asm volatile("outl %0, %1" : : "a" (data), "Nd" (port)); } -static __inline void -outsb(u_int port, const void *addr, size_t cnt) +static inline void +outsb(u_short port, const void *addr, size_t cnt) { - __asm __volatile("cld; rep; outsb" - : "+S" (addr), "+c" (cnt) - : "d" (port)); + __asm volatile("rep; outsb" : "+S" (addr), "+c" (cnt) : "d" (port)); } -static __inline void -outsw(u_int port, const void *addr, size_t cnt) +static inline void +outsw(u_short port, const void *addr, size_t cnt) { - __asm __volatile("cld; rep; outsw" - : "+S" (addr), "+c" (cnt) - : "d" (port)); + __asm volatile("rep; outsw" : "+S" (addr), "+c" (cnt) : "d" (port)); } -static __inline void -outsl(u_int port, const void *addr, size_t cnt) +static inline void +outsl(u_short port, const void *addr, size_t cnt) { - __asm __volatile("cld; rep; outsl" - : "+S" (addr), "+c" (cnt) - : "d" (port)); + __asm volatile("rep; outsl" : "+S" (addr), "+c" (cnt) : "d" (port)); } -static __inline void -outw(u_int port, u_short data) +static inline void +outw(u_short port, u_short data) { - __asm __volatile("outw %0,%%dx" : : "a" (data), "d" (port)); + __asm volatile("outw %0, %1" : : "a" (data), "Nd" (port)); } static __inline void Index: sys/i386/i386/machdep.c =================================================================== --- sys/i386/i386/machdep.c (Revision 190841) +++ sys/i386/i386/machdep.c (Arbeitskopie) @@ -3555,45 +3555,24 @@ #ifdef KDB /* - * Provide inb() and outb() as functions. They are normally only - * available as macros calling inlined functions, thus cannot be - * called from the debugger. - * - * The actual code is stolen from , and de-inlined. + * Provide inb() and outb() as functions. They are normally only available as + * inline functions, thus cannot be called from the debugger. */ -#undef inb -#undef outb - /* silence compiler warnings */ -u_char inb(u_int); -void outb(u_int, u_char); +u_char inb_(u_short); +void outb_(u_short, u_char); u_char -inb(u_int port) +inb_(u_short port) { - u_char data; - /* - * We use %%dx and not %1 here because i/o is done at %dx and not at - * %edx, while gcc generates inferior code (movw instead of movl) - * if we tell it to load (u_short) port. - */ - __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port)); - return (data); + return inb(port); } void -outb(u_int port, u_char data) +outb_(u_short port, u_char data) { - u_char al; - /* - * Use an unnecessary assignment to help gcc's register allocator. - * This make a large difference for gcc-1.40 and a tiny difference - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for - * best results. gcc-2.6.0 can't handle this. - */ - al = data; - __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port)); + outb(port, data); } #endif /* KDB */ Index: sys/amd64/include/cpufunc.h =================================================================== --- sys/amd64/include/cpufunc.h (Revision 190841) +++ sys/amd64/include/cpufunc.h (Arbeitskopie) @@ -164,177 +164,97 @@ __asm __volatile("hlt"); } -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3 - -#define inb(port) inbv(port) -#define outb(port, data) outbv(port, data) - -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */ - -/* - * The following complications are to get around gcc not having a - * constraint letter for the range 0..255. We still put "d" in the - * constraint because "i" isn't a valid constraint when the port - * isn't constant. This only matters for -O0 because otherwise - * the non-working version gets optimized away. - * - * Use an expression-statement instead of a conditional expression - * because gcc-2.6.0 would promote the operands of the conditional - * and produce poor code for "if ((inb(var) & const1) == const2)". - * - * The unnecessary test `(port) < 0x10000' is to generate a warning if - * the `port' has type u_short or smaller. Such types are pessimal. - * This actually only works for signed types. The range check is - * careful to avoid generating warnings. - */ -#define inb(port) __extension__ ({ \ - u_char _data; \ - if (__builtin_constant_p(port) && ((port) & 0xffff) < 0x100 \ - && (port) < 0x10000) \ - _data = inbc(port); \ - else \ - _data = inbv(port); \ - _data; }) - -#define outb(port, data) ( \ - __builtin_constant_p(port) && ((port) & 0xffff) < 0x100 \ - && (port) < 0x10000 \ - ? outbc(port, data) : outbv(port, data)) - -static __inline u_char -inbc(u_int port) +static inline u_char +inb(u_short port) { - u_char data; - - __asm __volatile("inb %1,%0" : "=a" (data) : "id" ((u_short)(port))); - return (data); + u_char data; + __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port)); + return data; } -static __inline void -outbc(u_int port, u_char data) +static inline u_int +inl(u_short port) { - __asm __volatile("outb %0,%1" : : "a" (data), "id" ((u_short)(port))); -} - -#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3*/ - -static __inline u_char -inbv(u_int port) -{ - u_char data; - /* - * We use %%dx and not %1 here because i/o is done at %dx and not at - * %edx, while gcc generates inferior code (movw instead of movl) - * if we tell it to load (u_short) port. - */ - __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port)); + u_int data; + __asm volatile("inl %1, %0" : "=a" (data) : "Nd" (port)); return (data); } -static __inline u_int -inl(u_int port) +static inline void +insb(u_short port, void *addr, size_t cnt) { - u_int data; - - __asm __volatile("inl %%dx,%0" : "=a" (data) : "d" (port)); - return (data); + __asm volatile("rep; insb" + : "+D" (addr), "+c" (cnt) + : "d" (port) + : "memory"); } -static __inline void -insb(u_int port, void *addr, size_t cnt) +static inline void +insw(u_short port, void *addr, size_t cnt) { - __asm __volatile("cld; rep; insb" - : "+D" (addr), "+c" (cnt) - : "d" (port) - : "memory"); + __asm volatile("rep; insw" + : "+D" (addr), "+c" (cnt) + : "d" (port) + : "memory"); } -static __inline void -insw(u_int port, void *addr, size_t cnt) +static inline void +insl(u_short port, void *addr, size_t cnt) { - __asm __volatile("cld; rep; insw" - : "+D" (addr), "+c" (cnt) - : "d" (port) - : "memory"); + __asm volatile("rep; insl" + : "+D" (addr), "+c" (cnt) + : "d" (port) + : "memory"); } static __inline void -insl(u_int port, void *addr, size_t cnt) -{ - __asm __volatile("cld; rep; insl" - : "+D" (addr), "+c" (cnt) - : "d" (port) - : "memory"); -} - -static __inline void invd(void) { __asm __volatile("invd"); } -static __inline u_short -inw(u_int port) +static inline u_short +inw(u_short port) { - u_short data; - - __asm __volatile("inw %%dx,%0" : "=a" (data) : "d" (port)); + u_short data; + __asm volatile("inw %1, %0" : "=a" (data) : "Nd" (port)); return (data); } -static __inline void -outbv(u_int port, u_char data) +static inline void +outb(u_short port, u_char data) { - u_char al; - /* - * Use an unnecessary assignment to help gcc's register allocator. - * This make a large difference for gcc-1.40 and a tiny difference - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for - * best results. gcc-2.6.0 can't handle this. - */ - al = data; - __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port)); + __asm volatile("outb %0, %1" : : "a" (data), "Nd" (port)); } -static __inline void -outl(u_int port, u_int data) +static inline void +outl(u_short port, u_int data) { - /* - * outl() and outw() aren't used much so we haven't looked at - * possible micro-optimizations such as the unnecessary - * assignment for them. - */ - __asm __volatile("outl %0,%%dx" : : "a" (data), "d" (port)); + __asm volatile("outl %0, %1" : : "a" (data), "Nd" (port)); } -static __inline void -outsb(u_int port, const void *addr, size_t cnt) +static inline void +outsb(u_short port, const void *addr, size_t cnt) { - __asm __volatile("cld; rep; outsb" - : "+S" (addr), "+c" (cnt) - : "d" (port)); + __asm volatile("rep; outsb" : "+S" (addr), "+c" (cnt) : "d" (port)); } -static __inline void -outsw(u_int port, const void *addr, size_t cnt) +static inline void +outsw(u_short port, const void *addr, size_t cnt) { - __asm __volatile("cld; rep; outsw" - : "+S" (addr), "+c" (cnt) - : "d" (port)); + __asm volatile("rep; outsw" : "+S" (addr), "+c" (cnt) : "d" (port)); } -static __inline void -outsl(u_int port, const void *addr, size_t cnt) +static inline void +outsl(u_short port, const void *addr, size_t cnt) { - __asm __volatile("cld; rep; outsl" - : "+S" (addr), "+c" (cnt) - : "d" (port)); + __asm volatile("rep; outsl" : "+S" (addr), "+c" (cnt) : "d" (port)); } -static __inline void -outw(u_int port, u_short data) +static inline void +outw(u_short port, u_short data) { - __asm __volatile("outw %0,%%dx" : : "a" (data), "d" (port)); + __asm volatile("outw %0, %1" : : "a" (data), "Nd" (port)); } static __inline void Index: sys/amd64/amd64/machdep.c =================================================================== --- sys/amd64/amd64/machdep.c (Revision 190841) +++ sys/amd64/amd64/machdep.c (Arbeitskopie) @@ -2178,45 +2178,24 @@ #ifdef KDB /* - * Provide inb() and outb() as functions. They are normally only - * available as macros calling inlined functions, thus cannot be - * called from the debugger. - * - * The actual code is stolen from , and de-inlined. + * Provide inb() and outb() as functions. They are normally only available as + * inline functions, thus cannot be called from the debugger. */ -#undef inb -#undef outb - /* silence compiler warnings */ -u_char inb(u_int); -void outb(u_int, u_char); +u_char inb_(u_short); +void outb_(u_short, u_char); u_char -inb(u_int port) +inb_(u_short port) { - u_char data; - /* - * We use %%dx and not %1 here because i/o is done at %dx and not at - * %edx, while gcc generates inferior code (movw instead of movl) - * if we tell it to load (u_short) port. - */ - __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port)); - return (data); + return inb(port); } void -outb(u_int port, u_char data) +outb_(u_short port, u_char data) { - u_char al; - /* - * Use an unnecessary assignment to help gcc's register allocator. - * This make a large difference for gcc-1.40 and a tiny difference - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for - * best results. gcc-2.6.0 can't handle this. - */ - al = data; - __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port)); + outb(port, data); } #endif /* KDB */ --------------010606040208020605070204-- From owner-freebsd-amd64@FreeBSD.ORG Wed Apr 8 20:58:37 2009 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C1F21065673; Wed, 8 Apr 2009 20:58:37 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.terabit.net.ua (mail.terabit.net.ua [195.137.202.147]) by mx1.freebsd.org (Postfix) with ESMTP id F1E8B8FC12; Wed, 8 Apr 2009 20:58:36 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from skuns.zoral.com.ua ([91.193.166.194] helo=mail.zoral.com.ua) by mail.terabit.net.ua with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63 (FreeBSD)) (envelope-from ) id 1Lreqw-000ERN-J6; Wed, 08 Apr 2009 23:58:34 +0300 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n38KwVC9074564 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 8 Apr 2009 23:58:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n38KwVIS007210; Wed, 8 Apr 2009 23:58:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n38KwV7I007209; Wed, 8 Apr 2009 23:58:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 8 Apr 2009 23:58:31 +0300 From: Kostik Belousov To: Christoph Mallon Message-ID: <20090408205831.GK3014@deviant.kiev.zoral.com.ua> References: <49DD01CA.3040900@gmx.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="1y1tiN5hVw5cPBDe" Content-Disposition: inline In-Reply-To: <49DD01CA.3040900@gmx.de> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua X-Virus-Scanned: mail.terabit.net.ua 1Lreqw-000ERN-J6 cb1c7ae4464e023beb1abd38385d05dc X-Terabit: YES Cc: Ed Schouten , freebsd-i386@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Apr 2009 20:58:38 -0000 --1y1tiN5hVw5cPBDe Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Apr 08, 2009 at 09:58:02PM +0200, Christoph Mallon wrote: > Hi amd64@ and i386@, >=20 > attached is a patch which simplifies the in*() and out*() functions for= =20 > I/O port access of AMD64 and i386. It removes an unnecessary distinction= =20 > of cases for inb() and outb(), which was used to generate better code=20 > for ports < 256. This is unnecessary, because GCC supports an asm input= =20 > constraint to handle this ("N"). The stale comment, which states there=20 > is no constraint for this, is removed, too. Also the {in,out}{w,l}() get= =20 > treated with this constraint. They had no special case before, so now=20 > better code is generated for them. Further, the unnecessary "cld" is=20 > removed from {in,out}s{b,w,l}(), because it is guaranteed by the ABI=20 > that the direction flag is cleared. All in all the code for in/out gets= =20 > a bit simpler. The DF flag is guaranteed to be cleared for usermode only. Currently, kernel does not clear the flag on entry. We did fixed signal handlers to always have DF cleared, but the kernel was explicitely left out because used version of gcc does the right thing with DF when needed. --1y1tiN5hVw5cPBDe Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkndD/UACgkQC3+MBN1Mb4hXiwCg2e662P8WkauJesi4PCP7YRVT /tEAoNa5yIybsppNuukP1CgbmNJMdClQ =A95l -----END PGP SIGNATURE----- --1y1tiN5hVw5cPBDe-- From owner-freebsd-amd64@FreeBSD.ORG Thu Apr 9 18:54:10 2009 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C74911065797; Thu, 9 Apr 2009 18:54:10 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx08.syd.optusnet.com.au (fallbackmx08.syd.optusnet.com.au [211.29.132.10]) by mx1.freebsd.org (Postfix) with ESMTP id 323028FC27; Thu, 9 Apr 2009 18:54:10 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by fallbackmx08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n39GksaR011236; Fri, 10 Apr 2009 02:46:54 +1000 Received: from besplex.bde.org (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n39GknQe012764 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 10 Apr 2009 02:46:51 +1000 Date: Fri, 10 Apr 2009 02:46:49 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Christoph Mallon In-Reply-To: <49DD01CA.3040900@gmx.de> Message-ID: <20090410001919.S1567@besplex.bde.org> References: <49DD01CA.3040900@gmx.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Ed Schouten , freebsd-i386@FreeBSD.org, freebsd-amd64@FreeBSD.org Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Apr 2009 18:54:11 -0000 On Wed, 8 Apr 2009, Christoph Mallon wrote: > attached is a patch which simplifies the in*() and out*() functions for I/O > port access of AMD64 and i386. It removes an unnecessary distinction of cases > for inb() and outb(), which was used to generate better code for ports < 256. In gcc >= 2. (gcc == 1 didn't even have __builtin_const_p(), so the optimization was too hard to do). > This is unnecessary, because GCC supports an asm input constraint to handle > this ("N"). The stale comment, which states there is no constraint for this, > is removed, too. This was necessary, but is stale like its comment. There is of course no need to support gcc 1 or 2. > Also the {in,out}{w,l}() get treated with this constraint. > They had no special case before, so now better code is generated for them. Actually, worse code is generated for them on most cases since the optimization of loading 32 bits for the usual case of a non-constant operand is lost in your version -- see below. The case of a constant port address was rare except for 8-bit isa ports "on the motherboard" and has become rarer with FreeBSD's runtime configuration becoming even more excessive (so instead of "inb $N,%al", the code is normally about 10 instructions for bus_space_read_1(sc->bst, sc->bsh, N1)). > Further, the unnecessary "cld" is removed from {in,out}s{b,w,l}(), because it > is guaranteed by the ABI that the direction flag is cleared. All in all the > code for in/out gets a bit simpler. kib@ already pointed out that the direction flag is not guaranteed to be cleared in the kernel. For amd64, this may be a bug in the kernel, but for i386 it is a bug in gcc say that the ABI specifies that the direction flag is cleared, since there is no standard i386 ABI so gcc must not assume that the direction flag is clear, like it has done until recently. There are lots more similar cld's in bus.h. % Index: sys/i386/include/cpufunc.h % =================================================================== % --- sys/i386/include/cpufunc.h (Revision 190841) % +++ sys/i386/include/cpufunc.h (Arbeitskopie) % @@ -170,177 +170,97 @@ % __asm __volatile("hlt"); % } % % -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3 % - % -#define inb(port) inbv(port) % -#define outb(port, data) outbv(port, data) % - % -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */ The ifdefs are ugly, and this one seems to have been slightly wrong, but they provide documentation aboout the features used: - __GNUCLIKE_BUILTIN_CONSTANT_P: no longer needed - __GNUCLIKE_ASM >= 3: to ensure that the "i" constraint is available. This used to be __GNUC__ >= 2. If that was correct then >= 3 is stricter than necessary; otherwise >= 2 was not as strict as it should have been, since some versions of gcc-2 certainly supported the "i" constraint. A precise translation of the above would be __GNU_PREREQ__(maj, min) || defined(__INTEL_COMPILER), where (maj, min) is the version of gcc that implemented the "N" constraint and you should check that the Intel compiler (all version?) support the "N" constraint. % - * The unnecessary test `(port) < 0x10000' is to generate a warning if % - * the `port' has type u_short or smaller. Such types are pessimal. % - * This actually only works for signed types. The range check is % - * careful to avoid generating warnings. This won't work any more, if it ever did. I think it only worked for the constant case, but for the constant case the width doesn't really matter. % -static __inline u_char % -inbc(u_int port) % +static inline u_char % +inb(u_short port) __inline should be globabally changed to inline someday, not starting here. Plain inline has been Standard for 10 years now, but it still isn't standard -- we're still waiting for gcc to implement C99, and haven't switched to gcc-4.3(?) or later which will implement C99 extern inline. Changing the type changes the API and breaks an optimization. Callers are supposed to provide an arg of width 32 so that it can be loaded as efficiently as possible (instruction prefixes and/or sign extension waste code space and time, mainly on old CPUs). The 0x10000 check referred to in the above comment attempts to enforce this. Casting down in the API prevents the 32-bit loads. % { % - u_char data; % - % - __asm __volatile("inb %1,%0" : "=a" (data) : "id" ((u_short)(port))); The port address was already cast down here in inbc(). This seems to have been just because I didn't know gnu asm very well when I wrote the above. The cast has no effect in the usual case where the "i" constraint is used, but when the "d" constraint is used (only when compiling with -O0?) %1 would be %edx without the cast. In inbv(), I hard-code %dx, but that does't work here "i" case. The correct method seems to be to use %w1 in both places (keep u_int port everywhere and use the same method for the "N" constraint). % - return (data); % + u_char data; % + __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port)); % + return data; % } Style breakages: - lost blank line after declarations. This file followed the rule about this except for previous breakage. It even followed it in some cases where the declarations are null. - lost parentheses around return value. This file followed the rule about this in all cases. Style not-quite-fix: - lost the tab after the type in the declaration. In KNF, such a tab is not used for auto declarations. However, this file was fairly consistent about breaking the rule. Why change __volatile to volatile? This is similar to changing __inline to inline, except plain volatile is more standard. has ifdef tangles which are supposed to allow users or cdefs.h itself to define away volatile so as to support old compilers, and we use __volatile instead of volatile a lot to support this. This is probably not needed for kernel headers. However, cpufunc.h is sometimes abused in userland. Fixed version (change nothing except the function name, %1, and the constraint): %%% static __inline u_char inb(u_int port) { u_char data; __asm __volatile("inb %w1, %0" : "=a" (data) : "Nd" (port)); return (data); } %%% Similarly for the other functions. BTW, I sometimes missing having the inverse of %[bwl...] -- a way to generate an operand size letter from the type. gcc-3 and gcc-4 have some horribly incomplete support for this. % -#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3*/ Removing this accidentally fixes >= 2 style bugs in it. % -static __inline u_char % -inbv(u_int port) % -{ % - u_char data; % - /* % - * We use %%dx and not %1 here because i/o is done at %dx and not at % - * %edx, while gcc generates inferior code (movw instead of movl) % - * if we tell it to load (u_short) port. % - */ Lost this documentation. Now the magic in my version is in %w in my version. The comment shouldn't blame gcc for generating movw -- that is the programmer's fault. gcc on modern machines now generates movswl for loading a u_short port to the u_int port specified by the old API, and movswl is fast on modern machines, so there is no penalty in using the old API/implementation on modern machines even if the caller neglected to start with a u_int port. % ... % -static __inline void % -insb(u_int port, void *addr, size_t cnt) % +static inline void % +insw(u_short port, void *addr, size_t cnt) % { % - __asm __volatile("cld; rep; insb" % - : "+D" (addr), "+c" (cnt) % - : "d" (port) % - : "memory"); % + __asm volatile("rep; insw" % + : "+D" (addr), "+c" (cnt) % + : "d" (port) % + : "memory"); % } The diffs are hard to read, apparently due to diff getting confused by your API and style changes and comparing unrelated functions. Without the API changes, I think it would have matched insb with insb, etc. % -static __inline void % -outbv(u_int port, u_char data) % +static inline void % +outb(u_short port, u_char data) % { % - u_char al; % - /* % - * Use an unnecessary assignment to help gcc's register allocator. % - * This make a large difference for gcc-1.40 and a tiny difference % - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for % - * best results. gcc-2.6.0 can't handle this. % - */ % - al = data; This can go of course. I saw dummy assignments bogusly helping the allocator in more complicated asms for a version of atomic_cmpset() as late as gcc-3, but that seemed to be fixed by gcc-3.3. % Index: sys/i386/i386/machdep.c % =================================================================== % --- sys/i386/i386/machdep.c (Revision 190841) % +++ sys/i386/i386/machdep.c (Arbeitskopie) % @@ -3555,45 +3555,24 @@ % #ifdef KDB % % /* % - * Provide inb() and outb() as functions. They are normally only % - * available as macros calling inlined functions, thus cannot be % - * called from the debugger. % - * % - * The actual code is stolen from , and de-inlined. % + * Provide inb() and outb() as functions. They are normally only available as % + * inline functions, thus cannot be called from the debugger. % */ This should have been implemented by using cpufunc.h instead of repeating it, and for everything in cpufunc.h not just inb and outb, e.g., as is done for everything in atomic.h. This gives a technical reason for using __inline -- we want to #undef it and shouldn't #undef a standard keyword. % % -#undef inb % -#undef outb % - % /* silence compiler warnings */ % -u_char inb(u_int); % -void outb(u_int, u_char); % +u_char inb_(u_short); % +void outb_(u_short, u_char); % % u_char % -inb(u_int port) % +inb_(u_short port) % { % - u_char data; % - /* % - * We use %%dx and not %1 here because i/o is done at %dx and not at % - * %edx, while gcc generates inferior code (movw instead of movl) % - * if we tell it to load (u_short) port. % - */ % - __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port)); % - return (data); % + return inb(port); % } % % void % -outb(u_int port, u_char data) % +outb_(u_short port, u_char data) % { % - u_char al; % - /* % - * Use an unnecessary assignment to help gcc's register allocator. % - * This make a large difference for gcc-1.40 and a tiny difference % - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for % - * best results. gcc-2.6.0 can't handle this. % - */ % - al = data; % - __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port)); % + outb(port, data); % } % % #endif /* KDB */ This changes the (ddb undocumented) API and ABI to worse ones. I already don't line having to type "call inb(foo)" (in my x86 debugger, inb is a command named "i", with certain space insensitivity that allows typing "ifoo"). Bruce From owner-freebsd-amd64@FreeBSD.ORG Thu Apr 9 16:10:06 2009 Return-Path: Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2416C106564A for ; Thu, 9 Apr 2009 16:10:06 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D43A88FC1E for ; Thu, 9 Apr 2009 16:10:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n39GA5eZ049040 for ; Thu, 9 Apr 2009 16:10:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n39GA5J9049039; Thu, 9 Apr 2009 16:10:05 GMT (envelope-from gnats) Resent-Date: Thu, 9 Apr 2009 16:10:05 GMT Resent-Message-Id: <200904091610.n39GA5J9049039@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-amd64@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, hobury Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8BF21106564A for ; Thu, 9 Apr 2009 16:00:29 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 5FD1F8FC08 for ; Thu, 9 Apr 2009 16:00:29 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n39G0S1N006543 for ; Thu, 9 Apr 2009 16:00:28 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id n39G0SJM006542; Thu, 9 Apr 2009 16:00:28 GMT (envelope-from nobody) Message-Id: <200904091600.n39G0SJM006542@www.freebsd.org> Date: Thu, 9 Apr 2009 16:00:28 GMT From: hobury To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 X-Mailman-Approved-At: Thu, 09 Apr 2009 19:16:04 +0000 Cc: Subject: amd64/133540: Cannot connect to ftp mirrors for 7.2 beta boot-only X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Apr 2009 16:10:06 -0000 >Number: 133540 >Category: amd64 >Synopsis: Cannot connect to ftp mirrors for 7.2 beta boot-only >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Apr 09 16:10:05 UTC 2009 >Closed-Date: >Last-Modified: >Originator: hobury >Release: 7.2 beta >Organization: hobury >Environment: Not applicable, but currently using FreeBSD 7.1 >Description: A boot-only of a 7.2 beta (announced 10 apr 2009) enters sysinstall as normal, but cannot connect to the ftp mirrors: leads to timeout. Possibly same issue for non-amd64 machines. A boot-only reinstall on same machines of 7.1 version works fine. Tried several mirrors from the ftp-list. >How-To-Repeat: Boot-only 64-bits 7.2 beta: sysinstall >Fix: >Release-Note: >Audit-Trail: >Unformatted: From owner-freebsd-amd64@FreeBSD.ORG Fri Apr 10 04:44:46 2009 Return-Path: Delivered-To: amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CAEF9106566B; Fri, 10 Apr 2009 04:44:46 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [205.211.164.50]) by mx1.freebsd.org (Postfix) with ESMTP id A22A68FC17; Fri, 10 Apr 2009 04:44:46 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from smtp1.sentex.ca (smtp1.sentex.ca [199.212.134.4]) by smarthost2.sentex.ca (8.14.3/8.14.3) with ESMTP id n3A4iiIP009614; Fri, 10 Apr 2009 00:44:44 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by smtp1.sentex.ca (8.14.3/8.14.3) with ESMTP id n3A4iimB014255; Fri, 10 Apr 2009 00:44:44 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: by freebsd-current.sentex.ca (Postfix, from userid 666) id 3BB177302F; Fri, 10 Apr 2009 00:44:44 -0400 (EDT) Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Message-Id: <20090410044444.3BB177302F@freebsd-current.sentex.ca> Date: Fri, 10 Apr 2009 00:44:44 -0400 (EDT) X-Virus-Scanned: ClamAV 0.94.1/8983/Thu Feb 12 07:48:01 2009 clamav-milter version 0.94.2 on clamscanner4 X-Virus-Status: Clean X-Scanned-By: MIMEDefang 2.64 on 205.211.164.50 Cc: Subject: [head tinderbox] failure on amd64/amd64 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Apr 2009 04:44:47 -0000 TB --- 2009-04-10 02:40:00 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2009-04-10 02:40:00 - starting HEAD tinderbox run for amd64/amd64 TB --- 2009-04-10 02:40:00 - cleaning the object tree TB --- 2009-04-10 02:41:26 - cvsupping the source tree TB --- 2009-04-10 02:41:26 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/HEAD/amd64/amd64/supfile TB --- 2009-04-10 02:41:37 - building world TB --- 2009-04-10 02:41:37 - MAKEOBJDIRPREFIX=/obj TB --- 2009-04-10 02:41:37 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2009-04-10 02:41:37 - TARGET=amd64 TB --- 2009-04-10 02:41:37 - TARGET_ARCH=amd64 TB --- 2009-04-10 02:41:37 - TZ=UTC TB --- 2009-04-10 02:41:37 - __MAKE_CONF=/dev/null TB --- 2009-04-10 02:41:37 - cd /src TB --- 2009-04-10 02:41:37 - /usr/bin/make -B buildworld >>> World build started on Fri Apr 10 02:41:38 UTC 2009 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> stage 5.1: building 32 bit shim libraries >>> World build completed on Fri Apr 10 04:42:02 UTC 2009 TB --- 2009-04-10 04:42:02 - generating LINT kernel config TB --- 2009-04-10 04:42:02 - cd /src/sys/amd64/conf TB --- 2009-04-10 04:42:02 - /usr/bin/make -B LINT TB --- 2009-04-10 04:42:03 - building LINT kernel TB --- 2009-04-10 04:42:03 - MAKEOBJDIRPREFIX=/obj TB --- 2009-04-10 04:42:03 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2009-04-10 04:42:03 - TARGET=amd64 TB --- 2009-04-10 04:42:03 - TARGET_ARCH=amd64 TB --- 2009-04-10 04:42:03 - TZ=UTC TB --- 2009-04-10 04:42:03 - __MAKE_CONF=/dev/null TB --- 2009-04-10 04:42:03 - cd /src TB --- 2009-04-10 04:42:03 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri Apr 10 04:42:03 UTC 2009 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies [...] awk -f /src/sys/tools/makeobjops.awk /src/sys/kern/serdev_if.m -h awk -f /src/sys/tools/makeobjops.awk /src/sys/libkern/iconv_converter_if.m -h awk -f /src/sys/tools/makeobjops.awk /src/sys/opencrypto/cryptodev_if.m -h awk -f /src/sys/tools/makeobjops.awk /src/sys/dev/acpica/acpi_if.m -h rm -f .newdep /usr/bin/make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES | MKDEP_CPP="cc -E" CC="cc" xargs mkdep -a -f .newdep -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ipfilter -I/src/sys/contrib/pf -I/src/sys/dev/ath -I/src/sys/dev/ath/ath_hal -I/src/sys/contrib/ngatm -I/src/sys/dev/twa -I/src/sys/gnu/fs/xfs/FreeBSD -I/src/sys/gnu/fs/xfs/FreeBSD/support -I/src/sys/gnu/fs/xfs -I/src/sys/contrib/opensolaris/compat -I/src/sys/dev/cxgb -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-ss e3 -mno-mmx -mno-3dnow -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector cc: /src/sys/dev/ixgbe/ixgbe_82599.c: No such file or directory mkdep: compile failed *** Error code 1 Stop in /obj/amd64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2009-04-10 04:44:43 - WARNING: /usr/bin/make returned exit code 1 TB --- 2009-04-10 04:44:43 - ERROR: failed to build lint kernel TB --- 2009-04-10 04:44:43 - 5710.57 user 604.70 system 7483.45 real http://tinderbox.des.no/tinderbox-head-HEAD-amd64-amd64.full From owner-freebsd-amd64@FreeBSD.ORG Fri Apr 10 09:52:00 2009 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C94D61065670 for ; Fri, 10 Apr 2009 09:52:00 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by mx1.freebsd.org (Postfix) with SMTP id 3D0C38FC1A for ; Fri, 10 Apr 2009 09:51:59 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: (qmail invoked by alias); 10 Apr 2009 09:51:56 -0000 Received: from p54A3ED4A.dip.t-dialin.net (EHLO tron.homeunix.org) [84.163.237.74] by mail.gmx.net (mp071) with SMTP; 10 Apr 2009 11:51:56 +0200 X-Authenticated: #1673122 X-Provags-ID: V01U2FsdGVkX189BCxly0o90AoFmc6FoWlvRp1MeyPL/PrslHs75G Rw50icq00bJ6jl Message-ID: <49DF16BA.8000807@gmx.de> Date: Fri, 10 Apr 2009 11:51:54 +0200 From: Christoph Mallon User-Agent: Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: Bruce Evans References: <49DD01CA.3040900@gmx.de> <20090410001919.S1567@besplex.bde.org> In-Reply-To: <20090410001919.S1567@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.43 Cc: Ed Schouten , freebsd-i386@FreeBSD.org, freebsd-amd64@FreeBSD.org Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Apr 2009 09:52:01 -0000 Bruce Evans schrieb: > On Wed, 8 Apr 2009, Christoph Mallon wrote: >> Also the {in,out}{w,l}() get treated with this constraint. They had no >> special case before, so now better code is generated for them. > > Actually, worse code is generated for them on most cases since the > optimization of loading 32 bits for the usual case of a non-constant > operand is lost in your version -- see below. The case of a constant If you are referring to the u_uint to u_short change, then you are wrong, see below. > port address was rare except for 8-bit isa ports "on the motherboard" > and has become rarer with FreeBSD's runtime configuration becoming > even more excessive (so instead of "inb $N,%al", the code is normally > about 10 instructions for bus_space_read_1(sc->bst, sc->bsh, N1)). > >> Further, the unnecessary "cld" is removed from {in,out}s{b,w,l}(), >> because it is guaranteed by the ABI that the direction flag is >> cleared. All in all the code for in/out gets a bit simpler. > > kib@ already pointed out that the direction flag is not guaranteed to be > cleared in the kernel. For amd64, this may be a bug in the kernel, but > for i386 it is a bug in gcc say that the ABI specifies that the direction > flag is cleared, since there is no standard i386 ABI so gcc must not > assume that the direction flag is clear, like it has done until recently. There *is* an ABI spec for i386 and it mandates that DF is cleared on function entry and exit. Also other compilers (e.g. LLVM and ICC) do not have this misbehaviour of GCC < 4.3 to emit cld before every string operation at all. I would not trust GCC < 4.3 to do this correctly everywhere either. > There are lots more similar cld's in bus.h. > > % Index: sys/i386/include/cpufunc.h > % =================================================================== > % --- sys/i386/include/cpufunc.h (Revision 190841) > % +++ sys/i386/include/cpufunc.h (Arbeitskopie) > % @@ -170,177 +170,97 @@ > % __asm __volatile("hlt"); > % } > % % -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3 > % - > % -#define inb(port) inbv(port) > % -#define outb(port, data) outbv(port, data) > % - > % -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */ > > The ifdefs are ugly, and this one seems to have been slightly wrong, but > they provide documentation aboout the features used: > - __GNUCLIKE_BUILTIN_CONSTANT_P: no longer needed > - __GNUCLIKE_ASM >= 3: to ensure that the "i" constraint is available. > This used to be __GNUC__ >= 2. If that was correct then >= 3 is > stricter than necessary; otherwise >= 2 was not as strict as it > should have been, since some versions of gcc-2 certainly supported > the "i" constraint. > > A precise translation of the above would be __GNU_PREREQ__(maj, min) || > defined(__INTEL_COMPILER), where (maj, min) is the version of gcc that > implemented the "N" constraint and you should check that the Intel > compiler (all version?) support the "N" constraint. This is unnecessary, even GCC 2.95 (10 years old, the oldest release you can get from official sources) supports this constraint (gcc-2.95/gcc/config/i386/i386.h, line 932). This shows how old and stale this hack is. > % - * The unnecessary test `(port) < 0x10000' is to generate a warning if > % - * the `port' has type u_short or smaller. Such types are pessimal. > % - * This actually only works for signed types. The range check is > % - * careful to avoid generating warnings. > > This won't work any more, if it ever did. I think it only worked for > the constant case, but for the constant case the width doesn't really > matter. The use of u_short for the port enables the compiler to generate a warning for too large port numbers. > % -static __inline u_char > % -inbc(u_int port) > % +static inline u_char > % +inb(u_short port) > > __inline should be globabally changed to inline someday, not starting here. > Plain inline has been Standard for 10 years now, but it still isn't > standard -- we're still waiting for gcc to implement C99, and haven't > switched to gcc-4.3(?) or later which will implement C99 extern inline. This very file already uses "inline" in other places. > Changing the type changes the API and breaks an optimization. Callers It does not change the API, it corrects it: Passing a port number > 0xFFFF never worked and the API now reflects this. So if now accidently a too large constant port number is passed, the compiler can warn. It's a static inline function, so there is no problem with the ABI. About optimization see below. > are supposed to provide an arg of width 32 so that it can be loaded > as efficiently as possible (instruction prefixes and/or sign extension > waste code space and time, mainly on old CPUs). The 0x10000 check > referred to in the above comment attempts to enforce this. Casting > down in the API prevents the 32-bit loads. When specifying 16 bit input the compiler is perfectly allowed to leave arbitrary garbage in the upper 16 bits, e.g. do a 32bit load of the port. Lying about this fact leads to worse code, example: static inline unsigned char inb(unsigned short port) /* ALT: static inline unsigned char inb(unsigned int port) */ { unsigned char data; asm volatile("inb %1, %0" : "=a" (data) : "d" (port)); /* ALT: asm volatile("inb %w1, %0" : "=a" (data) : "d" (port)); */ return data; } unsigned short in2(unsigned short port) { unsigned short res; res = inb(port) << 8; res |= inb(++port); return res; } diff of the assember of the two versions: --- in_short.s 2009-04-10 07:54:10.000000000 +0200 +++ in_int.s 2009-04-10 07:54:48.000000000 +0200 @@ -4,17 +4,22 @@ .globl in2 .type in2, @function in2: - movzwl 4(%esp), %edx + pushl %ebx + movzwl 8(%esp), %ebx + movzwl %bx, %eax + movl %eax, %edx #APP inb %dx, %al #NO_APP movl %eax, %ecx - addl $1, %edx + leal 1(%ebx), %edx sall $8, %ecx + movzwl %dx, %edx #APP inb %dx, %al #NO_APP movzbl %al, %edx + popl %ebx orl %edx, %ecx movzwl %cx, %eax ret Compilers are way better today than back in the GCC 1.x times. Lying to them to trick them to show a certain behaviour usually is a bad idea and tends to backfire. > % { > % - u_char data; > % - > % - __asm __volatile("inb %1,%0" : "=a" (data) : "id" > ((u_short)(port))); > > The port address was already cast down here in inbc(). This seems to > have been just because I didn't know gnu asm very well when I wrote > the above. The cast has no effect in the usual case where the "i" > constraint is used, but when the "d" constraint is used (only when > compiling with -O0?) %1 would be %edx without the cast. In inbv(), I > hard-code %dx, but that does't work here "i" case. The correct method > seems to be to use %w1 in both places (keep u_int port everywhere and > use the same method for the "N" constraint). The correct way to handle this is discussed above. > % - return (data); > % + u_char data; > % + __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port)); > % + return data; > % } > > Style breakages: > - lost blank line after declarations. This file followed the rule about > this except for previous breakage. It even followed it in some cases > where the declarations are null. > - lost parentheses around return value. This file followed the rule about > this in all cases. Anachronisms. > Style not-quite-fix: > - lost the tab after the type in the declaration. In KNF, such a tab is > not used for auto declarations. However, this file was fairly consistent > about breaking the rule. > > Why change __volatile to volatile? This is similar to changing __inline > to inline, except plain volatile is more standard. has You answered this by yourself: volatile is a standard word - there is no reason to not use this. > ifdef tangles which are supposed to allow users or cdefs.h itself to > define away volatile so as to support old compilers, and we use __volatile > instead of volatile a lot to support this. This is probably not needed > for kernel headers. However, cpufunc.h is sometimes abused in userland. "Old compilers" cannot compile the FreeBSD source anyway, so what's your point? This file can only be used with "new" (not older than a decade or so) and very GCC-compatible compilers. > Fixed version (change nothing except the function name, %1, and the > constraint): > > %%% > static __inline u_char > inb(u_int port) > { > u_char data; > > __asm __volatile("inb %w1, %0" : "=a" (data) : "Nd" (port)); > return (data); > } > %%% > > Similarly for the other functions. > > BTW, I sometimes missing having the inverse of %[bwl...] -- a way to > generate an operand size letter from the type. gcc-3 and gcc-4 have > some horribly incomplete support for this. > > % -#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3*/ > > Removing this accidentally fixes >= 2 style bugs in it. > > % -static __inline u_char > % -inbv(u_int port) > % -{ > % - u_char data; > % - /* > % - * We use %%dx and not %1 here because i/o is done at %dx and not at > % - * %edx, while gcc generates inferior code (movw instead of movl) > % - * if we tell it to load (u_short) port. > % - */ > > Lost this documentation. Now the magic in my version is in %w in my > version. > The comment shouldn't blame gcc for generating movw -- that is the > programmer's fault. This is wrong, it's correct to specify a 16bit input. The compiler therefore is perfectly allowed to leave arbitrary garbage in the upper 16 bits, e.g. do a 32bit load of the port and other things, see above. Also GCC does not generate movw here - for neither u_short nor u_int. GCC avoids partial register updates. > gcc on modern machines now generates movswl for loading a u_short port > to the u_int port specified by the old API, and movswl is fast on modern > machines, so there is no penalty in using the old API/implementation > on modern machines even if the caller neglected to start with a u_int > port. There is a penalty, see example above. > % ... > % -static __inline void > % -insb(u_int port, void *addr, size_t cnt) > % +static inline void > % +insw(u_short port, void *addr, size_t cnt) > % { > % - __asm __volatile("cld; rep; insb" > % - : "+D" (addr), "+c" (cnt) > % - : "d" (port) > % - : "memory"); > % + __asm volatile("rep; insw" > % + : "+D" (addr), "+c" (cnt) > % + : "d" (port) > % + : "memory"); > % } > > The diffs are hard to read, apparently due to diff getting confused > by your API and style changes and comparing unrelated functions. > Without the API changes, I think it would have matched insb with insb, > etc. > > % -static __inline void > % -outbv(u_int port, u_char data) > % +static inline void > % +outb(u_short port, u_char data) > % { > % - u_char al; > % - /* > % - * Use an unnecessary assignment to help gcc's register allocator. > % - * This make a large difference for gcc-1.40 and a tiny difference > % - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for > % - * best results. gcc-2.6.0 can't handle this. > % - */ > % - al = data; > > This can go of course. I saw dummy assignments bogusly helping the > allocator in more complicated asms for a version of atomic_cmpset() > as late as gcc-3, but that seemed to be fixed by gcc-3.3. > > % Index: sys/i386/i386/machdep.c > % =================================================================== > % --- sys/i386/i386/machdep.c (Revision 190841) > % +++ sys/i386/i386/machdep.c (Arbeitskopie) > % @@ -3555,45 +3555,24 @@ > % #ifdef KDB > % % /* > % - * Provide inb() and outb() as functions. They are normally only > % - * available as macros calling inlined functions, thus cannot be > % - * called from the debugger. > % - * > % - * The actual code is stolen from , and de-inlined. > % + * Provide inb() and outb() as functions. They are normally only > available as > % + * inline functions, thus cannot be called from the debugger. > % */ > > This should have been implemented by using cpufunc.h instead of repeating > it, and for everything in cpufunc.h not just inb and outb, e.g., as is > done for everything in atomic.h. This gives a technical reason for > using __inline -- we want to #undef it and shouldn't #undef a standard > keyword. You can neither #undef __inline nor inline, they both are keywords, not macros. Actually it's the other way round: There is no way to get rid of __inline, it is always a keyword to GCC. But you can tell GCC to not consider inline as keyword (-std=c89). > % % -#undef inb > % -#undef outb > % - > % /* silence compiler warnings */ > % -u_char inb(u_int); > % -void outb(u_int, u_char); > % +u_char inb_(u_short); > % +void outb_(u_short, u_char); > % % u_char > % -inb(u_int port) > % +inb_(u_short port) > % { > % - u_char data; > % - /* > % - * We use %%dx and not %1 here because i/o is done at %dx and not at > % - * %edx, while gcc generates inferior code (movw instead of movl) > % - * if we tell it to load (u_short) port. > % - */ > % - __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port)); > % - return (data); > % + return inb(port); > % } > % % void > % -outb(u_int port, u_char data) > % +outb_(u_short port, u_char data) > % { > % - u_char al; > % - /* > % - * Use an unnecessary assignment to help gcc's register allocator. > % - * This make a large difference for gcc-1.40 and a tiny difference > % - * for gcc-2.6.0. For gcc-1.40, al had to be ``asm("ax")'' for > % - * best results. gcc-2.6.0 can't handle this. > % - */ > % - al = data; > % - __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port)); > % + outb(port, data); > % } > % % #endif /* KDB */ > > This changes the (ddb undocumented) API and ABI to worse ones. I > already don't line having to type "call inb(foo)" (in my x86 debugger, > inb is a command named "i", with certain space insensitivity that > allows typing "ifoo"). The "worse" aspect is discussed above. I also changed the non-existent documentation of these functions, so the name change should be fine. (: Christoph From owner-freebsd-amd64@FreeBSD.ORG Fri Apr 10 06:13:36 2009 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8F46106566B for ; Fri, 10 Apr 2009 06:13:36 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by mx1.freebsd.org (Postfix) with SMTP id 466C88FC15 for ; Fri, 10 Apr 2009 06:13:36 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: (qmail invoked by alias); 10 Apr 2009 06:13:34 -0000 Received: from p54A3ED4A.dip.t-dialin.net (EHLO tron.homeunix.org) [84.163.237.74] by mail.gmx.net (mp063) with SMTP; 10 Apr 2009 08:13:34 +0200 X-Authenticated: #1673122 X-Provags-ID: V01U2FsdGVkX18NGQQxRVMNrcP3UqW4WEZtCjugRHJfkzBVJwJaoi 6r0tiW6YerY4ad Message-ID: <49DEE38D.6050604@gmx.de> Date: Fri, 10 Apr 2009 08:13:33 +0200 From: Christoph Mallon User-Agent: Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: Kostik Belousov References: <49DD01CA.3040900@gmx.de> <20090408205831.GK3014@deviant.kiev.zoral.com.ua> In-Reply-To: <20090408205831.GK3014@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.54 X-Mailman-Approved-At: Fri, 10 Apr 2009 11:39:39 +0000 Cc: Ed Schouten , freebsd-i386@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Apr 2009 06:13:37 -0000 Kostik Belousov schrieb: > On Wed, Apr 08, 2009 at 09:58:02PM +0200, Christoph Mallon wrote: >> Hi amd64@ and i386@, >> >> attached is a patch which simplifies the in*() and out*() functions for >> I/O port access of AMD64 and i386. It removes an unnecessary distinction >> of cases for inb() and outb(), which was used to generate better code >> for ports < 256. This is unnecessary, because GCC supports an asm input >> constraint to handle this ("N"). The stale comment, which states there >> is no constraint for this, is removed, too. Also the {in,out}{w,l}() get >> treated with this constraint. They had no special case before, so now > >> better code is generated for them. Further, the unnecessary "cld" is >> removed from {in,out}s{b,w,l}(), because it is guaranteed by the ABI >> that the direction flag is cleared. All in all the code for in/out gets >> a bit simpler. > The DF flag is guaranteed to be cleared for usermode only. Currently, > kernel does not clear the flag on entry. > > We did fixed signal handlers to always have DF cleared, but the kernel > was explicitely left out because used version of gcc does the right > thing with DF when needed. This will break with GCC >= 4.3. Also other compilers (e.g. LLVM and ICC) do not generate cld before they insert string instructions. I would not trust GCC < 4.3 to place cld everywhere, either. As a *temporary* workaround the cld in the {in,out}s*() functions could be left in, but this issue *must* be resolved correctly soon. Especially because LLVM is not generating the (according to the ABI spec) redundant clds and newer GCCs will not either. So no matter what the prospective compiler of FreeBSD will be, this issue must be tackled. From owner-freebsd-amd64@FreeBSD.ORG Fri Apr 10 12:54:42 2009 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC1771065670; Fri, 10 Apr 2009 12:54:42 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id 3C4BB8FC27; Fri, 10 Apr 2009 12:54:42 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n3ACsbK8012516 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 10 Apr 2009 22:54:38 +1000 Date: Fri, 10 Apr 2009 22:54:37 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Christoph Mallon In-Reply-To: <49DF16BA.8000807@gmx.de> Message-ID: <20090410200612.I2103@besplex.bde.org> References: <49DD01CA.3040900@gmx.de> <20090410001919.S1567@besplex.bde.org> <49DF16BA.8000807@gmx.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Ed Schouten , freebsd-amd64@FreeBSD.org, freebsd-i386@FreeBSD.org Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Apr 2009 12:54:43 -0000 On Fri, 10 Apr 2009, Christoph Mallon wrote: > Bruce Evans schrieb: >> On Wed, 8 Apr 2009, Christoph Mallon wrote: >>> Also the {in,out}{w,l}() get treated with this constraint. They had no >>> special case before, so now better code is generated for them. >> >> Actually, worse code is generated for them on most cases since the >> optimization of loading 32 bits for the usual case of a non-constant >> operand is lost in your version -- see below. The case of a constant > > If you are referring to the u_uint to u_short change, then you are > wrong, see below. No, you are wrong. See below in previous and this mail. >> kib@ already pointed out that the direction flag is not guaranteed to be >> cleared in the kernel. For amd64, this may be a bug in the kernel, but >> for i386 it is a bug in gcc say that the ABI specifies that the direction >> flag is cleared, since there is no standard i386 ABI so gcc must not >> assume that the direction flag is clear, like it has done until recently. > > There *is* an ABI spec for i386 and it mandates that DF is cleared on > function entry and exit. Also other compilers (e.g. LLVM and > ICC) do not have this misbehaviour of GCC < 4.3 to emit cld before every > string operation at all. I would not trust GCC < 4.3 to do this > correctly everywhere either. Anyone can make up a spec but no one has to follow it. FreeBSD has always followed the old de-facto spec that requires any code that uses string operations to set the direction flag for itself. Trying to change this after about 1990 is especially stupid since string instructions should almost never be used due to their becoming relatively even slower without even counting their large setup overhead. Their setup overhead includes the cld but should be large enough to do the cld in parallel so that it is almost free. >> % % -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3 >> % - >> % -#define inb(port) inbv(port) >> % -#define outb(port, data) outbv(port, data) >> % - >> % -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */ >> >> The ifdefs are ugly, and this one seems to have been slightly wrong, but >> they provide documentation aboout the features used: >> - __GNUCLIKE_BUILTIN_CONSTANT_P: no longer needed >> - __GNUCLIKE_ASM >= 3: to ensure that the "i" constraint is available. >> This used to be __GNUC__ >= 2. If that was correct then >= 3 is >> stricter than necessary; otherwise >= 2 was not as strict as it >> should have been, since some versions of gcc-2 certainly supported >> the "i" constraint. >> >> A precise translation of the above would be __GNU_PREREQ__(maj, min) || >> defined(__INTEL_COMPILER), where (maj, min) is the version of gcc that >> implemented the "N" constraint and you should check that the Intel >> compiler (all version?) support the "N" constraint. > > This is unnecessary, even GCC 2.95 (10 years old, the oldest release you can > get from official sources) supports this constraint > (gcc-2.95/gcc/config/i386/i386.h, line 932). This shows how old and stale > this hack is. I just want it pointed out (in the commit log) and in a separate commit that you are intentionally weakening the ifdefs since old compilers are not supported even to the extent of not generating syntax errors for them and most other places (e.g., atomic.h) already assume a non-old compiler and generate the syntax errors for old ones. >> % - * The unnecessary test `(port) < 0x10000' is to generate a warning if >> % - * the `port' has type u_short or smaller. Such types are pessimal. >> % - * This actually only works for signed types. The range check is >> % - * careful to avoid generating warnings. >> >> This won't work any more, if it ever did. I think it only worked for >> the constant case, but for the constant case the width doesn't really >> matter. > > The use of u_short for the port enables the compiler to generate a warning > for too large port numbers. Only if it warns for truncation on assignment, which most compilers won't do. >> % -static __inline u_char >> % -inbc(u_int port) >> % +static inline u_char >> % +inb(u_short port) >> >> __inline should be globabally changed to inline someday, not starting here. >> Plain inline has been Standard for 10 years now, but it still isn't >> standard -- we're still waiting for gcc to implement C99, and haven't >> switched to gcc-4.3(?) or later which will implement C99 extern inline. > > This very file already uses "inline" in other places. Only in recent style breakage in 2 places when I wasn't looking (18-Apr-08 for cpu_monitor() and cpu_wait()). Other style breakage for these functions: - insertion sort errors. This file is supposed to be sorted on function name - namespace errors. Functions in this file should have the same name as the instruction that they provided access too. monitor() is a bit too generic for this and wait() is much too generic for this (however, the instruction name is mwait() which is less generic than monitor()), so some renaming is required. The renaming should not be into the cpu_* namespace since the cpu_ prefix has a different, precise meaning. These bugs are missing for the older interface to the "pause" instruction: - pause() is too generic, so the cpufunc is named ia32_pause() on i386, etc. (but I prefer mdpause()). - pause() was already in use, so even the MI interface couldn't be named pause(). It is named cpu_spinwait(). - cpu_spinwait() is implemented as a macro that invokes ia32_pause(). - bogus semicolon at the end of the main asm string - missing space after ": :" (2 instances) - missing space after just 1 of the 4 constraint strings. - missing declarations in the non-__GNUCLIKE_ASM && non-__CC_SUPPORTS___INLINE case. Hmm, your change from __inline to inline also bogotifies the __CC_SUPPORTS___INLINE ifdef. This ifdef is careful to a fault, but not wrong provided __inline is used consistently. >> Changing the type changes the API and breaks an optimization. Callers > > It does not change the API, it corrects it: Passing a port number > 0xFFFF > never worked and the API now reflects this. So if now accidently a too large > constant port number is passed, the compiler can warn. It's a static inline > function, so there is no problem with the ABI. About optimization see below. No, it breaks it. I define the API so I know what it is! The caller should pass a u_int port so that optimal code is generated (at a tiny cost to data space for storing 32-bit port numbers in softcs). The hardware only uses the low 16 bits so the upper 16 bits don't matter. If the caller starts with a u_short port, then the API generates sub-optimal code to promote to a u_int. The promotion logically occurs as part of the function call protocol, but since the function is inline it normally occurs more directly (e.g., as "movswl port,%edx"). >> are supposed to provide an arg of width 32 so that it can be loaded >> as efficiently as possible (instruction prefixes and/or sign extension >> waste code space and time, mainly on old CPUs). The 0x10000 check >> referred to in the above comment attempts to enforce this. Casting >> down in the API prevents the 32-bit loads. > > When specifying 16 bit input the compiler is perfectly allowed to leave > arbitrary garbage in the upper 16 bits, e.g. do a 32bit load of the > port. Lying about this fact leads to worse code, example: It is only permitted to do that if it knows that the the upper 16 bits are in an object, or if they are in a register. While it could know that bits in an object (say for port = sc->sc_port where sc has stuff above sc_port so that the bits are in the object (*sc)), I've never seen gcc do this optimization. I didn't lie to the compiler, but told it to load all 32 bits of %edx. Then I ignore the top 16 bits in the asm. > static inline unsigned char inb(unsigned short port) > /* ALT: static inline unsigned char inb(unsigned int port) */ > { > unsigned char data; > asm volatile("inb %1, %0" : "=a" (data) : "d" (port)); > /* ALT: asm volatile("inb %w1, %0" : "=a" (data) : "d" (port)); */ > return data; > } > > unsigned short in2(unsigned short port) > { > unsigned short res; > res = inb(port) << 8; > res |= inb(++port); > return res; > } Sure, this gives suboptimal code for ALT since `port' starts in a u_short. Try starting with a matching ALT for in2: in2(unsigned port). The types must match at all levels including in2()'s caller to avoid promoting and demoting back and forth. Only some of the conversions can be free. ++port involves various conversions if port is not u_int throughout. The compiler can usually optimize these (by keeping port in a register and ignoring top bits). > diff of the assember of the two versions: > --- in_short.s 2009-04-10 07:54:10.000000000 +0200 > +++ in_int.s 2009-04-10 07:54:48.000000000 +0200 > @@ -4,17 +4,22 @@ > .globl in2 > .type in2, @function > in2: > - movzwl 4(%esp), %edx With my ALT in2 version: movl 4(%esp), %edx. This is 3-4 cycles faster than on i386 and i486. On modern CPUs, there is probably no difference, but movzwl wastes a byte or two of instruction space so it might cost in some cases. > + pushl %ebx > + movzwl 8(%esp), %ebx > + movzwl %bx, %eax > + movl %eax, %edx This is unnecessarily stupid code. Why not identical code to the u_int case? Leaving out the inb(++port) gives identical code for the first inb(). > #APP > inb %dx, %al > #NO_APP > movl %eax, %ecx > - addl $1, %edx > + leal 1(%ebx), %edx The compiler is being stupid above so that it can be stupid here too (leal is slightly worse than addl). It should just use %edx and not take a trip through %ebx and %eax, and have to save %ebx. > sall $8, %ecx > + movzwl %dx, %edx This is needed for ++port since it has to demote the intermediate u_int in ++port back to u_short. > #APP > inb %dx, %al > #NO_APP > movzbl %al, %edx > + popl %ebx > orl %edx, %ecx > movzwl %cx, %eax > ret With u_ints throughout (ALT for inb() and in2()), there are no conversions and the generated code is optimal (movl instead of movzwl and no other differences). With the only conversion being a demotion in inb() (non-ALT for inb() and ALT for in2()), the generated code is again optimal (movl instead of movzwl). gcc is optimizing away the conversion in this case. With port replaced by portarray[0] where portarray[1] exists, gcc no longer does the optimization. This is with gcc-4.2 with the default (usually wrong) arch. gcc-3.3 and gcc-4.2 -march=i386 give the original pessimization of using movw instead of movzwl to load a 16-bit object into a 16 or 32-bit register. When only 16 bits are needed gcc-old just uses movw. When a promotion is needed in the ALT in2(), it uses movw to %bx where the above uses movzwl to %ebx. Then it must promote %bx and it uses movzwl to %eax for this, then it moves to %edx. This is sub-optimal but not as silly as the above -- the above has already done the promotion into %ebx and then it does it again into %eax. > Compilers are way better today than back in the GCC 1.x times. Lying to > them to trick them to show a certain behaviour usually is a bad idea and > tends to backfire. Apparently gcc is still stupid when there are just a 4 not-quite null promotions and demotions: for u_short port; use((u_int)port); /* ALT inb(port); */ port = (u_short)((u_int)port + 1); /* ++port; */ use((u_int)port); /* ALT inb(port); */ >> % { >> % - u_char data; >> % - >> % - __asm __volatile("inb %1,%0" : "=a" (data) : "id" >> ((u_short)(port))); >> >> The port address was already cast down here in inbc(). This seems to >> have been just because I didn't know gnu asm very well when I wrote >> the above. The cast has no effect in the usual case where the "i" >> constraint is used, but when the "d" constraint is used (only when >> compiling with -O0?) %1 would be %edx without the cast. In inbv(), I >> hard-code %dx, but that does't work here "i" case. The correct method >> seems to be to use %w1 in both places (keep u_int port everywhere and >> use the same method for the "N" constraint). > > The correct way to handle this is discussed above. No, this gives movzwl instead of movl in some cases and movw instead of movl in some other cases. >> % - return (data); >> % + u_char data; >> % + __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port)); >> % + return data; >> % } >> >> Style breakages: >> - lost blank line after declarations. This file followed the rule about >> this except for previous breakage. It even followed it in some cases >> where the declarations are null. >> - lost parentheses around return value. This file followed the rule about >> this in all cases. > > Anachronisms. No, standard in FreeBSD. >> Style not-quite-fix: >> - lost the tab after the type in the declaration. In KNF, such a tab is >> not used for auto declarations. However, this file was fairly consistent >> about breaking the rule. >> >> Why change __volatile to volatile? This is similar to changing __inline >> to inline, except plain volatile is more standard. has > > You answered this by yourself: volatile is a standard word - there is no > reason to not use this. It is for stylistic reasons. Readers will wonder why most code uses __volatile and yours doesn't. >> ifdef tangles which are supposed to allow users or cdefs.h itself to >> define away volatile so as to support old compilers, and we use __volatile >> instead of volatile a lot to support this. This is probably not needed >> for kernel headers. However, cpufunc.h is sometimes abused in userland. > > "Old compilers" cannot compile the FreeBSD source anyway, so what's your > point? This file can only be used with "new" (not older than a decade or so) > and very GCC-compatible compilers. The sources should be as independent of compiler versions as possible. cpufunc.h was independent (it is supposed contain only prototypes only for compilers that can't handle its asms), so why break it? >> % -static __inline u_char >> % -inbv(u_int port) >> % -{ >> % - u_char data; >> % - /* >> % - * We use %%dx and not %1 here because i/o is done at %dx and not at >> % - * %edx, while gcc generates inferior code (movw instead of movl) >> % - * if we tell it to load (u_short) port. >> % - */ >> >> Lost this documentation. Now the magic in my version is in %w in my >> version. >> The comment shouldn't blame gcc for generating movw -- that is the >> programmer's fault. > > This is wrong, it's correct to specify a 16bit input. The compiler therefore > is perfectly allowed to leave arbitrary garbage in the upper 16 bits, e.g. do > a 32bit load of the port and other things, see above. Also GCC does not > generate movw here - for neither u_short nor u_int. GCC avoids partial > register updates. No, this is correct. Please allow me to find bugs in my own comment :-). The comment would need to be too long, like this mail, to give all the details and I apparently broke it trying to make it shorter. It is correct but sub-optimal to specify a 16-bit input. Whether gcc generates a movw is now very machine-dependent. When the code and comment was written, gcc always generated movw for memory to register 16-bit loads. Now it prefers to generate movzwl since partial register updates are bad and movzwl is now fast, but it still generates movw for old CPUs. My optimization was not to avoid partial register updates, but just to avoid operand size prefixes for loading u_shorts and extra instructions for conversions in expressions like port+1 and use(foo) (where use() takes a u_int)). Your examples show that compilers now avoid the prefixes or conversions in some but not all of the cases. >> % Index: sys/i386/i386/machdep.c >> % =================================================================== >> % --- sys/i386/i386/machdep.c (Revision 190841) >> % +++ sys/i386/i386/machdep.c (Arbeitskopie) >> % @@ -3555,45 +3555,24 @@ >> % #ifdef KDB >> % % /* >> % - * Provide inb() and outb() as functions. They are normally only >> % - * available as macros calling inlined functions, thus cannot be >> % - * called from the debugger. >> % - * >> % - * The actual code is stolen from , and de-inlined. >> % + * Provide inb() and outb() as functions. They are normally only >> available as >> % + * inline functions, thus cannot be called from the debugger. >> % */ >> >> This should have been implemented by using cpufunc.h instead of repeating >> it, and for everything in cpufunc.h not just inb and outb, e.g., as is >> done for everything in atomic.h. This gives a technical reason for >> using __inline -- we want to #undef it and shouldn't #undef a standard >> keyword. > > You can neither #undef __inline nor inline, they both are keywords, not > macros. Actually it's the other way round: There is no way to get rid of > __inline, it is always a keyword to GCC. But you can tell GCC to not consider > inline as keyword (-std=c89). Oops, I should have written "#define away" instead of #undef. I think it is legal in C to #define any identifer. It just might things by giving inconsistencies. already defines away __inline in some cases (mostly for old compilers) and defines it to plain inline for the C++ case even for new gcc. The u_short optimizations are mostly moot for inb() etc. The hardware part of inb() now normally takes a few thousand times longer than the whole 1 cycle potentially saved by these optimizations. When inb() was written, the hardware part took less than one hundred times longer. I defend the optimization mainly because losing it is a regression and I still hate APIs that use chars and shorts (or [u]intN_t's) to give pessimizations by maximizing conversions. Bruce From owner-freebsd-amd64@FreeBSD.ORG Fri Apr 10 21:40:35 2009 Return-Path: Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D581106564A; Fri, 10 Apr 2009 21:40:35 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 51D668FC27; Fri, 10 Apr 2009 21:40:35 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3ALeZDr069972; Fri, 10 Apr 2009 21:40:35 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3ALeZDg069968; Fri, 10 Apr 2009 21:40:35 GMT (envelope-from linimon) Date: Fri, 10 Apr 2009 21:40:35 GMT Message-Id: <200904102140.n3ALeZDg069968@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-bugs@FreeBSD.org From: linimon@FreeBSD.org X-Mailman-Approved-At: Fri, 10 Apr 2009 21:54:05 +0000 Cc: Subject: Re: misc/133540: Cannot connect to ftp mirrors for 7.2 beta boot-only X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Apr 2009 21:40:36 -0000 Synopsis: Cannot connect to ftp mirrors for 7.2 beta boot-only Responsible-Changed-From-To: freebsd-amd64->freebsd-bugs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Apr 10 21:39:51 UTC 2009 Responsible-Changed-Why: probably not amd64-specific. http://www.freebsd.org/cgi/query-pr.cgi?pr=133540 From owner-freebsd-amd64@FreeBSD.ORG Sat Apr 11 02:40:02 2009 Return-Path: Delivered-To: freebsd-amd64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 295A9106568A for ; Sat, 11 Apr 2009 02:40:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 554358FC1D for ; Sat, 11 Apr 2009 02:40:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3B2e1Ya071357 for ; Sat, 11 Apr 2009 02:40:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3B2e1Bi071356; Sat, 11 Apr 2009 02:40:01 GMT (envelope-from gnats) Resent-Date: Sat, 11 Apr 2009 02:40:01 GMT Resent-Message-Id: <200904110240.n3B2e1Bi071356@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-amd64@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Jason Harmening Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3B9B61065670 for ; Sat, 11 Apr 2009 02:37:18 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 29ED98FC0A for ; Sat, 11 Apr 2009 02:37:18 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n3B2bHkd015742 for ; Sat, 11 Apr 2009 02:37:17 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id n3B2bHvE015726; Sat, 11 Apr 2009 02:37:17 GMT (envelope-from nobody) Message-Id: <200904110237.n3B2bHvE015726@www.freebsd.org> Date: Sat, 11 Apr 2009 02:37:17 GMT From: Jason Harmening To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 X-Mailman-Approved-At: Sat, 11 Apr 2009 03:02:59 +0000 Cc: Subject: amd64/133592: busdma incorrectly calculates bounce buffer requirements for userspace buffers X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Apr 2009 02:40:02 -0000 >Number: 133592 >Category: amd64 >Synopsis: busdma incorrectly calculates bounce buffer requirements for userspace buffers >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Apr 11 02:40:00 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Jason Harmening >Release: 7.2-PRERELEASE >Organization: >Environment: FreeBSD CORONA 7.2-PRERELEASE FreeBSD 7.2-PRERELEASE #2: Sat Mar 21 00:30:34 CDT 2009 jason@CORONA:/usr/obj/usr/src/sys/CUSTOM amd64 >Description: The _bus_dmamap_load_buffer function in sys/amd64/amd64/busdma_machdep.c takes a pmap_t param indicating the address space of the buffer (NULL => KVA space). When calculating the number of bounce buffers to reserve, it always calls pmap_kextract() to get the physical address, when it should call pmap_extract() if pmap != NULL. The problem exists in both 7-STABLE and 8-CURRENT--the attached patch is against 7-STABLE. >How-To-Repeat: >Fix: Patch attached with submission follows: --- busdma_machdep.c.bkp 2009-04-10 21:27:51.000000000 -0500 +++ busdma_machdep.c 2009-04-10 21:30:58.000000000 -0500 @@ -602,7 +602,10 @@ vendaddr = (vm_offset_t)buf + buflen; while (vaddr < vendaddr) { - paddr = pmap_kextract(vaddr); + if (pmap) + paddr = pmap_extract(pmap, vaddr); + else + paddr = pmap_kextract(vaddr); if (run_filter(dmat, paddr) != 0) map->pagesneeded++; vaddr += PAGE_SIZE; >Release-Note: >Audit-Trail: >Unformatted: