From owner-freebsd-amd64@FreeBSD.ORG  Sun Apr  5 18:30:38 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 87C1C10659C6;
	Sun,  5 Apr 2009 18:30:38 +0000 (UTC)
	(envelope-from tinderbox@freebsd.org)
Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [64.7.153.18])
	by mx1.freebsd.org (Postfix) with ESMTP id 48C898FC14;
	Sun,  5 Apr 2009 18:30:38 +0000 (UTC)
	(envelope-from tinderbox@freebsd.org)
Received: from smtp1.sentex.ca (smtp1c.sentex.ca [64.7.153.10])
	by smarthost1.sentex.ca (8.14.3/8.14.3) with ESMTP id n35IUaE9075784;
	Sun, 5 Apr 2009 14:30:36 -0400 (EDT)
	(envelope-from tinderbox@freebsd.org)
Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca
	[64.7.128.98])
	by smtp1.sentex.ca (8.14.3/8.14.3) with ESMTP id n35IUawC040860;
	Sun, 5 Apr 2009 14:30:36 -0400 (EDT)
	(envelope-from tinderbox@freebsd.org)
Received: by freebsd-current.sentex.ca (Postfix, from userid 666)
	id 138117302F; Sun,  5 Apr 2009 14:30:36 -0400 (EDT)
Sender: FreeBSD Tinderbox <tinderbox@freebsd.org>
From: FreeBSD Tinderbox <tinderbox@freebsd.org>
To: FreeBSD Tinderbox <tinderbox@freebsd.org>, <current@freebsd.org>,
	<amd64@freebsd.org>
Precedence: bulk
Message-Id: <20090405183036.138117302F@freebsd-current.sentex.ca>
Date: Sun,  5 Apr 2009 14:30:36 -0400 (EDT)
X-Virus-Scanned: ClamAV 0.94.1/8983/Thu Feb 12 07:48:01 2009
	clamav-milter version 0.94.2 on clamscanner2
X-Virus-Status: Clean
X-Scanned-By: MIMEDefang 2.64 on 64.7.153.18
Cc: 
Subject: [head tinderbox] failure on amd64/amd64
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 18:30:40 -0000

TB --- 2009-04-05 17:20:00 - tinderbox 2.6 running on freebsd-current.sentex.ca
TB --- 2009-04-05 17:20:00 - starting HEAD tinderbox run for amd64/amd64
TB --- 2009-04-05 17:20:00 - cleaning the object tree
TB --- 2009-04-05 17:21:10 - cvsupping the source tree
TB --- 2009-04-05 17:21:10 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/HEAD/amd64/amd64/supfile
TB --- 2009-04-05 17:21:17 - building world
TB --- 2009-04-05 17:21:17 - MAKEOBJDIRPREFIX=/obj
TB --- 2009-04-05 17:21:17 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2009-04-05 17:21:17 - TARGET=amd64
TB --- 2009-04-05 17:21:17 - TARGET_ARCH=amd64
TB --- 2009-04-05 17:21:17 - TZ=UTC
TB --- 2009-04-05 17:21:17 - __MAKE_CONF=/dev/null
TB --- 2009-04-05 17:21:17 - cd /src
TB --- 2009-04-05 17:21:17 - /usr/bin/make -B buildworld
>>> World build started on Sun Apr  5 17:21:20 UTC 2009
>>> Rebuilding the temporary build tree
>>> stage 1.1: legacy release compatibility shims
>>> stage 1.2: bootstrap tools
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3: cross tools
>>> stage 4.1: building includes
>>> stage 4.2: building libraries
>>> stage 4.3: make dependencies
>>> stage 4.4: building everything
[...]
rm -f .depend
mkdep -f .depend -a    -DRESCUE /src/sbin/routed/rtquery/rtquery.c
echo rtquery: /obj/amd64/src/tmp/usr/lib/libc.a /obj/amd64/src/tmp/usr/lib/libmd.a >> .depend
cc -O2 -pipe  -DRESCUE -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-pointer-sign -c /src/sbin/routed/if.c
cc -O2 -pipe  -DRESCUE -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-pointer-sign -c /src/sbin/routed/input.c
cc1: warnings being treated as errors
/src/sbin/routed/input.c: In function 'ck_passwd':
/src/sbin/routed/input.c:984: warning: format '%#x' expects type 'unsigned int', but argument 5 has type 'long unsigned int'
*** Error code 1

Stop in /src/sbin/routed.
*** Error code 1

Stop in /obj/amd64/src/rescue/rescue.
*** Error code 1

Stop in /src/rescue/rescue.
*** Error code 1

Stop in /src/rescue.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2009-04-05 18:30:35 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2009-04-05 18:30:35 - ERROR: failed to build world
TB --- 2009-04-05 18:30:35 - 3245.27 user 338.72 system 4235.45 real


http://tinderbox.des.no/tinderbox-head-HEAD-amd64-amd64.full

From owner-freebsd-amd64@FreeBSD.ORG  Mon Apr  6 11:06:50 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C30D10656C6
	for <freebsd-amd64@FreeBSD.org>; Mon,  6 Apr 2009 11:06:50 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 78A5A8FC26
	for <freebsd-amd64@FreeBSD.org>; Mon,  6 Apr 2009 11:06:50 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n36B6oOU061792
	for <freebsd-amd64@FreeBSD.org>; Mon, 6 Apr 2009 11:06:50 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n36B6nBk061788
	for freebsd-amd64@FreeBSD.org; Mon, 6 Apr 2009 11:06:49 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 6 Apr 2009 11:06:49 GMT
Message-Id: <200904061106.n36B6nBk061788@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-amd64@FreeBSD.org
X-Mailman-Approved-At: Mon, 06 Apr 2009 11:30:44 +0000
Cc: 
Subject: Current problem reports assigned to freebsd-amd64@FreeBSD.org
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 11:06:51 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o amd64/132574 amd64      [boot] Freeze on bootstrap loader (CD) using ATA/133 P
o amd64/132170 amd64      7.1 kernel compilation problem
o amd64/132019 amd64      [install] kernel trap 12 while installation
o amd64/131906 amd64      [ata] SATA data corruption with Promise PDC20378 (amd6
o amd64/131456 amd64      ACPI & ATA problems
o amd64/131314 amd64      [modules] [panic] large modules fail to load on amd64
o amd64/131209 amd64      [panic] [bce] 7.1-STABLE amd64 crash - m0 NULL
f amd64/130885 amd64      sockstat(1) on amd64 does not work
o amd64/130864 amd64      [hang] Problem with copying files to a large partition
o amd64/130817 amd64      FreeBSD does not support HP DL160G5 [regression]
o amd64/130494 amd64      [boot] netbooting BTX fails on amd64
o amd64/130483 amd64      [mxge] MSI must be disabled when Myricom 10Gbps Card i
o amd64/130368 amd64      [hang] Switching from xorg to console locks up compute
o amd64/129889 amd64      [boot] The booting process stops at the line mounting 
o amd64/129721 amd64      [hang] Motherboard K9N2G Neo-FD hangs on boot of 7.0-R
o amd64/129667 amd64      [ata] Elitegroup A780GM-A IDE controller not recognize
o amd64/129488 amd64      [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o amd64/129426 amd64      [panic] FreeBSD 7.0 crash after subdiskXX: detached
o amd64/129315 amd64      [boot] amd64 motherboard: Intel DG965WH motherboard co
f amd64/128978 amd64      [install] FreeBSD 6.3 64-bit  panics at boot time duri
o amd64/128810 amd64      AMD 64 port installation
o amd64/128765 amd64      [install] Install CD loads to Install choices but stop
o amd64/128686 amd64      [ata] can't detect SATA Disk on 8.0-Current with NF550
o amd64/128263 amd64      [panic] 2 amd64 dl380 g5 with dual quadcore xeons, 8 a
o amd64/128259 amd64      csh(1): "`" crashes csh
o amd64/127640 amd64      gcc(1) will not build shared libraries with -fprofile-
o amd64/127492 amd64      [zfs] System hang on ZFS input-output
o amd64/127484 amd64      [timecounters] Drift problem with FreeBSD 7.0 and 7.1 
o amd64/127451 amd64      [scheduler] incorrect load on quad core
o amd64/127397 amd64      [amd64] 32bit application on FreeBSD-6.3 amd64 gets SI
s amd64/127276 amd64      ldd(1) invokes linux yes
o amd64/127129 amd64      mdconfig(8) is core dumping with Segmentation Fault 11
o amd64/125873 amd64      [smbd] [panic] Repeated kernel panics, trap 12 page fa
o amd64/125002 amd64      [install] amd64, SATA hard disks not detected
o amd64/124432 amd64      [panic] 7.0-STABLE panic: invalbuf: dirty bufs
o amd64/124134 amd64      [kernel] The kernel doesn't follow the calling convent
o amd64/123562 amd64      [install] FreeBSD amd64 not installs
o amd64/123520 amd64      [ahd] unable to boot from net while using ahd
o amd64/123456 amd64      fstat(1): /usr/bin/fstat shows error messages and hang
f amd64/123275 amd64      [cbb] [pcmcia] cbb/pcmcia drivers on amd64 failure [re
o kern/122782  amd64      [modules] accf_http.ko kernel module is not loadable
o amd64/122695 amd64      [cpufreq] Lack of cpufreq control using amd64 eith cor
o amd64/122624 amd64      unusable minimal installation of FreeBSD-7.0
o amd64/122549 amd64      7.0-RELEASE-amd64-bootonly.iso doesn't work w/ serial 
o amd64/122468 amd64      Compile problems after upgrading to 7.0
o amd64/122174 amd64      [panic] 7.0 no longer includes "device atpic" so fails
o amd64/121590 amd64      [est] [p4tcc] [acpi_perf] setting dev.cpu.0.freq somet
o amd64/121439 amd64      [boot] Installation of FreeBSD 7.0 fails: ACPI problem
o amd64/120202 amd64      [amd64] [patch] [panic] kernel panic at start_all_aps,
o amd64/119591 amd64      [amd64] [patch] time_t on 64-bit architecture
o amd64/117418 amd64      [hang] FreeBSD 6.2 crash on amd64 4400+ with ssh
o amd64/117316 amd64      [acpi] ACPI lockups on SuperMicro motherboard
o amd64/117296 amd64      [ata] I don`t see second SATA IDE on VIA VT8237A
a amd64/117186 amd64      [modules] kldload Unsupported file type on STABLE amd6
s amd64/116689 amd64      [request] support for MSI K9MM-V
o amd64/116620 amd64      [hang] ifconfig spins when creating carp(4) device on 
o amd64/116322 amd64      [panic] At start fsck on current, the system panics
o amd64/116159 amd64      [panic] Panic while debugging on CURRENT
s amd64/115815 amd64      [ata] [request] Gigabyte GA-M61P-S3 Motherboard unsupp
o amd64/115581 amd64      [Makefile] [patch] -mfancy-math-387 has no effect
o amd64/115194 amd64      LCD screen remains blank after Dell XPS M1210 lid is c
o amd64/114270 amd64      [cpufreq] cpufreq doesnt work when compiled in to kern
o amd64/114111 amd64      [nfs] System crashes while writing on NFS-mounted shar
f amd64/113021 amd64      [re] ASUS M2A-VM onboard NIC does not work
o amd64/112222 amd64      [libc] 32-bit libc incorrectly converts some FP number
f amd64/111992 amd64      [boot] BTX failed - HP Laptop dv2315nr
o amd64/110655 amd64      [threads] 32 bit threaded applications crash on amd64 
o amd64/110599 amd64      [geli] geli attach to gmirror device hangs and cannot 
s amd64/108861 amd64      [nve] nve(4) driver on FreeBSD 6.2 AMD64 does not work
o amd64/106186 amd64      [panic] panic in swap_pager_swap_init (amd64/smp/6.2-p
f amd64/105629 amd64      [re] TrendNet TEG-BUSR 10/100/1000 disables itself on 
f amd64/105531 amd64      [ata] gigabyte GA-M51GM-S2G / nVidia nForce 430 - does
f amd64/105514 amd64      [boot] FreeBSD/amd64 - Fails to boot on HP Pavilion dv
o amd64/102716 amd64      ex with no argument in an xterm gets SIGSEGV
o amd64/97337  amd64      [dri] xorg reboots system if dri module is enabled
o amd64/95888  amd64      [ata] kernel: ad2: TIMEOUT - WRITE_DMA retrying on HP 
f amd64/94989  amd64      [boot] BTX Halts on Sun Fire X2100 w/6.1-BETA4 (amd64)
o amd64/94677  amd64      [panic] panic in amd64 install at non-root user creati
o amd64/93961  amd64      [busdma] Problem in bounce buffer handling in sys/amd6
o amd64/92337  amd64      [em] FreeBSD 6.0 Release Intel Pro 1000 MT em1 no buff
f amd64/91492  amd64      [boot] BTX halted
o amd64/91405  amd64      [asr] [panic] Kernel panic caused by asr on 6.0-amd64 
o amd64/89501  amd64      [install] System crashes on install using ftp on local
o amd64/88790  amd64      [panic] kernel panic on first boot (after the FreeBSD 
o amd64/88568  amd64      [panic] 6.0-RELEASE install cd does not boot with usb 
o amd64/87689  amd64      [powerd] [hang] powerd hangs SMP Opteron 244 5-STABLE
o amd64/87316  amd64      [vge] "vge0 attach returned 6" on FreeBSD 6.0-RC1 amd6
o amd64/87305  amd64      [smp] Dual Opteron / FreeBSD 5 & 6 / powerd results in
f amd64/87258  amd64      [smp] [boot] cannot boot with SMP and Areca ARC-1160 r
f amd64/86080  amd64      [radeon] [hang] radeon DRI causes system hang on amd64
s amd64/85273  amd64      [install] FreeBSD (NetBSD or OpenBSD) not install on l
o amd64/78406  amd64      [panic]AMD64 w/ SCSI: issue 'rm -r /usr/ports' and sys
o amd64/76136  amd64      [hang] system halts before reboot
o amd64/74747  amd64      [panic] System panic on shutdown when process will not
f amd64/73322  amd64      [msdosfs] [hang] unarchiving /etc to msdosfs locks up 

95 problems total.


From owner-freebsd-amd64@FreeBSD.ORG  Wed Apr  8 20:25:05 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 222BD1065672
	for <freebsd-amd64@freebsd.org>; Wed,  8 Apr 2009 20:25:05 +0000 (UTC)
	(envelope-from christoph.mallon@gmx.de)
Received: from mail.gmx.net (mail.gmx.net [213.165.64.20])
	by mx1.freebsd.org (Postfix) with SMTP id 26A4F8FC19
	for <freebsd-amd64@freebsd.org>; Wed,  8 Apr 2009 20:25:03 +0000 (UTC)
	(envelope-from christoph.mallon@gmx.de)
Received: (qmail invoked by alias); 08 Apr 2009 19:58:18 -0000
Received: from p54A3ED49.dip.t-dialin.net (EHLO tron.homeunix.org)
	[84.163.237.73]
	by mail.gmx.net (mp003) with SMTP; 08 Apr 2009 21:58:18 +0200
X-Authenticated: #1673122
X-Provags-ID: V01U2FsdGVkX18USWokAY8UFTCCyifQrh6VcCjKva/H0V5Jjs2LXe
	KHDQGN/SzkFTLo
Message-ID: <49DD01CA.3040900@gmx.de>
Date: Wed, 08 Apr 2009 21:58:02 +0200
From: Christoph Mallon <christoph.mallon@gmx.de>
User-Agent: Thunderbird 2.0.0.19 (X11/20090103)
MIME-Version: 1.0
To: freebsd-amd64@freebsd.org, freebsd-i386@freebsd.org
Content-Type: multipart/mixed; boundary="------------010606040208020605070204"
X-Y-GMX-Trusted: 0
X-FuHaFi: 0.63,0.46
X-Mailman-Approved-At: Wed, 08 Apr 2009 20:36:55 +0000
Cc: Ed Schouten <ed@FreeBSD.org>
Subject: [PATCH] Simplify in*() and out*() functions of AMD64 and i386
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 20:25:05 -0000

This is a multi-part message in MIME format.
--------------010606040208020605070204
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

Hi amd64@ and i386@,

attached is a patch which simplifies the in*() and out*() functions for 
I/O port access of AMD64 and i386. It removes an unnecessary distinction 
of cases for inb() and outb(), which was used to generate better code 
for ports < 256. This is unnecessary, because GCC supports an asm input 
constraint to handle this ("N"). The stale comment, which states there 
is no constraint for this, is removed, too. Also the {in,out}{w,l}() get 
treated with this constraint. They had no special case before, so now 
better code is generated for them. Further, the unnecessary "cld" is 
removed from {in,out}s{b,w,l}(), because it is guaranteed by the ABI 
that the direction flag is cleared. All in all the code for in/out gets 
a bit simpler.


Comments are welcome

	Christoph

--------------010606040208020605070204
Content-Type: text/plain;
 name="inout.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="inout.diff"

Index: sys/i386/include/cpufunc.h
===================================================================
--- sys/i386/include/cpufunc.h	(Revision 190841)
+++ sys/i386/include/cpufunc.h	(Arbeitskopie)
@@ -170,177 +170,97 @@
 	__asm __volatile("hlt");
 }
 
-#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3
-
-#define	inb(port)		inbv(port)
-#define	outb(port, data)	outbv(port, data)
-
-#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */
-
-/*
- * The following complications are to get around gcc not having a
- * constraint letter for the range 0..255.  We still put "d" in the
- * constraint because "i" isn't a valid constraint when the port
- * isn't constant.  This only matters for -O0 because otherwise
- * the non-working version gets optimized away.
- * 
- * Use an expression-statement instead of a conditional expression
- * because gcc-2.6.0 would promote the operands of the conditional
- * and produce poor code for "if ((inb(var) & const1) == const2)".
- *
- * The unnecessary test `(port) < 0x10000' is to generate a warning if
- * the `port' has type u_short or smaller.  Such types are pessimal.
- * This actually only works for signed types.  The range check is
- * careful to avoid generating warnings.
- */
-#define	inb(port) __extension__ ({					\
-	u_char	_data;							\
-	if (__builtin_constant_p(port) && ((port) & 0xffff) < 0x100	\
-	    && (port) < 0x10000)					\
-		_data = inbc(port);					\
-	else								\
-		_data = inbv(port);					\
-	_data; })
-
-#define	outb(port, data) (						\
-	__builtin_constant_p(port) && ((port) & 0xffff) < 0x100		\
-	&& (port) < 0x10000						\
-	? outbc(port, data) : outbv(port, data))
-
-static __inline u_char
-inbc(u_int port)
+static inline u_char
+inb(u_short port)
 {
-	u_char	data;
-
-	__asm __volatile("inb %1,%0" : "=a" (data) : "id" ((u_short)(port)));
-	return (data);
+	u_char data;
+	__asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port));
+	return data;
 }
 
-static __inline void
-outbc(u_int port, u_char data)
+static inline u_int
+inl(u_short port)
 {
-	__asm __volatile("outb %0,%1" : : "a" (data), "id" ((u_short)(port)));
-}
-
-#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P  && __GNUCLIKE_ASM >= 3*/
-
-static __inline u_char
-inbv(u_int port)
-{
-	u_char	data;
-	/*
-	 * We use %%dx and not %1 here because i/o is done at %dx and not at
-	 * %edx, while gcc generates inferior code (movw instead of movl)
-	 * if we tell it to load (u_short) port.
-	 */
-	__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
+	u_int data;
+	__asm volatile("inl %1, %0" : "=a" (data) : "Nd" (port));
 	return (data);
 }
 
-static __inline u_int
-inl(u_int port)
+static inline void
+insb(u_short port, void *addr, size_t cnt)
 {
-	u_int	data;
-
-	__asm __volatile("inl %%dx,%0" : "=a" (data) : "d" (port));
-	return (data);
+	__asm volatile("rep; insb"
+	    : "+D" (addr), "+c" (cnt)
+	    : "d" (port)
+	    : "memory");
 }
 
-static __inline void
-insb(u_int port, void *addr, size_t cnt)
+static inline void
+insw(u_short port, void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; insb"
-			 : "+D" (addr), "+c" (cnt)
-			 : "d" (port)
-			 : "memory");
+	__asm volatile("rep; insw"
+	    : "+D" (addr), "+c" (cnt)
+	    : "d" (port)
+	    : "memory");
 }
 
-static __inline void
-insw(u_int port, void *addr, size_t cnt)
+static inline void
+insl(u_short port, void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; insw"
-			 : "+D" (addr), "+c" (cnt)
-			 : "d" (port)
-			 : "memory");
+	__asm volatile("rep; insl"
+	    : "+D" (addr), "+c" (cnt)
+	    : "d" (port)
+	    : "memory");
 }
 
 static __inline void
-insl(u_int port, void *addr, size_t cnt)
-{
-	__asm __volatile("cld; rep; insl"
-			 : "+D" (addr), "+c" (cnt)
-			 : "d" (port)
-			 : "memory");
-}
-
-static __inline void
 invd(void)
 {
 	__asm __volatile("invd");
 }
 
-static __inline u_short
-inw(u_int port)
+static inline u_short
+inw(u_short port)
 {
-	u_short	data;
-
-	__asm __volatile("inw %%dx,%0" : "=a" (data) : "d" (port));
+	u_short data;
+	__asm volatile("inw %1, %0" : "=a" (data) : "Nd" (port));
 	return (data);
 }
 
-static __inline void
-outbv(u_int port, u_char data)
+static inline void
+outb(u_short port, u_char data)
 {
-	u_char	al;
-	/*
-	 * Use an unnecessary assignment to help gcc's register allocator.
-	 * This make a large difference for gcc-1.40 and a tiny difference
-	 * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
-	 * best results.  gcc-2.6.0 can't handle this.
-	 */
-	al = data;
-	__asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port));
+	__asm volatile("outb %0, %1" : : "a" (data), "Nd" (port));
 }
 
-static __inline void
-outl(u_int port, u_int data)
+static inline void
+outl(u_short port, u_int data)
 {
-	/*
-	 * outl() and outw() aren't used much so we haven't looked at
-	 * possible micro-optimizations such as the unnecessary
-	 * assignment for them.
-	 */
-	__asm __volatile("outl %0,%%dx" : : "a" (data), "d" (port));
+	__asm volatile("outl %0, %1" : : "a" (data), "Nd" (port));
 }
 
-static __inline void
-outsb(u_int port, const void *addr, size_t cnt)
+static inline void
+outsb(u_short port, const void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; outsb"
-			 : "+S" (addr), "+c" (cnt)
-			 : "d" (port));
+	__asm volatile("rep; outsb" : "+S" (addr), "+c" (cnt) : "d" (port));
 }
 
-static __inline void
-outsw(u_int port, const void *addr, size_t cnt)
+static inline void
+outsw(u_short port, const void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; outsw"
-			 : "+S" (addr), "+c" (cnt)
-			 : "d" (port));
+	__asm volatile("rep; outsw" : "+S" (addr), "+c" (cnt) : "d" (port));
 }
 
-static __inline void
-outsl(u_int port, const void *addr, size_t cnt)
+static inline void
+outsl(u_short port, const void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; outsl"
-			 : "+S" (addr), "+c" (cnt)
-			 : "d" (port));
+	__asm volatile("rep; outsl" : "+S" (addr), "+c" (cnt) : "d" (port));
 }
 
-static __inline void
-outw(u_int port, u_short data)
+static inline void
+outw(u_short port, u_short data)
 {
-	__asm __volatile("outw %0,%%dx" : : "a" (data), "d" (port));
+	__asm volatile("outw %0, %1" : : "a" (data), "Nd" (port));
 }
 
 static __inline void
Index: sys/i386/i386/machdep.c
===================================================================
--- sys/i386/i386/machdep.c	(Revision 190841)
+++ sys/i386/i386/machdep.c	(Arbeitskopie)
@@ -3555,45 +3555,24 @@
 #ifdef KDB
 
 /*
- * Provide inb() and outb() as functions.  They are normally only
- * available as macros calling inlined functions, thus cannot be
- * called from the debugger.
- *
- * The actual code is stolen from <machine/cpufunc.h>, and de-inlined.
+ * Provide inb() and outb() as functions.  They are normally only available as
+ * inline functions, thus cannot be called from the debugger.
  */
 
-#undef inb
-#undef outb
-
 /* silence compiler warnings */
-u_char inb(u_int);
-void outb(u_int, u_char);
+u_char inb_(u_short);
+void outb_(u_short, u_char);
 
 u_char
-inb(u_int port)
+inb_(u_short port)
 {
-	u_char	data;
-	/*
-	 * We use %%dx and not %1 here because i/o is done at %dx and not at
-	 * %edx, while gcc generates inferior code (movw instead of movl)
-	 * if we tell it to load (u_short) port.
-	 */
-	__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
-	return (data);
+	return inb(port);
 }
 
 void
-outb(u_int port, u_char data)
+outb_(u_short port, u_char data)
 {
-	u_char	al;
-	/*
-	 * Use an unnecessary assignment to help gcc's register allocator.
-	 * This make a large difference for gcc-1.40 and a tiny difference
-	 * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
-	 * best results.  gcc-2.6.0 can't handle this.
-	 */
-	al = data;
-	__asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port));
+	outb(port, data);
 }
 
 #endif /* KDB */
Index: sys/amd64/include/cpufunc.h
===================================================================
--- sys/amd64/include/cpufunc.h	(Revision 190841)
+++ sys/amd64/include/cpufunc.h	(Arbeitskopie)
@@ -164,177 +164,97 @@
 	__asm __volatile("hlt");
 }
 
-#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3
-
-#define	inb(port)		inbv(port)
-#define	outb(port, data)	outbv(port, data)
-
-#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */
-
-/*
- * The following complications are to get around gcc not having a
- * constraint letter for the range 0..255.  We still put "d" in the
- * constraint because "i" isn't a valid constraint when the port
- * isn't constant.  This only matters for -O0 because otherwise
- * the non-working version gets optimized away.
- * 
- * Use an expression-statement instead of a conditional expression
- * because gcc-2.6.0 would promote the operands of the conditional
- * and produce poor code for "if ((inb(var) & const1) == const2)".
- *
- * The unnecessary test `(port) < 0x10000' is to generate a warning if
- * the `port' has type u_short or smaller.  Such types are pessimal.
- * This actually only works for signed types.  The range check is
- * careful to avoid generating warnings.
- */
-#define	inb(port) __extension__ ({					\
-	u_char	_data;							\
-	if (__builtin_constant_p(port) && ((port) & 0xffff) < 0x100	\
-	    && (port) < 0x10000)					\
-		_data = inbc(port);					\
-	else								\
-		_data = inbv(port);					\
-	_data; })
-
-#define	outb(port, data) (						\
-	__builtin_constant_p(port) && ((port) & 0xffff) < 0x100		\
-	&& (port) < 0x10000						\
-	? outbc(port, data) : outbv(port, data))
-
-static __inline u_char
-inbc(u_int port)
+static inline u_char
+inb(u_short port)
 {
-	u_char	data;
-
-	__asm __volatile("inb %1,%0" : "=a" (data) : "id" ((u_short)(port)));
-	return (data);
+	u_char data;
+	__asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port));
+	return data;
 }
 
-static __inline void
-outbc(u_int port, u_char data)
+static inline u_int
+inl(u_short port)
 {
-	__asm __volatile("outb %0,%1" : : "a" (data), "id" ((u_short)(port)));
-}
-
-#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P  && __GNUCLIKE_ASM >= 3*/
-
-static __inline u_char
-inbv(u_int port)
-{
-	u_char	data;
-	/*
-	 * We use %%dx and not %1 here because i/o is done at %dx and not at
-	 * %edx, while gcc generates inferior code (movw instead of movl)
-	 * if we tell it to load (u_short) port.
-	 */
-	__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
+	u_int data;
+	__asm volatile("inl %1, %0" : "=a" (data) : "Nd" (port));
 	return (data);
 }
 
-static __inline u_int
-inl(u_int port)
+static inline void
+insb(u_short port, void *addr, size_t cnt)
 {
-	u_int	data;
-
-	__asm __volatile("inl %%dx,%0" : "=a" (data) : "d" (port));
-	return (data);
+	__asm volatile("rep; insb"
+	    : "+D" (addr), "+c" (cnt)
+	    : "d" (port)
+	    : "memory");
 }
 
-static __inline void
-insb(u_int port, void *addr, size_t cnt)
+static inline void
+insw(u_short port, void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; insb"
-			 : "+D" (addr), "+c" (cnt)
-			 : "d" (port)
-			 : "memory");
+	__asm volatile("rep; insw"
+	    : "+D" (addr), "+c" (cnt)
+	    : "d" (port)
+	    : "memory");
 }
 
-static __inline void
-insw(u_int port, void *addr, size_t cnt)
+static inline void
+insl(u_short port, void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; insw"
-			 : "+D" (addr), "+c" (cnt)
-			 : "d" (port)
-			 : "memory");
+	__asm volatile("rep; insl"
+	    : "+D" (addr), "+c" (cnt)
+	    : "d" (port)
+	    : "memory");
 }
 
 static __inline void
-insl(u_int port, void *addr, size_t cnt)
-{
-	__asm __volatile("cld; rep; insl"
-			 : "+D" (addr), "+c" (cnt)
-			 : "d" (port)
-			 : "memory");
-}
-
-static __inline void
 invd(void)
 {
 	__asm __volatile("invd");
 }
 
-static __inline u_short
-inw(u_int port)
+static inline u_short
+inw(u_short port)
 {
-	u_short	data;
-
-	__asm __volatile("inw %%dx,%0" : "=a" (data) : "d" (port));
+	u_short data;
+	__asm volatile("inw %1, %0" : "=a" (data) : "Nd" (port));
 	return (data);
 }
 
-static __inline void
-outbv(u_int port, u_char data)
+static inline void
+outb(u_short port, u_char data)
 {
-	u_char	al;
-	/*
-	 * Use an unnecessary assignment to help gcc's register allocator.
-	 * This make a large difference for gcc-1.40 and a tiny difference
-	 * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
-	 * best results.  gcc-2.6.0 can't handle this.
-	 */
-	al = data;
-	__asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port));
+	__asm volatile("outb %0, %1" : : "a" (data), "Nd" (port));
 }
 
-static __inline void
-outl(u_int port, u_int data)
+static inline void
+outl(u_short port, u_int data)
 {
-	/*
-	 * outl() and outw() aren't used much so we haven't looked at
-	 * possible micro-optimizations such as the unnecessary
-	 * assignment for them.
-	 */
-	__asm __volatile("outl %0,%%dx" : : "a" (data), "d" (port));
+	__asm volatile("outl %0, %1" : : "a" (data), "Nd" (port));
 }
 
-static __inline void
-outsb(u_int port, const void *addr, size_t cnt)
+static inline void
+outsb(u_short port, const void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; outsb"
-			 : "+S" (addr), "+c" (cnt)
-			 : "d" (port));
+	__asm volatile("rep; outsb" : "+S" (addr), "+c" (cnt) : "d" (port));
 }
 
-static __inline void
-outsw(u_int port, const void *addr, size_t cnt)
+static inline void
+outsw(u_short port, const void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; outsw"
-			 : "+S" (addr), "+c" (cnt)
-			 : "d" (port));
+	__asm volatile("rep; outsw" : "+S" (addr), "+c" (cnt) : "d" (port));
 }
 
-static __inline void
-outsl(u_int port, const void *addr, size_t cnt)
+static inline void
+outsl(u_short port, const void *addr, size_t cnt)
 {
-	__asm __volatile("cld; rep; outsl"
-			 : "+S" (addr), "+c" (cnt)
-			 : "d" (port));
+	__asm volatile("rep; outsl" : "+S" (addr), "+c" (cnt) : "d" (port));
 }
 
-static __inline void
-outw(u_int port, u_short data)
+static inline void
+outw(u_short port, u_short data)
 {
-	__asm __volatile("outw %0,%%dx" : : "a" (data), "d" (port));
+	__asm volatile("outw %0, %1" : : "a" (data), "Nd" (port));
 }
 
 static __inline void
Index: sys/amd64/amd64/machdep.c
===================================================================
--- sys/amd64/amd64/machdep.c	(Revision 190841)
+++ sys/amd64/amd64/machdep.c	(Arbeitskopie)
@@ -2178,45 +2178,24 @@
 #ifdef KDB
 
 /*
- * Provide inb() and outb() as functions.  They are normally only
- * available as macros calling inlined functions, thus cannot be
- * called from the debugger.
- *
- * The actual code is stolen from <machine/cpufunc.h>, and de-inlined.
+ * Provide inb() and outb() as functions.  They are normally only available as
+ * inline functions, thus cannot be called from the debugger.
  */
 
-#undef inb
-#undef outb
-
 /* silence compiler warnings */
-u_char inb(u_int);
-void outb(u_int, u_char);
+u_char inb_(u_short);
+void outb_(u_short, u_char);
 
 u_char
-inb(u_int port)
+inb_(u_short port)
 {
-	u_char	data;
-	/*
-	 * We use %%dx and not %1 here because i/o is done at %dx and not at
-	 * %edx, while gcc generates inferior code (movw instead of movl)
-	 * if we tell it to load (u_short) port.
-	 */
-	__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
-	return (data);
+	return inb(port);
 }
 
 void
-outb(u_int port, u_char data)
+outb_(u_short port, u_char data)
 {
-	u_char	al;
-	/*
-	 * Use an unnecessary assignment to help gcc's register allocator.
-	 * This make a large difference for gcc-1.40 and a tiny difference
-	 * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
-	 * best results.  gcc-2.6.0 can't handle this.
-	 */
-	al = data;
-	__asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port));
+	outb(port, data);
 }
 
 #endif /* KDB */

--------------010606040208020605070204--

From owner-freebsd-amd64@FreeBSD.ORG  Wed Apr  8 20:58:37 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5C1F21065673;
	Wed,  8 Apr 2009 20:58:37 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.terabit.net.ua (mail.terabit.net.ua [195.137.202.147])
	by mx1.freebsd.org (Postfix) with ESMTP id F1E8B8FC12;
	Wed,  8 Apr 2009 20:58:36 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from skuns.zoral.com.ua ([91.193.166.194] helo=mail.zoral.com.ua)
	by mail.terabit.net.ua with esmtps (TLSv1:AES256-SHA:256)
	(Exim 4.63 (FreeBSD)) (envelope-from <kostikbel@gmail.com>)
	id 1Lreqw-000ERN-J6; Wed, 08 Apr 2009 23:58:34 +0300
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n38KwVC9074564
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 8 Apr 2009 23:58:31 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id
	n38KwVIS007210; Wed, 8 Apr 2009 23:58:31 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n38KwV7I007209; 
	Wed, 8 Apr 2009 23:58:31 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Wed, 8 Apr 2009 23:58:31 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Christoph Mallon <christoph.mallon@gmx.de>
Message-ID: <20090408205831.GK3014@deviant.kiev.zoral.com.ua>
References: <49DD01CA.3040900@gmx.de>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="1y1tiN5hVw5cPBDe"
Content-Disposition: inline
In-Reply-To: <49DD01CA.3040900@gmx.de>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: ClamAV version 0.94.2,
	clamav-milter version 0.94.2 on skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00
	autolearn=ham version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
X-Virus-Scanned: mail.terabit.net.ua 1Lreqw-000ERN-J6
	cb1c7ae4464e023beb1abd38385d05dc
X-Terabit: YES
Cc: Ed Schouten <ed@freebsd.org>, freebsd-i386@freebsd.org,
	freebsd-amd64@freebsd.org
Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 20:58:38 -0000


--1y1tiN5hVw5cPBDe
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Apr 08, 2009 at 09:58:02PM +0200, Christoph Mallon wrote:
> Hi amd64@ and i386@,
>=20
> attached is a patch which simplifies the in*() and out*() functions for=
=20
> I/O port access of AMD64 and i386. It removes an unnecessary distinction=
=20
> of cases for inb() and outb(), which was used to generate better code=20
> for ports < 256. This is unnecessary, because GCC supports an asm input=
=20
> constraint to handle this ("N"). The stale comment, which states there=20
> is no constraint for this, is removed, too. Also the {in,out}{w,l}() get=
=20
> treated with this constraint. They had no special case before, so now=20

> better code is generated for them. Further, the unnecessary "cld" is=20
> removed from {in,out}s{b,w,l}(), because it is guaranteed by the ABI=20
> that the direction flag is cleared. All in all the code for in/out gets=
=20
> a bit simpler.
The DF flag is guaranteed to be cleared for usermode only. Currently,
kernel does not clear the flag on entry.

We did fixed signal handlers to always have DF cleared, but the kernel
was explicitely left out because used version of gcc does the right
thing with DF when needed.

--1y1tiN5hVw5cPBDe
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (FreeBSD)

iEYEARECAAYFAkndD/UACgkQC3+MBN1Mb4hXiwCg2e662P8WkauJesi4PCP7YRVT
/tEAoNa5yIybsppNuukP1CgbmNJMdClQ
=A95l
-----END PGP SIGNATURE-----

--1y1tiN5hVw5cPBDe--

From owner-freebsd-amd64@FreeBSD.ORG  Thu Apr  9 18:54:10 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C74911065797;
	Thu,  9 Apr 2009 18:54:10 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from fallbackmx08.syd.optusnet.com.au
	(fallbackmx08.syd.optusnet.com.au [211.29.132.10])
	by mx1.freebsd.org (Postfix) with ESMTP id 323028FC27;
	Thu,  9 Apr 2009 18:54:10 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au
	[211.29.132.190])
	by fallbackmx08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	n39GksaR011236; Fri, 10 Apr 2009 02:46:54 +1000
Received: from besplex.bde.org (c122-107-120-227.carlnfd1.nsw.optusnet.com.au
	[122.107.120.227])
	by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	n39GknQe012764
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 10 Apr 2009 02:46:51 +1000
Date: Fri, 10 Apr 2009 02:46:49 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Christoph Mallon <christoph.mallon@gmx.de>
In-Reply-To: <49DD01CA.3040900@gmx.de>
Message-ID: <20090410001919.S1567@besplex.bde.org>
References: <49DD01CA.3040900@gmx.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Ed Schouten <ed@FreeBSD.org>, freebsd-i386@FreeBSD.org,
	freebsd-amd64@FreeBSD.org
Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 18:54:11 -0000

On Wed, 8 Apr 2009, Christoph Mallon wrote:

> attached is a patch which simplifies the in*() and out*() functions for I/O 
> port access of AMD64 and i386. It removes an unnecessary distinction of cases 
> for inb() and outb(), which was used to generate better code for ports < 256.

In gcc >= 2.  (gcc == 1 didn't even have __builtin_const_p(), so the
optimization was too hard to do).

> This is unnecessary, because GCC supports an asm input constraint to handle 
> this ("N"). The stale comment, which states there is no constraint for this, 
> is removed, too.

This was necessary, but is stale like its comment.  There is of course no
need to support gcc 1 or 2.

> Also the {in,out}{w,l}() get treated with this constraint. 
> They had no special case before, so now better code is generated for them.

Actually, worse code is generated for them on most cases since the
optimization of loading 32 bits for the usual case of a non-constant
operand is lost in your version -- see below.  The case of a constant
port address was rare except for 8-bit isa ports "on the motherboard"
and has become rarer with FreeBSD's runtime configuration becoming
even more excessive (so instead of "inb $N,%al", the code is normally
about 10 instructions for bus_space_read_1(sc->bst, sc->bsh, N1)).

> Further, the unnecessary "cld" is removed from {in,out}s{b,w,l}(), because it 
> is guaranteed by the ABI that the direction flag is cleared. All in all the 
> code for in/out gets a bit simpler.

kib@ already pointed out that the direction flag is not guaranteed to be
cleared in the kernel.  For amd64, this may be a bug in the kernel, but
for i386 it is a bug in gcc say that the ABI specifies that the direction
flag is cleared, since there is no standard i386 ABI so gcc must not
assume that the direction flag is clear, like it has done until recently.

There are lots more similar cld's in bus.h.

% Index: sys/i386/include/cpufunc.h
% ===================================================================
% --- sys/i386/include/cpufunc.h	(Revision 190841)
% +++ sys/i386/include/cpufunc.h	(Arbeitskopie)
% @@ -170,177 +170,97 @@
%  	__asm __volatile("hlt");
%  }
% 
% -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3
% -
% -#define	inb(port)		inbv(port)
% -#define	outb(port, data)	outbv(port, data)
% -
% -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */

The ifdefs are ugly, and this one seems to have been slightly wrong, but
they provide documentation aboout the features used:
- __GNUCLIKE_BUILTIN_CONSTANT_P: no longer needed
- __GNUCLIKE_ASM >= 3: to ensure that the "i" constraint is available.
   This used to be __GNUC__ >= 2.  If that was correct then >= 3 is
   stricter than necessary; otherwise >= 2 was not as strict as it
   should have been, since some versions of gcc-2 certainly supported
   the "i" constraint.

A precise translation of the above would be __GNU_PREREQ__(maj, min) ||
defined(__INTEL_COMPILER), where (maj, min) is the version of gcc that
implemented the "N" constraint and you should check that the Intel
compiler (all version?) support the "N" constraint.

% - * The unnecessary test `(port) < 0x10000' is to generate a warning if
% - * the `port' has type u_short or smaller.  Such types are pessimal.
% - * This actually only works for signed types.  The range check is
% - * careful to avoid generating warnings.

This won't work any more, if it ever did.  I think it only worked for
the constant case, but for the constant case the width doesn't really
matter.

% -static __inline u_char
% -inbc(u_int port)
% +static inline u_char
% +inb(u_short port)

__inline should be globabally changed to inline someday, not starting here.
Plain inline has been Standard for 10 years now, but it still isn't
standard -- we're still waiting for gcc to implement C99, and haven't
switched to gcc-4.3(?) or later which will implement C99 extern inline.

Changing the type changes the API and breaks an optimization.  Callers
are supposed to provide an arg of width 32 so that it can be loaded
as efficiently as possible (instruction prefixes and/or sign extension
waste code space and time, mainly on old CPUs).  The 0x10000 check
referred to in the above comment attempts to enforce this.  Casting
down in the API prevents the 32-bit loads.

%  {
% -	u_char	data;
% -
% -	__asm __volatile("inb %1,%0" : "=a" (data) : "id" ((u_short)(port)));

The port address was already cast down here in inbc().  This seems to
have been just because I didn't know gnu asm very well when I wrote
the above.  The cast has no effect in the usual case where the "i"
constraint is used, but when the "d" constraint is used (only when
compiling with -O0?) %1 would be %edx without the cast.  In inbv(), I
hard-code %dx, but that does't work here "i" case.  The correct method
seems to be to use %w1 in both places (keep u_int port everywhere and
use the same method for the "N" constraint).

% -	return (data);
% +	u_char data;
% +	__asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port));
% +	return data;
%  }

Style breakages:
- lost blank line after declarations.  This file followed the rule about
   this except for previous breakage.  It even followed it in some cases
   where the declarations are null.
- lost parentheses around return value.  This file followed the rule about
   this in all cases.
Style not-quite-fix:
- lost the tab after the type in the declaration.  In KNF, such a tab is
   not used for auto declarations.  However, this file was fairly consistent
   about breaking the rule.

Why change __volatile to volatile?  This is similar to changing __inline
to inline, except plain volatile is more standard.  <sys/cdefs.h> has
ifdef tangles which are supposed to allow users or cdefs.h itself to
define away volatile so as to support old compilers, and we use __volatile
instead of volatile a lot to support this.  This is probably not needed
for kernel headers.  However, cpufunc.h is sometimes abused in userland.

Fixed version (change nothing except the function name, %1, and the
constraint):

%%%
static __inline u_char
inb(u_int port)
{
 	u_char	data;

 	__asm __volatile("inb %w1, %0" : "=a" (data) : "Nd" (port));
 	return (data);
}
%%%

Similarly for the other functions.

BTW, I sometimes missing having the inverse of %[bwl...] -- a way to
generate an operand size letter from the type.  gcc-3 and gcc-4 have
some horribly incomplete support for this.

% -#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P  && __GNUCLIKE_ASM >= 3*/

Removing this accidentally fixes >= 2 style bugs in it.

% -static __inline u_char
% -inbv(u_int port)
% -{
% -	u_char	data;
% -	/*
% -	 * We use %%dx and not %1 here because i/o is done at %dx and not at
% -	 * %edx, while gcc generates inferior code (movw instead of movl)
% -	 * if we tell it to load (u_short) port.
% -	 */

Lost this documentation.  Now the magic in my version is in %w in my version.
The comment shouldn't blame gcc for generating movw -- that is the
programmer's fault.

gcc on modern machines now generates movswl for loading a u_short port
to the u_int port specified by the old API, and movswl is fast on modern
machines, so there is no penalty in using the old API/implementation
on modern machines even if the caller neglected to start with a u_int
port.

% ...
% -static __inline void
% -insb(u_int port, void *addr, size_t cnt)
% +static inline void
% +insw(u_short port, void *addr, size_t cnt)
%  {
% -	__asm __volatile("cld; rep; insb"
% -			 : "+D" (addr), "+c" (cnt)
% -			 : "d" (port)
% -			 : "memory");
% +	__asm volatile("rep; insw"
% +	    : "+D" (addr), "+c" (cnt)
% +	    : "d" (port)
% +	    : "memory");
%  }

The diffs are hard to read, apparently due to diff getting confused
by your API and style changes and comparing unrelated functions.
Without the API changes, I think it would have matched insb with insb,
etc.

% -static __inline void
% -outbv(u_int port, u_char data)
% +static inline void
% +outb(u_short port, u_char data)
%  {
% -	u_char	al;
% -	/*
% -	 * Use an unnecessary assignment to help gcc's register allocator.
% -	 * This make a large difference for gcc-1.40 and a tiny difference
% -	 * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
% -	 * best results.  gcc-2.6.0 can't handle this.
% -	 */
% -	al = data;

This can go of course.  I saw dummy assignments bogusly helping the
allocator in more complicated asms for a version of atomic_cmpset()
as late as gcc-3, but that seemed to be fixed by gcc-3.3.

% Index: sys/i386/i386/machdep.c
% ===================================================================
% --- sys/i386/i386/machdep.c	(Revision 190841)
% +++ sys/i386/i386/machdep.c	(Arbeitskopie)
% @@ -3555,45 +3555,24 @@
%  #ifdef KDB
% 
%  /*
% - * Provide inb() and outb() as functions.  They are normally only
% - * available as macros calling inlined functions, thus cannot be
% - * called from the debugger.
% - *
% - * The actual code is stolen from <machine/cpufunc.h>, and de-inlined.
% + * Provide inb() and outb() as functions.  They are normally only available as
% + * inline functions, thus cannot be called from the debugger.
%   */

This should have been implemented by using cpufunc.h instead of repeating
it, and for everything in cpufunc.h not just inb and outb, e.g., as is
done for everything in atomic.h.  This gives a technical reason for
using __inline -- we want to #undef it and shouldn't #undef a standard
keyword.

% 
% -#undef inb
% -#undef outb
% -
%  /* silence compiler warnings */
% -u_char inb(u_int);
% -void outb(u_int, u_char);
% +u_char inb_(u_short);
% +void outb_(u_short, u_char);
% 
%  u_char
% -inb(u_int port)
% +inb_(u_short port)
%  {
% -	u_char	data;
% -	/*
% -	 * We use %%dx and not %1 here because i/o is done at %dx and not at
% -	 * %edx, while gcc generates inferior code (movw instead of movl)
% -	 * if we tell it to load (u_short) port.
% -	 */
% -	__asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
% -	return (data);
% +	return inb(port);
%  }
% 
%  void
% -outb(u_int port, u_char data)
% +outb_(u_short port, u_char data)
%  {
% -	u_char	al;
% -	/*
% -	 * Use an unnecessary assignment to help gcc's register allocator.
% -	 * This make a large difference for gcc-1.40 and a tiny difference
% -	 * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
% -	 * best results.  gcc-2.6.0 can't handle this.
% -	 */
% -	al = data;
% -	__asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port));
% +	outb(port, data);
%  }
% 
%  #endif /* KDB */

This changes the (ddb undocumented) API and ABI to worse ones.  I
already don't line having to type "call inb(foo)" (in my x86 debugger,
inb is a command named "i", with certain space insensitivity that
allows typing "ifoo").

Bruce

From owner-freebsd-amd64@FreeBSD.ORG  Thu Apr  9 16:10:06 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2416C106564A
	for <freebsd-amd64@hub.freebsd.org>;
	Thu,  9 Apr 2009 16:10:06 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id D43A88FC1E
	for <freebsd-amd64@hub.freebsd.org>;
	Thu,  9 Apr 2009 16:10:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n39GA5eZ049040
	for <freebsd-amd64@freefall.freebsd.org>; Thu, 9 Apr 2009 16:10:05 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n39GA5J9049039;
	Thu, 9 Apr 2009 16:10:05 GMT (envelope-from gnats)
Resent-Date: Thu, 9 Apr 2009 16:10:05 GMT
Resent-Message-Id: <200904091610.n39GA5J9049039@freefall.freebsd.org>
Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer)
Resent-To: freebsd-amd64@FreeBSD.org
Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, hobury <info@hobury.com>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8BF21106564A
	for <freebsd-gnats-submit@FreeBSD.org>;
	Thu,  9 Apr 2009 16:00:29 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 5FD1F8FC08
	for <freebsd-gnats-submit@FreeBSD.org>;
	Thu,  9 Apr 2009 16:00:29 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n39G0S1N006543
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 9 Apr 2009 16:00:28 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n39G0SJM006542;
	Thu, 9 Apr 2009 16:00:28 GMT (envelope-from nobody)
Message-Id: <200904091600.n39G0SJM006542@www.freebsd.org>
Date: Thu, 9 Apr 2009 16:00:28 GMT
From: hobury <info@hobury.com>
To: freebsd-gnats-submit@FreeBSD.org
X-Send-Pr-Version: www-3.1
X-Mailman-Approved-At: Thu, 09 Apr 2009 19:16:04 +0000
Cc: 
Subject: amd64/133540: Cannot connect to ftp mirrors for 7.2 beta boot-only
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 16:10:06 -0000


>Number:         133540
>Category:       amd64
>Synopsis:       Cannot connect to ftp mirrors for 7.2 beta boot-only
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr 09 16:10:05 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     hobury
>Release:        7.2 beta
>Organization:
hobury
>Environment:
Not applicable, but currently using FreeBSD 7.1
>Description:
A boot-only of a 7.2 beta (announced 10 apr 2009) enters sysinstall as normal, but cannot connect to the ftp mirrors: leads to timeout. Possibly same issue for non-amd64 machines. A boot-only reinstall on same machines of 7.1 version works fine. Tried several mirrors from the ftp-list.
>How-To-Repeat:
Boot-only 64-bits 7.2 beta: sysinstall
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:

From owner-freebsd-amd64@FreeBSD.ORG  Fri Apr 10 04:44:46 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CAEF9106566B;
	Fri, 10 Apr 2009 04:44:46 +0000 (UTC)
	(envelope-from tinderbox@freebsd.org)
Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [205.211.164.50])
	by mx1.freebsd.org (Postfix) with ESMTP id A22A68FC17;
	Fri, 10 Apr 2009 04:44:46 +0000 (UTC)
	(envelope-from tinderbox@freebsd.org)
Received: from smtp1.sentex.ca (smtp1.sentex.ca [199.212.134.4])
	by smarthost2.sentex.ca (8.14.3/8.14.3) with ESMTP id n3A4iiIP009614;
	Fri, 10 Apr 2009 00:44:44 -0400 (EDT)
	(envelope-from tinderbox@freebsd.org)
Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca
	[64.7.128.98])
	by smtp1.sentex.ca (8.14.3/8.14.3) with ESMTP id n3A4iimB014255;
	Fri, 10 Apr 2009 00:44:44 -0400 (EDT)
	(envelope-from tinderbox@freebsd.org)
Received: by freebsd-current.sentex.ca (Postfix, from userid 666)
	id 3BB177302F; Fri, 10 Apr 2009 00:44:44 -0400 (EDT)
Sender: FreeBSD Tinderbox <tinderbox@freebsd.org>
From: FreeBSD Tinderbox <tinderbox@freebsd.org>
To: FreeBSD Tinderbox <tinderbox@freebsd.org>, <current@freebsd.org>,
	<amd64@freebsd.org>
Precedence: bulk
Message-Id: <20090410044444.3BB177302F@freebsd-current.sentex.ca>
Date: Fri, 10 Apr 2009 00:44:44 -0400 (EDT)
X-Virus-Scanned: ClamAV 0.94.1/8983/Thu Feb 12 07:48:01 2009
	clamav-milter version 0.94.2 on clamscanner4
X-Virus-Status: Clean
X-Scanned-By: MIMEDefang 2.64 on 205.211.164.50
Cc: 
Subject: [head tinderbox] failure on amd64/amd64
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 04:44:47 -0000

TB --- 2009-04-10 02:40:00 - tinderbox 2.6 running on freebsd-current.sentex.ca
TB --- 2009-04-10 02:40:00 - starting HEAD tinderbox run for amd64/amd64
TB --- 2009-04-10 02:40:00 - cleaning the object tree
TB --- 2009-04-10 02:41:26 - cvsupping the source tree
TB --- 2009-04-10 02:41:26 - /usr/bin/csup -z -r 3 -g -L 1 -h localhost -s /tinderbox/HEAD/amd64/amd64/supfile
TB --- 2009-04-10 02:41:37 - building world
TB --- 2009-04-10 02:41:37 - MAKEOBJDIRPREFIX=/obj
TB --- 2009-04-10 02:41:37 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2009-04-10 02:41:37 - TARGET=amd64
TB --- 2009-04-10 02:41:37 - TARGET_ARCH=amd64
TB --- 2009-04-10 02:41:37 - TZ=UTC
TB --- 2009-04-10 02:41:37 - __MAKE_CONF=/dev/null
TB --- 2009-04-10 02:41:37 - cd /src
TB --- 2009-04-10 02:41:37 - /usr/bin/make -B buildworld
>>> World build started on Fri Apr 10 02:41:38 UTC 2009
>>> Rebuilding the temporary build tree
>>> stage 1.1: legacy release compatibility shims
>>> stage 1.2: bootstrap tools
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3: cross tools
>>> stage 4.1: building includes
>>> stage 4.2: building libraries
>>> stage 4.3: make dependencies
>>> stage 4.4: building everything
>>> stage 5.1: building 32 bit shim libraries
>>> World build completed on Fri Apr 10 04:42:02 UTC 2009
TB --- 2009-04-10 04:42:02 - generating LINT kernel config
TB --- 2009-04-10 04:42:02 - cd /src/sys/amd64/conf
TB --- 2009-04-10 04:42:02 - /usr/bin/make -B LINT
TB --- 2009-04-10 04:42:03 - building LINT kernel
TB --- 2009-04-10 04:42:03 - MAKEOBJDIRPREFIX=/obj
TB --- 2009-04-10 04:42:03 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2009-04-10 04:42:03 - TARGET=amd64
TB --- 2009-04-10 04:42:03 - TARGET_ARCH=amd64
TB --- 2009-04-10 04:42:03 - TZ=UTC
TB --- 2009-04-10 04:42:03 - __MAKE_CONF=/dev/null
TB --- 2009-04-10 04:42:03 - cd /src
TB --- 2009-04-10 04:42:03 - /usr/bin/make -B buildkernel KERNCONF=LINT
>>> Kernel build for LINT started on Fri Apr 10 04:42:03 UTC 2009
>>> stage 1: configuring the kernel
>>> stage 2.1: cleaning up the object tree
>>> stage 2.2: rebuilding the object tree
>>> stage 2.3: build tools
>>> stage 3.1: making dependencies
[...]
awk -f /src/sys/tools/makeobjops.awk /src/sys/kern/serdev_if.m -h
awk -f /src/sys/tools/makeobjops.awk /src/sys/libkern/iconv_converter_if.m -h
awk -f /src/sys/tools/makeobjops.awk /src/sys/opencrypto/cryptodev_if.m -h
awk -f /src/sys/tools/makeobjops.awk /src/sys/dev/acpica/acpi_if.m -h
rm -f .newdep
/usr/bin/make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES |  MKDEP_CPP="cc -E" CC="cc" xargs mkdep -a -f .newdep -O2 -frename-registers -pipe -fno-strict-aliasing  -std=c99  -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I.  -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ipfilter -I/src/sys/contrib/pf -I/src/sys/dev/ath -I/src/sys/dev/ath/ath_hal -I/src/sys/contrib/ngatm -I/src/sys/dev/twa -I/src/sys/gnu/fs/xfs/FreeBSD -I/src/sys/gnu/fs/xfs/FreeBSD/support -I/src/sys/gnu/fs/xfs -I/src/sys/contrib/opensolaris/compat -I/src/sys/dev/cxgb -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-ss
 e3 -mno-mmx -mno-3dnow  -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector
cc: /src/sys/dev/ixgbe/ixgbe_82599.c: No such file or directory
mkdep: compile failed
*** Error code 1

Stop in /obj/amd64/src/sys/LINT.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2009-04-10 04:44:43 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2009-04-10 04:44:43 - ERROR: failed to build lint kernel
TB --- 2009-04-10 04:44:43 - 5710.57 user 604.70 system 7483.45 real


http://tinderbox.des.no/tinderbox-head-HEAD-amd64-amd64.full

From owner-freebsd-amd64@FreeBSD.ORG  Fri Apr 10 09:52:00 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C94D61065670
	for <freebsd-amd64@FreeBSD.org>; Fri, 10 Apr 2009 09:52:00 +0000 (UTC)
	(envelope-from christoph.mallon@gmx.de)
Received: from mail.gmx.net (mail.gmx.net [213.165.64.20])
	by mx1.freebsd.org (Postfix) with SMTP id 3D0C38FC1A
	for <freebsd-amd64@FreeBSD.org>; Fri, 10 Apr 2009 09:51:59 +0000 (UTC)
	(envelope-from christoph.mallon@gmx.de)
Received: (qmail invoked by alias); 10 Apr 2009 09:51:56 -0000
Received: from p54A3ED4A.dip.t-dialin.net (EHLO tron.homeunix.org)
	[84.163.237.74]
	by mail.gmx.net (mp071) with SMTP; 10 Apr 2009 11:51:56 +0200
X-Authenticated: #1673122
X-Provags-ID: V01U2FsdGVkX189BCxly0o90AoFmc6FoWlvRp1MeyPL/PrslHs75G
	Rw50icq00bJ6jl
Message-ID: <49DF16BA.8000807@gmx.de>
Date: Fri, 10 Apr 2009 11:51:54 +0200
From: Christoph Mallon <christoph.mallon@gmx.de>
User-Agent: Thunderbird 2.0.0.19 (X11/20090103)
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <49DD01CA.3040900@gmx.de> <20090410001919.S1567@besplex.bde.org>
In-Reply-To: <20090410001919.S1567@besplex.bde.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Y-GMX-Trusted: 0
X-FuHaFi: 0.43
Cc: Ed Schouten <ed@FreeBSD.org>, freebsd-i386@FreeBSD.org,
	freebsd-amd64@FreeBSD.org
Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 09:52:01 -0000

Bruce Evans schrieb:
> On Wed, 8 Apr 2009, Christoph Mallon wrote:
>> Also the {in,out}{w,l}() get treated with this constraint. They had no 
>> special case before, so now better code is generated for them.
> 
> Actually, worse code is generated for them on most cases since the
> optimization of loading 32 bits for the usual case of a non-constant
> operand is lost in your version -- see below.  The case of a constant

If you are referring to the u_uint to u_short change, then you are
wrong, see below.

> port address was rare except for 8-bit isa ports "on the motherboard"
> and has become rarer with FreeBSD's runtime configuration becoming
> even more excessive (so instead of "inb $N,%al", the code is normally
> about 10 instructions for bus_space_read_1(sc->bst, sc->bsh, N1)).
>
>> Further, the unnecessary "cld" is removed from {in,out}s{b,w,l}(), 
>> because it is guaranteed by the ABI that the direction flag is 
>> cleared. All in all the code for in/out gets a bit simpler.
> 
> kib@ already pointed out that the direction flag is not guaranteed to be
> cleared in the kernel.  For amd64, this may be a bug in the kernel, but
> for i386 it is a bug in gcc say that the ABI specifies that the direction
> flag is cleared, since there is no standard i386 ABI so gcc must not
> assume that the direction flag is clear, like it has done until recently.

There *is* an ABI spec for i386 and it mandates that DF is cleared on 
function entry and exit. Also other compilers (e.g. LLVM and
ICC) do not have this misbehaviour of GCC < 4.3 to emit cld before every
string operation at all. I would not trust GCC < 4.3 to do this
correctly everywhere either.

> There are lots more similar cld's in bus.h.
> 
> % Index: sys/i386/include/cpufunc.h
> % ===================================================================
> % --- sys/i386/include/cpufunc.h    (Revision 190841)
> % +++ sys/i386/include/cpufunc.h    (Arbeitskopie)
> % @@ -170,177 +170,97 @@
> %      __asm __volatile("hlt");
> %  }
> % % -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3
> % -
> % -#define    inb(port)        inbv(port)
> % -#define    outb(port, data)    outbv(port, data)
> % -
> % -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */
> 
> The ifdefs are ugly, and this one seems to have been slightly wrong, but
> they provide documentation aboout the features used:
> - __GNUCLIKE_BUILTIN_CONSTANT_P: no longer needed
> - __GNUCLIKE_ASM >= 3: to ensure that the "i" constraint is available.
>   This used to be __GNUC__ >= 2.  If that was correct then >= 3 is
>   stricter than necessary; otherwise >= 2 was not as strict as it
>   should have been, since some versions of gcc-2 certainly supported
>   the "i" constraint.
> 
> A precise translation of the above would be __GNU_PREREQ__(maj, min) ||
> defined(__INTEL_COMPILER), where (maj, min) is the version of gcc that
> implemented the "N" constraint and you should check that the Intel
> compiler (all version?) support the "N" constraint.

This is unnecessary, even GCC 2.95 (10 years old, the oldest release you 
can get from official sources) supports this constraint 
(gcc-2.95/gcc/config/i386/i386.h, line 932). This shows how old and 
stale this hack is.

> % - * The unnecessary test `(port) < 0x10000' is to generate a warning if
> % - * the `port' has type u_short or smaller.  Such types are pessimal.
> % - * This actually only works for signed types.  The range check is
> % - * careful to avoid generating warnings.
> 
> This won't work any more, if it ever did.  I think it only worked for
> the constant case, but for the constant case the width doesn't really
> matter.

The use of u_short for the port enables the compiler to generate a 
warning for too large port numbers.

> % -static __inline u_char
> % -inbc(u_int port)
> % +static inline u_char
> % +inb(u_short port)
> 
> __inline should be globabally changed to inline someday, not starting here.
> Plain inline has been Standard for 10 years now, but it still isn't
> standard -- we're still waiting for gcc to implement C99, and haven't
> switched to gcc-4.3(?) or later which will implement C99 extern inline.

This very file already uses "inline" in other places.

> Changing the type changes the API and breaks an optimization.  Callers

It does not change the API, it corrects it: Passing a port number > 
0xFFFF never worked and the API now reflects this. So if now accidently 
a too large constant port number is passed, the compiler can warn. It's 
a static inline function, so there is no problem with the ABI. About 
optimization see below.

> are supposed to provide an arg of width 32 so that it can be loaded
> as efficiently as possible (instruction prefixes and/or sign extension
> waste code space and time, mainly on old CPUs).  The 0x10000 check
> referred to in the above comment attempts to enforce this.  Casting
> down in the API prevents the 32-bit loads.

When specifying 16 bit input the compiler is perfectly allowed to leave
arbitrary garbage in the upper 16 bits, e.g. do a 32bit load of the
port. Lying about this fact leads to worse code, example:

static inline unsigned char inb(unsigned short port)
/* ALT: static inline unsigned char inb(unsigned int port) */
{
   unsigned char data;
   asm volatile("inb %1, %0" : "=a" (data) : "d" (port));
/* ALT: asm volatile("inb %w1, %0" : "=a" (data) : "d" (port)); */
   return data;
}

unsigned short in2(unsigned short port)
{
   unsigned short res;
   res  = inb(port) << 8;
   res |= inb(++port);
   return res;
}

diff of the assember of the two versions:
--- in_short.s	2009-04-10 07:54:10.000000000 +0200
+++ in_int.s	2009-04-10 07:54:48.000000000 +0200
@@ -4,17 +4,22 @@
  .globl in2
  	.type	in2, @function
  in2:
-	movzwl	4(%esp), %edx
+	pushl	%ebx
+	movzwl	8(%esp), %ebx
+	movzwl	%bx, %eax
+	movl	%eax, %edx
  #APP
  	inb %dx, %al
  #NO_APP
  	movl	%eax, %ecx
-	addl	$1, %edx
+	leal	1(%ebx), %edx
  	sall	$8, %ecx
+	movzwl	%dx, %edx
  #APP
  	inb %dx, %al
  #NO_APP
  	movzbl	%al, %edx
+	popl	%ebx
  	orl	%edx, %ecx
  	movzwl	%cx, %eax
  	ret

Compilers are way better today than back in the GCC 1.x times. Lying to
them to trick them to show a certain behaviour usually is a bad idea and 
tends to backfire.

> %  {
> % -    u_char    data;
> % -
> % -    __asm __volatile("inb %1,%0" : "=a" (data) : "id" 
> ((u_short)(port)));
> 
> The port address was already cast down here in inbc().  This seems to
> have been just because I didn't know gnu asm very well when I wrote
> the above.  The cast has no effect in the usual case where the "i"
> constraint is used, but when the "d" constraint is used (only when
> compiling with -O0?) %1 would be %edx without the cast.  In inbv(), I
> hard-code %dx, but that does't work here "i" case.  The correct method
> seems to be to use %w1 in both places (keep u_int port everywhere and
> use the same method for the "N" constraint).

The correct way to handle this is discussed above.

> % -    return (data);
> % +    u_char data;
> % +    __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port));
> % +    return data;
> %  }
> 
> Style breakages:
> - lost blank line after declarations.  This file followed the rule about
>   this except for previous breakage.  It even followed it in some cases
>   where the declarations are null.
> - lost parentheses around return value.  This file followed the rule about
>   this in all cases.

Anachronisms.

> Style not-quite-fix:
> - lost the tab after the type in the declaration.  In KNF, such a tab is
>   not used for auto declarations.  However, this file was fairly consistent
>   about breaking the rule.
> 
> Why change __volatile to volatile?  This is similar to changing __inline
> to inline, except plain volatile is more standard.  <sys/cdefs.h> has

You answered this by yourself: volatile is a standard word - there is no
reason to not use this.

> ifdef tangles which are supposed to allow users or cdefs.h itself to
> define away volatile so as to support old compilers, and we use __volatile
> instead of volatile a lot to support this.  This is probably not needed
> for kernel headers.  However, cpufunc.h is sometimes abused in userland.

"Old compilers" cannot compile the FreeBSD source anyway, so what's your 
point? This file can only be used with "new" (not older than a decade or 
so) and very GCC-compatible compilers.

> Fixed version (change nothing except the function name, %1, and the
> constraint):
> 
> %%%
> static __inline u_char
> inb(u_int port)
> {
>     u_char    data;
> 
>     __asm __volatile("inb %w1, %0" : "=a" (data) : "Nd" (port));
>     return (data);
> }
> %%%
> 
> Similarly for the other functions.
> 
> BTW, I sometimes missing having the inverse of %[bwl...] -- a way to
> generate an operand size letter from the type.  gcc-3 and gcc-4 have
> some horribly incomplete support for this.
> 
> % -#endif /* __GNUCLIKE_BUILTIN_CONSTANT_P  && __GNUCLIKE_ASM >= 3*/
> 
> Removing this accidentally fixes >= 2 style bugs in it.
> 
> % -static __inline u_char
> % -inbv(u_int port)
> % -{
> % -    u_char    data;
> % -    /*
> % -     * We use %%dx and not %1 here because i/o is done at %dx and not at
> % -     * %edx, while gcc generates inferior code (movw instead of movl)
> % -     * if we tell it to load (u_short) port.
> % -     */
> 
> Lost this documentation.  Now the magic in my version is in %w in my 
> version.
> The comment shouldn't blame gcc for generating movw -- that is the
> programmer's fault.

This is wrong, it's correct to specify a 16bit input. The compiler 
therefore is perfectly allowed to leave arbitrary garbage in the upper 
16 bits, e.g. do a 32bit load of the port and other things, see above. 
Also GCC does not generate movw here - for neither u_short nor u_int. 
GCC avoids partial register updates.

> gcc on modern machines now generates movswl for loading a u_short port
> to the u_int port specified by the old API, and movswl is fast on modern
> machines, so there is no penalty in using the old API/implementation
> on modern machines even if the caller neglected to start with a u_int
> port.

There is a penalty, see example above.

> % ...
> % -static __inline void
> % -insb(u_int port, void *addr, size_t cnt)
> % +static inline void
> % +insw(u_short port, void *addr, size_t cnt)
> %  {
> % -    __asm __volatile("cld; rep; insb"
> % -             : "+D" (addr), "+c" (cnt)
> % -             : "d" (port)
> % -             : "memory");
> % +    __asm volatile("rep; insw"
> % +        : "+D" (addr), "+c" (cnt)
> % +        : "d" (port)
> % +        : "memory");
> %  }
> 
> The diffs are hard to read, apparently due to diff getting confused
> by your API and style changes and comparing unrelated functions.
> Without the API changes, I think it would have matched insb with insb,
> etc.
> 
> % -static __inline void
> % -outbv(u_int port, u_char data)
> % +static inline void
> % +outb(u_short port, u_char data)
> %  {
> % -    u_char    al;
> % -    /*
> % -     * Use an unnecessary assignment to help gcc's register allocator.
> % -     * This make a large difference for gcc-1.40 and a tiny difference
> % -     * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
> % -     * best results.  gcc-2.6.0 can't handle this.
> % -     */
> % -    al = data;
> 
> This can go of course.  I saw dummy assignments bogusly helping the
> allocator in more complicated asms for a version of atomic_cmpset()
> as late as gcc-3, but that seemed to be fixed by gcc-3.3.
> 
> % Index: sys/i386/i386/machdep.c
> % ===================================================================
> % --- sys/i386/i386/machdep.c    (Revision 190841)
> % +++ sys/i386/i386/machdep.c    (Arbeitskopie)
> % @@ -3555,45 +3555,24 @@
> %  #ifdef KDB
> % %  /*
> % - * Provide inb() and outb() as functions.  They are normally only
> % - * available as macros calling inlined functions, thus cannot be
> % - * called from the debugger.
> % - *
> % - * The actual code is stolen from <machine/cpufunc.h>, and de-inlined.
> % + * Provide inb() and outb() as functions.  They are normally only 
> available as
> % + * inline functions, thus cannot be called from the debugger.
> %   */
> 
> This should have been implemented by using cpufunc.h instead of repeating
> it, and for everything in cpufunc.h not just inb and outb, e.g., as is
> done for everything in atomic.h.  This gives a technical reason for
> using __inline -- we want to #undef it and shouldn't #undef a standard
> keyword.

You can neither #undef __inline nor inline, they both are keywords, not
macros. Actually it's the other way round: There is no way to get rid of 
__inline, it is always a keyword to GCC. But you can tell GCC to not 
consider inline as keyword (-std=c89).

> % % -#undef inb
> % -#undef outb
> % -
> %  /* silence compiler warnings */
> % -u_char inb(u_int);
> % -void outb(u_int, u_char);
> % +u_char inb_(u_short);
> % +void outb_(u_short, u_char);
> % %  u_char
> % -inb(u_int port)
> % +inb_(u_short port)
> %  {
> % -    u_char    data;
> % -    /*
> % -     * We use %%dx and not %1 here because i/o is done at %dx and not at
> % -     * %edx, while gcc generates inferior code (movw instead of movl)
> % -     * if we tell it to load (u_short) port.
> % -     */
> % -    __asm __volatile("inb %%dx,%0" : "=a" (data) : "d" (port));
> % -    return (data);
> % +    return inb(port);
> %  }
> % %  void
> % -outb(u_int port, u_char data)
> % +outb_(u_short port, u_char data)
> %  {
> % -    u_char    al;
> % -    /*
> % -     * Use an unnecessary assignment to help gcc's register allocator.
> % -     * This make a large difference for gcc-1.40 and a tiny difference
> % -     * for gcc-2.6.0.  For gcc-1.40, al had to be ``asm("ax")'' for
> % -     * best results.  gcc-2.6.0 can't handle this.
> % -     */
> % -    al = data;
> % -    __asm __volatile("outb %0,%%dx" : : "a" (al), "d" (port));
> % +    outb(port, data);
> %  }
> % %  #endif /* KDB */
> 
> This changes the (ddb undocumented) API and ABI to worse ones.  I
> already don't line having to type "call inb(foo)" (in my x86 debugger,
> inb is a command named "i", with certain space insensitivity that
> allows typing "ifoo").

The "worse" aspect is discussed above. I also changed the non-existent 
documentation of these functions, so the name change should be fine. (:

	Christoph

From owner-freebsd-amd64@FreeBSD.ORG  Fri Apr 10 06:13:36 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D8F46106566B
	for <freebsd-amd64@freebsd.org>; Fri, 10 Apr 2009 06:13:36 +0000 (UTC)
	(envelope-from christoph.mallon@gmx.de)
Received: from mail.gmx.net (mail.gmx.net [213.165.64.20])
	by mx1.freebsd.org (Postfix) with SMTP id 466C88FC15
	for <freebsd-amd64@freebsd.org>; Fri, 10 Apr 2009 06:13:36 +0000 (UTC)
	(envelope-from christoph.mallon@gmx.de)
Received: (qmail invoked by alias); 10 Apr 2009 06:13:34 -0000
Received: from p54A3ED4A.dip.t-dialin.net (EHLO tron.homeunix.org)
	[84.163.237.74]
	by mail.gmx.net (mp063) with SMTP; 10 Apr 2009 08:13:34 +0200
X-Authenticated: #1673122
X-Provags-ID: V01U2FsdGVkX18NGQQxRVMNrcP3UqW4WEZtCjugRHJfkzBVJwJaoi
	6r0tiW6YerY4ad
Message-ID: <49DEE38D.6050604@gmx.de>
Date: Fri, 10 Apr 2009 08:13:33 +0200
From: Christoph Mallon <christoph.mallon@gmx.de>
User-Agent: Thunderbird 2.0.0.19 (X11/20090103)
MIME-Version: 1.0
To: Kostik Belousov <kostikbel@gmail.com>
References: <49DD01CA.3040900@gmx.de>
	<20090408205831.GK3014@deviant.kiev.zoral.com.ua>
In-Reply-To: <20090408205831.GK3014@deviant.kiev.zoral.com.ua>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
X-Y-GMX-Trusted: 0
X-FuHaFi: 0.54
X-Mailman-Approved-At: Fri, 10 Apr 2009 11:39:39 +0000
Cc: Ed Schouten <ed@freebsd.org>, freebsd-i386@freebsd.org,
	freebsd-amd64@freebsd.org
Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 06:13:37 -0000

Kostik Belousov schrieb:
> On Wed, Apr 08, 2009 at 09:58:02PM +0200, Christoph Mallon wrote:
>> Hi amd64@ and i386@,
>>
>> attached is a patch which simplifies the in*() and out*() functions for 
>> I/O port access of AMD64 and i386. It removes an unnecessary distinction 
>> of cases for inb() and outb(), which was used to generate better code 
>> for ports < 256. This is unnecessary, because GCC supports an asm input 
>> constraint to handle this ("N"). The stale comment, which states there 
>> is no constraint for this, is removed, too. Also the {in,out}{w,l}() get 
>> treated with this constraint. They had no special case before, so now 
> 
>> better code is generated for them. Further, the unnecessary "cld" is 
>> removed from {in,out}s{b,w,l}(), because it is guaranteed by the ABI 
>> that the direction flag is cleared. All in all the code for in/out gets 
>> a bit simpler.
> The DF flag is guaranteed to be cleared for usermode only. Currently,
> kernel does not clear the flag on entry.
> 
> We did fixed signal handlers to always have DF cleared, but the kernel
> was explicitely left out because used version of gcc does the right
> thing with DF when needed.

This will break with GCC >= 4.3. Also other compilers (e.g. LLVM and 
ICC) do not generate cld before they insert string instructions. I would 
not trust GCC < 4.3 to place cld everywhere, either.
As a *temporary* workaround the cld in the {in,out}s*() functions could 
be left in, but this issue *must* be resolved correctly soon. Especially 
because LLVM is not generating the (according to the ABI spec) redundant 
clds and newer GCCs will not either. So no matter what the prospective 
compiler of FreeBSD will be, this issue must be tackled.

From owner-freebsd-amd64@FreeBSD.ORG  Fri Apr 10 12:54:42 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DC1771065670;
	Fri, 10 Apr 2009 12:54:42 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au
	[211.29.132.183])
	by mx1.freebsd.org (Postfix) with ESMTP id 3C4BB8FC27;
	Fri, 10 Apr 2009 12:54:42 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from besplex.bde.org (c122-107-120-227.carlnfd1.nsw.optusnet.com.au
	[122.107.120.227])
	by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	n3ACsbK8012516
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 10 Apr 2009 22:54:38 +1000
Date: Fri, 10 Apr 2009 22:54:37 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Christoph Mallon <christoph.mallon@gmx.de>
In-Reply-To: <49DF16BA.8000807@gmx.de>
Message-ID: <20090410200612.I2103@besplex.bde.org>
References: <49DD01CA.3040900@gmx.de> <20090410001919.S1567@besplex.bde.org>
	<49DF16BA.8000807@gmx.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Ed Schouten <ed@FreeBSD.org>, freebsd-amd64@FreeBSD.org,
	freebsd-i386@FreeBSD.org
Subject: Re: [PATCH] Simplify in*() and out*() functions of AMD64 and i386
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 12:54:43 -0000

On Fri, 10 Apr 2009, Christoph Mallon wrote:

> Bruce Evans schrieb:
>> On Wed, 8 Apr 2009, Christoph Mallon wrote:
>>> Also the {in,out}{w,l}() get treated with this constraint. They had no 
>>> special case before, so now better code is generated for them.
>> 
>> Actually, worse code is generated for them on most cases since the
>> optimization of loading 32 bits for the usual case of a non-constant
>> operand is lost in your version -- see below.  The case of a constant
>
> If you are referring to the u_uint to u_short change, then you are
> wrong, see below.

No, you are wrong.  See below in previous and this mail.

>> kib@ already pointed out that the direction flag is not guaranteed to be
>> cleared in the kernel.  For amd64, this may be a bug in the kernel, but
>> for i386 it is a bug in gcc say that the ABI specifies that the direction
>> flag is cleared, since there is no standard i386 ABI so gcc must not
>> assume that the direction flag is clear, like it has done until recently.
>
> There *is* an ABI spec for i386 and it mandates that DF is cleared on 
> function entry and exit. Also other compilers (e.g. LLVM and
> ICC) do not have this misbehaviour of GCC < 4.3 to emit cld before every
> string operation at all. I would not trust GCC < 4.3 to do this
> correctly everywhere either.

Anyone can make up a spec but no one has to follow it.  FreeBSD has always
followed the old de-facto spec that requires any code that uses string
operations to set the direction flag for itself.  Trying to change this
after about 1990 is especially stupid since string instructions should
almost never be used due to their becoming relatively even slower without
even counting their large setup overhead.  Their setup overhead includes
the cld but should be large enough to do the cld in parallel so that it
is almost free.

>> % % -#if !defined(__GNUCLIKE_BUILTIN_CONSTANT_P) || __GNUCLIKE_ASM < 3
>> % -
>> % -#define    inb(port)        inbv(port)
>> % -#define    outb(port, data)    outbv(port, data)
>> % -
>> % -#else /* __GNUCLIKE_BUILTIN_CONSTANT_P && __GNUCLIKE_ASM >= 3 */
>> 
>> The ifdefs are ugly, and this one seems to have been slightly wrong, but
>> they provide documentation aboout the features used:
>> - __GNUCLIKE_BUILTIN_CONSTANT_P: no longer needed
>> - __GNUCLIKE_ASM >= 3: to ensure that the "i" constraint is available.
>>   This used to be __GNUC__ >= 2.  If that was correct then >= 3 is
>>   stricter than necessary; otherwise >= 2 was not as strict as it
>>   should have been, since some versions of gcc-2 certainly supported
>>   the "i" constraint.
>> 
>> A precise translation of the above would be __GNU_PREREQ__(maj, min) ||
>> defined(__INTEL_COMPILER), where (maj, min) is the version of gcc that
>> implemented the "N" constraint and you should check that the Intel
>> compiler (all version?) support the "N" constraint.
>
> This is unnecessary, even GCC 2.95 (10 years old, the oldest release you can 
> get from official sources) supports this constraint 
> (gcc-2.95/gcc/config/i386/i386.h, line 932). This shows how old and stale 
> this hack is.

I just want it pointed out (in the commit log) and in a separate commit
that you are intentionally weakening the ifdefs since old compilers are
not supported even to the extent of not generating syntax errors for
them and most other places (e.g., atomic.h) already assume a non-old
compiler and generate the syntax errors for old ones.

>> % - * The unnecessary test `(port) < 0x10000' is to generate a warning if
>> % - * the `port' has type u_short or smaller.  Such types are pessimal.
>> % - * This actually only works for signed types.  The range check is
>> % - * careful to avoid generating warnings.
>> 
>> This won't work any more, if it ever did.  I think it only worked for
>> the constant case, but for the constant case the width doesn't really
>> matter.
>
> The use of u_short for the port enables the compiler to generate a warning 
> for too large port numbers.

Only if it warns for truncation on assignment, which most compilers won't
do.

>> % -static __inline u_char
>> % -inbc(u_int port)
>> % +static inline u_char
>> % +inb(u_short port)
>> 
>> __inline should be globabally changed to inline someday, not starting here.
>> Plain inline has been Standard for 10 years now, but it still isn't
>> standard -- we're still waiting for gcc to implement C99, and haven't
>> switched to gcc-4.3(?) or later which will implement C99 extern inline.
>
> This very file already uses "inline" in other places.

Only in recent style breakage in 2 places when I wasn't looking
(18-Apr-08 for cpu_monitor() and cpu_wait()).  Other style breakage
for these functions:
- insertion sort errors.  This file is supposed to be sorted on function
   name
- namespace errors.  Functions in this file should have the same
   name as the instruction that they provided access too.  monitor()
   is a bit too generic for this and wait() is much too generic for this
   (however, the instruction name is mwait() which is less generic than
   monitor()), so some renaming is required.  The renaming should not
   be into the cpu_* namespace since the cpu_ prefix has a different,
   precise meaning.  These bugs are missing for the older interface to
   the "pause" instruction:
   - pause() is too generic, so the cpufunc is named ia32_pause() on i386,
     etc. (but I prefer mdpause()).
   - pause() was already in use, so even the MI interface couldn't be named
     pause().  It is named cpu_spinwait().
   - cpu_spinwait() is implemented as a macro that invokes ia32_pause().
- bogus semicolon at the end of the main asm string
- missing space after ": :" (2 instances)
- missing space after just 1 of the 4 constraint strings.
- missing declarations in the non-__GNUCLIKE_ASM &&
   non-__CC_SUPPORTS___INLINE case.

Hmm, your change from __inline to inline also bogotifies the
__CC_SUPPORTS___INLINE ifdef.  This ifdef is careful to a fault, but
not wrong provided __inline is used consistently.

>> Changing the type changes the API and breaks an optimization.  Callers
>
> It does not change the API, it corrects it: Passing a port number > 0xFFFF 
> never worked and the API now reflects this. So if now accidently a too large 
> constant port number is passed, the compiler can warn. It's a static inline 
> function, so there is no problem with the ABI. About optimization see below.

No, it breaks it.  I define the API so I know what it is!  The caller should
pass a u_int port so that optimal code is generated (at a tiny cost to
data space for storing 32-bit port numbers in softcs).  The hardware only
uses the low 16 bits so the upper 16 bits don't matter.  If the caller
starts with a u_short port, then the API generates sub-optimal code to
promote to a u_int.  The promotion logically occurs as part of the function
call protocol, but since the function is inline it normally occurs more
directly (e.g., as "movswl port,%edx").

>> are supposed to provide an arg of width 32 so that it can be loaded
>> as efficiently as possible (instruction prefixes and/or sign extension
>> waste code space and time, mainly on old CPUs).  The 0x10000 check
>> referred to in the above comment attempts to enforce this.  Casting
>> down in the API prevents the 32-bit loads.
>
> When specifying 16 bit input the compiler is perfectly allowed to leave
> arbitrary garbage in the upper 16 bits, e.g. do a 32bit load of the
> port. Lying about this fact leads to worse code, example:

It is only permitted to do that if it knows that the the upper 16 bits
are in an object, or if they are in a register.  While it could know that
bits in an object (say for port = sc->sc_port where sc has stuff above
sc_port so that the bits are in the object (*sc)), I've never seen gcc
do this optimization.

I didn't lie to the compiler, but told it to load all 32 bits of %edx.
Then I ignore the top 16 bits in the asm.

> static inline unsigned char inb(unsigned short port)
> /* ALT: static inline unsigned char inb(unsigned int port) */
> {
>  unsigned char data;
>  asm volatile("inb %1, %0" : "=a" (data) : "d" (port));
> /* ALT: asm volatile("inb %w1, %0" : "=a" (data) : "d" (port)); */
>  return data;
> }
>
> unsigned short in2(unsigned short port)
> {
>  unsigned short res;
>  res  = inb(port) << 8;
>  res |= inb(++port);
>  return res;
> }

Sure, this gives suboptimal code for ALT since `port' starts in a u_short.
Try starting with a matching ALT for in2: in2(unsigned port).  The types
must match at all levels including in2()'s caller to avoid promoting and
demoting back and forth.  Only some of the conversions can be free.

++port involves various conversions if port is not u_int throughout.
The compiler can usually optimize these (by keeping port in a register
and ignoring top bits).

> diff of the assember of the two versions:
> --- in_short.s	2009-04-10 07:54:10.000000000 +0200
> +++ in_int.s	2009-04-10 07:54:48.000000000 +0200
> @@ -4,17 +4,22 @@
> .globl in2
> 	.type	in2, @function
> in2:
> -	movzwl	4(%esp), %edx

With my ALT in2 version:

 	movl	4(%esp), %edx.

This is 3-4 cycles faster than on i386 and i486.  On modern CPUs, there
is probably no difference, but movzwl wastes a byte or two of instruction
space so it might cost in some cases.

> +	pushl	%ebx
> +	movzwl	8(%esp), %ebx
> +	movzwl	%bx, %eax
> +	movl	%eax, %edx

This is unnecessarily stupid code.  Why not identical code to the
u_int case?  Leaving out the inb(++port) gives identical code for the
first inb().

> #APP
> 	inb %dx, %al
> #NO_APP
> 	movl	%eax, %ecx
> -	addl	$1, %edx
> +	leal	1(%ebx), %edx

The compiler is being stupid above so that it can be stupid here too
(leal is slightly worse than addl).  It should just use %edx and not take
a trip through %ebx and %eax, and have to save %ebx.

> 	sall	$8, %ecx
> +	movzwl	%dx, %edx

This is needed for ++port since it has to demote the intermediate u_int
in ++port back to u_short.

> #APP
> 	inb %dx, %al
> #NO_APP
> 	movzbl	%al, %edx
> +	popl	%ebx
> 	orl	%edx, %ecx
> 	movzwl	%cx, %eax
> 	ret

With u_ints throughout (ALT for inb() and in2()), there are no conversions
and the generated code is optimal (movl instead of movzwl and no other
differences).  With the only conversion being a demotion in inb() (non-ALT
for inb() and ALT for in2()), the generated code is again optimal
(movl instead of movzwl).  gcc is optimizing away the conversion in this
case.  With port replaced by portarray[0] where portarray[1] exists,
gcc no longer does the optimization.

This is with gcc-4.2 with the default (usually wrong) arch.  gcc-3.3
and gcc-4.2 -march=i386 give the original pessimization of using movw
instead of movzwl to load a 16-bit object into a 16 or 32-bit register.
When only 16 bits are needed gcc-old just uses movw.  When a promotion
is needed in the ALT in2(), it uses movw to %bx where the above uses
movzwl to %ebx.  Then it must promote %bx and it uses movzwl to %eax
for this, then it moves to %edx.  This is sub-optimal but not as silly
as the above -- the above has already done the promotion into %ebx and
then it does it again into %eax.

> Compilers are way better today than back in the GCC 1.x times. Lying to
> them to trick them to show a certain behaviour usually is a bad idea and 
> tends to backfire.

Apparently gcc is still stupid when there are just a 4 not-quite null
promotions and demotions: for

 	u_short port;

 	use((u_int)port);			/* ALT inb(port); */
 	port = (u_short)((u_int)port + 1);	/* ++port; */
 	use((u_int)port);			/* ALT inb(port); */

>> %  {
>> % -    u_char    data;
>> % -
>> % -    __asm __volatile("inb %1,%0" : "=a" (data) : "id" 
>> ((u_short)(port)));
>> 
>> The port address was already cast down here in inbc().  This seems to
>> have been just because I didn't know gnu asm very well when I wrote
>> the above.  The cast has no effect in the usual case where the "i"
>> constraint is used, but when the "d" constraint is used (only when
>> compiling with -O0?) %1 would be %edx without the cast.  In inbv(), I
>> hard-code %dx, but that does't work here "i" case.  The correct method
>> seems to be to use %w1 in both places (keep u_int port everywhere and
>> use the same method for the "N" constraint).
>
> The correct way to handle this is discussed above.

No, this gives movzwl instead of movl in some cases and movw instead of
movl in some other cases.

>> % -    return (data);
>> % +    u_char data;
>> % +    __asm volatile("inb %1, %0" : "=a" (data) : "Nd" (port));
>> % +    return data;
>> %  }
>> 
>> Style breakages:
>> - lost blank line after declarations.  This file followed the rule about
>>   this except for previous breakage.  It even followed it in some cases
>>   where the declarations are null.
>> - lost parentheses around return value.  This file followed the rule about
>>   this in all cases.
>
> Anachronisms.

No, standard in FreeBSD.

>> Style not-quite-fix:
>> - lost the tab after the type in the declaration.  In KNF, such a tab is
>>   not used for auto declarations.  However, this file was fairly consistent
>>   about breaking the rule.
>> 
>> Why change __volatile to volatile?  This is similar to changing __inline
>> to inline, except plain volatile is more standard.  <sys/cdefs.h> has
>
> You answered this by yourself: volatile is a standard word - there is no
> reason to not use this.

It is for stylistic reasons.  Readers will wonder why most code uses
__volatile and yours doesn't.

>> ifdef tangles which are supposed to allow users or cdefs.h itself to
>> define away volatile so as to support old compilers, and we use __volatile
>> instead of volatile a lot to support this.  This is probably not needed
>> for kernel headers.  However, cpufunc.h is sometimes abused in userland.
>
> "Old compilers" cannot compile the FreeBSD source anyway, so what's your 
> point? This file can only be used with "new" (not older than a decade or so) 
> and very GCC-compatible compilers.

The sources should be as independent of compiler versions as possible.
cpufunc.h was independent (it is supposed contain only prototypes only
for compilers that can't handle its asms), so why break it?

>> % -static __inline u_char
>> % -inbv(u_int port)
>> % -{
>> % -    u_char    data;
>> % -    /*
>> % -     * We use %%dx and not %1 here because i/o is done at %dx and not at
>> % -     * %edx, while gcc generates inferior code (movw instead of movl)
>> % -     * if we tell it to load (u_short) port.
>> % -     */
>> 
>> Lost this documentation.  Now the magic in my version is in %w in my 
>> version.
>> The comment shouldn't blame gcc for generating movw -- that is the
>> programmer's fault.
>
> This is wrong, it's correct to specify a 16bit input. The compiler therefore 
> is perfectly allowed to leave arbitrary garbage in the upper 16 bits, e.g. do 
> a 32bit load of the port and other things, see above. Also GCC does not 
> generate movw here - for neither u_short nor u_int. GCC avoids partial 
> register updates.

No, this is correct.  Please allow me to find bugs in my own comment :-).
The comment would need to be too long, like this mail, to give all the
details and I apparently broke it trying to make it shorter.

It is correct but sub-optimal to specify a 16-bit input.

Whether gcc generates a movw is now very machine-dependent.  When the
code and comment was written, gcc always generated movw for memory to
register 16-bit loads.  Now it prefers to generate movzwl since partial
register updates are bad and movzwl is now fast, but it still generates
movw for old CPUs.  My optimization was not to avoid partial register
updates, but just to avoid operand size prefixes for loading u_shorts
and extra instructions for conversions in expressions like port+1 and
use(foo) (where use() takes a u_int)).  Your examples show that compilers
now avoid the prefixes or conversions in some but not all of the cases.

>> % Index: sys/i386/i386/machdep.c
>> % ===================================================================
>> % --- sys/i386/i386/machdep.c    (Revision 190841)
>> % +++ sys/i386/i386/machdep.c    (Arbeitskopie)
>> % @@ -3555,45 +3555,24 @@
>> %  #ifdef KDB
>> % %  /*
>> % - * Provide inb() and outb() as functions.  They are normally only
>> % - * available as macros calling inlined functions, thus cannot be
>> % - * called from the debugger.
>> % - *
>> % - * The actual code is stolen from <machine/cpufunc.h>, and de-inlined.
>> % + * Provide inb() and outb() as functions.  They are normally only 
>> available as
>> % + * inline functions, thus cannot be called from the debugger.
>> %   */
>> 
>> This should have been implemented by using cpufunc.h instead of repeating
>> it, and for everything in cpufunc.h not just inb and outb, e.g., as is
>> done for everything in atomic.h.  This gives a technical reason for
>> using __inline -- we want to #undef it and shouldn't #undef a standard
>> keyword.
>
> You can neither #undef __inline nor inline, they both are keywords, not
> macros. Actually it's the other way round: There is no way to get rid of 
> __inline, it is always a keyword to GCC. But you can tell GCC to not consider 
> inline as keyword (-std=c89).

Oops, I should have written "#define away" instead of #undef.  I think it
is legal in C to #define any identifer.  It just might things by giving
inconsistencies.  <sys/cdefs.h> already defines away __inline in some
cases (mostly for old compilers) and defines it to plain inline for the
C++ case even for new gcc.

The u_short optimizations are mostly moot for inb() etc.  The hardware
part of inb() now normally takes a few thousand times longer than the
whole 1 cycle potentially saved by these optimizations.  When inb() was
written, the hardware part took less than one hundred times longer.  I
defend the optimization mainly because losing it is a regression and
I still hate APIs that use chars and shorts (or [u]intN_t's) to give
pessimizations by maximizing conversions.

Bruce

From owner-freebsd-amd64@FreeBSD.ORG  Fri Apr 10 21:40:35 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7D581106564A;
	Fri, 10 Apr 2009 21:40:35 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 51D668FC27;
	Fri, 10 Apr 2009 21:40:35 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3ALeZDr069972;
	Fri, 10 Apr 2009 21:40:35 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3ALeZDg069968;
	Fri, 10 Apr 2009 21:40:35 GMT (envelope-from linimon)
Date: Fri, 10 Apr 2009 21:40:35 GMT
Message-Id: <200904102140.n3ALeZDg069968@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-bugs@FreeBSD.org
From: linimon@FreeBSD.org
X-Mailman-Approved-At: Fri, 10 Apr 2009 21:54:05 +0000
Cc: 
Subject: Re: misc/133540: Cannot connect to ftp mirrors for 7.2 beta
	boot-only
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 21:40:36 -0000

Synopsis: Cannot connect to ftp mirrors for 7.2 beta boot-only

Responsible-Changed-From-To: freebsd-amd64->freebsd-bugs
Responsible-Changed-By: linimon
Responsible-Changed-When: Fri Apr 10 21:39:51 UTC 2009
Responsible-Changed-Why: 
probably not amd64-specific.

http://www.freebsd.org/cgi/query-pr.cgi?pr=133540

From owner-freebsd-amd64@FreeBSD.ORG  Sat Apr 11 02:40:02 2009
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 295A9106568A
	for <freebsd-amd64@hub.freebsd.org>;
	Sat, 11 Apr 2009 02:40:02 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 554358FC1D
	for <freebsd-amd64@hub.freebsd.org>;
	Sat, 11 Apr 2009 02:40:01 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3B2e1Ya071357
	for <freebsd-amd64@freefall.freebsd.org>; Sat, 11 Apr 2009 02:40:01 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3B2e1Bi071356;
	Sat, 11 Apr 2009 02:40:01 GMT (envelope-from gnats)
Resent-Date: Sat, 11 Apr 2009 02:40:01 GMT
Resent-Message-Id: <200904110240.n3B2e1Bi071356@freefall.freebsd.org>
Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer)
Resent-To: freebsd-amd64@FreeBSD.org
Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org,
	Jason Harmening <jason.harmening@gmail.com>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3B9B61065670
	for <freebsd-gnats-submit@FreeBSD.org>;
	Sat, 11 Apr 2009 02:37:18 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 29ED98FC0A
	for <freebsd-gnats-submit@FreeBSD.org>;
	Sat, 11 Apr 2009 02:37:18 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n3B2bHkd015742
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 11 Apr 2009 02:37:17 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n3B2bHvE015726;
	Sat, 11 Apr 2009 02:37:17 GMT (envelope-from nobody)
Message-Id: <200904110237.n3B2bHvE015726@www.freebsd.org>
Date: Sat, 11 Apr 2009 02:37:17 GMT
From: Jason Harmening <jason.harmening@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
X-Send-Pr-Version: www-3.1
X-Mailman-Approved-At: Sat, 11 Apr 2009 03:02:59 +0000
Cc: 
Subject: amd64/133592: busdma incorrectly calculates bounce buffer
	requirements for userspace buffers
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 02:40:02 -0000


>Number:         133592
>Category:       amd64
>Synopsis:       busdma incorrectly calculates bounce buffer requirements for userspace buffers
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Apr 11 02:40:00 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Jason Harmening
>Release:        7.2-PRERELEASE
>Organization:
>Environment:
FreeBSD CORONA 7.2-PRERELEASE FreeBSD 7.2-PRERELEASE #2: Sat Mar 21 00:30:34 CDT 2009     jason@CORONA:/usr/obj/usr/src/sys/CUSTOM  amd64

>Description:
The _bus_dmamap_load_buffer function in sys/amd64/amd64/busdma_machdep.c takes a pmap_t param indicating the address space of the buffer (NULL => KVA space).  When calculating the number of bounce buffers to reserve, it always calls pmap_kextract() to get the physical address, when it should call pmap_extract() if pmap != NULL.

The problem exists in both 7-STABLE and 8-CURRENT--the attached patch is against 7-STABLE.
>How-To-Repeat:

>Fix:


Patch attached with submission follows:

--- busdma_machdep.c.bkp	2009-04-10 21:27:51.000000000 -0500
+++ busdma_machdep.c	2009-04-10 21:30:58.000000000 -0500
@@ -602,7 +602,10 @@
 		vendaddr = (vm_offset_t)buf + buflen;
 
 		while (vaddr < vendaddr) {
-			paddr = pmap_kextract(vaddr);
+			if (pmap)
+				paddr = pmap_extract(pmap, vaddr);
+			else
+				paddr = pmap_kextract(vaddr);
 			if (run_filter(dmat, paddr) != 0)
 				map->pagesneeded++;
 			vaddr += PAGE_SIZE;


>Release-Note:
>Audit-Trail:
>Unformatted: