Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Mar 2003 01:13:07 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        "M. Warner Losh" <imp@bsdimp.com>, <current@FreeBSD.ORG>
Subject:   Re: Any ideas why we can't even boot a i386 ? 
Message-ID:  <20030302001614.H26391-100000@gamplex.bde.org>
In-Reply-To: <56631.1046466626@critter.freebsd.dk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 28 Feb 2003, Poul-Henning Kamp wrote:

> My main concern would be if the chips have the necessary "umphf"
> to actually do a real-world job once they're done running all the
> overhead of 5.0-R.  The lack of cmpxchg8 makes the locking horribly
> expensive.

Actually, the lack of cmpxchg8 only makes locking more expensive.  It's
hard to say how long cmpxchg8 would take on i386's if i386's had it,
but it involves memory accesses which i386's are especially poor at,
so I guess it would take about 2/3 as long as the main extra instructions
that we use in the CPU_I386 case (pushfl: 4 cycles; cli: 3 cycles;
popfl: 5 cycles).

Actual testing on an Athlon1600XP in userland for the core of
mtx_lock_*() + mtx_unlock_*(), namely
atomic_cmpset_acq_ptr() + atomic_cmpset_rel_ptr(),
run in a loop (cycle counts include loop overhead):
    10 cycles in the !CPU_i386 case
    42 cycles in the CPU_I386 case
    36 cycles in the CPU_I386 case with cli removed
    12 cycles in the CPU_I386 case with cli removed and popfl changed to
       "addl $4,%%esp"
     9 cycles in the CPU_I386 case with pushfl, cli and popfl removed
So the i386 code is almost the same speed on AthlonXP's in user mode
except for the expensive cli and popfl instructions.  However, these
instructions aren't so relatively expensive on plain i386's; i386's
are just generally slow and their privileged instructions aren't much
slower than their integer instructions.

The relative slowdown for the full mtx_*lock*() functions would be
smaller since these functions do more.  mtx_unlock_spin() uses
atomic_store_rel_ptr() so it doesn't go near cmpxchg8 or cli and
the above times the acquire/release times are almost twice as
small as above.

Bruce


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030302001614.H26391-100000>