From owner-freebsd-alpha Fri Jul 21 23:25: 8 2000 Delivered-To: freebsd-alpha@freebsd.org Received: from front7m.grolier.fr (front7m.grolier.fr [195.36.216.57]) by hub.freebsd.org (Postfix) with ESMTP id F002137BAAE; Fri, 21 Jul 2000 23:24:50 -0700 (PDT) (envelope-from groudier@club-internet.fr) Received: from nas1-54.cgy.club-internet.fr (nas1-54.cgy.club-internet.fr [195.36.197.54]) by front7m.grolier.fr (8.9.3/No_Relay+No_Spam_MGC990224) with ESMTP id IAA22725; Sat, 22 Jul 2000 08:24:41 +0200 (MET DST) Date: Sat, 22 Jul 2000 08:04:24 +0200 (CEST) From: =?ISO-8859-1?Q?G=E9rard_Roudier?= X-Sender: groudier@linux.local To: Mike Smith Cc: freebsd-alpha@FreeBSD.ORG Subject: Re: fxp0 hangs on a PC164 using STABLE In-Reply-To: <200007220002.RAA01857@mass.osd.bsdi.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Fri, 21 Jul 2000, Mike Smith wrote: > > On Thu, 20 Jul 2000, Mike Smith wrote: > >=20 > > > > It is my opinion. You may disagree but it will hard for anybody to > > > > convince me that I am wrong. ;-) > > >=20 > > > On x86, it's very hard for you to be right; the CPU specification and= bus > > > bridge behaviour both guarantee retirement of writes in order of issu= ance. > > > This combined with strong cache coherency makes barriers irrelevant o= n > > > this platform. > >=20 > > Let a PCI device perform: > > =09STORE A > > =09STORE B > >=20 > > Let the CPU perform and expect: > > LOAD B > > LOAD A > >=20 > > Let some CPU speculative execution carry out to the system BUS: > > =09LOAD A > > =09LOAD B > >=20 > > My reading of the the Intel docs didn't convince me that such reorderin= g > > is not possible. > > =20 > > Typically A is some indicator of an IO completion pushed to a completio= n=20 > > queue and B is the associated status data. >=20 > You've got those the wrong way around; A will be the status, and B the=20 > indicator (since we have to assume the peripheral is correctly designed). Yes. It has been a typo from me, obviously. Just FYI, the SYM and the AIC7XXX should not get the LOADs reordered, in my opinion, since B (item extracted from the completion queue) is used as an index (or tag) to retrieve the IO data-structure and then the status data (A). An operand of the second LOAD depending on the first one insures no reordering will occur. Anyway, I have put the MB in the SYM for safety against any other weirdness I may have missed. :-) This is different in the NCR, that uses the host_status field of the nccb as completion flag, but tests against the xerr_status flag of the nccb for extended errors. Since these 2 informations are not written together atomically, the NCR falls into the above model, on paper. Just, the 2 LOADs are probably not close enough in the instruction stream for bad reordering to ever happen. > I don't believe that the x86 re-orders read operations (but I don't have= =20 > the P6 architecture manuals here to be certain). If it does, then to=20 > remain compatible with older x86 processors, it would have to invalidate= =20 > the pipeline and re-fetch when the bus snoop code detected the STORE B. >=20 > > > As far as other platforms are concerned, however, you're quite correc= t. > >=20 > > Are you still so sure. ;-) >=20 > Yes. x86 has so much seralised legacy baggage that it's a special case. = =20 You shouldn't, IMO, for Pi, i >=3D II, and some clones. :) > Everyone else needs help. 8) I probably fall into this category too. :) > > > There does need to be an extension to the busspace API to define a ra= nge=20 > > > of host memory with a tag/handle pair for barrier activity. > >=20 > > Hmmm... Barrier semantics vary so much between architectures that an > > unified semantic that also address device driver's concerns (not only > > CPU<->CPU) is either close to impossible or will just be extremally poo= r, > > in my opinion. >=20 > I'm open to edification here, but I think that the ability to define the= =20 > sort of barrier operation required for a memory region and to then=20 > invoke said barrier is about the best we can hope for. This looks to me the best complex and poor possible semantic. :) > > The drivers I maintain will always contain any stuff needed for them to= be > > as correct as I want them to be, modulo my knowledge and competence on > > addressed platforms obviously. >=20 > I don't think anyone could ask for any more than that. Thanks for the reply. Regards, G=E9rard. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message