From owner-freebsd-current@FreeBSD.ORG Wed Jan 17 00:55:12 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D5AF016A412 for ; Wed, 17 Jan 2007 00:55:12 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.189]) by mx1.freebsd.org (Postfix) with ESMTP id 3C7ED13C45E for ; Wed, 17 Jan 2007 00:55:12 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by nf-out-0910.google.com with SMTP id k27so42429nfc for ; Tue, 16 Jan 2007 16:55:11 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=UiWS5XqcE4hMR3V9L1sJu0rEyJ43vIfQY1xCZqsykuUJFCH64ppkLmql5nbeHtAYPLjGJLDhzwuPsaHGcisnLBC1mzda88EWN9xjuonM1rOwyPQYRR4DaIJhxT/40VrHZKzEs8pfFQz6U4OL6byknKGo8GTbCLzuzJA7NOuSVIE= Received: by 10.49.93.4 with SMTP id v4mr109303nfl.1168995309825; Tue, 16 Jan 2007 16:55:09 -0800 (PST) Received: by 10.48.238.9 with HTTP; Tue, 16 Jan 2007 16:55:09 -0800 (PST) Message-ID: <3bbf2fe10701161655p5e686b52n7340b3100ecfab93@mail.gmail.com> Date: Wed, 17 Jan 2007 01:55:09 +0100 From: "Attilio Rao" Sender: asmrookie@gmail.com To: "Maxim Sobolev" In-Reply-To: <45AD6DFA.6030808@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3bbf2fe10607250813w8ff9e34pc505bf290e71758@mail.gmail.com> <3bbf2fe10607281004o6727e976h19ee7e054876f914@mail.gmail.com> <3bbf2fe10701160851r79b04464m2cbdbb7f644b22b6@mail.gmail.com> <20070116154258.568e1aaf@pleiades.nextvenue.com> <3bbf2fe10701161525j6ad9292y93502b8df0f67aa9@mail.gmail.com> <45AD6DFA.6030808@FreeBSD.org> X-Google-Sender-Auth: 156349cc4e226549 Cc: freebsd-current@freebsd.org, Ivan Voras , freebsd-arch@freebsd.org Subject: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jan 2007 00:55:13 -0000 2007/1/17, Maxim Sobolev : > Attilio Rao wrote: > > 2007/1/17, Ivan Voras : > >> Kip Macy wrote: > >> > On 1/16/07, Ivan Voras wrote: > >> >> But it does seem to hurt the performance a bit - maybe it's time to > >> add > >> >> another CPU option like I586_CPU and I686_CPU? > >> > > >> > Unless there is a compelling reason not to do so, I think that that > >> > would be a good idea. > >> > >> Maybe even someone finds a way to get optimized versions of memcpy in > >> the kernel :) > >> > >> I was thinking: AFAIK the only major stopper is context saving of the > >> various "auxiliary" registers - FPU, MMX, SSE, right? But is it an > >> all-or-nothing situation? I.e. does it make sense (can it be done?) to > >> just elect to save the MMX context? (AFAIK they are different registers > >> than SSE, but overlay FPU registers?) The idea is to save something > >> smaller than the full set. > > > > When I implemented fpu copy into the kernel I had a lot of thinking > > about this and I think it is possible at least with some restrictions. > > For example, for an xmm copy you would just save 8 registers content > > but you have to ensure no pending FPU exceptions will break your > > kernel and so you should preserve a clean copy of FPU state or, treact > > the corner cases you can get. > > For xmm, after some very productive discussions with bde@, we arrived > > at the conclusion that should be pretty safe to just have an 16 byte > > aligned buffer for registers saving (in this way you can use 8 movdqa > > for saving them) but I didn't end to play with it. > > (My implementation should deal with the problem of pinning the > > scheduler too, in order to avoid a wrong reading of per-cpu datas). > > I might be wrong, but I think the DragonFly has solved this issue (i.e. > optimized memcpy in the kernel) somehow quite some time ago. Dragonfly saves the whole context (xmm + mmx + fpu state). It is a too heavy mechanism ATM for us (and for them too I suspect). The don't need to deal with pinning too, BTW. Attilio -- Peace can only be achieved by understanding - A. Einstein