From owner-freebsd-current@FreeBSD.ORG Thu May 3 10:49:40 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 80710106564A; Thu, 3 May 2012 10:49:40 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 85E268FC08; Thu, 3 May 2012 10:49:39 +0000 (UTC) Received: by lagv3 with SMTP id v3so1563843lag.13 for ; Thu, 03 May 2012 03:49:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=c5db86wHEhm02qq6fQ2SecJmHD8Aq0D32Oblle5MatA=; b=tANeaMcTEAeCo0vRhWXOZ6VqPr8fl2ulFbLIVOUJKZzs6SPRgrsxtREeHKqYcrCQf7 fzYb29R/ZG6TQSNO12uLkNtpvriYibvMcY2BgS/UYK2F6ZEc0r5e04v7ghVTEKaqiJif B6rwJnaCpF2ngLaM5NKCNs1xDrxXCDfCyegkNcmrDhP+FF9EnTPG8DjUA2HBc+N0sowJ 8Iun/Cn1/vhVuGcv+q4qoshf0/2yQhOYYHkW02DyVH8yLEP0KbpQdPQo2bSdxg5jIOJC Ri6PGvkaXrJQuJPSDTYE0epq5S+MFAh7BUGaMq/rYjQh0MnbJZvR7xzaCglNWk2BBE38 utOQ== MIME-Version: 1.0 Received: by 10.152.132.166 with SMTP id ov6mr1757981lab.35.1336042178447; Thu, 03 May 2012 03:49:38 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.112.27.65 with HTTP; Thu, 3 May 2012 03:49:38 -0700 (PDT) In-Reply-To: <20120503102844.GU633@sherwood.local> References: <20120502182557.GA93838@onelab2.iet.unipi.it> <20120502215249.GT633@sherwood.local> <20120503102844.GU633@sherwood.local> Date: Thu, 3 May 2012 11:49:38 +0100 X-Google-Sender-Auth: Woi8Qh4LWGNtIwv5ifqXkdsIsHY Message-ID: From: Attilio Rao To: Steven Atreju Content-Type: text/plain; charset=UTF-8 Cc: net@freebsd.org, Luigi Rizzo , "K. Macy" , current@freebsd.org Subject: Re: fast bcopy... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2012 10:49:40 -0000 2012/5/3, Steven Atreju : > K. Macy wrote [2012-05-03 02:58+0200]: >> It's highly chipset and processor dependent what works best. > > Yes, of course. > Though i was kinda, even shocked, once i've seen this first: > > http://marc.info/?l=dragonfly-commits&m=132241713812022&w=2 > > So we don't use our assembler version for new gccs and HAMMER or > SSE3+ (the decision for these was rather arbitrarily, except they > were yet existent for an instant implementation). > >> Intel now has non-temporal loads and stores which work much >> better in some cases but provide little benefit in others. > > Yes, our 2002 tests have shown that these were *extremely* > dependent upon alignment. (Note: 2002. o-) > Hmm, it doesn't really matter, but i guess this is a good time to > thank the FreeBSD hackers for that FPU stack FILD/FISTP idea! > I'll append the copy related notes of our doc/memperf.txt. > Thanks, I made an implementation of fpu unwinding and mmx copy to see if they were really making a difference years ago (reimplementing bcopy, memcopy, etc.). What really mattered with hw available at that time (pentium4) was the alignment and use of non-temporal operations on heavilly contended cache-lines. In few words it is more important we engineer the "buffer" layout rather than the functions themselves. Attilio -- Peace can only be achieved by understanding - A. Einstein