From owner-svn-src-head@FreeBSD.ORG Tue Nov 4 16:20:09 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CEF1D58E; Tue, 4 Nov 2014 16:20:09 +0000 (UTC) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id 59D4D9AD; Tue, 4 Nov 2014 16:20:08 +0000 (UTC) Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 1856A7811EC; Wed, 5 Nov 2014 03:20:00 +1100 (AEDT) Date: Wed, 5 Nov 2014 03:19:58 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mateusz Guzik Subject: Re: svn commit: r274088 - head/sys/kern In-Reply-To: <20141104114041.GA21297@dft-labs.eu> Message-ID: <20141105023323.O1105@besplex.bde.org> References: <201411041129.sA4BTnwX030600@svn.freebsd.org> <20141104114041.GA21297@dft-labs.eu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=fvDlOjIf c=1 sm=1 tr=0 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=6I5d2MoRAAAA:8 a=OG-MzttRyq2ILO3qSbwA:9 a=CjuIK1q_8ugA:10 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Hans Petter Selasky X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Nov 2014 16:20:10 -0000 On Tue, 4 Nov 2014, Mateusz Guzik wrote: > On Tue, Nov 04, 2014 at 11:29:49AM +0000, Hans Petter Selasky wrote: >> Author: hselasky >> Date: Tue Nov 4 11:29:49 2014 >> New Revision: 274088 >> URL: https://svnweb.freebsd.org/changeset/base/274088 >> >> Log: >> Simplify logic a bit. Ensure data buffer is properly aligned, >> especially for platforms where unaligned access is not allowed. Make >> it possible to override the small buffer size. >> >> A simple continuous read string test using libusb showed a reduction >> in CPU usage from roughly 10% to less than 1% using a dual-core GHz >> CPU, when the malloc() operation was skipped for small buffers. >> >> MFC after: 2 weeks >> >> Modified: >> head/sys/kern/sys_generic.c >> >> Modified: head/sys/kern/sys_generic.c >> ============================================================================== >> --- head/sys/kern/sys_generic.c Tue Nov 4 10:25:52 2014 (r274087) >> +++ head/sys/kern/sys_generic.c Tue Nov 4 11:29:49 2014 (r274088) >> @@ -646,10 +646,13 @@ struct ioctl_args { >> int >> sys_ioctl(struct thread *td, struct ioctl_args *uap) >> { >> +#ifndef SYS_IOCTL_SMALL_SIZE >> +#define SYS_IOCTL_SMALL_SIZE 128 >> +#endif This shouldn't be ifdefed. It doesn't even have the full complexity for an option (it isn't in conf/options). >> + u_char smalldata[SYS_IOCTL_SMALL_SIZE] __aligned(8); > > Should not you align to word size instead? Neither. The alignment should be the same as given by malloc(9). The precise value is undocumented. It seems to be determined by malloc() using uma_zcreate() with align = UMA_ALIGN_PTR. This is bogus since UMA_ALIGN_PTR only guarantees enough alignment for pointers but malloc() is required to support all object types. The definition UMA_ALIGN_PTR is also bogus, since it assumes that all pointers have the same alignment requirements and that these are given by the size of the special pointer type 'void *'. But this works, because all machines are vaxes. It gives the register size on all machines. Another unsuitable alignment is by the MD ALIGNBYTES macro. This is (sizeof(register_t) - 1) on all arches except mips 32-bit where it is 7 sparc64 where it is 15. On arm, it is 3, but arm also has STACKALIGNBYTES = 7. The register size is really too small to use for malloc() on 32-bit arches. Its technical correctness depends on no C objects having more than 32-bit alignment. On i386, int64_t and double should have 64-bit alignment, but this is not required unless CFLAGS includes -malign-double which breaks the ABI in userland and is irrelevant for the kernel. > Also maybe it would be beneficial to derive the limit from KSTACK_PAGES > (something silly like KSTACK_PAGES * 32 or something), which I believe would > address earlier complaints. No, that would reward bloat in KSTACK_PAGES. KSTACK_PAGES depends on this, not the other way round. >> @@ -682,10 +685,10 @@ sys_ioctl(struct thread *td, struct ioct >> data = (void *)&arg; >> size = 0; >> } else { >> - if (size <= sizeof(smalldata)) >> - data = smalldata; >> - else >> + if (size > SYS_IOCTL_SMALL_SIZE) >> data = malloc((u_long)size, M_IOCTLOPS, M_WAITOK); >> + else >> + data = smalldata; >> } >> } else >> data = (void *)&uap->data; The previous order was better. The malloc() line still has 2 style bugs. Bruce