From owner-freebsd-hackers@freebsd.org Sat Nov 12 15:58:52 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 48FE4C3D8EA for ; Sat, 12 Nov 2016 15:58:52 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C2D981C96 for ; Sat, 12 Nov 2016 15:58:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id uACFwV1j059146 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 12 Nov 2016 17:58:32 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua uACFwV1j059146 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id uACFwU66059145; Sat, 12 Nov 2016 17:58:30 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 12 Nov 2016 17:58:30 +0200 From: Konstantin Belousov To: Sebastian Huber Cc: FreeBSD Subject: Re: Should page allocator zero the pages for UMA? Message-ID: <20161112155830.GC54029@kib.kiev.ua> References: <5821DC2E.9020302@embedded-brains.de> <20161108145528.GN54029@kib.kiev.ua> <5822C615.3060100@embedded-brains.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5822C615.3060100@embedded-brains.de> User-Agent: Mutt/1.7.1 (2016-10-04) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Nov 2016 15:58:52 -0000 On Wed, Nov 09, 2016 at 07:45:41AM +0100, Sebastian Huber wrote: > On 08/11/16 15:55, Konstantin Belousov wrote: > > On Tue, Nov 08, 2016 at 03:07:42PM +0100, Sebastian Huber wrote: > >> Hello, > >> > >> we use the FreeBSD network, USB and SD/MMC card stacks for the real-time > >> operating system RTEMS: > >> > >> https://git.rtems.org/rtems-libbsd > >> > >> I update currently from FreeBSD 9.3 to head. We use the UMA from FreeBSD > >> with a custom page allocator: > >> > >> https://git.rtems.org/rtems-libbsd/tree/rtemsbsd/rtems/rtems-kernel-page.c > >> > >> The FreeBSD 9.3 based port worked well with uninitialized pages, e.g. > >> random or previous content. However, after the update to head I had to > >> zero initialize the pages. One issue was an incomplete > >> > >> struct inpcb { > >> [...] > >> struct inpcbport *inp_phd; /* (i/h) head of this list */ > >> #define inp_zero_size offsetof(struct inpcb, inp_gencnt) > >> inp_gen_t inp_gencnt; /* (c) generation count */ > >> struct llentry *inp_lle; /* cached L2 information */ > >> struct rwlock inp_lock; > >> rt_gen_t inp_rt_cookie; /* generation for route entry */ > >> union { /* cached L3 information */ > >> struct route inpu_route; > >> struct route_in6 inpu_route6; > >> } inp_rtu; > >> #define inp_route inp_rtu.inpu_route > >> #define inp_route6 inp_rtu.inpu_route6 > >> }; > >> > >> initialization. The initialization consists of two parts: > >> > >> static int > >> udp_inpcb_init(void *mem, int size, int flags) > >> { > >> struct inpcb *inp; > >> > >> inp = mem; > >> INP_LOCK_INIT(inp, "inp", "udpinp"); > >> return (0); > >> } > >> > >> /* > >> * Allocate a PCB and associate it with the socket. > >> * On success return with the PCB locked. > >> */ > >> int > >> in_pcballoc(struct socket *so, struct inpcbinfo *pcbinfo) > >> { > >> struct inpcb *inp; > >> int error; > >> > >> #ifdef INVARIANTS > >> if (pcbinfo == &V_tcbinfo) { > >> INP_INFO_RLOCK_ASSERT(pcbinfo); > >> } else { > >> INP_INFO_WLOCK_ASSERT(pcbinfo); > >> } > >> #endif > >> > >> error = 0; > >> inp = uma_zalloc(pcbinfo->ipi_zone, M_NOWAIT); > >> if (inp == NULL) > >> return (ENOBUFS); > >> bzero(inp, inp_zero_size); > >> inp->inp_pcbinfo = pcbinfo; > >> inp->inp_socket = so; > >> inp->inp_cred = crhold(so->so_cred); > >> inp->inp_inc.inc_fibnum = so->so_fibnum; > >> [...] > >> > >> This lets at least inp_route uninitialized leading to a crash during > >> destruction, e.g. > >> > >> if (inp->inp_route.ro_rt) { > >> RTFREE(inp->inp_route.ro_rt); > >> inp->inp_route.ro_rt = (struct rtentry *)NULL; > >> } > >> > >> uses uninitialized data. > >> > >> Did something in the page allocator change between FreeBSD 9.3 and > >> trunk, so that page are now zero initialized or is this a bug in > >> udp_inpcb_init()? > > Nothing guarantees that returned items are zeroed out automatically. > > Pages are not returned zeroed by VM page allocator, it might indicate > > that the allocated page happens to be zeroed by PG_ZERO flag. Unless > > M_ZERO is passed to uma_zalloc(), item is not zeroed. > > > > I am not sure, you did not demonstrated zone initialization, but it seems > > that your zone is having zone init function. If this is true, than I do not > > see why do you consider it possible to have items automatically zeroed at > > all. Uma calls the init function at the bulk allocation time, and constructor > > for each item returned by uma_zalloc(). No implicit zeroing occurs there. > > Now, I am a bit confused. We have > > /* > * Allocate a new slab for a keg. This does not insert the slab onto a > list. > * > * Arguments: > * wait Shall we wait? > * > * Returns: > * The slab that was allocated or NULL if there is no memory and the > * caller specified M_NOWAIT. > */ > static uma_slab_t > keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int wait) > { > [...] > /* > * This reproduces the old vm_zone behavior of zero filling pages the > * first time they are added to a zone. > * > * Malloced items are zeroed in uma_zalloc. > */ > > if ((keg->uk_flags & UMA_ZONE_MALLOC) == 0) > wait |= M_ZERO; > else > wait &= ~M_ZERO; > [...] > /* zone is passed for legacy reasons. */ > mem = allocf(zone, keg->uk_ppera * PAGE_SIZE, &flags, wait); > > Firstly, the name "wait" and the comment are confusing, since "wait" is > really "flags" which may have M_ZERO set. So, for non UMA_ZONE_MALLOC > zones (which should be most zones) the page is supposed to be zeroed? > > We ignored the flags in our FreeBSD 9.3 based port and this worked quite > well. However, on FreeBSD head I changed the page allocator now to do: > > #ifdef INVARIANTS > wait |= M_ZERO; > #endif > > if (addr != NULL && (wait & M_ZERO) != 0) { > memset(addr, 0, size_in_bytes); > } > > With this change, everything works like before. Is this in your page allocator ? If yes, then this is how things are suppossed to work. If uma requested M_ZERO page, the page must be zeroed. This is very different from 'always zeroing', which you asked about before. > > > > > If your zone does not have init/fini functions, nor ctors/dtors, then > > HEAD and stable/11 actually actively trash items to catch reliance on > > the uninitialized memory being zero. Look for trash_ctor in uma_core.c. > > The zone has an init/fini function, see udp_init() > > /* > * For now default to 2-tuple UDP hashing - until the fragment > * reassembly code can also update the flowid. > * > * Once we can calculate the flowid that way and re-establish > * a 4-tuple, flip this to 4-tuple. > */ > in_pcbinfo_init(&V_udbinfo, "udp", &V_udb, UDBHASHSIZE, UDBHASHSIZE, > "udp_inpcb", udp_inpcb_init, NULL, 0, > IPI_HASHFIELDS_2TUPLE); > If you use init/ctr which do not set the allocated object to the pre-defined values, and do not specify M_ZERO flag, the content of the bytes which were not touched by init/ctr is not defined. It is not guaranteed to be zeroed.