From owner-freebsd-hackers@freebsd.org Wed Nov 9 06:45:48 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 986DCC3708F for ; Wed, 9 Nov 2016 06:45:48 +0000 (UTC) (envelope-from sebastian.huber@embedded-brains.de) Received: from dedi548.your-server.de (dedi548.your-server.de [85.10.215.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56B62CBD for ; Wed, 9 Nov 2016 06:45:48 +0000 (UTC) (envelope-from sebastian.huber@embedded-brains.de) Received: from [88.198.220.132] (helo=sslproxy03.your-server.de) by dedi548.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.85_2) (envelope-from ) id 1c4Mdl-00009B-9n; Wed, 09 Nov 2016 07:45:45 +0100 Received: from [82.135.62.35] (helo=mail.embedded-brains.de) by sslproxy03.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84_2) (envelope-from ) id 1c4Mdk-00049w-Jx; Wed, 09 Nov 2016 07:45:44 +0100 Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 913082A0003; Wed, 9 Nov 2016 07:45:54 +0100 (CET) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id m06-M-Ry5Hax; Wed, 9 Nov 2016 07:45:52 +0100 (CET) Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 307D12A1808; Wed, 9 Nov 2016 07:45:52 +0100 (CET) X-Virus-Scanned: amavisd-new at zimbra.eb.localhost Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id kDVT8wttQDvO; Wed, 9 Nov 2016 07:45:52 +0100 (CET) Received: from [192.168.96.129] (unknown [192.168.96.129]) by mail.embedded-brains.de (Postfix) with ESMTPSA id 1154C2A0003; Wed, 9 Nov 2016 07:45:52 +0100 (CET) Subject: Re: Should page allocator zero the pages for UMA? To: Konstantin Belousov References: <5821DC2E.9020302@embedded-brains.de> <20161108145528.GN54029@kib.kiev.ua> Cc: FreeBSD From: Sebastian Huber Message-ID: <5822C615.3060100@embedded-brains.de> Date: Wed, 9 Nov 2016 07:45:41 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20161108145528.GN54029@kib.kiev.ua> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-Authenticated-Sender: smtp-embedded@poldinet.de X-Virus-Scanned: Clear (ClamAV 0.99.2/22502/Wed Nov 9 04:11:10 2016) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Nov 2016 06:45:48 -0000 On 08/11/16 15:55, Konstantin Belousov wrote: > On Tue, Nov 08, 2016 at 03:07:42PM +0100, Sebastian Huber wrote: >> Hello, >> >> we use the FreeBSD network, USB and SD/MMC card stacks for the real-ti= me >> operating system RTEMS: >> >> https://git.rtems.org/rtems-libbsd >> >> I update currently from FreeBSD 9.3 to head. We use the UMA from FreeB= SD >> with a custom page allocator: >> >> https://git.rtems.org/rtems-libbsd/tree/rtemsbsd/rtems/rtems-kernel-pa= ge.c >> >> The FreeBSD 9.3 based port worked well with uninitialized pages, e.g. >> random or previous content. However, after the update to head I had to >> zero initialize the pages. One issue was an incomplete >> >> struct inpcb { >> [...] >> struct inpcbport *inp_phd; /* (i/h) head of this list */ >> #define inp_zero_size offsetof(struct inpcb, inp_gencnt) >> inp_gen_t inp_gencnt; /* (c) generation count */ >> struct llentry *inp_lle; /* cached L2 information */ >> struct rwlock inp_lock; >> rt_gen_t inp_rt_cookie; /* generation for route entry */ >> union { /* cached L3 information */ >> struct route inpu_route; >> struct route_in6 inpu_route6; >> } inp_rtu; >> #define inp_route inp_rtu.inpu_route >> #define inp_route6 inp_rtu.inpu_route6 >> }; >> >> initialization. The initialization consists of two parts: >> >> static int >> udp_inpcb_init(void *mem, int size, int flags) >> { >> struct inpcb *inp; >> >> inp =3D mem; >> INP_LOCK_INIT(inp, "inp", "udpinp"); >> return (0); >> } >> >> /* >> * Allocate a PCB and associate it with the socket. >> * On success return with the PCB locked. >> */ >> int >> in_pcballoc(struct socket *so, struct inpcbinfo *pcbinfo) >> { >> struct inpcb *inp; >> int error; >> >> #ifdef INVARIANTS >> if (pcbinfo =3D=3D &V_tcbinfo) { >> INP_INFO_RLOCK_ASSERT(pcbinfo); >> } else { >> INP_INFO_WLOCK_ASSERT(pcbinfo); >> } >> #endif >> >> error =3D 0; >> inp =3D uma_zalloc(pcbinfo->ipi_zone, M_NOWAIT); >> if (inp =3D=3D NULL) >> return (ENOBUFS); >> bzero(inp, inp_zero_size); >> inp->inp_pcbinfo =3D pcbinfo; >> inp->inp_socket =3D so; >> inp->inp_cred =3D crhold(so->so_cred); >> inp->inp_inc.inc_fibnum =3D so->so_fibnum; >> [...] >> >> This lets at least inp_route uninitialized leading to a crash during >> destruction, e.g. >> >> if (inp->inp_route.ro_rt) { >> RTFREE(inp->inp_route.ro_rt); >> inp->inp_route.ro_rt =3D (struct rtentry *)NULL; >> } >> >> uses uninitialized data. >> >> Did something in the page allocator change between FreeBSD 9.3 and >> trunk, so that page are now zero initialized or is this a bug in >> udp_inpcb_init()? > Nothing guarantees that returned items are zeroed out automatically. > Pages are not returned zeroed by VM page allocator, it might indicate > that the allocated page happens to be zeroed by PG_ZERO flag. Unless > M_ZERO is passed to uma_zalloc(), item is not zeroed. > > I am not sure, you did not demonstrated zone initialization, but it see= ms > that your zone is having zone init function. If this is true, than I d= o not > see why do you consider it possible to have items automatically zeroed = at > all. Uma calls the init function at the bulk allocation time, and cons= tructor > for each item returned by uma_zalloc(). No implicit zeroing occurs the= re. Now, I am a bit confused. We have /* * Allocate a new slab for a keg. This does not insert the slab onto a=20 list. * * Arguments: * wait Shall we wait? * * Returns: * The slab that was allocated or NULL if there is no memory and the * caller specified M_NOWAIT. */ static uma_slab_t keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int wait) { [...] /* * This reproduces the old vm_zone behavior of zero filling pages th= e * first time they are added to a zone. * * Malloced items are zeroed in uma_zalloc. */ if ((keg->uk_flags & UMA_ZONE_MALLOC) =3D=3D 0) wait |=3D M_ZERO; else wait &=3D ~M_ZERO; [...] /* zone is passed for legacy reasons. */ mem =3D allocf(zone, keg->uk_ppera * PAGE_SIZE, &flags, wait); Firstly, the name "wait" and the comment are confusing, since "wait" is=20 really "flags" which may have M_ZERO set. So, for non UMA_ZONE_MALLOC=20 zones (which should be most zones) the page is supposed to be zeroed? We ignored the flags in our FreeBSD 9.3 based port and this worked quite=20 well. However, on FreeBSD head I changed the page allocator now to do: #ifdef INVARIANTS wait |=3D M_ZERO; #endif if (addr !=3D NULL && (wait & M_ZERO) !=3D 0) { memset(addr, 0, size_in_bytes); } With this change, everything works like before. > > If your zone does not have init/fini functions, nor ctors/dtors, then > HEAD and stable/11 actually actively trash items to catch reliance on > the uninitialized memory being zero. Look for trash_ctor in uma_core.c= . The zone has an init/fini function, see udp_init() /* * For now default to 2-tuple UDP hashing - until the fragment * reassembly code can also update the flowid. * * Once we can calculate the flowid that way and re-establish * a 4-tuple, flip this to 4-tuple. */ in_pcbinfo_init(&V_udbinfo, "udp", &V_udb, UDBHASHSIZE, UDBHASHSIZE, "udp_inpcb", udp_inpcb_init, NULL, 0, IPI_HASHFIELDS_2TUPLE); --=20 Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.huber@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine gesch=E4ftliche Mitteilung im Sinne des EHUG.