From owner-freebsd-performance@freebsd.org Sat Jun 25 16:16:03 2016 Return-Path: Delivered-To: freebsd-performance@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EB0E1B82F4D for ; Sat, 25 Jun 2016 16:16:03 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id C2A6D1E9C for ; Sat, 25 Jun 2016 16:16:03 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mailman.ysv.freebsd.org (Postfix) id BC526B82F47; Sat, 25 Jun 2016 16:16:03 +0000 (UTC) Delivered-To: performance@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B9A03B82F44 for ; Sat, 25 Jun 2016 16:16:03 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-it0-x22f.google.com (mail-it0-x22f.google.com [IPv6:2607:f8b0:4001:c0b::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 77B9C1E90 for ; Sat, 25 Jun 2016 16:16:03 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-it0-x22f.google.com with SMTP id g127so36674747ith.0 for ; Sat, 25 Jun 2016 09:16:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=KPGvviuXJCPPNzg8N0itYWnVzqmvsMZ6wlYso68tcds=; b=Xv7KMMSGluuD2Bkm3wYQR4t04pJCNJ7hQKuQwjd+ONTKflWo3qE5vpWORMqSE9sKYg CuBbXmlNtJYW4C4k1gaIWAd9VxPsu3j+PsxSGBsYPYOlmwFK8dcKLHmWYPZc87Zpp8YZ lEOSirf3Y2sgmfZmOSHqLANHM7A4xtmQiEoMuCcbVr84WGV7vr90CoCaMNI+SCTE0kC3 LKHJL119d6IfVZ11YwW6d6834G2gaLA7o8fristayQP4ybNmpzzMKmU8acYT5tki7pRm LR/CzDb62QNnSREwNmL6d4/dz75gzEAmvOB/N69RBhojxjPixz9K7fDOvAOovtMXTbdA fSZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=KPGvviuXJCPPNzg8N0itYWnVzqmvsMZ6wlYso68tcds=; b=g6bRu3SsIwDhZX/cVfgJhEhhGiDFseuCCp7tIlnzpiur36Qbxvc0vhqmL+xNy/D01r EXiT5N0T4A3aE1SQ8Z/uNVCmnDC2Nk8nm9bl3jcTauTEpU2AKWtIJ1KHkamLIiS5Ft7X 0wJckYWuByy5NZMf++GBqk6zHkHBi/813vpx/ALc97QYAB/yhuTs4X1MrxT7+vKewYzI +hRGtBYLY8OF396KVE1SQ1Cfe7oBC7bEu/E+2zDMKkq93MGLKEcR6qow4oH5gE9rdCv8 8HOfW+M5CVEiNA77TrUQvq6olmpf1fM1gm7VXXW8mEXHefmvfLT318/GkVodvy6n6VBD Bf2g== X-Gm-Message-State: ALyK8tKDmwNQEVaKtWVj6znTS6BFGBVqxA6WhYyTaS3plHaqOtJRNw2RLYjyLuZmEztPsgcSzjQHZotc53WIB2Xi X-Received: by 10.36.123.75 with SMTP id q72mr2486183itc.44.1466871362552; Sat, 25 Jun 2016 09:16:02 -0700 (PDT) MIME-Version: 1.0 Sender: sobomax@sippysoft.com Received: by 10.36.125.197 with HTTP; Sat, 25 Jun 2016 09:16:01 -0700 (PDT) In-Reply-To: References: <20140627125613.GT93733@kib.kiev.ua> <1603235.2ShtoCfSqO@ralph.baldwin.cx> <20160622100241.GM38613@kib.kiev.ua> From: Maxim Sobolev Date: Sat, 25 Jun 2016 09:16:01 -0700 X-Google-Sender-Auth: OyT3s5nJ4OaYhoaFUlp5IrO8VdU Message-ID: Subject: Re: PostgreSQL performance on FreeBSD To: Sean Chittenden Cc: Konstantin Belousov , Adrian Chadd , performance@freebsd.org, John Baldwin , Alan Somers , Alan Cox , Alan Cox , freebsd-current , "current@freebsd.org" X-Mailman-Approved-At: Sat, 25 Jun 2016 17:38:47 +0000 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jun 2016 16:16:04 -0000 Sean, to the issue that you are describing it is also might be possible to do it some other way around. One, perhaps more portable, is to share a connected socketpair between two communicating processes, so that you can do non-blocking read on one of its ends from time to time and check if it returns EOF. Which would be the case if whatever process holds the other end of it is no longer there. So instead of shared memory segment, you can have pool of descriptors, one for each worker that you care about. Polling on those would be trivial with just regular poll(2). The only issue might be that postgres forks a lot, so we would probably need to implement FD_CLOFORK to avoid copying those extra fds into every child. Something akin to a solution that I recently posted to work around problem that you cannot really waitpid() on a grand-child see PG BUG #14199 for details & patch. But yes, it would be really nice to get rid of SYSV shared memory use in PG completely as some point one way or another. -Max On Thu, Jun 23, 2016 at 3:42 PM, Sean Chittenden wrote: > Small nit: > > PostgreSQL used SYSV because it allowed for the detection of dead > processes. If you `kill -9`=E2=80=99ed a process, PostgreSQL can detect = that and > then shut down and perform an automatic recovery. In this regard, sysv i= s > pretty clever. The move to POSIX shared mem was done for a host of > reasons, but it means that you don=E2=80=99t have to adjust your SYSV lim= its. My > understanding from a few years ago is that there is still a ~64KB SYSV > memory segment that is still used to act as the latch to signal if a > process was killed, but all of the shared buffers are stored in posix > mmap=E2=80=99ed regions. > > At this point in time this could be replaced with kqueue(2) EVFILT_PROC, > but no one has done that yet. > > -sc > > > > -- > Sean Chittenden > sean@chittenden.org > > > On Jun 22, 2016, at 07:26 , Maxim Sobolev wrote: > > > > Konstantin, > > > > Not if you do sem_unlink() immediately, AFAIK. And that's what PG does. > So > > the window of opportunity for the leakage is quite small, much smaller > than > > for SYSV primitives. Sorry for missing your status update message, I've > > missed it somehow. > > > > ---- > > mySem =3D sem_open(semname, O_CREAT | O_EXCL, > > (mode_t) IPCProtection, > > (unsigned) 1); > > > > #ifdef SEM_FAILED > > if (mySem !=3D (sem_t *) SEM_FAILED) > > break; > > #else > > if (mySem !=3D (sem_t *) (-1)) > > break; > > #endif > > > > /* Loop if error indicates a collision */ > > if (errno =3D=3D EEXIST || errno =3D=3D EACCES || errno = =3D=3D EINTR) > > continue; > > > > /* > > * Else complain and abort > > */ > > elog(FATAL, "sem_open(\"%s\") failed: %m", semname); > > } > > > > /* > > * Unlink the semaphore immediately, so it can't be accessed > > externally. > > * This also ensures that it will go away if we crash. > > */ > > sem_unlink(semname); > > > > return mySem; > > ---- > > > > -Max > > > > On Wed, Jun 22, 2016 at 3:02 AM, Konstantin Belousov < > kostikbel@gmail.com> > > wrote: > > > >> On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote: > >>> Thanks, Konstantin for the great work, we are definitely looking > forward > >> to > >>> get all those improvements to be part of the default FreeBSD > kernel/port. > >>> Would be nice if you can post an update some day later as to what's > >>> integrated and what's not. > >> I did posted the update several days earlier. Since you replying to > this > >> thread, it would be not unreasonable to read recent messages that were > >> sent. > >> > >>> > >>> Just in case, I've opened #14206 with PG to switch us to using POSIX > >>> semaphores by default. Apart from the mentioned performance benefits, > >> SYSV > >>> semaphores are PITA to deal with as they come in very limited > quantities > >> by > >>> default. Also they might stay around if PG dies/gets nuked and preven= t > it > >>> from starting again due to overflow. We've got some quite ugly code t= o > >>> clean up those using ipcrm(1) in our build scripts to deal with just > >> that. > >>> I am happy that code could be retired now. > >> > >> Named semaphores also stuck around if processes are killed without > cleanup. > >> > >> > > _______________________________________________ > > freebsd-performance@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-performance > > To unsubscribe, send any mail to " > freebsd-performance-unsubscribe@freebsd.org" > >