Date: Thu, 23 Jun 2016 15:42:32 -0700 From: Sean Chittenden <sean@chittenden.org> To: Maxim Sobolev <sobomax@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, Adrian Chadd <adrian@freebsd.org>, performance@freebsd.org, John Baldwin <jhb@freebsd.org>, Alan Somers <asomers@freebsd.org>, Alan Cox <alc@rice.edu>, Alan Cox <alc@freebsd.org>, freebsd-current <freebsd-current@freebsd.org>, "current@freebsd.org" <current@freebsd.org> Subject: Re: PostgreSQL performance on FreeBSD Message-ID: <C06B11C3-5D40-43AF-8975-880F272933C5@chittenden.org> In-Reply-To: <CAH7qZfvy46wWcrjz-ihA%2B%2BEYktm7PqGoJhj1a7hdYWssiEXFuA@mail.gmail.com> References: <20140627125613.GT93733@kib.kiev.ua> <CAJ-Vmom-M=R=FaBfHE5c2%2BYxW0SLmJTdFJD8tW4_aOD7MDNwzA@mail.gmail.com> <CAJ-Vmomt=WYjct%2BzsTbHuryxqYp7ELyS52LOb4NEsfENQ1yj1w@mail.gmail.com> <1603235.2ShtoCfSqO@ralph.baldwin.cx> <CAH7qZfuAtHtUG92wEjPhOZ=BGgyFS728uigjJoD0pG%2B-mtUSww@mail.gmail.com> <20160622100241.GM38613@kib.kiev.ua> <CAH7qZfvy46wWcrjz-ihA%2B%2BEYktm7PqGoJhj1a7hdYWssiEXFuA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Small nit: PostgreSQL used SYSV because it allowed for the detection of dead = processes. If you `kill -9`=E2=80=99ed a process, PostgreSQL can detect = that and then shut down and perform an automatic recovery. In this = regard, sysv is pretty clever. The move to POSIX shared mem was done = for a host of reasons, but it means that you don=E2=80=99t have to = adjust your SYSV limits. My understanding from a few years ago is that = there is still a ~64KB SYSV memory segment that is still used to act as = the latch to signal if a process was killed, but all of the shared = buffers are stored in posix mmap=E2=80=99ed regions. At this point in time this could be replaced with kqueue(2) EVFILT_PROC, = but no one has done that yet. -sc -- Sean Chittenden sean@chittenden.org > On Jun 22, 2016, at 07:26 , Maxim Sobolev <sobomax@freebsd.org> wrote: >=20 > Konstantin, >=20 > Not if you do sem_unlink() immediately, AFAIK. And that's what PG = does. So > the window of opportunity for the leakage is quite small, much smaller = than > for SYSV primitives. Sorry for missing your status update message, = I've > missed it somehow. >=20 > ---- > mySem =3D sem_open(semname, O_CREAT | O_EXCL, > (mode_t) = IPCProtection, > (unsigned) 1); >=20 > #ifdef SEM_FAILED > if (mySem !=3D (sem_t *) SEM_FAILED) > break; > #else > if (mySem !=3D (sem_t *) (-1)) > break; > #endif >=20 > /* Loop if error indicates a collision */ > if (errno =3D=3D EEXIST || errno =3D=3D EACCES || errno = =3D=3D EINTR) > continue; >=20 > /* > * Else complain and abort > */ > elog(FATAL, "sem_open(\"%s\") failed: %m", semname); > } >=20 > /* > * Unlink the semaphore immediately, so it can't be accessed > externally. > * This also ensures that it will go away if we crash. > */ > sem_unlink(semname); >=20 > return mySem; > ---- >=20 > -Max >=20 > On Wed, Jun 22, 2016 at 3:02 AM, Konstantin Belousov = <kostikbel@gmail.com> > wrote: >=20 >> On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote: >>> Thanks, Konstantin for the great work, we are definitely looking = forward >> to >>> get all those improvements to be part of the default FreeBSD = kernel/port. >>> Would be nice if you can post an update some day later as to what's >>> integrated and what's not. >> I did posted the update several days earlier. Since you replying to = this >> thread, it would be not unreasonable to read recent messages that = were >> sent. >>=20 >>>=20 >>> Just in case, I've opened #14206 with PG to switch us to using POSIX >>> semaphores by default. Apart from the mentioned performance = benefits, >> SYSV >>> semaphores are PITA to deal with as they come in very limited = quantities >> by >>> default. Also they might stay around if PG dies/gets nuked and = prevent it >>> from starting again due to overflow. We've got some quite ugly code = to >>> clean up those using ipcrm(1) in our build scripts to deal with just >> that. >>> I am happy that code could be retired now. >>=20 >> Named semaphores also stuck around if processes are killed without = cleanup. >>=20 >>=20 > _______________________________________________ > freebsd-performance@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to = "freebsd-performance-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C06B11C3-5D40-43AF-8975-880F272933C5>