Date: Wed, 17 Nov 2004 08:01:43 +0800 From: David Xu <davidxu@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: Perforce Change Reviews <perforce@freebsd.org> Subject: Re: PERFORCE change 65074 for review Message-ID: <419A94E7.1080908@freebsd.org> In-Reply-To: <200411161600.33252.jhb@FreeBSD.org> References: <200411140513.iAE5DOTv056478@repoman.freebsd.org> <200411151318.49415.jhb@FreeBSD.org> <41993EAB.3030108@freebsd.org> <200411161600.33252.jhb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote: >On Monday 15 November 2004 06:41 pm, David Xu wrote: > > >>John Baldwin wrote: >> >> >>>On Sunday 14 November 2004 12:13 am, David Xu wrote: >>> >>> >>>>http://perforce.freebsd.org/chv.cgi?CH=65074 >>>> >>>>Change 65074 by davidxu@davidxu_alona on 2004/11/14 05:12:40 >>>> >>>> 1. Fix a race between signal and umtx_unlock. a waiter >>>> may be resumed by signal and left or exited, heavily >>>> loaded test causes kernel to crash. >>>> 2. Use distributed queue locks instead of single giant >>>> lock. >>>> >>>>Affected files ... >>>> >>>>.. //depot/projects/davidxu_ksedbg/src/sys/kern/kern_umtx.c#4 edit >>>> >>>>Differences ... >>>> >>>>==== //depot/projects/davidxu_ksedbg/src/sys/kern/kern_umtx.c#4 (text+ko) >>>>==== >>>> >>>>@@ -49,25 +49,48 @@ >>>> pid_t uq_pid; /* Pid key component. */ >>>>}; >>>> >>>>#define UMTX_QUEUES 128 >>>>#define UMTX_HASH(pid, umtx) \ >>>>- (((uintptr_t)pid + ((uintptr_t)umtx & ~65535)) % UMTX_QUEUES) >>>>+ ((((uintptr_t)pid << 16) + ((uintptr_t)umtx & 65535)) % UMTX_QUEUES) >>>> >>>> >>>I'm curious why you changed the hash macro here? Low order bits of >>>pointers tend to be zero due to alignment, so I think this will result in >>>fewer "useful" bits and more collisions and longer chains. >>> >>> >>FYI, I changed back to original hash code: >>I simulate it under userland: >> >>printf("thread lock: %p hash=%d\n", &thread->lock, >> ((getpid() + ((unsigned)(&thread->lock) & ~0xFFFF))) % 128); >> >>get result: >> >>thread lock: 0x804e014 hash=39 >>thread lock: 0x8053014 hash=39 >>thread lock: 0x8056014 hash=39 >>thread lock: 0x805a014 hash=39 >>thread lock: 0x805d014 hash=39 >>thread lock: 0x8060014 hash=39 >>thread lock: 0x8063014 hash=39 >>thread lock: 0x8067014 hash=39 >>thread lock: 0x806a014 hash=39 >>thread lock: 0x806d014 hash=39 >>thread lock: 0x8070014 hash=39 >> >>So my version put all locks on chain 0, the orignal version put all >>locks on chain 39. :( Both seem bad algorithm. >>Note it is my private version of libpthread using thr and umtx, >>1:1 only. >> >> > >Humm, try changing the hash to not mask off bits from the umtx pointer. The >pid is useful because if locks are allocated at the same address in different >processes (e.g. if you run the same program N times) it will keep them from >all colliding, but if you mask off all the bits with only 128 queues it will >currently stick all locks from a process on the same queue. Maybe use >something similar to the 4BSD sleep queue hash (still used in >subr_sleepqueue.c) that looks like: > >/* > * Constants for the hash table of sleep queue chains. These constants are > * the same ones that 4BSD (and possibly earlier versions of BSD) used. > * Basically, we ignore the lower 8 bits of the address since most wait > * channel pointers are aligned and only look at the next 7 bits for the > * hash. SC_TABLESIZE must be a power of two for SC_MASK to work properly. > */ >#define SC_TABLESIZE 128 /* Must be power of 2. */ >#define SC_MASK (SC_TABLESIZE - 1) >#define SC_SHIFT 8 >#define SC_HASH(wc) (((uintptr_t)(wc) >> SC_SHIFT) & SC_MASK) >#define SC_LOOKUP(wc) &sleepq_chains[SC_HASH(wc)] > >So maybe use something like: > >#define UMTX_HASH(pid, umtx) \ > (((uintptr_t)(umtx) >> 8 + (uintptr_t)pid) % UMTX_QUEUES) > > > I t looks better, but I found with few locks, it still has two or more locks on same chain. I am thinking that we can use some code like this: char *p = &(umtx_p); unsigned k = 0; k = ((k << 2) + k) + p[0]; k = ((k << 2) + k) + p[1]; k = ((k << 2) + k) + p[2]; k = ((k << 2) + k) + p[3]; The above code multiplicates k by 5 (prime number), and add one byte from umtx pointer. I found all umtxes are now well distributed in hash table. I only calculate the hash number once in umtx_lock or umtx_unlock, not like current it repeatly calculates it in loop. David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?419A94E7.1080908>