From owner-freebsd-ports Sun Aug 4 10:50:48 2002 Delivered-To: freebsd-ports@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AD5DC37B4B7 for ; Sun, 4 Aug 2002 10:50:04 -0700 (PDT) Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0483343E65 for ; Sun, 4 Aug 2002 10:50:04 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.4/8.12.4) with ESMTP id g74Ho2JU082219 for ; Sun, 4 Aug 2002 10:50:02 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.4/8.12.4/Submit) id g74Ho29J082218; Sun, 4 Aug 2002 10:50:02 -0700 (PDT) Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 309D837B400 for ; Sun, 4 Aug 2002 10:43:39 -0700 (PDT) Received: from heaven.gigo.com (heaven.gigo.com [64.57.102.22]) by mx1.FreeBSD.org (Postfix) with ESMTP id B632743E5E for ; Sun, 4 Aug 2002 10:43:38 -0700 (PDT) (envelope-from lioux@brturbo.com) Received: from 200-163-006-239-bsace7003.dsl.telebrasilia.net.br (200-163-006-239-bsace7003.dsl.telebrasilia.net.br [200.163.6.239]) by heaven.gigo.com (Postfix) with ESMTP id 5A1FFB7F7 for ; Sun, 4 Aug 2002 10:43:29 -0700 (PDT) Received: (qmail 85780 invoked by uid 1001); 4 Aug 2002 17:22:34 -0000 Message-Id: <20020804172234.85779.qmail@exxodus.fedaykin.here> Date: 4 Aug 2002 17:22:34 -0000 From: Mario Sergio Fujikawa Ferreira Reply-To: Mario Sergio Fujikawa Ferreira To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: ports/41323: net/dctc freezes in semwait state if anyone tries an upload Sender: owner-freebsd-ports@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org >Number: 41323 >Category: ports >Synopsis: net/dctc freezes in semwait state if anyone tries an upload >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-ports >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Aug 04 10:50:02 PDT 2002 >Closed-Date: >Last-Modified: >Originator: Mario Sergio Fujikawa Ferreira >Release: FreeBSD 4.6-STABLE i386 >Organization: >Environment: System: FreeBSD exxodus.fedaykin.here 4.6-STABLE FreeBSD 4.6-STABLE #1: Sat Aug 3 09:26:28 BRT 2002 lioux@exxodus.fedaykin.here:/usr/obj/usr/src/sys/LIOUX i386 All FreeBSD platforms. Possibly all BSD based ones. Potentially other platforms as well since part of the problem is not platform specific. Any system using a user space threads implementation should be affected, check description below. All dctc versions up to latest 0.83.2. Patches are being sent to developer. Latest FreeBSD port of net/dctc version 0.83.2 has been fixed with port patches. >Description: Problem Report written to help others understand the port fix. None patches exist here, check the CVS (or appropriate control version system) for net/dctc port looking for this PR number. The fix is there in version 0.83.2 of the port. Explanation begins below dctc is a Direct Connect(TM) client. Amongst its advanced features are both bandwidth throttling and multiple part file download from multiple hubs. It employs a combination of multi-thread and multi-process programming models to achieve its goals. It uses one process for each Direct Connect(TM) hub it connects to. These processes use semaphores to insure that all processes are correctly synchronized so that all processes bandwidth usage summed up do not surpass user chosen bandwidth limits. On a per process basis, there are threads. One thread communicates with the hub while other thread manages bandwidth throttling. It periodically checks the overall bandwidth usage of the summed concurring processes throttling its process accordinly. PROBLEM: 1. Semaphore versus multi-threading dctc utilizes semaphores within each process to manage throttling cooperation amidst concurrent processes. Furthermore, each process has concurrent threads which manage throttling on a per process basis. Whenever a thread within the process accesses a semaphore to verify the current bandwidth usage of all other processes, it may block. This might lead to a deadlock scenario. For instance, pretend there is only one process running. It still has to both update and check bandwidth usage. Moreover, it has to check if other processes exist so that it can cooperate with them. Nevertheless, suppose one thread checks the semaphore then another thread from the same program tries to obtain the semaphore, it might block. This is specially true with upload bandwidth limitting. This is of importance since we are working with multiple threads. I have mentioned 2 threading models. Recall that a blocking call blocks the whole process. Consequently, the semaphore will block a single thread in model 1. Nonetheless, it will effectively block ALL threads of the program in model 2 since it will block the process containing all threads of the program. Therefore, blocking calls should be avoided in multi-threaded scenarios since it is not guaranteed that using a blocking call will not block ALL threads instead of only the calling thread. This affects all BSD implementations. Of course, blocking calls can still be used if the programmer plans carefully for this. Whenever an upload would begin, dctc would block in a semwait state requiring a kill(1) command invocation. 2. Hide absolute option not working Hide absolute is a dctc option to hide the leading / in a directory absolute reference when returning search results. Also, it prefixes all search results with character . Besides, it also triggers removal of leading / when building the available file database if enabled. However, when checking if a file requested for upload is available with int file_in_db(char *filename, int *virtual); inside src/dc_manage.c, files should be processed to remove the leading if hide absolute is enabled. Since this does not happen, all upload requests do not work except for the available file list (dcflist). >How-To-Repeat: 1. Install net/dctc version 0. in one of the affected platforms. Or, build it against a user space thread implementation in one of the unaffected ones. 2. Connect to a Direct Connect(TM) hub 3. Ask someone to try fetching either your available file list or any file for that matter 4. Client freezes on semwait state >Fix: The fix is 2 fold. First, we need to prevent the client from blocking. Then, we need the client to correctly implement the hide absolute option so that working it can properly process upload requests. 1. Prevent blocking Since we are having problems with semaphores blocking ALL threads. We could tell them not to block then write them as a busy wait construct with a small time interval between retries. Investigating src/sema.c source code, the only blocking calls are semop(2) with -1 as operation parameters. Consequently, we will both add IPC_NOWAIT to flag parameters in all of those and rewrite them as busy wait constructs. Nevertheless, this does not solve the contention problem. Semaphores are built for protection of a shared resource; thus, we should add a thread appropriate mutual exclusion mechanism. The most appropriate seems to be mutexes. Also, whenever we busy wait, we will call void pthread_yield(void); from pthread(3), increasing the chance that a concurrent thread releases the semaphore before our next try. See the example Example 1-1 Example 1-1. Replacing a blocking semaphore operation with a non-blocking busy wait one Replace void get_slice(int semid, SPD_SEMA semnum) { while(1) { 5 struct sembuf local={0,-1,0}; /* slave sema */ local.sem_num=semnum; if(semop(semid,&local,1)==0) { 10 /* we have what we want */ return; } } } 15 with hopefully portable #include /* interval between busy wait tries measured in microseconds */ #define MUTEX_BUSY_WAIT_TIME 5000 5 void get_slice(int semid, SPD_SEMA semnum) { #if !(defined(BSD) && (BSD >= 199103)) struct sembuf local={0,-1,0}; /* slave sema */ 10#else struct sembuf local={0,-1,0|IPC_NOWAIT}; /* slave sema */ (1) #endif local.sem_num=semnum; 15 (void) lp_mutex_lock_(semaphore_mutex); (2) while(1) { switch (semop(semid,&local,1)) { case 0: (void) lp_mutex_unlock_(semaphore_mutex); (3) 20 /* we have what we want */ return; break; case -1: switch(errno) { (4) case EAGAIN: /* triggers busy wait */ 25 case EINTR: /* interrupted by system call, try again */ pthread_yield(); (5) usleep(MUTEX_BUSY_WAIT_TIME); /* busy wait with a small time out */ (6) continue; break; 30 } } } } (1) Have a non-blocking semaphore (2) Add a mutex protection around the shared semaphore. Acquire mutex lock before trying semaphore (3) Add a mutex protection around the shared semaphore. Release mutex lock when we are done with the semaphore (4) If both the semaphore fails AND it signals that it would have blocked if it could, we will have to try the semaphore again (5) Before trying again, we yield the processor by letting another thread run. Another thread might release the required semaphore (6) To avoid hogging the processor with multiple retries, we will wait a time interval before retrying 1.1 Fix i. Copy my lp_mutex.c BSD licensed mutex handling routines to subdirectory src ii. Apply patch patch-configure.in against configure.in to enable detection of header file sys/param.h which is used to detect if current system is BSD based iii. Apply patch patch-src::Makefile.in against src/Makefile.in to connect lp_mutex.c to the build iv. Apply patch patch-src::sema.c against src/sema.c hopefully adding more portable semaphore code 2. Correct hide absolute option First, we need to pass the correct upload requests to int file_in_db(char *filename, int *virtual); routine that checks if the files exist. Then, we have to make sure that this checking routine understands the requests. We will move the hide absolute option handling routines before the check routine. Then, we will add a specific handling to the check routine. 2.1 Fix i. Apply patch patch-src::dc_manage.c against src/dc_manage.c so that the checking routine receives a correct file request ii. Apply patch patch-src::mydb.c against src/mydb.c so that the checking routine understands the file requested 3. PENDING problems: * Upload bandwidth limitation option has to be enabled for upload to work at all. If it is not enabled, clients are dropped after transferring a few kb * When connected to multiple hubs, the client uses a lot of processing power. Checking system load, it is way over 100%. This happened before the non- blocking semaphore busy wait fix as well >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-ports" in the body of the message