Date: Tue, 28 Jan 2014 14:07:08 +0100 From: Jens Krieg <jkrieg@mailbox.tu-berlin.de> To: freebsd-hackers@freebsd.org Subject: ULE locking mechanism Message-ID: <FD4193F4-FA47-4D77-BC1F-23749D9B7E5F@mailbox.tu-berlin.de>
next in thread | raw e-mail | index | archive | help
Hello, we are currently working on project for our university. Our goal is to implement a simple round robin scheduler for FreeBSD 9.2 on a single core machine. So far we removed most of the functionality of the ULE scheduler except the functions that are called from outside. The system successfully boots to user land with our RR scheduler managing thread in a list based run queue. Further, it is possible to interact with the system using the shell. The next step is to replace the locking mechanism of the ULE scheduler. Therefore, we replaced the scheduling dependent thread_lock/thread_unlock functions by simply disabling/enabling the interrupts. With this modification the kernel works fine until we hit the user land then the system crashes. The error occurs in the init user process (init_main.c:start_init:685). We found out that the page fault is triggered while executing the subyte function for the first time. See the error description below (unfortunately not shown in backtrace). We compared the ULE scheduler with our RR implementation and it appears, that the parameters passed to subyte as well as the register values are identical. We assume, that whatever caused the error is related to the thread locking replacement. Every time the kernel want to modify thread data the corresponding thread is locked to prevent any interference by other threads. Since we are using a single core machine why isn’t it sufficient to simply disable interrupt while modifying thread data. Could you provide us with detailed information about the locking mechanism in FreeBSD and also answer the following questions, please. What is the purpose of thread_lock/thread_unlock besides protecting thread data? How does the TDQ LOCK works and how is it related to a thread LOCK? - all thread LOCKs of the thread located in the run queue pointing to the TDQ LOCK, and - the TDQ LOCK points to the currently running thread - on context switching the current thread passes the TDQ LOCK to the new chosen thread - Could you explain the idea behind that locking concept, please? Any suggestions we shall care about in our own lock implementation? Kind regards, Jens Krieg start_init: trying /sbin/init Fatal trap 12: page fault while in kernel mode fault virtual address = 0x7fffffffefff fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff808ab119 stack pointer = 0x28:0xffffff800020db30 frame pointer = 0x28:0xffffff800020dbe0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1 (kernel) trap number = 12 panic: page fault KDB: stack backtrace: #0 0xffffffff806e19cf at kdb_backtrace+0x5f #1 0xffffffff806b2ddb at panic+0x15b #2 0xffffffff808ac797 at trap_fatal+0x267 #3 0xffffffff808accfc at trap_pfault+0x40c #4 0xffffffff808ad0ca at trap+0x37a #5 0xffffffff8089839f at calltrap+0x8 #6 0xffffffff80687c4d at fork_exit+0x9d #7 0xffffffff808988ce at fork_trampoline+0xe
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FD4193F4-FA47-4D77-BC1F-23749D9B7E5F>
