From owner-freebsd-current@FreeBSD.ORG Fri Oct 21 20:46:38 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A359416A41F; Fri, 21 Oct 2005 20:46:38 +0000 (GMT) (envelope-from mcsi@mcsi.pp.ru) Received: from portpc-design.spb.ru (portpc-design.spb.ru [81.176.64.110]) by mx1.FreeBSD.org (Postfix) with ESMTP id A025143D4C; Fri, 21 Oct 2005 20:46:37 +0000 (GMT) (envelope-from mcsi@mcsi.pp.ru) Received: from [83.237.55.177] (ppp83-237-55-177.pppoe.mtu-net.ru [83.237.55.177]) (authenticated bits=0) by portpc-design.spb.ru (8.13.5/8.13.5) with ESMTP id j9LKkZPa071927; Sat, 22 Oct 2005 00:46:35 +0400 (MSD) (envelope-from mcsi@mcsi.pp.ru) Message-ID: <435953A6.6090203@mcsi.pp.ru> Date: Sat, 22 Oct 2005 00:46:30 +0400 From: Maxim Maximov User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20050929 X-Accept-Language: ru, en-us, en MIME-Version: 1.0 To: current@freebsd.org References: <43590814.5090201@mcsi.pp.ru> <4359483C.6000808@mcsi.pp.ru> In-Reply-To: <4359483C.6000808@mcsi.pp.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV version 0.87, clamav-milter version 0.87 on 81.176.64.226 X-Virus-Status: Clean Cc: Bill Paul Subject: Re: boot panic (NDIS, SCHED_ULE?) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Oct 2005 20:46:38 -0000 Maxim Maximov wrote: > Maxim Maximov wrote: > >> Hi. >> >> Got boot time panic on fresh CURRENT. >> NDIS hardware: >> >> ndis0: mem 0xfeaf8000-0xfeaf9fff irq 17 >> at device 2.0 on pci2 >> ndis0: NDIS API version: 5.0 >> ndis0: Ethernet address: 00:0e:a6:c2:00:e4 >> >> Panic: >> ... >> Timecounters tick every 1.000 msec >> kernel trap 12 with interrupts disabled >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x109 >> fault code = supervisor read, page not present >> instruction pointer = 0x20:0xc06a7570 >> stack pointer = 0x28:0xd5985cbc >> frame pointer = 0x28:0xd5985cc4 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, def32 1, gran 1 >> processor eflags = resume, IOPL = 0 >> current process = 41 (Windows DPC 1) >> >> >> Hand transcribed trace, abbreviated: >> >> >> Stopped at kseq_notify+0x94: cmpb 0x109(%edx), %al > > > Debugging shows that this is: > > pcpu = pcpu_find(cpu); > td = pcpu->pc_curthread; > if (ke->ke_thread->td_priority < td->td_priority || > > ^^^^^ here. > > td == pcpu->pc_idlethread) { > td->td_flags |= TDF_NEEDRESCHED; > ipi_selected(1 << cpu, IPI_AST); > } > > And %edx holds 'td' pointer, not ke->ke_thread. > So I wonder could this be sched_ule problem, just being triggered by new > NDIS code? I'll try to build sched_4bsd kernel now to see if it disappears. > The panic remains. It has changed of course: kick_other_cpu sched_add setrunqueue sched_switch mi_switch sched_bind ntoskrnl_dpc_thread ... But it's still the same: scheduler cannot dereference pcpu->pc_curthread. cpu1 is not started yet. 'show allpcpu' shows curthread on cpu1 as none, so I guess it is just illegal to call sched_bind() so early on boot. sched_bind() was used in r1.75 of subr_ntoskrnl.c. Should I try to just remove this line? >> >> >trace >> kseq_notify(c1edbb24,1,c09c4520,c1edbb24,c08236bc) ...+0x94 >> sched_bind(c1edb9c0,1) ...+0x62 >> ntoskrnl_dpc_thread(c1f4fc3c,d5985d38,c1f4fc3c,c08236bc,0) ...+0x73 >> fork_exit(c08236bc,c1f4fc3c,d5985d38) ...+0xa4 >> fork_trampoline() >> > > > -- Maxim Maximov