From owner-freebsd-current@freebsd.org Wed Dec 11 10:48:42 2019 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 830941D3689 for ; Wed, 11 Dec 2019 10:48:42 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47XtvY2glrz3xjB for ; Wed, 11 Dec 2019 10:48:41 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f178.google.com with SMTP id 21so23552705ljr.0 for ; Wed, 11 Dec 2019 02:48:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:openpgp:autocrypt:message-id :date:user-agent:mime-version:content-language :content-transfer-encoding; bh=5ccsfBHoDP2F61Eg1nluy0gJshVvLSUQCSFZQYhs1mM=; b=mron7trhUb1ZvTbL17380qtl6+xI2aKIYKq3spHzi8hYlLxeO9O4nGXn0abFsDYtuL kXC79gWnwmS84s2FkArrVqXk2KZpeLTAcPGZc7C7i57sN3yoDiWhoRIDXo+FrfuRCFOl zeAl/KsayB94a1ZF1yx2/XzoNgn6ieHl+RtvWfTelaLd9qzzqcvdvK11UYSOkqklduTN UaLjdZtnv5QyUmBfzEbd+EEYHk6548YBxoCUOx6nV/K79pwAEnotTAyb44zHveIbNqe4 SXVvDJ4m7ZmmlsOecw7HzyY9N+E1cVVsG5d+fTCITUPXIKUMQp2NqFmH543VkCXgcxQY M8Fw== X-Gm-Message-State: APjAAAWw4cEVJnhsSVRb0T1H3pZo6VDKnweBPHGnArwQ9TSuZAw/ojcL JdtNnLA5VXk0rDxxmcPU7iGyYEqm8Tg= X-Google-Smtp-Source: APXvYqzk0GEgP3IW8IHEX0N/xZNxd6HnJZbBRjpH8ARmM4Pu79LFgc1KHM8NqYY4AEI7vliMeA8LAg== X-Received: by 2002:a2e:b017:: with SMTP id y23mr1633925ljk.229.1576061318606; Wed, 11 Dec 2019 02:48:38 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id u16sm890299ljo.22.2019.12.11.02.48.37 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2019 02:48:37 -0800 (PST) To: FreeBSD Current From: Andriy Gapon Subject: any scheduler/ipi/wakeup bug fixed in the last year? Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <7196727c-77bd-8c96-5194-3dcc7277d4bd@FreeBSD.org> Date: Wed, 11 Dec 2019 12:48:36 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 47XtvY2glrz3xjB X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-2.11 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; R_DKIM_NA(0.00)[]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; TO_DOM_EQ_FROM_DOM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; DMARC_NA(0.00)[FreeBSD.org]; RCPT_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; RCVD_IN_DNSWL_NONE(0.00)[178.208.85.209.list.dnswl.org : 127.0.5.0]; IP_SCORE(-1.11)[ip: (-0.45), ipnet: 209.85.128.0/17(-3.14), asn: 15169(-1.92), country: US(-0.05)]; RWL_MAILSPIKE_POSSIBLE(0.00)[178.208.85.209.rep.mailspike.net : 127.0.0.17]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Dec 2019 10:48:42 -0000 I am investigating a problem that originally looked like a ZFS I/O hang. But it quickly became obvious that the GEOM "up" queue was not being processed. (kgdb) p g_bio_run_up $54 = {bio_queue = {tqh_first = 0xfffff801d8627178, tqh_last = 0xfffff80134751658}, bio_queue_lock = {lock_object = {lo_name = 0xffffffff80ad11ab "bio queue", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, bio_queue_length = 19} The queue is unlocked and there are 19 bio-s on it. At the same time: (kgdb) tid 100125 (kgdb) bt #0 sched_switch (td=0xfffff80111b23000, newtd=0xfffff801119d2000, flags=) at /usr/src/sys/kern/sched_ule.c:1997 #1 0xffffffff80705405 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:436 #2 0xffffffff8074844a in sleepq_wait (wchan=, pri=) at /usr/src/sys/kern/subr_sleepqueue.c:694 #3 0xffffffff80704ed6 in _sleep (ident=0xffffffff81233d68 , lock=0xffffffff810d72e0 , priority=, wmesg=0xffffffff80b417e4 "-", sbt=0, pr=0, flags=256) at /usr/src/sys/kern/kern_synch.c:216 #4 0xffffffff8067713c in g_io_schedule_up (tp=) at /usr/src/sys/geom/geom_io.c:908 #5 0xffffffff8067772d in g_up_procbody (arg=) at /usr/src/sys/geom/geom_kern.c:99 #6 0xffffffff806c64c1 in fork_exit (callout=0xffffffff806776c0 , arg=0x0, frame=0xfffffe014cc87ac0) at /usr/src/sys/kern/kern_fork.c:1042 The "g_up" thread is sleeping as if the queue was empty. The code in g_io_schedule_up() and g_io_deliver() is obviously correct with respect to synchronizing the queue access and wait/wakeup. So, there must be something deeper. I examined the struct thread and the related scheduling objects: (kgdb) p *td $57 = {td_lock = 0xffffffff810f3a00 , td_proc = 0xfffff801119cd590, td_plist = {tqe_next = 0xfffff80111b1f5e0, tqe_prev = 0xfffff80111b235f0}, td_runq = {tqe_next = 0x0, tqe_prev = 0xffffffff810f3bd8 }, td_slpq = {tqe_next = 0x0, tqe_prev = 0xfffff80100050280}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xfffffe018e443998}, td_hash = {le_next = 0x0, le_prev = 0xfffffe014bab68e8}, td_cpuset = 0xfffff80111b3a618, td_domain = {dr_policy = 0xffffffff810d78d8 , dr_iterator = 0}, td_sel = 0x0, td_sleepqueue = 0xfffff80100050280, td_turnstile = 0xfffff801a7ed8a80, td_rlqe = 0x0, td_umtxq = 0xfffff80111b13e80, td_tid = 100125, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_ptrace = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffff80111b230d8}, sq_proc = 0xfffff801119cd590, sq_flags = 1}, td_lend_user_pri = 255 '\377', td_flags = 4, td_inhibitors = 0, td_pflags = 2097152, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_owepreempt = 0 '\000', td_tsqueue = 0 '\000', td_locks = 0, td_rw_rlocks = 0, td_sx_slocks = 0, td_lk_slocks = 0, td_stopsched = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xfffff80100082b00, td_limit = 0xfffff80100082a00, td_slptick = 0, td_blktick = 0, td_swvoltick = -2139537593, td_swinvoltick = -2139537706, td_cow = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 14113408, ru_nivcsw = 240828}, td_rux = {rux_runtime = 202213463115, rux_uticks = 0, rux_sticks = 10554, rux_iticks = 0, rux_uu = 0, rux_su = 36818497, rux_tu = 36818497}, td_incruntime = 46828278, td_runtime = 202260266673, td_pticks = 10557, td_sticks = 3, td_iticks = 0, td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_generation = 14354236, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 0}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = "g_up", '\000' , td_fpop = 0x0, td_dbgflags = 0, td_si = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, td_ng_outbound = 0, td_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_map_def_user = 0x0, td_dbg_forked = 0, td_vp_reserv = 0, td_no_sleeping = 0, td_su = 0x0, td_sleeptimo = 0, td_rtcgen = 0, td_sigmask = {__bits = {0, 0, 0, 0}}, td_rqindex = 23 '\027', td_base_pri = 92 '\\', td_priority = 92 '\\', td_pri_class = 3 '\003', td_user_pri = 120 'x', td_base_user_pri = 120 'x', td_rb_list = 0, td_rbp_list = 0, td_rb_inact = 0, td_sa = {code = 0, callp = 0x0, args = {0 }, narg = 0}, td_pcb = 0xfffffe014cc87b80, td_state = TDS_RUNQ, td_uretoff = {tdu_retval = {0, 0}, tdu_off = 0}, td_cowgen = 0, td_slpcallout = {c_links = {le = {le_next = 0x0, le_prev = 0x0}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0x0, c_lock = 0x0, c_flags = 0, c_iflags = 16, c_cpu = 0, c_exec_time = 0, c_lines = {u128 = 1481, u16 = {1481, 0, 0, 0, 0, 0, 0, 0}}}, td_frame = 0xfffffe014cc87ac0, td_kstack_obj = 0xfffff80111b31e00, td_kstack = 18446741880269406208, td_kstack_pages = 16, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582, md_spurflt_addr = 0, md_invl_gen = {gen = 0, link = {le_next = 0x0, le_prev = 0x0}}}, td_ar = 0x0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0xfffff80111aef700, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0, td_rfppwait_p = 0x0, td_ma = 0x0, td_ma_cnt = 0, td_emuldata = 0x0, td_lastcpu = 3, td_oncpu = -1, td_lkpi_task = 0x0} (kgdb) p td->td_oncpu $58 = -1 (kgdb) p *td->td_sleepqueue $61 = {sq_blocked = {{tqh_first = 0x0, tqh_last = 0xfffff80100050280}, {tqh_first = 0x0, tqh_last = 0xfffff80100050290}}, sq_blockedcnt = {0, 0}, sq_hash = {le_next = 0xfffff807d069c380, le_prev = 0xffffffff811c7c40 }, sq_free = {lh_first = 0x0}, sq_wchan = 0xffffffff81233d68 , sq_type = 0} (kgdb) p *(struct td_sched *)&td[1] $68 = {ts_runq = 0xffffffff810f3a60 , ts_flags = 2, ts_cpu = 3, ts_rltick = -2139537593, ts_slice = 0, ts_slptime = 4601740, ts_runtime = 30146, ts_ltick = -2139537587, ts_ftick = -2139547918, ts_ticks = 20239, ts_name = "g_up tid 100125", '\000' } (kgdb) p tdq_cpu[3] $70 = {tdq_lock = {lock_object = {lo_name = 0xffffffff810f4678 "sched lock 3", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, tdq_cg = 0xffffffff811ce8b0 , tdq_load = 1, tdq_cpu_idle = 1, tdq_sysload = 1, tdq_transferable = 1, tdq_switchcnt = 1, tdq_oldswitchcnt = 26, tdq_lowpri = 92 '\\', tdq_ipipending = 0 '\000', tdq_idx = 9 '\t', tdq_ridx = 9 '\t', tdq_realtime = { ... What I see here: - td_state = TDS_RUNQ -- so the thread is "kind of" woken up - td_oncpu = -1 -- but not quite - ts_cpu = 3 -- looks like it was placed on CPU3's run queue - tdq_cpu_idle = 1 -- the CPU still "runs" its idle thread So, if I am not confused, it appears like possibly a notification from a waking CPU to the woken CPU (CPU3) was never delivered. Potentially, a problem with cpu_idle_wakeup() ? I wonder if there have been any bug fixes in that area over the past year or so. Any help and pointers are welcome. Thank you! -- Andriy Gapon