From nobody Fri Oct 28 16:37:10 2022
X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4MzSr62ZMqz4gWNc;
	Fri, 28 Oct 2022 16:37:10 +0000 (UTC)
	(envelope-from git@FreeBSD.org)
Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (4096 bits) client-digest SHA256)
	(Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4MzSr61zPwz3wds;
	Fri, 28 Oct 2022 16:37:10 +0000 (UTC)
	(envelope-from git@FreeBSD.org)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim;
	t=1666975030;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding;
	bh=mupgSDHz0v6eNmwV5NeNiuAxIznjnEth3DVyl3j3qsE=;
	b=kYN+t4ccgVoHLyWki2UUSPCRxbFF3hBfMBT56jDp+DJyj2VLGUJAFG59ns1n7kHa6xazjk
	R9j5zB8gml1jSi5T7KzU27R0FBIZ/O6N7ae4rXMWv/jB3kiWO3TC7nVooFprTwkGZJRE8n
	iN29K83za1dDeMzYBAX54DlaGmMFZSwDXH9swFAk+segAB90axk+UqLA50eWaxSttQ+U6p
	pJifKoialh04s85+5zMKzT3JZjGMUzM4/EXUcRTPEW8xKhaNoXEATqJxJpLCKu2Cd8fcmj
	e57nHS3b2ZPdDbdillx89VkvDppjmum1XtzS9F2ANpOMVOUw1P/Qk2T3K5O+Ag==
Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(Client did not present a certificate)
	by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4MzSr6118BzkhW;
	Fri, 28 Oct 2022 16:37:10 +0000 (UTC)
	(envelope-from git@FreeBSD.org)
Received: from gitrepo.freebsd.org ([127.0.1.44])
	by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 29SGbA1d039838;
	Fri, 28 Oct 2022 16:37:10 GMT
	(envelope-from git@gitrepo.freebsd.org)
Received: (from git@localhost)
	by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 29SGbA81039837;
	Fri, 28 Oct 2022 16:37:10 GMT
	(envelope-from git)
Date: Fri, 28 Oct 2022 16:37:10 GMT
Message-Id: <202210281637.29SGbA81039837@gitrepo.freebsd.org>
To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org,
        dev-commits-src-main@FreeBSD.org
From: Mitchell Horne <mhorne@FreeBSD.org>
Subject: git: 701923e2a410 - main - riscv: improve parsing of riscv,isa property strings
List-Id: Commit messages for all branches of the src repository <dev-commits-src-all.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all
List-Help: <mailto:dev-commits-src-all+help@freebsd.org>
List-Post: <mailto:dev-commits-src-all@freebsd.org>
List-Subscribe: <mailto:dev-commits-src-all+subscribe@freebsd.org>
List-Unsubscribe: <mailto:dev-commits-src-all+unsubscribe@freebsd.org>
Sender: owner-dev-commits-src-all@freebsd.org
X-BeenThere: dev-commits-src-all@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Git-Committer: mhorne
X-Git-Repository: src
X-Git-Refname: refs/heads/main
X-Git-Reftype: branch
X-Git-Commit: 701923e2a4105be606c5263181b6eb6f546f1a84
Auto-Submitted: auto-generated
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org;
	s=dkim; t=1666975030;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding;
	bh=mupgSDHz0v6eNmwV5NeNiuAxIznjnEth3DVyl3j3qsE=;
	b=vPsyy9taSw9IBmZWglDrA0NqSl4oR9bEJOhC5hp5bqxHSvjgfian5pynN2iTuxDeys+ZA/
	fGUtbvoFXvcnQ8OWfztlI4HlG+AIErOT+srKr//TGWcZl3d4yfGKWDa6fp2nIr+jdHd+Vp
	xW9XtKhMdi9JrTZeIFNll+5jJZrVjBCbckPhMDbyh7GxHrEb+Gm357SIUAQZPas4AAWMdX
	qZt5qFE8pO765JCXiyxVWzyqXVN2eCJ8mSys+gQayiXqxUov+6YyozE2vzZ6tTKgo8kjng
	I2ficC3mLK+mywy4qVpZWHL1ST78pIurIgFMay8Ia6pWiwaURTCh2EL5kYRlFA==
ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1666975030; a=rsa-sha256; cv=none;
	b=hCSE6oSJF+5U57lLQWbwWwtB9C5rxV2kbmWqDffYFK5Wn8dTQR+SqPxqXgIoJ9YRlvEQid
	wz3hPCyRsEY4gGS1cC5cKqTxGqOXO9LHvAqnvYXWVJc9Or9ApCeUsS6xDs73bphSzorgvc
	DKLPA5Qd25Xd0i/chaiJ07sYPc8jFcXbgKojti53bS+XO3yQKRWN3zJwmbwFIDCgllE1yt
	5M3JTkJZ17rzpY7ATUWk0/q5ejux2mtCZuO9HALpreR6ZKO9SsNTeFkd/ax7QCBl81maDN
	bG46kg0AnWC3QsxzpymAatPkMJR4MlUAG5fjsGm614+MPHErvre02blvGHqTGA==
ARC-Authentication-Results: i=1;
	mx1.freebsd.org;
	none
X-ThisMailContainsUnwantedMimeParts: N

The branch main has been updated by mhorne:

URL: https://cgit.FreeBSD.org/src/commit/?id=701923e2a4105be606c5263181b6eb6f546f1a84

commit 701923e2a4105be606c5263181b6eb6f546f1a84
Author:     Mitchell Horne <mhorne@FreeBSD.org>
AuthorDate: 2022-10-28 16:28:08 +0000
Commit:     Mitchell Horne <mhorne@FreeBSD.org>
CommitDate: 2022-10-28 16:28:08 +0000

    riscv: improve parsing of riscv,isa property strings
    
    This code was originally written under the assumption that the ISA
    string would only contain single-letter extensions. The RISC-V
    specification has extended its description of the format quite a bit,
    allowing for much longer ISA strings containing multi-letter extension
    names.
    
    Newer versions of QEMU (7.1.0) will append to the riscv,isa property
    indicating the presence of multi-letter standard extensions such as
    Zfencei. This triggers a KASSERT about the expected length of the
    string, preventing boot.
    
    Increase the size of the isa array significantly, and teach the code
    to parse (skip over) multi-letter extensions, and optional extension
    version numbers. We currently ignore them completely, but this will
    change in the future as we start supporting supervisor-level extensions.
    
    MFC after:      2 weeks
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D36601
---
 sys/riscv/include/elf.h    |  14 ++--
 sys/riscv/riscv/identcpu.c | 182 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 169 insertions(+), 27 deletions(-)

diff --git a/sys/riscv/include/elf.h b/sys/riscv/include/elf.h
index 671e2d2617c0..c0817d8fb47b 100644
--- a/sys/riscv/include/elf.h
+++ b/sys/riscv/include/elf.h
@@ -75,13 +75,13 @@ __ElfType(Auxinfo);
 #define	ET_DYN_LOAD_ADDR 0x100000
 
 /* Flags passed in AT_HWCAP */
-#define	HWCAP_ISA_BIT(c)	(1 << ((c) - 'A'))
-#define	HWCAP_ISA_I		HWCAP_ISA_BIT('I')
-#define	HWCAP_ISA_M		HWCAP_ISA_BIT('M')
-#define	HWCAP_ISA_A		HWCAP_ISA_BIT('A')
-#define	HWCAP_ISA_F		HWCAP_ISA_BIT('F')
-#define	HWCAP_ISA_D		HWCAP_ISA_BIT('D')
-#define	HWCAP_ISA_C		HWCAP_ISA_BIT('C')
+#define	HWCAP_ISA_BIT(c)	(1 << ((c) - 'a'))
+#define	HWCAP_ISA_I		HWCAP_ISA_BIT('i')
+#define	HWCAP_ISA_M		HWCAP_ISA_BIT('m')
+#define	HWCAP_ISA_A		HWCAP_ISA_BIT('a')
+#define	HWCAP_ISA_F		HWCAP_ISA_BIT('f')
+#define	HWCAP_ISA_D		HWCAP_ISA_BIT('d')
+#define	HWCAP_ISA_C		HWCAP_ISA_BIT('c')
 #define	HWCAP_ISA_G		\
     (HWCAP_ISA_I | HWCAP_ISA_M | HWCAP_ISA_A | HWCAP_ISA_F | HWCAP_ISA_D)
 
diff --git a/sys/riscv/riscv/identcpu.c b/sys/riscv/riscv/identcpu.c
index f3afa9b8c7ea..4c151eb47939 100644
--- a/sys/riscv/riscv/identcpu.c
+++ b/sys/riscv/riscv/identcpu.c
@@ -39,6 +39,7 @@ __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
+#include <sys/ctype.h>
 #include <sys/kernel.h>
 #include <sys/pcpu.h>
 #include <sys/sysctl.h>
@@ -104,34 +105,171 @@ const struct cpu_implementers cpu_implementers[] = {
 	CPU_IMPLEMENTER_NONE,
 };
 
-#ifdef FDT
 /*
- * The ISA string is made up of a small prefix (e.g. rv64) and up to 26 letters
- * indicating the presence of the 26 possible standard extensions. Therefore 32
- * characters will be sufficient.
+ * The ISA string describes the complete set of instructions supported by a
+ * RISC-V CPU. The string begins with a small prefix (e.g. rv64) indicating the
+ * base ISA. It is followed first by single-letter ISA extensions, and then
+ * multi-letter ISA extensions.
+ *
+ * Underscores are used mainly to separate consecutive multi-letter extensions,
+ * but may optionally appear between any two extensions. An extension may be
+ * followed by a version number, in the form of 'Mpm', where M is the
+ * extension's major version number, and 'm' is the minor version number.
+ *
+ * The format is described in detail by the "ISA Extension Naming Conventions"
+ * chapter of the unprivileged spec.
  */
-#define	ISA_NAME_MAXLEN		32
 #define	ISA_PREFIX		("rv" __XSTRING(__riscv_xlen))
 #define	ISA_PREFIX_LEN		(sizeof(ISA_PREFIX) - 1)
 
+static __inline int
+parse_ext_s(char *isa, int idx, int len)
+{
+	/*
+	 * Proceed to the next multi-letter extension or the end of the
+	 * string.
+	 *
+	 * TODO: parse these once we gain support
+	 */
+	while (isa[idx] != '_' && idx < len) {
+		idx++;
+	}
+
+	return (idx);
+}
+
+static __inline int
+parse_ext_x(char *isa, int idx, int len)
+{
+	/*
+	 * Proceed to the next multi-letter extension or the end of the
+	 * string.
+	 */
+	while (isa[idx] != '_' && idx < len) {
+		idx++;
+	}
+
+	return (idx);
+}
+
+static __inline int
+parse_ext_z(char *isa, int idx, int len)
+{
+	/*
+	 * Proceed to the next multi-letter extension or the end of the
+	 * string.
+	 *
+	 * TODO: parse some of these.
+	 */
+	while (isa[idx] != '_' && idx < len) {
+		idx++;
+	}
+
+	return (idx);
+}
+
+static __inline int
+parse_ext_version(char *isa, int idx, u_int *majorp __unused,
+    u_int *minorp __unused)
+{
+	/* Major version. */
+	while (isdigit(isa[idx]))
+		idx++;
+
+	if (isa[idx] != 'p')
+		return (idx);
+	else
+		idx++;
+
+	/* Minor version. */
+	while (isdigit(isa[idx]))
+		idx++;
+
+	return (idx);
+}
+
+/*
+ * Parse the ISA string, building up the set of HWCAP bits as they are found.
+ */
 static void
-fill_elf_hwcap(void *dummy __unused)
+parse_riscv_isa(char *isa, int len, u_long *hwcapp)
 {
-	u_long caps[256] = {0};
-	char isa[ISA_NAME_MAXLEN];
 	u_long hwcap;
-	phandle_t node;
-	ssize_t len;
 	int i;
 
-	caps['i'] = caps['I'] = HWCAP_ISA_I;
-	caps['m'] = caps['M'] = HWCAP_ISA_M;
-	caps['a'] = caps['A'] = HWCAP_ISA_A;
+	hwcap = 0;
+	i = ISA_PREFIX_LEN;
+	while (i < len) {
+		switch(isa[i]) {
+		case 'a':
+		case 'c':
 #ifdef FPE
-	caps['f'] = caps['F'] = HWCAP_ISA_F;
-	caps['d'] = caps['D'] = HWCAP_ISA_D;
+		case 'd':
+		case 'f':
 #endif
-	caps['c'] = caps['C'] = HWCAP_ISA_C;
+		case 'i':
+		case 'm':
+			hwcap |= HWCAP_ISA_BIT(isa[i]);
+			i++;
+			break;
+		case 'g':
+			hwcap |= HWCAP_ISA_G;
+			i++;
+			break;
+		case 's':
+			/*
+			 * XXX: older versions of this string erroneously
+			 * indicated supervisor and user mode support as
+			 * single-letter extensions. Detect and skip both 's'
+			 * and 'u'.
+			 */
+			if (isa[i - 1] != '_' && isa[i + 1] == 'u') {
+				i += 2;
+				continue;
+			}
+
+			/*
+			 * Supervisor-level extension namespace.
+			 */
+			i = parse_ext_s(isa, i, len);
+			break;
+		case 'x':
+			/*
+			 * Custom extension namespace. For now, we ignore
+			 * these.
+			 */
+			i = parse_ext_x(isa, i, len);
+			break;
+		case 'z':
+			/*
+			 * Multi-letter standard extension namespace.
+			 */
+			i = parse_ext_z(isa, i, len);
+			break;
+		case '_':
+			i++;
+			continue;
+		default:
+			/* Unrecognized/unsupported. */
+			i++;
+			break;
+		}
+
+		i = parse_ext_version(isa, i, NULL, NULL);
+	}
+
+	if (hwcapp != NULL)
+		*hwcapp = hwcap;
+}
+
+#ifdef FDT
+static void
+fill_elf_hwcap(void *dummy __unused)
+{
+	char isa[1024];
+	u_long hwcap;
+	phandle_t node;
+	ssize_t len;
 
 	node = OF_finddevice("/cpus");
 	if (node == -1) {
@@ -152,7 +290,7 @@ fill_elf_hwcap(void *dummy __unused)
 			continue;
 
 		len = OF_getprop(node, "riscv,isa", isa, sizeof(isa));
-		KASSERT(len <= ISA_NAME_MAXLEN, ("ISA string truncated"));
+		KASSERT(len <= sizeof(isa), ("ISA string truncated"));
 		if (len == -1) {
 			if (bootverbose)
 				printf("fill_elf_hwcap: "
@@ -165,9 +303,13 @@ fill_elf_hwcap(void *dummy __unused)
 			return;
 		}
 
-		hwcap = 0;
-		for (i = ISA_PREFIX_LEN; i < len; i++)
-			hwcap |= caps[(unsigned char)isa[i]];
+		/*
+		 * The string is specified to be lowercase, but let's be
+		 * certain.
+		 */
+		for (int i = 0; i < len; i++)
+			isa[i] = tolower(isa[i]);
+		parse_riscv_isa(isa, len, &hwcap);
 
 		if (elf_hwcap != 0)
 			elf_hwcap &= hwcap;