From nobody Sun Mar 1 11:28:59 2026 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4fP0FR3bKjz6SnDM for ; Sun, 01 Mar 2026 11:28:59 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4fP0FR2lRFz3sny for ; Sun, 01 Mar 2026 11:28:59 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1772364539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=MZBBIHPmwoOzK6kQq6a1AR9x82e0OdoWYjFnjf8WMxs=; b=XaDzOXYDHkcUH5tQA3FCKhyDh5on2281okdE+34uCe/CKN+VYC930qVH7DynKtI45UJ88K nK5j1kxCmTEcFoZZtA/zqrIi4n6L0JDHhjJo+5trofZjcGFDh9MInkhrzh8MImdvEoM9iP hE99NlFovGLZsOkdVt9zvfJScf4igUKw2amc4kz+NnA3ECTSI2B30bpV7v9qWy+CY3q6Qk 5HYDfXwUPQcLUeadmiON+K4jJ5/EwbOcGZgGAK03ffQsKLt1IQkC5sPkUvFHAdJcXiKz72 Zp6L1fsDO66hWSckB9CQNl+J6g7qkZOS/i7glq7nBVclPsMStcAQ9jGkrXUTsQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1772364539; a=rsa-sha256; cv=none; b=kiSJoVRgCVMfo/BUY5m++6cS+o9kWoMLUJDxj1FdpxqDYM0f7vYBXgFHxreicUVdwmbE3H Q7kTTSHkEpxhbaTccYTtRuZdfMXtO9wVvzdXcQpJfKgsDg6r7h7eiRvYtiskEhlEDZ6ak9 IN98Snn19INVXKZtOXx37vyOBwGm8ZRcIW0sW7CO+oRWpTMlz4lzx/mq2eokvM3VkP2kBc fszHVvipHexdc0O8VhrOSrLJdKZeeIZO6CcEJ/kRHJ2p7imUzd0WKq5iy85mIKfOz1un8u A+vF06rT+P6WiQlgy/Zyza4rMIszF9lSWX4AGpvruJ61RhtaO6+FXnN0U2mIZw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1772364539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=MZBBIHPmwoOzK6kQq6a1AR9x82e0OdoWYjFnjf8WMxs=; b=R4rLhIBFt4YHkyxjch4uHCvbtcxUfJExCr0nK5/2BTzB/JyQ6zOaBhl7vUlKvaxGFYI3rL 9jJ2ahgsdtk9H6jAIUX6DLxT/EQ3uGoOLKfJpkx8NYPYl3qJO7rLKOhnOzUTwSjN/wHI8H 5iamQKXJYQNJZoT4iSX4pRROqaV8KTWdhBC5J46o8YNzbFnrdwWEKYRvJEdAsTA1QaykqZ OZOzKIln5S+DcmGB1I+yYZH2mgS0NVnsGpEkbEUr3b8BldmW4ruitNB2pqWG0Re3ufBYac +TdLgsrFAkNJlRm7vPFkTpEwvMqun7oyiRyoQuwBrX83KDuXZrjIQNzJCNEWcw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) by mxrelay.nyi.freebsd.org (Postfix) with ESMTP id 4fP0FR2JwTznHP for ; Sun, 01 Mar 2026 11:28:59 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from git (uid 1279) (envelope-from git@FreeBSD.org) id 398e4 by gitrepo.freebsd.org (DragonFly Mail Agent v0.13+ on gitrepo.freebsd.org); Sun, 01 Mar 2026 11:28:59 +0000 To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Baptiste Daroussin Subject: git: 7c2c2c2a2253 - main - ed: add unicode test cases to ATF test suite List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: bapt X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 7c2c2c2a2253370c88fe428cf1c0ecebd68fe864 Auto-Submitted: auto-generated Date: Sun, 01 Mar 2026 11:28:59 +0000 Message-Id: <69a422fb.398e4.1e4447d5@gitrepo.freebsd.org> The branch main has been updated by bapt: URL: https://cgit.FreeBSD.org/src/commit/?id=7c2c2c2a2253370c88fe428cf1c0ecebd68fe864 commit 7c2c2c2a2253370c88fe428cf1c0ecebd68fe864 Author: Baptiste Daroussin AuthorDate: 2026-02-17 16:38:29 +0000 Commit: Baptiste Daroussin CommitDate: 2026-03-01 11:25:16 +0000 ed: add unicode test cases to ATF test suite Including examples in Cyrillic suggested by kib@ Differential Revusion: https://reviews.freebsd.org/D55364 --- bin/ed/tests/ed_test.sh | 333 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 333 insertions(+) diff --git a/bin/ed/tests/ed_test.sh b/bin/ed/tests/ed_test.sh index c67df8ae9f65..d4b18fa92ca0 100755 --- a/bin/ed/tests/ed_test.sh +++ b/bin/ed/tests/ed_test.sh @@ -1687,6 +1687,322 @@ z CMDS } +# --------------------------------------------------------------------------- +# Unicode support +# --------------------------------------------------------------------------- +atf_test_case unicode_list_multibyte +unicode_list_multibyte_head() +{ + atf_set "descr" "l command displays multibyte UTF-8 as-is"; +} +unicode_list_multibyte_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café\n' > input.txt + atf_check -o inline:'café$\n' ed -s - <<'CMDS' +H +r input.txt +l +Q +CMDS +} + +atf_test_case unicode_list_cjk +unicode_list_cjk_head() +{ + atf_set "descr" "l command displays CJK characters as-is"; +} +unicode_list_cjk_body() +{ + + export LC_CTYPE=C.UTF-8 + printf '日本語テスト\n' > input.txt + atf_check -o inline:'日本語テスト$\n' ed -s - <<'CMDS' +H +r input.txt +l +Q +CMDS +} + +atf_test_case unicode_list_mixed +unicode_list_mixed_head() +{ + atf_set "descr" "l command displays mixed ASCII/UTF-8 correctly"; +} +unicode_list_mixed_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'hello café 世界\n' > input.txt + atf_check -o inline:'hello café 世界$\n' ed -s - <<'CMDS' +H +r input.txt +l +Q +CMDS +} + +atf_test_case unicode_list_invalid +unicode_list_invalid_head() +{ + atf_set "descr" "l command escapes invalid UTF-8 as octal"; +} +unicode_list_invalid_body() +{ + + export LC_CTYPE=C.UTF-8 + printf '\200\201\376\377\n' > input.txt + atf_check -o inline:'\\200\\201\\376\\377$\n' ed -s - <<'CMDS' +H +r input.txt +l +Q +CMDS +} + +atf_test_case unicode_list_wrap_cjk +unicode_list_wrap_cjk_head() +{ + atf_set "descr" "l command wraps correctly around double-width CJK"; +} +unicode_list_wrap_cjk_body() +{ + + export LC_CTYPE=C.UTF-8 + # 69 A's + 日本 (2 CJK chars): 69 + 2 = 71 cols for 日 (fits), + # 71 + 2 = 73 for 本 (exceeds 72), so 本 wraps to next line. + printf 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA日本\n' > input.txt + ed -s - <<'CMDS' > output.txt +H +r input.txt +l +Q +CMDS + printf 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA日\\\n本$\n' > expected.txt + atf_check cmp output.txt expected.txt +} + +atf_test_case unicode_print +unicode_print_head() +{ + atf_set "descr" "p command passes through UTF-8 correctly"; +} +unicode_print_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café 日本語\n' > input.txt + atf_check -o inline:'café 日本語\n' ed -s - <<'CMDS' +H +r input.txt +p +Q +CMDS +} + +atf_test_case unicode_number +unicode_number_head() +{ + atf_set "descr" "n command displays line number with UTF-8"; +} +unicode_number_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café 日本語\n' > input.txt + atf_check -o inline:'1\tcafé 日本語\n' ed -s - <<'CMDS' +H +r input.txt +n +Q +CMDS +} + +atf_test_case unicode_regex +unicode_regex_head() +{ + atf_set "descr" "Regex search matches UTF-8 characters"; +} +unicode_regex_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café\ntest\nüber\n' > input.txt + atf_check -o inline:'café\n' ed -s - <<'CMDS' +H +r input.txt +g/é/p +Q +CMDS +} + +atf_test_case unicode_regex_charclass +unicode_regex_charclass_head() +{ + atf_set "descr" "Regex character classes work with UTF-8"; +} +unicode_regex_charclass_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café123\ntest456\n' > input.txt + atf_check -o inline:'café123\n' ed -s - <<'CMDS' +H +r input.txt +g/[[:alpha:]]*é/p +Q +CMDS +} + +atf_test_case unicode_substitute +unicode_substitute_head() +{ + atf_set "descr" "Substitute replaces UTF-8 characters"; +} +unicode_substitute_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café\n' > input.txt + ed -s - <<'CMDS' +H +r input.txt +s/é/e/ +w output.txt +Q +CMDS + printf 'cafe\n' > expected.txt + atf_check cmp output.txt expected.txt +} + +atf_test_case unicode_substitute_cjk +unicode_substitute_cjk_head() +{ + atf_set "descr" "Substitute replaces CJK characters"; +} +unicode_substitute_cjk_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'hello 世界\n' > input.txt + ed -s - <<'CMDS' +H +r input.txt +s/世界/world/ +w output.txt +Q +CMDS + printf 'hello world\n' > expected.txt + atf_check cmp output.txt expected.txt +} + +atf_test_case unicode_global_substitute +unicode_global_substitute_head() +{ + atf_set "descr" "Global substitute works with UTF-8"; +} +unicode_global_substitute_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'à la carte\nà bientôt\nhello\n' > input.txt + ed -s - <<'CMDS' +H +r input.txt +g/à/s/à/a/ +w output.txt +Q +CMDS + cat > expected.txt <<'EOF' +a la carte +a bientôt +hello +EOF + atf_check cmp output.txt expected.txt +} + +atf_test_case unicode_join +unicode_join_head() +{ + atf_set "descr" "Join preserves UTF-8 content"; +} +unicode_join_body() +{ + + export LC_CTYPE=C.UTF-8 + printf 'café\n世界\n' > input.txt + ed -s - <<'CMDS' +H +r input.txt +1,2j +w output.txt +Q +CMDS + printf 'café世界\n' > expected.txt + atf_check cmp output.txt expected.txt +} + +atf_test_case unicode_append +unicode_append_head() +{ + atf_set "descr" "Append preserves UTF-8 text"; +} +unicode_append_body() +{ + + export LC_CTYPE=C.UTF-8 + ed -s - <<'CMDS' +H +a +première +deuxième +. +w output.txt +Q +CMDS + cat > expected.txt <<'EOF' +première +deuxième +EOF + atf_check cmp output.txt expected.txt +} + +atf_test_case unicode_cyrillic +unicode_cyrillic_head() +{ + atf_set "descr" "Cyrillic: append, substitute, print, regex search"; +} +unicode_cyrillic_body() +{ + + export LC_CTYPE=C.UTF-8 + ed -s - <<'CMDS' > output.txt +H +a +Привет +. +s/ривет/ока/ +1p +a +Строка +. +1 +/а/p +1,$p +Q +CMDS + cat > expected.txt <<'EOF' +Пока +Пока +Строка +Пока +Строка +EOF + atf_check cmp output.txt expected.txt +} + # --------------------------------------------------------------------------- # Registration # --------------------------------------------------------------------------- @@ -1735,6 +2051,23 @@ atf_init_test_cases() atf_add_test_case newline_insert atf_add_test_case newline_search + # Unicode support + atf_add_test_case unicode_list_multibyte + atf_add_test_case unicode_list_cjk + atf_add_test_case unicode_list_mixed + atf_add_test_case unicode_list_invalid + atf_add_test_case unicode_list_wrap_cjk + atf_add_test_case unicode_print + atf_add_test_case unicode_number + atf_add_test_case unicode_regex + atf_add_test_case unicode_regex_charclass + atf_add_test_case unicode_substitute + atf_add_test_case unicode_substitute_cjk + atf_add_test_case unicode_global_substitute + atf_add_test_case unicode_join + atf_add_test_case unicode_append + atf_add_test_case unicode_cyrillic + # Error tests atf_add_test_case err_append_suffix atf_add_test_case err_addr_out_of_range