aboutsummaryrefslogtreecommitdiff
path: root/man/man1/tcs.1
diff options
context:
space:
mode:
Diffstat (limited to 'man/man1/tcs.1')
-rw-r--r--man/man1/tcs.1167
1 files changed, 167 insertions, 0 deletions
diff --git a/man/man1/tcs.1 b/man/man1/tcs.1
new file mode 100644
index 00000000..e1a410c3
--- /dev/null
+++ b/man/man1/tcs.1
@@ -0,0 +1,167 @@
+.TH TCS 1
+.SH NAME
+tcs \- translate character sets
+.SH SYNOPSIS
+.B tcs
+[
+.B -slcv
+]
+[
+.B -f
+.I ics
+]
+[
+.B -t
+.I ocs
+]
+[
+.I file ...
+]
+.SH DESCRIPTION
+.I Tcs
+interprets the named
+.I file(s)
+(standard input default) as a stream of characters from the
+.I ics
+character set or format, converts them to runes,
+and then converts them into a stream of characters from the
+.I ocs
+character set or format on the standard output.
+The default value for
+.I ics
+and
+.I ocs
+is
+.BR utf ,
+the
+.SM UTF
+encoding described in
+.IR utf (7).
+The
+.B -l
+option lists the character sets known to
+.IR tcs .
+Processing continues in the face of conversion errors (the
+.B -s
+option prevents reporting of these errors).
+The
+.B -c
+option forces the output to contain only correctly converted characters;
+otherwise,
+.B 0x80
+characters will be substituted for
+.SM UTF
+encoding errors and
+.B 0xFFFD
+characters will substituted for unknown characters.
+.PP
+The
+.B -v
+option generates various diagnostic and summary information on standard error,
+or makes the
+.B -l
+output more verbose.
+.PP
+.I Tcs
+recognizes an ever changing list of character sets.
+In particular, it supports a variety of Russian and Japanese encodings.
+Some of the supported encodings are
+.TF jis-kanji
+.TP
+.B utf
+The Plan 9
+.SM UTF
+encoding, known by ISO as UTF-8
+.TP
+.B utf1
+The deprecated original
+.SM UTF
+encoding from ISO 10646
+.TP
+.B ascii
+7-bit ASCII
+.TP
+.B 8859-1
+Latin-1 (Central European)
+.TP
+.B 8859-2
+Latin-2 (Czech .. Slovak)
+.TP
+.B 8859-3
+Latin-3 (Dutch .. Turkish)
+.TP
+.B 8859-4
+Latin-4 (Scandinavian)
+.TP
+.B 8859-5
+Part 5 (Cyrillic)
+.TP
+.B 8859-6
+Part 6 (Arabic)
+.TP
+.B 8859-7
+Part 7 (Greek)
+.TP
+.B 8859-8
+Part 8 (Hebrew)
+.TP
+.B 8859-9
+Latin-5 (Finnish .. Portuguese)
+.TP
+.B koi8
+KOI-8 (GOST 19769-74)
+.TP
+.B jis-kanji
+ISO 2022-JP
+.TP
+.B ujis
+EUC-JX: JIS 0208
+.TP
+.B ms-kanji
+Microsoft, or Shift-JIS
+.TP
+.B jis
+(from only) guesses between ISO 2022-JP, EUC or Shift-Jis
+.TP
+.B gb
+Chinese national standard (GB2312-80)
+.TP
+.B big5
+Big 5 (HKU version)
+.TP
+.B unicode
+Unicode Standard 1.0
+.TP
+.B tis
+Thai character set plus
+.SM ASCII
+(TIS 620-1986)
+.TP
+.B msdos
+IBM PC: CP 437
+.TP
+.B atari
+Atari-ST character set
+.SH EXAMPLES
+.TP
+.B tcs -f 8859-1
+Convert 8859-1 (Latin-1) characters into
+.SM UTF
+format.
+.TP
+.B tcs -s -f jis
+Convert characters encoded in one of several shift JIS encodings into
+.SM UTF
+format.
+Unknown Kanji will be converted into
+.B 0xFFFD
+characters.
+.TP
+.B tcs -lv
+Print an up to date list of the supported character sets.
+.SH SOURCE
+.B /usr/local/plan9/src/cmd/tcs
+.SH SEE ALSO
+.IR ascii (1),
+.IR rune (3),
+.IR utf (7).