Batch conversion of Simplified Chinese articles to Traditional Chinese
Conversion between simplified and traditional Chinese is a common requirement, such as multilingual support for websites, multilingual operation manuals, input methods, etc. Recently, I found that some traditional Chinese users visit this site, so it is necessary to provide traditional Chinese language support.
The simplified and traditional conversion operation under the Linux operating system is as follows:
1. Install opencc
1sudo apt-get install opencc
opencc is an open source project, thanks to the selfless dedication of relevant personnel.
2. Example of single file conversion:
1➜ ~ cat simplified.txt
2静夜思
3床前明月光,疑是地上霜。
4举头望明月,低头思故乡。
5➜ ~ opencc -i simplified.txt -o traditional.txt -c s2t.json
6➜ ~ cat traditional.txt
7靜夜思
8牀前明月光,疑是地上霜。
9舉頭望明月,低頭思故鄉。
It's that simple, the Simplified-Traditional conversion of 《静夜思》is completed.
Parameter description
-i input file
-o output file
-c configuration file
s2t.json
Simplified Chinese to Traditional Chinese Simplified to Traditional Chineset2s.json
Traditional Chinese to Simplified Chinese Traditional to Simplifieds2tw.json
Simplified Chinese to Traditional Chinese (Taiwan Standard)tw2s.json
Traditional Chinese (Taiwan Standard) to Simplified Chineses2hk.json
Simplified Chinese to Traditional Chinese (Hong Kong variant) Simplified to Hong Kong Traditional
hk2s.json
Traditional Chinese (Hong Kong variant) to Simplified Chinese Hong Kong Traditional to Simplifieds2twp.json
Simplified Chinese to Traditional Chinese (Taiwan Standard) with Taiwanese idiom
tw2sp.json
Traditional Chinese (Taiwan Standard) to Simplified Chinese with Mainland Chinese idiomt2tw.json
Traditional Chinese (OpenCC Standard) to Taiwan Standardhk2t.json
Traditional Chinese (Hong Kong variant) to Traditional Chinese Hong Kong Traditional to Traditional (OpenCC standard)t2hk.json
Traditional Chinese (OpenCC Standard) to Hong Kong variant Traditional Chinese (OpenCC Standard) to Hong Kong Traditionalt2jp.json
Traditional Chinese Characters (Kyūjitai) to New Japanese Kanji (Shinjitai)jp2t.json
New Japanese Kanji (Shinjitai) to Traditional Chinese Characters (Kyūjitai)tw2t.json
Traditional Chinese (Taiwan standard) to Traditional Chinese
1 ➜ ~ opencc --help
2
3 Open Chinese Convert (OpenCC) Command Line Tool
4 Author: Carbo Kuo <byvoid@byvoid.com>
5 Bug Report: http://github.com/BYVoid/OpenCC/issues
6
7 Usage:
8
9 opencc [--noflush <bool>] [-i <file>] [-o <file>] [-c <file>] [--]
10 [--version] [-h]
11
12 Options:
13
14 --noflush <bool>
15 Disable flush for every line
16
17 -i <file>, --input <file>
18 Read original text from <file>.
19
20 -o <file>, --output <file>
21 Write converted text to <file>.
22
23 -c <file>, --config <file>
24 Configuration file
25
26 --, --ignore_rest
27 Ignores the rest of the labeled arguments following this flag.
28
29 --version
30 Displays version information and exits.
31
32 -h, --help
33 Displays usage information and exits.
34
35
36 Open Chinese Convert (OpenCC) Command Line Tool
3. Multi-file batch conversion
Use fd command, taking batch conversion of markdown files as an example:
1fd -e.md -x opencc -i {} -o {/.}.zh-tw.md -c s2t.json
-
fd -e .md
means to find files ending with.md
-
-x/--exec
perform additional operations on each result of the previous search -
opencc -i {} -o {/.}.zh-tw.md -c s2t.json
performs simplified and traditional conversion for each file,{}
and{./}
are the fd command Grammar (the example means, assuming that the input is aaa.md, the output is aaa.zh-tw.md. The reason why this is written is the multilingual rule requirement of the blog system. Readers can make appropriate changes according to the following instructions):{}
: A placeholder token that will be replaced with the path of the search result (documents/images/party.jpg
).{.}
: Like{}
, but without the file extension (documents/images/party
).{/}
: A placeholder that will be replaced by the basename of the search result (party.jpg
).{//}
: The parent of the discovered path (documents/images
).{/.}
: The basename, with the extension removed (party
).
If you want to convert a file, you just need to write the search matching rules and output the file name you want. I used the opencc
and fd
commands to convert thousands of articles from simplified to traditional in more than ten minutes. It is convenient and easy to switch between various styles of Chinese characters.
Copyright statement:
- All content that is not sourced is original., please do not reprint without authorization (because the typesetting is often disordered after reprinting, the content is uncontrollable, and cannot be continuously updated, etc.);
- For non-profit purposes, to deduce any content of this blog, please give the relevant webpage address of this site in the form of 'source of original text' or 'reference link' (for the convenience of readers).
See Also:
- Use Inkscape to resize svg images
- Install the latest version of Python for Linux
- Ubuntu 22.10 connection bluetooth headset error 'br-connection-profile-unavailable solution'
- Minetest demo
- alacritty can't input Chinese and the title bar problem
- Ubuntu connects AirPods headphones
- Whether initialDelaySeconds is valid in the startupProbe probe of k8s
- Nginx webdav for Joplin
- alacritty terminal use