Character code of python | Laptrinhcanban.com

HOME › >>

Character code of python

Posted at: https://laptrinhcanban.com/en

In the current python version, python processes the python program archive file with default that the file is already written in UTF-8 charset, so if the python file is written with a non-UTF-8 character code, It is necessary to specify which Character code of python is used to avoid a decoding error when running the program. This article is especially useful for you to use computers with operating systems installed in non-English languages ​​such as Japanese, Hebrew, Korean, Chinese …

Put the character code used in the python program archive file

Character encoding in case the python file is written with UTF-8 character encoding

Python processes the python program archive file with default that the file has been written in UTF-8, so for python files written and saved in UTF-8, we do not need put character code in python file.
For example we write and store the python program below in UTF-8 character code.

print ( "Hello" )

When we try to run the program written in this python file, the following result shows that python has successfully processed the program:

Run the program written in the python file

As you can see, since we have written the program in the file with the UTF-8 charset, python with the default that the file to be processed is written in UTF-8 has no problems decoding. characters. And the program has been handled successfully.

Character encoding in case the python file is written with character code other than UTF-8

If the characters in the python program file are written in a character code other than UTF-8, which common for computers with non-English languages such as Japanese, Korean, Chinese , so that python can decode these character codes when running the program, we need to put the character code in the python file.

The syntax for placing the character code used in the python program archive file is as follows:

# coding : encoding
Or
# coding = encoding

For example:

# coding: shift_jis

Again, in a Linux environment, if on the first line of the python program archive files it says #! / Usr / bin / env python3 , use the second character encoding above.

Let’s take a look at an example where the python file using Japanese is written with the following Shift_JIS character code:

print ("こんにちは")

python file containing Japanese is written in Shift_JIS character code
※こ ん に ち は means hello in Japanese

When we try to run this python file, since we do not put the Shift_JIS character code used in writing the file, python cannot decode it, and the error occurs:

SyntaxError: Non-UTF-8 code starting with '\x82' in file sample5-2.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Lỗi giải mã ký tự

To solve this error, we proceed to set the character code used in this python program archive file Shift_JIS by adding the first line in this python file with the following line of code:

# coding: shift_jis

The above python file will have the following content.

# coding: shift_jis
print ("こんにちは")

python file containing Japanese is written with Shift_JIS character code that has been coded

After saving the file above, let us try to run the program. As a result, the program runs smoothly as follows:

The result after placing the character code used in the python program archive file
As above, when writing and saving a python program into a file, if for no special reason we will use the UTF-8 character code.
If for any special reason you want to use character code other than UTF-8 to write and save the python file, you need to put the character code used in that file.

Common types of character codes are used when placing character codes for use in python files

Common types of character codes to use when placing character codes for use in python files are as follows:

Codec (Character code name)Another way to call
ascii646, us-ascii
cp932932, ms932, mskanji, ms-kanji
euc_jpeucjp, ujis, u-jis
iso2022_jpcsiso2022jp, iso2022jp, iso-2022-jp
shift_jiscsshiftjis, shiftjis, sjis, s_jis
utf_8U8, UTF, utf8

Summary

Above, Kiyoshi showed you how to place the character code used in the python program archive file. Kiyoshi thinks you already have the basics.

Let’s learn more about Python in the next lessons.

URL Link

https://laptrinhcanban.com/en/python/nhap-mon-lap-trinh-python/kien-thuc-can-ban-ve-chuong-trinh-python/dat-ma-ky-trong-file-python/

Thank you for reading and please Like & Share for other friends to learn.
">

HOME  › >>

  • Recent Posts
Profile
きよしです!笑

Author: Kiyoshi (Chis Thanh)

Kiyoshi was a former international student in Japan. After graduating from Toyama University in 2017, Kiyoshi is currently working at BrSE in Tokyo, Japan. Kiyoshi là một cựu du học sinh tại Nhật Bản. Sau khi tốt nghiệp đại học Toyama năm 2017, Kiyoshi hiện đang làm BrSE tại Tokyo, Nhật Bản.