Object Oriented Python 简明教程

Object Oriented Python - Files and Strings

Strings

字符串是所有编程语言中最常用的数据类型。这是为什么?因为我们更能理解文本,而不是数字,所以我们在书写和交谈时使用文本和单词,同样,在编程中,我们也使用字符串。在字符串中,我们解析文本、分析文本语义并进行数据挖掘 - 所有这些数据都是人类可理解的文本。Python 中的字符串是不可变的。

Strings are the most popular data types used in every programming language. Why? Because we, understand text better than numbers, so in writing and talking we use text and words, similarly in programming too we use strings. In string we parse text, analyse text semantics, and do data mining – and all this data is human consumed text.The string in Python is immutable.

String Manipulation

在 Python 中,字符串可以用多种方式标记,对于多行字符串可以使用单引号(')、双引号(")甚至三重引号(''')。

In Python, string can be marked in multiple ways, using single quote ( ‘ ), double quote( “ ) or even triple quote ( ‘’’ ) in case of multiline strings.

>>> # String Examples
>>> a = "hello"
>>> b = ''' A Multi line string,
Simple!'''
>>> e = ('Multiple' 'strings' 'togethers')

字符串操作非常有用,在每种语言中都得到了非常广泛的使用。通常,程序员需要分解字符串并仔细检查它们。

String manipulation is very useful and very widely used in every language. Often, programmers are required to break down strings and examine them closely.

字符串可以进行迭代(逐个字符)、切片或连接。语法与列表相同。

Strings can be iterated over (character by character), sliced, or concatenated. The syntax is the same as for lists.

str 类中包含大量方法,可以使字符串操作变得更容易。dir 和 help 命令在 Python 解释器中提供了如何使用这些方法的指导。

The str class has numerous methods on it to make manipulating strings easier. The dir and help commands provides guidance in the Python interpreter how to use them.

以下是我们使用的一些常用字符串方法。

Below are some of the commonly used string methods we use.

Sr.No.

Method & Description

1

isalpha() Checks if all characters are Alphabets

2

isdigit() Checks Digit Characters

3

isdecimal() Checks decimal Characters

4

isnumeric() checks Numeric Characters

5

find() Returns the Highest Index of substrings

6

istitle() Checks for Titlecased strings

7

join() Returns a concatenated string

8

lower() returns lower cased string

9

upper() returns upper cased string

10

partion() Returns a tuple

11

bytearray() Returns array of given byte size

12

enumerate() Returns an enumerate object

13

isprintable() Checks printable character

让我们尝试运行几个字符串方法:

Let’s try to run couple of string methods,

>>> str1 = 'Hello World!'
>>> str1.startswith('h')
False
>>> str1.startswith('H')
True
>>> str1.endswith('d')
False
>>> str1.endswith('d!')
True
>>> str1.find('o')
4
>>> #Above returns the index of the first occurence of the character/substring.
>>> str1.find('lo')
3
>>> str1.upper()
'HELLO WORLD!'
>>> str1.lower()
'hello world!'
>>> str1.index('b')
Traceback (most recent call last):
   File "<pyshell#19>", line 1, in <module>
      str1.index('b')
ValueError: substring not found
>>> s = ('hello How Are You')
>>> s.split(' ')
['hello', 'How', 'Are', 'You']
>>> s1 = s.split(' ')
>>> '*'.join(s1)
'hello*How*Are*You'
>>> s.partition(' ')
('hello', ' ', 'How Are You')
>>>

String Formatting

在 Python 3.x 中,字符串的格式发生了变化,现在更合乎逻辑且更灵活。可以在格式字符串中使用 format() 方法或 % 符号(旧样式)进行格式化。

In Python 3.x formatting of strings has changed, now it more logical and is more flexible. Formatting can be done using the format() method or the % sign(old style) in format string.

字符串可以包含文字或用大括号 {} 括起来的替换字段,每个替换字段都可以包含位置参数的数字索引或关键字参数的名称。

The string can contain literal text or replacement fields delimited by braces {} and each replacement field may contains either the numeric index of a positional argument or the name of a keyword argument.

syntax

str.format(*args, **kwargs)

Basic Formatting

>>> '{} {}'.format('Example', 'One')
'Example One'
>>> '{} {}'.format('pie', '3.1415926')
'pie 3.1415926'

以下示例允许重新调整显示顺序而不更改参数。

Below example allows re-arrange the order of display without changing the arguments.

>>> '{1} {0}'.format('pie', '3.1415926')
'3.1415926 pie'

字符串填充和对齐

Padding and aligning strings

可以将值填充到特定长度。

A value can be padded to a specific length.

>>> #Padding Character, can be space or special character
>>> '{:12}'.format('PYTHON')
'PYTHON '
>>> '{:>12}'.format('PYTHON')
' PYTHON'
>>> '{:<{}s}'.format('PYTHON',12)
'PYTHON '
>>> '{:*<12}'.format('PYTHON')
'PYTHON******'
>>> '{:*^12}'.format('PYTHON')
'***PYTHON***'
>>> '{:.15}'.format('PYTHON OBJECT ORIENTED PROGRAMMING')
'PYTHON OBJECT O'
>>> #Above, truncated 15 characters from the left side of a specified string
>>> '{:.{}}'.format('PYTHON OBJECT ORIENTED',15)
'PYTHON OBJECT O'
>>> #Named Placeholders
>>> data = {'Name':'Raghu', 'Place':'Bangalore'}
>>> '{Name} {Place}'.format(**data)
'Raghu Bangalore'
>>> #Datetime
>>> from datetime import datetime
>>> '{:%Y/%m/%d.%H:%M}'.format(datetime(2018,3,26,9,57))
'2018/03/26.09:57'

Strings are Unicode

字符串作为不可变 Unicode 字符的集合。Unicode 字符串提供了创建可在任何地方运行的软件或程序的机会,因为 Unicode 字符串可以表示任何可能的字符,而不仅仅是 ASCII 字符。

Strings as collections of immutable Unicode characters. Unicode strings provide an opportunity to create software or programs that works everywhere because the Unicode strings can represent any possible character not just the ASCII characters.

即使字节对象引用文本数据,许多 IO 操作也只知道如何处理字节。因此,了解如何在字节和 Unicode 之间进行互换非常重要。

Many IO operations only know how to deal with bytes, even if the bytes object refers to textual data. It is therefore very important to know how to interchange between bytes and Unicode.

将文本转换为字节

Converting text to bytes

将字符串转换为字节对象称为编码。有许多形式的编码,最常见的是:PNG;JPEG、MP3、WAV、ASCII、UTF-8 等。此外,(编码)是一种以字节表示音频、图像、文本等的格式。

Converting a strings to byte object is termed as encoding. There are numerous forms of encoding, most common ones are: PNG; JPEG, MP3, WAV, ASCII, UTF-8 etc. Also this(encoding) is a format to represent audio, images, text, etc. in bytes.

这种转换可以通过 encode() 实现。它将编码技术作为参数。默认情况下,我们使用“UTF-8”技术。

This conversion is possible through encode(). It take encoding technique as argument. By default, we use ‘UTF-8’ technique.

>>> # Python Code to demonstrate string encoding
>>>
>>> # Initialising a String
>>> x = 'TutorialsPoint'
>>>
>>> #Initialising a byte object
>>> y = b'TutorialsPoint'
>>>
>>> # Using encode() to encode the String >>> # encoded version of x is stored in z using ASCII mapping
>>> z = x.encode('ASCII')
>>>
>>> # Check if x is converted to bytes or not
>>>
>>> if(z==y):
   print('Encoding Successful!')
else:
   print('Encoding Unsuccessful!')
Encoding Successful!

Converting bytes to text

Converting bytes to text

将字节转换为文本称为解码。这是通过 decode() 实现的。如果我们知道哪个编码用于对其进行编码,则可以将字节字符串转换为字符字符串。

Converting bytes to text is called the decoding. This is implemented through decode(). We can convert a byte string to a character string if we know which encoding is used to encode it.

因此,编码和解码是逆向过程。

So Encoding and decoding are inverse processes.

>>>
>>> # Python code to demonstrate Byte Decoding
>>>
>>> #Initialise a String
>>> x = 'TutorialsPoint'
>>>
>>> #Initialising a byte object
>>> y = b'TutorialsPoint'
>>>
>>> #using decode() to decode the Byte object
>>> # decoded version of y is stored in z using ASCII mapping
>>> z = y.decode('ASCII')
>>> #Check if y is converted to String or not
>>> if (z == x):
   print('Decoding Successful!')
else:
   print('Decoding Unsuccessful!') Decoding Successful!
>>>

File I/O

操作系统将文件表示为字节序列,而不是文本。

Operating systems represents files as a sequence of bytes, not text.

文件是磁盘上的一个已命名位置,用于存储相关信息。它用于永久存储磁盘中的数据。

A file is a named location on disk to store related information. It is used to permanently store data in your disk.

在 Python 中,文件操作按以下顺序进行。

In Python, a file operation takes place in the following order.

  1. Open a file

  2. Read or write onto a file (operation).Open a file

  3. Close the file.

Python 用适当的解码(或编码)调用包装传入(或传出)字节流,以便我们可以直接处理 str 对象。

Python wraps the incoming (or outgoing) stream of bytes with appropriate decode (or encode) calls so we can deal directly with str objects.

Opening a file

Python 有一个内置函数 open() 来打开文件。这将生成一个文件对象,也称为句柄,因为它用于相应地读取或修改文件。

Python has a built-in function open() to open a file. This will generate a file object, also called a handle as it is used to read or modify the file accordingly.

>>> f = open(r'c:\users\rajesh\Desktop\index.webm','rb')
>>> f
<_io.BufferedReader name='c:\\users\\rajesh\\Desktop\\index.webm'>
>>> f.mode
'rb'
>>> f.name
'c:\\users\\rajesh\\Desktop\\index.webm'

要从文件中读取文本,我们只需要将文件名传递给函数。系统将打开该文件以进行读取,并使用平台默认编码将字节转换为文本。

For reading text from a file, we only need to pass the filename into the function. The file will be opened for reading, and the bytes will be converted to text using the platform default encoding.