Python Text Processing 简明教程

Python - Backward File Reading


pip install file-read-backwards


with open ("Path\GodFather.txt", "r") as BigFile:

# Print each line
	for i in range(len(data)):
    print "Line No- ",i
    print data[i]


Line No-  0
Vito Corleone is the aging don (head) of the Corleone Mafia Family.

Line No-  1
His youngest son Michael has returned from WWII just in time to see the wedding of Connie Corleone (Michael's sister) to Carlo Rizzi.

Line No-  2
All of Michael's family is involved with the Mafia, but Michael just wants to live a normal life. Drug dealer Virgil Sollozzo is looking for Mafia families to offer him protection in exchange for a profit of the drug money.

Line No-  3
He approaches Don Corleone about it, but, much against the advice of the Don's lawyer Tom Hagen, the Don is morally against the use of drugs, and turns down the offer.

Line No-  4
This does not please Sollozzo, who has the Don shot down by some of his hit men.

Line No-  5
The Don barely survives, which leads his son Michael to begin a violent mob war against Sollozzo and tears the Corleone family apart.

Reading Lines Backward


from file_read_backwards import FileReadBackwards

with FileReadBackwards("Path\GodFather.txt", encoding="utf-8") as BigFile:

# getting lines by lines starting from the last line up
    for line in BigFile:
        print line


The Don barely survives, which leads his son Michael to begin a violent mob war against Sollozzo and tears the Corleone family apart.
This does not please Sollozzo, who has the Don shot down by some of his hit men.
He approaches Don Corleone about it, but, much against the advice of the Don's lawyer Tom Hagen, the Don is morally against the use of drugs, and turns down the offer.
All of Michael's family is involved with the Mafia, but Michael just wants to live a normal life. Drug dealer Virgil Sollozzo is looking for Mafia families to offer him protection in exchange for a profit of the drug money.
His youngest son Michael has returned from WWII just in time to see the wedding of Connie Corleone (Michael's sister) to Carlo Rizzi.
Vito Corleone is the aging don (head) of the Corleone Mafia Family.


Reading Words Backward

我们还可以反向读取文件中的单词。为此,我们首先反向读取这些行,然后通过应用反转函数对其中的单词进行标记化。在下面的示例中,我们使用包和 nltk 模块反向从同一个文件中打印单词标记。

import nltk
from file_read_backwards import FileReadBackwards

with FileReadBackwards("Path\GodFather.txt", encoding="utf-8") as BigFile:

# getting lines by lines starting from the last line up
# And tokenizing with applying reverse()
    for line in BigFile:
        word_data= line
        nltk_tokens = nltk.word_tokenize(word_data)
        print (nltk_tokens)

当我们运行以上程序时,我们得到了以下输出 −

['.', 'apart', 'family', 'Corleone', 'the', 'tears', 'and', 'Sollozzo', 'against', 'war', 'mob', 'violent', 'a', 'begin', 'to', 'Michael', 'son', 'his', 'leads', 'which', ',', 'srvives', 'barely', 'Don', 'The']
['.', 'men', 'hit', 'his', 'of', 'some', 'by', 'down', 'shot', 'Don', 'the', 'has', 'who', ',', 'Sollozzo', 'please', 'not', 'does', 'This']
['.', 'offer', 'the', 'down', 'trns', 'and', ',', 'drgs', 'of', 'se', 'the', 'against', 'morally', 'is', 'Don', 'the', ',', 'Hagen', 'Tom', 'lawyer', "'s", 'Don', 'the', 'of', 'advice', 'the', 'against', 'mch', ',', 'bt', ',', 'it', 'abot', 'Corleone', 'Don', 'approaches', 'He']
['.', 'money', 'drg', 'the', 'of', 'profit', 'a', 'for', 'exchange', 'in', 'protection', 'him', 'offer', 'to', 'families', 'Mafia', 'for', 'looking', 'is', 'Sollozzo', 'Virgil', 'dealer', 'Drg', '.', 'life', 'normal', 'a', 'live', 'to', 'wants', 'jst', 'Michael', 'bt', ',', 'Mafia', 'the', 'with', 'involved', 'is', 'family', "'s", 'Michael', 'of', 'All']
['.', 'Rizzi', 'Carlo', 'to', ')', 'sister', "'s", 'Michael', '(', 'Corleone', 'Connie', 'of', 'wedding', 'the', 'see', 'to', 'time', 'in', 'jst', 'WWII', 'from', 'retrned', 'has', 'Michael', 'son', 'yongest', 'His']
['.', 'Family', 'Mafia', 'Corleone', 'the', 'of', ')', 'head', '(', 'don', 'aging', 'the', 'is', 'Corleone', 'Vito']