Python Text Processing 简明教程

Python - Text wrapping

当从某个来源获取的文本未正确格式化以便在可用的屏幕宽度内显示时,需要换行。这是通过使用以下程序包实现的,该程序包可以使用以下命令安装在我们的环境中。

pip install parawrap

下面的段落包含一个连续的单字符串文本。在应用 wrap 函数时,我们可以看到文本如何被分隔成以逗号分隔的多行。

import parawrap

text = "In late summer 1945, guests are gathered for the wedding reception of Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi (Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia family, is known to friends and associates as Godfather. He and Tom Hagen (Robert Duvall), the Corleone family lawyer, are hearing requests for favors because, according to Italian tradition, no Sicilian can refuse a request on his daughter's wedding day. One of the men who asks the Don for a favor is Amerigo Bonasera, a successful mortician and acquaintance of the Don, whose daughter was brutally beaten by two young men because she refused their advances; the men received minimal punishment from the presiding judge. The Don is disappointed in Bonasera, who'd avoided most contact with the Don due to Corleone's nefarious business dealings. The Don's wife is godmother to Bonasera's shamed daughter, a relationship the Don uses to extract new loyalty from the undertaker. The Don agrees to have his men punish the young men responsible (in a non-lethal manner) in return for future service if necessary."

print parawrap.wrap(text)

当我们运行以上程序时,我们得到了以下输出 −

['In late summer 1945, guests are gathered for the wedding reception of', "Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi", '(Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia', 'family, is known to friends and associates as Godfather. He and Tom', 'Hagen (Robert Duvall), the Corleone family lawyer, are hearing', 'requests for favors because, according to Italian tradition, no', "Sicilian can refuse a request on his daughter's wedding day. One of", 'the men who asks the Don for a favor is Amerigo Bonasera, a successful', 'mortician and acquaintance of the Don, whose daughter was brutally', 'beaten by two young men because she refused their advances; the men', 'received minimal punishment from the presiding judge. The Don is', "disappointed in Bonasera, who'd avoided most contact with the Don due", "to Corleone's nefarious business dealings. The Don's wife is godmother", "to Bonasera's shamed daughter, a relationship the Don uses to extract", 'new loyalty from the undertaker. The Don agrees to have his men punish', 'the young men responsible (in a non-lethal manner) in return for', 'future service if necessary.']

我们可以使用 wrap 函数来应用特定宽度作为输入参数,如果需要保持 wrap 函数所需宽度,它将剪切单词。

import parawrap

text = "In late summer 1945, guests are gathered for the wedding reception of Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi (Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia family, is known to friends and associates as Godfather. He and Tom Hagen (Robert Duvall), the Corleone family lawyer, are hearing requests for favors because, according to Italian tradition, no Sicilian can refuse a request on his daughter's wedding day. One of the men who asks the Don for a favor is Amerigo Bonasera, a successful mortician and acquaintance of the Don, whose daughter was brutally beaten by two young men because she refused their advances; the men received minimal punishment from the presiding judge. The Don is disappointed in Bonasera, who'd avoided most contact with the Don due to Corleone's nefarious business dealings. The Don's wife is godmother to Bonasera's shamed daughter, a relationship the Don uses to extract new loyalty from the undertaker. The Don agrees to have his men punish the young men responsible (in a non-lethal manner) in return for future service if necessary."

print parawrap.wrap(text,5)

当我们运行以上程序时,我们得到了以下输出 −

['In', 'late ', 'summe', 'r', '1945,', 'guest', 's are', 'gathe', 'red', 'for', 'the w', 'eddin', 'g rec', 'eptio', 'n of', 'Don', 'Vito ', 'Corle', "one's", 'daugh', 'ter C', 'onnie', '(Tali', 'a Shi', 're)', 'and', 'Carlo', 'Rizzi', '(Gian', 'ni Ru', 'sso).', 'Vito ', '(Marl', 'on Br', 'ando)', ', the', 'head', 'of', 'the C', 'orleo', 'ne', 'Mafia', 'famil', 'y, is', 'known', 'to fr', 'iends', 'and a', 'ssoci', 'ates', 'as Go', 'dfath', 'er.', 'He', 'and', 'Tom', 'Hagen', '(Robe', 'rt Du', 'vall)', ', the', 'Corle', 'one f', 'amily', 'lawye', 'r,', 'are h', 'earin', 'g req', 'uests', 'for f', 'avors', 'becau', 'se, a', 'ccord', 'ing', 'to It', 'alian', 'tradi', 'tion,', 'no Si', 'cilia', 'n can', 'refus', 'e a r', 'eques', 't on', 'his d', 'aught', "er's ", 'weddi', 'ng', 'day.', 'One', 'of', 'the', 'men', 'who', 'asks', 'the', 'Don', 'for a', 'favor', 'is Am', 'erigo', 'Bonas', 'era,', 'a suc', 'cessf', 'ul mo', 'rtici', 'an', 'and a', 'cquai', 'ntanc', 'e of', 'the', 'Don,', 'whose', 'daugh', 'ter', 'was b', 'rutal', 'ly be', 'aten', 'by', 'two', 'young', 'men b', 'ecaus', 'e she', 'refus', 'ed', 'their', 'advan', 'ces;', 'the', 'men r', 'eceiv', 'ed mi', 'nimal', 'punis', 'hment', 'from', 'the p', 'resid', 'ing j', 'udge.', 'The', 'Don', 'is di', 'sappo', 'inted', 'in Bo', 'naser', 'a,', "who'd", 'avoid', 'ed', 'most ', 'conta', 'ct', 'with', 'the', 'Don', 'due', 'to Co', 'rleon', "e's n", 'efari', 'ous b', 'usine', 'ss de', 'aling', 's.', 'The', "Don's", 'wife', 'is go', 'dmoth', 'er to', 'Bonas', "era's", 'shame', 'd dau', 'ghter', ', a r', 'elati', 'onshi', 'p the', 'Don', 'uses', 'to ex', 'tract', 'new l', 'oyalt', 'y', 'from', 'the u', 'ndert', 'aker.', 'The', 'Don a', 'grees', 'to', 'have', 'his', 'men p', 'unish', 'the', 'young', 'men r', 'espon', 'sible', '(in a', 'non-l', 'ethal', 'manne', 'r) in', 'retur', 'n for', 'futur', 'e ser', 'vice', 'if ne', 'cessa', 'ry.']