Biopython 简明教程

Biopython - Genome Analysis

基因组是 DNA 的完整集合,包括其所有基因。基因组分析是指研究个体基因及其在遗传中的作用。

A genome is complete set of DNA, including all of its genes. Genome analysis refers to the study of individual genes and their roles in inheritance.

Genome Diagram

基因组图以图表的形式展示遗传信息。Biopython 使用 Bio.Graphics.GenomeDiagram 模块来展示 GenomeDiagram。GenomeDiagram 模块需要安装 ReportLab。

Genome diagram represents the genetic information as charts. Biopython uses Bio.Graphics.GenomeDiagram module to represent GenomeDiagram. The GenomeDiagram module requires ReportLab to be installed.

Steps for creating a diagram

创建图表的流程通常遵循以下简单模式:

The process of creating a diagram generally follows the below simple pattern −

  1. Create a FeatureSet for each separate set of features you want to display, and add Bio.SeqFeature objects to them.

  2. Create a GraphSet for each graph you want to display, and add graph data to them.

  3. Create a Track for each track you want on the diagram, and add GraphSets and FeatureSets to the tracks you require.

  4. Create a Diagram, and add the Tracks to it.

  5. Tell the Diagram to draw the image.

  6. Write the image to a file.

让我们以输入 GenBank 文件为例:

Let us take an example of input GenBank file −

https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.gbk 并从 SeqRecord 对象中读取记录,然后最终绘制基因组图表。它在下面进行了说明:

https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.gbk and read records from SeqRecord object then finally draw a genome diagram. It is explained below,

我们首先需要导入所有模块,如下所示:

We shall import all the modules first as shown below −

>>> from reportlab.lib import colors
>>> from reportlab.lib.units import cm
>>> from Bio.Graphics import GenomeDiagram

现在,导入 SeqIO 模块来读取数据:

Now, import SeqIO module to read data −

>>> from Bio import SeqIO
record = SeqIO.read("example.gb", "genbank")

在这里,记录从 genbank 文件中读取序列。

Here, the record reads the sequence from genbank file.

现在,创建一个空的图表来添加轨道和特性集:

Now, create an empty diagram to add track and feature set −

>>> diagram = GenomeDiagram.Diagram(
   "Yersinia pestis biovar Microtus plasmid pPCP1")
>>> track = diagram.new_track(1, name="Annotated Features")
>>> feature = track.new_set()

现在,我们可以使用下面定义的绿色到灰色的交替颜色应用颜色主题更改:

Now, we can apply color theme changes using alternative colors from green to grey as defined below −

>>> for feature in record.features:
>>>    if feature.type != "gene":
>>>       continue
>>>    if len(feature) % 2 == 0:
>>>       color = colors.blue
>>>    else:
>>>       color = colors.red
>>>
>>>    feature.add_feature(feature, color=color, label=True)

现在您可以在屏幕上看到以下响应:

Now you could see the below response on your screen −

<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d3dc90>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d3dfd0>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x1007627d0>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d57290>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d57050>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d57390>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d57590>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d57410>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d57490>
<Bio.Graphics.GenomeDiagram._Feature.Feature object at 0x105d574d0>

让我们为上述输入记录绘制一张图表 -

Let us draw a diagram for the above input records −

>>> diagram.draw(
   format = "linear", orientation = "landscape", pagesize = 'A4',
   ... fragments = 4, start = 0, end = len(record))
>>> diagram.write("orchid.pdf", "PDF")
>>> diagram.write("orchid.eps", "EPS")
>>> diagram.write("orchid.svg", "SVG")
>>> diagram.write("orchid.png", "PNG")

执行上述命令后,您可以在 Biopython 目录中看到已保存的以下图像。

After executing the above command, you could see the following image saved in your Biopython directory.

** Result **
genome.png
creating diagram

您还可以通过进行以下更改来以圆形格式绘制图像 -

You can also draw the image in circular format by making the below changes −

>>> diagram.draw(
   format = "circular", circular = True, pagesize = (20*cm,20*cm),
   ... start = 0, end = len(record), circle_core = 0.7)
>>> diagram.write("circular.pdf", "PDF")

Chromosomes Overview

DNA 分子被包装成称为染色体的类线程结构。每条染色体都由 DNA 紧密盘绕在支持其结构的称为组蛋白的蛋白质周围多次组成。

DNA molecule is packaged into thread-like structures called chromosomes. Each chromosome is made up of DNA tightly coiled many times around proteins called histones that support its structure.

当细胞没有分裂时,染色体在细胞核中不可见 - 即使在显微镜下也是如此。然而,构成染色体的 DNA 在细胞分裂期间会变得更加紧密地包装,然后在显微镜下可见。

Chromosomes are not visible in the cell’s nucleus — not even under a microscope —when the cell is not dividing. However, the DNA that makes up chromosomes becomes more tightly packed during cell division and is then visible under a microscope.

在人类中,每个细胞通常包含 23 对染色体,总共 46 条。其中 22 对(称为常染色体)在男性和女性中看起来相同。第 23 对(性染色体)在男性和女性之间不同。女性有 X 染色体的两份拷贝,而男性有 X 和 Y 染色体各一份。

In humans, each cell normally contains 23 pairs of chromosomes, for a total of 46. Twenty-two of these pairs, called autosomes, look the same in both males and females. The 23rd pair, the sex chromosomes, differ between males and females. Females have two copies of the X chromosome, while males have one X and one Y chromosome.