Ms Sql Server 简明教程
MS SQL Server - Architecture
为了便于理解,我们将 SQL Server 的架构分为以下部分 −
We have classified the architecture of SQL Server into the following parts for easy understanding −
-
General architecture
-
Memory architecture
-
Data file architecture
-
Log file architecture
General Architecture
Client − 请求发起的地方。
Client − Where the request initiated.
Query − 作为高级语言的 SQL 查询。
Query − SQL query which is high level language.
Logical Units − 关键字、表达式和运算符等。
Logical Units − Keywords, expressions and operators, etc.
N/W Packets − 与网络相关的代码。
N/W Packets − Network related code.
Protocols − 在 SQL Server 中我们有 4 个协议。
Protocols − In SQL Server we have 4 protocols.
-
Shared memory (for local connections and troubleshooting purpose).
-
Named pipes (for connections which are in LAN connectivity).
-
TCP/IP (for connections which are in WAN connectivity).
-
VIA-Virtual Interface Adapter (requires special hardware to set up by vendor and also deprecated from SQL 2012 version).
Server − SQL 服务已安装的位置以及数据库所在的目录。
Server − Where SQL Services got installed and databases reside.
Relational Engine − 此处将执行实际执行。它包含查询解析器、查询优化器和查询执行器。
Relational Engine − This is where real execution will be done. It contains Query parser, Query optimizer and Query executor.
Query Parser (Command Parser) and Compiler (Translator) − 这将检查查询的语法,并将查询转换为机器语言。
Query Parser (Command Parser) and Compiler (Translator) − This will check syntax of the query and it will convert the query to machine language.
Query Optimizer − 采用查询、统计和 Algebrizer 树作为输入,准备执行计划作为输出。
Query Optimizer − It will prepare the execution plan as output by taking query, statistics and Algebrizer tree as input.
Execution Plan − 就像一个路线图,其中包含作为查询执行的一部分要执行的所有步骤的顺序。
Execution Plan − It is like a roadmap, which contains the order of all the steps to be performed as part of the query execution.
Query Executor − 在此,将借助执行计划逐步执行查询,还将联系存储引擎。
Query Executor − This is where the query will be executed step by step with the help of execution plan and also the storage engine will be contacted.
Storage Engine − 负责存储系统(磁盘、SAN 等)上的数据存储和检索、数据操作、锁定和管理事务。
Storage Engine − It is responsible for storage and retrieval of data on the storage system (disk, SAN, etc.,), data manipulation, locking and managing transactions.
SQL OS - SQL Server 和 SQL OS 的区别在于:SQL OS 位于主机(Windows 操作系统)和 SQL Server 之间。数据库引擎上执行的所有活动均由 SQL OS 处理。SQL OS 提供各种操作系统服务,例如内存管理,它通过阻塞和锁定结构来处理缓冲池、日志缓冲以及死锁检测。
SQL OS − This lies between the host machine (Windows OS) and SQL Server. All the activities performed on database engine are taken care of by SQL OS. SQL OS provides various operating system services, such as memory management deals with buffer pool, log buffer and deadlock detection using the blocking and locking structure.
Checkpoint Process - 检查点是一个内部进程,它会将缓冲缓存中的所有脏页(已修改页)写入物理磁盘。除此之外,它还会将日志缓冲中的日志记录写入物理文件。将脏页从缓冲缓存写入数据文件也称为脏页硬化。
Checkpoint Process − Checkpoint is an internal process that writes all dirty pages (modified pages) from Buffer Cache to Physical disk. Apart from this, it also writes the log records from log buffer to physical file. Writing of Dirty pages from buffer cache to data file is also known as Hardening of dirty pages.
这是一个专用进程,并且由 SQL Server 在特定时间间隔内自动运行。SQL Server 为每个数据库单独运行检查点进程。检查有助于缩短 SQL Server 在发生意外关机或系统崩溃/故障时的恢复时间。
It is a dedicated process and runs automatically by SQL Server at specific intervals. SQL Server runs checkpoint process for each database individually. Checkpoint helps to reduce the recovery time for SQL Server in the event of unexpected shutdown or system crash\Failure.
Checkpoints in SQL Server
在 SQL Server 2012 中有四种类型的 checkpoints -
In SQL Server 2012 there are four types of checkpoints −
-
Automatic − This is the most common checkpoint which runs as a process in the background to make sure SQL Server Database can be recovered in the time limit defined by the Recovery Interval − Server Configuration Option.
-
Indirect − This is new in SQL Server 2012. This also runs in the background but to meet a user-specified target recovery time for the specific database where the option has been configured. Once the Target_Recovery_Time for a given database has been selected, this will override the Recovery Interval specified for the server and avoid automatic checkpoint on such DB.
-
Manual − This one runs just like any other T-SQL statement, once you issue checkpoint command it will run to its completion. Manual checkpoint runs for your current database only. You can also specify the Checkpoint_Duration which is optional - this duration specifies the time in which you want your checkpoint to complete.
-
Internal − As a user you can’t control internal checkpoint. Issued on specific operations such as Shutdown initiates a checkpoint operation on all databases except when shutdown is not clean (shutdown with nowait). If the recovery model gets changed from Full\Bulk-logged to Simple. While taking backup of the database. If your DB is in simple recovery model, checkpoint process executes automatically either when the log becomes 70% full, or based on Server option-Recovery Interval. Alter database command to add or remove a data\log file also initiates a checkpoint. Checkpoint also takes place when the recovery model of the DB is bulk-logged and a minimally logged operation is performed. DB Snapshot creation.
-
Lazy Writer Process − Lazy writer will push dirty pages to disk for an entirely different reason, because it needs to free up memory in the buffer pool. This happens when SQL server comes under memory pressure. As far as I am aware, this is controlled by an internal process and there is no setting for it.
SQL Server 会不断监视内存使用情况以评估资源争用(或可用性),其工作是确保始终有一定量的可用空间。在此过程中,当注意到任何此类资源争用时,会触发 Lazy Writer 通过将脏页写入磁盘来释放内存中的某些页。它采用最近最少使用 (LRU) 算法来决定将哪些页刷新到磁盘。
SQL server constantly monitors memory usage to assess resource contention (or availability); its job is to make sure that there is a certain amount of free space available at all times. As part of this process, when it notices any such resource contention, it triggers Lazy Writer to free up some pages in memory by writing out dirty pages to disk. It employs Least Recently Used (LRU) algorithm to decide which pages are to be flushed to the disk.
如果 Lazy Writer 始终处于活动状态,则可能表示存在内存瓶颈。
If Lazy Writer is always active, it could indicate memory bottleneck.
Memory Architecture
以下是内存体系结构的一些显着特征:
Following are some of the salient features of memory architecture.
-
One of the primary design goals of all database software is to minimize disk I/O because disk reads and writes are among the most resource-intensive operations.
-
Memory in windows can be called with Virtual Address Space, shared by Kernel mode (OS mode) and User mode (Application like SQL Server).
-
SQL Server "User address space" is broken into two regions: MemToLeave and Buffer Pool.
-
Size of MemToLeave (MTL) and Buffer Pool (BPool) is determined by SQL Server during startup.
-
Buffer management is a key component in achieving I/O highly efficiency. The buffer management component consists of two mechanisms: the buffer manager to access and update database pages, and the buffer pool to reduce database file I/O.
-
The buffer pool is further divided into multiple sections. The most important ones being the buffer cache (also referred to as data cache) and procedure cache. Buffer cache holds the data pages in memory so that frequently accessed data can be retrieved from cache. The alternative would be reading data pages from the disk. Reading data pages from cache optimizes performance by minimizing the number of required I/O operations which are inherently slower than retrieving data from the memory.
-
Procedure cache keeps the stored procedure and query execution plans to minimize the number of times that query plans have to be generated. You can find out information about the size and activity within the procedure cache using DBCC PROCCACHE statement.
缓冲池的其他部分包括 −
Other portions of buffer pool include −
-
System level data structures − Holds SQL Server instance level data about databases and locks.
-
Log cache − Reserved for reading and writing transaction log pages.
-
Connection context − Each connection to the instance has a small area of memory to record the current state of the connection. This information includes stored procedure and user-defined function parameters, cursor positions and more.
-
Stack space − Windows allocates stack space for each thread started by SQL Server.
Data File Architecture
数据文件架构有以下组件 −
Data File architecture has the following components −
File Groups
数据库文件可以组合在文件组中以用于分配和管理。没有文件可以成为多个文件组的成员。日志文件从不作为文件组的一部分。日志空间与数据空间分开管理。
Database files can be grouped together in file groups for allocation and administration purposes. No file can be a member of more than one file group. Log files are never part of a file group. Log space is managed separately from data space.
SQL Server 中有两种类型文件组,主文件组和用户自定义文件组。主文件组包含主数据文件和未具体指定到其他文件组的任何其他文件。系统表的所有页都在主文件组中分配。用户自定义文件组是使用创建数据库或更改数据库语句中的文件组关键字指定的任何文件组。
There are two types of file groups in SQL Server, Primary and User-defined. Primary file group contains the primary data file and any other files not specifically assigned to another file group. All pages for the system tables are allocated in the primary file group. User-defined file groups are any file groups specified using the file group keyword in create database or alter database statement.
每个数据库中的一个文件组作为默认文件组。当 SQL Server 将页面分配给在创建时未指定文件组的表或索引时,这些页面会从默认文件组中分配。要将默认文件组从一个文件组切换到另一个文件组,它应具有 db_owner 固定数据库角色。
One file group in each database operates as the default file group. When SQL Server allocates a page to a table or index for which no file group was specified when they were created, the pages are allocated from default file group. To switch the default file group from one file group to another file group, it should have db_owner fixed db role.
默认情况下,主文件组是默认文件组。用户应具有 db_owner 固定数据库角色,以便分别备份文件和文件组。
By default, primary file group is the default file group. User should have db_owner fixed database role in order to take backup of files and file groups individually.
Files
数据库有三种类型的文件 - 主数据文件、辅助数据文件和日志文件。主数据文件是数据库的起点,并指向数据库中的其他文件。
Databases have three types of files - Primary data file, Secondary data file, and Log file. Primary data file is the starting point of the database and points to the other files in the database.
每个数据库有一个主数据文件。我们可以为主要数据文件提供任何扩展名,但建议的扩展名是 .mdf 。辅助数据文件是该数据库中除主要数据文件之外的文件。一些数据库可能有多个辅助数据文件。一些数据库可能没有一个辅助数据文件。辅助数据文件的推荐扩展名是 .ndf 。
Every database has one primary data file. We can give any extension for the primary data file but the recommended extension is .mdf. Secondary data file is a file other than the primary data file in that database. Some databases may have multiple secondary data files. Some databases may not have a single secondary data file. Recommended extension for secondary data file is .ndf.
日志文件保存用于恢复数据库的所有日志信息。数据库必须至少有一个日志文件。我们可以为一个数据库有多个日志文件。日志文件的建议扩展名是 .ldf 。
Log files hold all of the log information used to recover the database. Database must have at least one log file. We can have multiple log files for one database. The recommended extension for log file is .ldf.
数据库中所有文件的位置都记录在 master 数据库和数据库的主文件中。大多数情况下,数据库引擎使用 master 数据库中的文件位置。
The location of all the files in a database are recorded in both master database and the primary file for the database. Most of the time, the database engine uses the file location from the master database.
文件有两个名称 − 逻辑名称和物理名称。逻辑名称用于在所有 T-SQL 语句中引用该文件。物理名称是 OS_file_name,它必须遵循操作系统的规则。数据和日志文件可以放在 FAT 或 NTFS 文件系统上,但不能放在压缩文件系统上。一个数据库中最多可以有 32,767 个文件。
Files have two names − Logical and Physical. Logical name is used to refer to the file in all T-SQL statements. Physical name is the OS_file_name, it must follow the rules of OS. Data and Log files can be placed on either FAT or NTFS file systems, but cannot be placed on compressed file systems. There can be up to 32,767 files in one database.
Extents
范围是将空间分配给表和索引的基本单位。一个范围是 8 个连续页或 64KB。SQL Server 有两种类型的范围 - 统一范围和混合范围。统一范围仅由单个对象构成。混合范围最多由八个对象共享。
Extents are basic unit in which space is allocated to tables and indexes. An extent is 8 contiguous pages or 64KB. SQL Server has two types of extents - Uniform and Mixed. Uniform extents are made up of only single object. Mixed extents are shared by up to eight objects.
Pages
它是 MS SQL Server 中数据存储的基本单位。页的大小为 8KB。每个页面的开头是 96 字节的头文件,用于存储系统信息,例如页面的类型、页面上的可用空间量以及拥有该页面的对象的 object id。SQL Server 中有 9 种类型的数据页。
It is the fundamental unit of data storage in MS SQL Server. The size of the page is 8KB. The start of each page is 96 byte header used to store system information such as type of page, amount of free space on the page and object id of the object owning the page. There are 9 types of data pages in SQL Server.
-
Data − Data rows with all data except text, ntext and image data.
-
Index − Index entries.
-
Tex\Image − Text, image and ntext data.
-
GAM − Information about allocated extents.
-
SGAM − Information about allocated extents at system level.
-
Page Free Space (PFS) − Information about free space available on pages.
-
Index Allocation Map (IAM) − Information about extents used by a table or index.
-
Bulk Changed Map (BCM) − Information about extents modified by bulk operations since the last backup log statement.
-
Differential Changed Map (DCM) − Information about extents that have changed since the last backup database statement.
Log File Architecture
SQL Server 的事务日志在逻辑上将事务日志当作一系列的日志记录。每个日志记录由日志序列号 (LSN) 识别。每个日志记录都包含它所属的事务的 ID。
The SQL Server transaction log operates logically as if the transaction log is a string of log records. Each log record is identified by Log Sequence Number (LSN). Each log record contains the ID of the transaction that it belongs to.
数据修改的日志记录要么记录执行的逻辑操作,要么记录修改后和修改前的数据映像。修改前映像是执行操作前的数据副本;修改后映像是执行操作后的数据副本。
Log records for data modifications record either the logical operation performed or they record the before and after images of the modified data. The before image is a copy of the data before the operation is performed; the after image is a copy of the data after the operation has been performed.
恢复操作的步骤取决于日志记录的类型 −
The steps to recover an operation depend on the type of log record −
-
Logical operation logged. To roll the logical operation forward, the operation is performed again.To roll the logical operation back, the reverse logical operation is performed.
-
Before and after image logged. To roll the operation forward, the after image is applied.To roll the operation back, the before image is applied.
事务日志中记录了不同类型的操作。这些操作包括 −
Different types of operations are recorded in the transaction log. These operations include −
-
The start and end of each transaction.
-
Every data modification (insert, update, or delete). This includes changes by system stored procedures or data definition language (DDL) statements to any table, including system tables.
-
Every extent and page allocation or de allocation.
-
Creating or dropping a table or index.
取消操作也会被记录。每个事务保留事务日志上的空间,以确保存在足够的日志空间来支持因明确取消语句或遇到错误而导致的取消。此保留空间在事务完成后被释放。
Rollback operations are also logged. Each transaction reserves space on the transaction log to make sure that enough log space exists to support a rollback that is caused by either an explicit rollback statement or if an error is encountered. This reserved space is freed when the transaction is completed.
从必须用于成功对整个数据库执行取消以恢复的第一个日志记录到最后写入的日志记录的对数文件部分称为日志的活动部分或活动日志。这是数据库完全恢复所需的日志部分。活跃日志的任何部分都不能被截断。此第一个日志记录的 LSN 被称为最小恢复 LSN(最小 LSN)。
The section of the log file from the first log record that must be present for a successful database-wide rollback to the last-written log record is called the active part of the log, or the active log. This is the section of the log required to a full recovery of the database. No part of the active log can ever be truncated. LSN of this first log record is known as the minimum recovery LSN (Min LSN).
SQL Server 数据库引擎在内部将每个物理日志文件划分为多个虚拟日志文件。虚拟日志文件没有固定大小,也没有固定数量的虚拟日志文件对应一个物理日志文件。
The SQL Server Database Engine divides each physical log file internally into a number of virtual log files. Virtual log files have no fixed size, and there is no fixed number of virtual log files for a physical log file.
数据库引擎会在创建或扩展日志文件时动态选择虚拟日志文件的大小。数据库引擎会尝试维护少量虚拟文件。管理员无法配置或设置虚拟日志文件的大小或数量。只有当物理日志文件由小文件大小和增量增长_值定义时,虚拟日志文件才会影响系统性能。
The Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files. The Database Engine tries to maintain a small number of virtual files. The size or number of virtual log files cannot be configured or set by administrators. The only time virtual log files affect system performance is if the physical log files are defined by small size and growth_increment values.
大小值是日志文件的初始大小,而 growth_increment 值每次需要新的空间时添加到文件中的空间量。如果日志文件由于很多小增量而增长到很大的大小,它们就会有许多虚拟日志文件。这会降低数据库启动速度,还会降低日志备份和还原操作速度。
The size value is the initial size for the log file and the growth_increment value is the amount of space added to the file every time new space is required. If the log files grow to a large size because of many small increments, they will have many virtual log files. This can slow down database startup and also log backup and restore operations.
我们建议您为日志文件分配接近最终所需大小的大小值,还要拥有一个相对较大的 growth_increment 值。SQL Server 使用预写日志 (WAL),该日志保证在将关联的日志记录写入磁盘之前,不会将任何数据修改写入磁盘。这维护事务的 ACID 属性。
We recommend that you assign log files a size value close to the final size required, and also have a relatively large growth_increment value. SQL Server uses a write-ahead log (WAL), which guarantees that no data modifications are written to disk before the associated log record is written to disk. This maintains the ACID properties for a transaction.