Documentdb 简明教程
DocumentDB - Introduction
在本章中,我们将简要讨论有关 NoSQL 和文档数据库的主要概念。我们还将对 DocumentDB 进行快速概述。
In this chapter, we will briefly discuss the major concepts around NoSQL and document databases. We will also have a quick overview of DocumentDB.
NoSQL Document Database
DocumentDB 是 Microsoft 最新型的 NoSQL 文档数据库,因此当你谈论 NoSQL 文档数据库时,我们确切地指的是什么,是 NoSQL 还是文档数据库?
DocumentDB is Microsoft’s newest NoSQL document database, so when you say NoSQL document database then, what precisely do we mean by NoSQL, and document database?
-
SQL means Structured Query Language which is traditional query language of relational databases. SQL is often equated with relational databases.
-
It’s really more helpful to think of a NoSQL database as a non-relational database, so NoSQL really means non-relational.
有不同类型的 NoSQL 数据库,包括键值存储,例如:
There are different types of NoSQL databases which include key value stores such as −
-
Azure Table Storage.
-
Column-based stores like Cassandra.
-
Graph databases like NEO4.
-
Document databases like MongoDB and Azure DocumentDB.
Azure DocumentDB
Microsoft 于 2015 年 4 月 8 日正式推出了 Azure DocumentDB,它肯定可以被描述为一个典型的 NoSQL 文档数据库。它具有大规模可扩展性,并且适用于无模式的 JSON 文档。
Microsoft officially launched Azure DocumentDB on April 8th, 2015, and it certainly can be characterized as a typical NoSQL document database. It’s massively scalable, and it works with schema-free JSON documents.
-
DocumentDB is a true schema-free NoSQL document database service designed for modern mobile and web applications.
-
It also delivers consistently fast reads and writes, schema flexibility, and the ability to easily scale a database up and down on demand.
-
It does not assume or require any schema for the JSON documents it indexes.
-
DocumentDB automatically indexes every property in a document as soon as the document is added to the database.
-
DocumentDB enables complex ad-hoc queries using a SQL language, and every document is instantly queryable the moment it’s created, and you can search on any property anywhere within the document hierarchy.
DocumentDB – Pricing
DocumentDB 是基于一个数据库账户中所包含的集合数计费的。每个账户可以有一个或多个数据库,并且每个数据库可以有几乎不受限制的集合数,尽管最初的默认配额为 100。可以通过联系 Azure 支持来取消此配额。
DocumentDB is billed based on the number of collections contained in a database account. Each account can have one or more databases and each database can have a virtually unlimited number of collections, although there is an initial default quota of 100. This quota can be lifted by contacting Azure support.
-
A collection is not only a unit of scale, but also a unit of cost, so in DocumentDB you pay per collection, which has a storage capacity of up to 10 GB.
-
At a minimum, you’ll need one S1 collection to store documents in a database that will cost roughly $25 per month, which gets billed against your Azure subscription.
-
As your database grows in size and exceeds 10 GB, you’ll need to purchase another collection to contain the additional data.
-
Each S1 collection will give you 250 request units per second, and if that’s not enough, then you can scale the collection up to an S2 and get a 1000 request units per second for about $50 a month.
-
You can also turn it all the way up to an S3 and pay around $100 a month.
DocumentDB - Advantages
DocumentDB 脱颖而出,具有非常独特的功能。Azure DocumentDB 提供了以下关键功能和优势。
DocumentDB stands out with some very unique capabilities. Azure DocumentDB offers the following key capabilities and benefits.
Schema Free
在关系数据库中,每个表都具有一个架构,用于定义每个表中的每行必须遵循的列和数据类型。
In a relational database, every table has a schema that defines the columns and data types that each row in the table must conform to.
相反,文档数据库没有定义架构,并且每个文档的结构可以不同。
In contrast, a document database has no defined schema, and every document can be structured differently.
SQL Syntax
DocumentDB 能够使用 SQL 语言进行复杂的临时查询,并且每份文档在创建后都可以立即进行查询。您可以在文档层次结构中的任何位置对任何属性进行搜索。
DocumentDB enables complex ad-hoc queries using SQL language, and every document is instantly queryable the moment it’s created. You can search on any property anywhere within the document hierarchy.
Tunable Consistency
它提供了一些详细且定义良好的一致性级别,这使您能够在一致性、可用性和延迟之间进行合理的权衡。
It provides some granular, well-defined consistency levels, which allows you to make sound trade-offs between consistency, availability, and latency.
您可以从四个定义明确的一致性级别中进行选择,以在一致性和性能之间实现最佳权衡。对于查询和读取操作,DocumentDB 提供四个不同的级别 −
You can select from four well-defined consistency levels to achieve optimal trade-off between consistency and performance. For queries and read operations, DocumentDB offers four distinct consistency levels −
-
Strong
-
Bounded-staleness
-
Session
-
Eventual
Elastic Scale
可伸缩性是 NoSQL 的关键,而 DocumentDB 则提供了此功能。DocumentDB 已经证明了它的可伸缩性。
Scalability is the name of the game with NoSQL, and DocumentDB delivers. DocumentDB has already been proven its scale.
-
Major services like Office OneNote and Xbox are already backed by DocumentDB with databases containing tens of terabytes of JSON documents, over a million active users, and operating consistently with 99.95% availability.
-
You can elastically scale DocumentDB with predictable performance by creating more units as your application grows.
Fully Managed
DocumentDB 以在 Azure 上运行的服务形式提供,作为一个完全管理的基于云的平台。
DocumentDB is available as a fully managed cloud-based platform as a service running on Azure.
-
There is simply nothing for you to install or manage.
-
There are no servers, cables, no operating systems or updates to deal with, no replicas to set up.
-
Microsoft does all that work and keeps the service running.
-
Within literally minutes, you can get started working with DocumentDB using just a browser and an Azure subscription.
DocumentDB - Environment Setup
Microsoft 提供了 Visual Studio 的免费版本,其中还包含 SQL Server,可以从 https://www.visualstudio.com 下载。
Microsoft provides a free version of Visual Studio which also contains SQL Server and it can be downloaded from https://www.visualstudio.com
Installation
Step 1 - 下载完成后,运行安装程序。将显示以下对话框。
Step 1 − Once downloading is completed, run the installer. The following dialog will be displayed.

Step 2 - 单击“安装”按钮,安装过程将开始。
Step 2 − Click on the Install button and it will start the installation process.

Step 3 - 安装过程成功完成后,您将看到以下对话框。
Step 3 − Once the installation process is completed successfully, you will see the following dialog.

Step 4 - 关闭此对话框并在需要时重新启动计算机。
Step 4 − Close this dialog and restart your computer if required.
Step 5 - 现在从开始菜单打开 Visual studio,将打开下面的对话框。第一次仅准备需要一些时间。
Step 5 − Now open Visual studio from start Menu which will open the below dialog. It will take some time for the first time only for preparation.

完成后,您将看到 Visual Studio 的主窗口。
Once all is done, you will see the main window of Visual Studio.

Step 6 - 让我们从“文件→新建→项目”创建一个新项目。
Step 6 − Let’s create a new project from File → New → Project.

Step 7 - 选择控制台应用程序,在“名称”字段中输入 DocumentDBDemo,然后单击“确定”按钮。
Step 7 − Select Console Application, enter DocumentDBDemo in the Name field and click OK button.
Step 8 - 在解决方案资源管理器中,右键单击您的项目。
Step 8 − In solution Explorer, right-click on your project.

Step 9 - 选择“管理 NuGet 程序包”,这将在 Visual Studio 中打开以下窗口并在“联机搜索”输入框中搜索 DocumentDB 客户端库。
Step 9 − Select Manage NuGet Packages which will open the following window in Visual Studio and in the Search Online input box, search for DocumentDB Client Library.

Step 10 - 通过单击“安装”按钮安装最新版本。
Step 10 − Install the latest version by clicking the install button.

Step 11 - 单击“我接受”。一旦安装完成,您将在输出窗口中看到消息。
Step 11 − Click “I Accept”. Once installation is done you will see the message in your output window.

你现在可以开始你的应用程序。
You are now ready to start your application.
DocumentDB - Create Account
若要使用 Microsoft Azure DocumentDB,你必须创建一个 DocumentDB 帐户。在本章中,我们将使用 Azure 门户创建一个 DocumentDB 帐户。
To use Microsoft Azure DocumentDB, you must create a DocumentDB account. In this chapter, we will create a DocumentDB account using Azure portal.
Step 1 − 如果已拥有 Azure 订阅,请登录到在线 https://portal.azure.com ,否则你需要先登录。
Step 1 − Log in to the online https://portal.azure.com if you already have an Azure subscription otherwise you need to sign in first.
你将看到主仪表板。它完全可定制,因此你可以按任何需要的方式排列这些磁贴、调整它们的大小、增加和移除经常使用或不再使用的项目的磁贴。
You will see the main Dashboard. It is fully customizable so you can arrange these tiles any way you like, resize them, add and remove tiles for things you frequently use or no longer do.

Step 2 − 选择页面左上角的“新建”选项。
Step 2 − Select the ‘New’ option on the top left side of the page.

Step 3 − 现在选择数据 + 存储 > Azure DocumentDB 选项,你将看到以下新建 DocumentDB 帐户部分。
Step 3 − Now select Data + Storage > Azure DocumentDB option and you see the following New DocumentDB account section.

我们需要想出一个全局唯一名称(ID),该名称与 .documents.azure.com 结合,是可公开寻址的 DocumentDB 帐户的端点。我们可以使用此端点访问通过该帐户创建的所有数据库。
We need to come up with a globally unique name (ID), which combined with .documents.azure.com is the publicly addressable endpoint to our DocumentDB account. All the databases we create beneath that account can be accessed over the internet using this endpoint.
Step 4 − 我们将其命名为 azuredocdbdemo,然后单击资源组 → new_resource。
Step 4 − Let’s name it azuredocdbdemo and click on Resource Group → new_resource.

Step 5 − 选择位置,即希望在哪个 Microsoft 数据中心托管此帐户。选择位置并选择你的区域。
Step 5 − Choose the location i.e., which Microsoft data center you want this account to be hosted. Select the location and choose your region.

Step 6 − 选中固定到仪表板复选框,然后继续单击创建按钮。
Step 6 − Check Pin to dashboard checkbox and just go ahead and click Create button.

你可以看到磁贴已添加到仪表板,并且它让我们知道该帐户正在创建。实际上,为新帐户设置可能需要几分钟,同时 DocumentDB 分配端点、配置副本并执行其他后台工作。
You can see that the tile has already been added to the Dashboard, and it’s letting us know that the account is being created. It can actually take a few minutes to set things up for a new account while DocumentDB allocates the endpoint, provisions replicas, and performs other work in the background.
完成后,你将看到仪表板。
Once it is done, you will see the dashboard.

Step 7 − 现在单击创建的 DocumentDB 帐户,你将看到一个详细的屏幕,如下所示。
Step 7 − Now click on the created DocumentDB account and you will see a detailed screen as the following image.

DocumentDB - Connect Account
在你开始对 DocumentDB 编程时,第一步是连接。因此,要连接到你的 DocumentDB 帐户,你需要两样东西;
When you start programming against DocumentDB, the very first step is to connect. So to connect to your DocumentDB account you will need two things;
-
Endpoint
-
Authorization Key
Endpoint
端点是你的 DocumentDB 帐户的 URL,它是通过将你的 DocumentDB 帐户名与 .documents.azure.com 结合来构建的。我们进入仪表板。
Endpoint is the URL to your DocumentDB account and it is constructed by combining your DocumentDB account name with .documents.azure.com. Let’s go to the Dashboard.

现在,单击创建的 DocumentDB 帐户。你将看到详细信息,如下所示。
Now, click on the created DocumentDB account. You will see the details as shown in the following image.

当你选择“密钥”选项时,它将显示更多信息,如下所示。你还会看到你的 DocumentDB 帐户的 URL,你可以用它作为你的端点。
When you select the ‘Keys’ option, it will display additional information as shown in the following image. You will also see the URL to your DocumentDB account, which you can use as your endpoint.

Authorization Key
授权密钥包含你的凭证,并且有两种类型的密钥。主密钥允许访问帐户内的所有资源,而资源令牌允许受限访问特定资源。
Authorization key contains your credentials and there are two types of keys. The master key allows full access to all resources within the account, while resource tokens permit restricted access to specific resources.
Master Keys
-
There’s nothing you can’t do with a master key. You can blow away your entire database if you want, using the master key.
-
For this reason, you definitely don’t want to be sharing the master key or distributing it to client environments. As an added security measure, it’s a good idea to change it frequently.
-
There are actually two master keys for each database account, the primary and the secondary as highlighted in the above screenshot.
Resource Tokens
-
You can also use resource tokens instead of a master key.
-
Connections based on resource tokens can only access the resources specified by the tokens and no other resources.
-
Resource tokens are based on user permissions, so first you create one or more users, and these are defined at the database level.
-
You create one or more permissions for each user, based on the resources that you want to allow each user to access.
-
Each permission generates a resource token that allows either read-only or full access to a given resource and that can be any user resource within the database.
让我们进入在第 3 章中创建的控制台应用程序。
Let’s go to console application created in chapter 3.
Step 1 − 在 Program.cs 文件中添加以下引用。
Step 1 − Add the following references in the Program.cs file.
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
Step 2 − 现在添加端点 URL 和授权密钥。在此示例中,我们将使用主密钥作为授权密钥。
Step 2 − Now add Endpoint URL and Authorization key. In this example we will be using primary key as Authorization key.
请注意,在你的情况下,端点 URL 和授权密钥都应有所不同。
Note that in your case both Endpoint URL and authorization key should be different.
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey =
"BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
Step 3 − 在称为 CreateDocumentClient 的异步任务中新建一个 DocumentClient 的实例,并实例化新的 DocumentClient。
Step 3 − Create a new instance of the DocumentClient in asynchronous task called CreateDocumentClient and instantiate new DocumentClient.
Step 4 − 从你的 Main 方法调用异步任务。
Step 4 − Call your asynchronous task from your Main method.
以下是迄今为止完整的 Program.cs 文件。
Following is the complete Program.cs file so far.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
namespace DocumentDBDemo {
class Program {
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey = "BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/
StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
static void Main(string[] args) {
try {
CreateDocumentClient().Wait();
} catch (Exception e) {
Exception baseException = e.GetBaseException();
Console.WriteLine("Error: {0}, Message: {1}", e.Message, baseException.Message);
}
Console.ReadKey();
}
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey);
}
}
}
在本章中,我们学习了如何连接到 DocumentDB 帐户和新建一个 DocumentClient 类的实例。
In this chapter, we have learnt how to connect to a DocumentDB account and create an instance of the DocumentClient class.
DocumentDB - Create Database
在本章中,我们将学习如何新建一个数据库。要使用 Microsoft Azure DocumentDB,你必须有一个 DocumentDB 帐户、一个数据库、一个集合和文档。我们已经有了 DocumentDB 帐户,现在我们有两种选择来创建数据库 −
In this chapter, we will learn how to create a database. To use Microsoft Azure DocumentDB, you must have a DocumentDB account, a database, a collection, and documents. We already have a DocumentDB account, now to create database we have two options −
-
Microsoft Azure Portal or
-
.Net SDK
Create a Database for DocumentDB using the Microsoft Azure Portal
若要使用门户创建数据库,以下为步骤。
To create a database using portal, following are the steps.
Step 1 − 登录到 Azure 门户,你将看到仪表板。
Step 1 − Login to Azure portal and you will see the dashboard.

Step 2 − 现在单击已创建的 DocumentDB 帐户,你将看到详细信息,如下图所示。
Step 2 − Now click on the created DocumentDB account and you will see the details as shown in the following screenshot.

Step 3 − 选择“添加数据库”选项并提供数据库 ID。
Step 3 − Select the Add Database option and provide the ID for your database.

Step 4 - 单击“确定”。
Step 4 − Click OK.

您会看到添加了数据库。目前,尚未添加任何集合,但稍后我们可以添加将用于存储 JSON 文档的容器集合。请注意,它同时具有 ID 和资源 ID。
You can see that the database is added. At the moment, it has no collection, but we can add collections later which are the containers that will store our JSON documents. Notice that it has both an ID and a Resource ID.
Create a Database for DocumentDB Using .Net SDK
如需使用 .Net SDK 创建数据库,请执行以下步骤。
To create a database using .Net SDK, following are the steps.
Step 1 − 从上一章中在 Visual Studio 中打开控制台应用程序。
Step 1 − Open the Console Application in Visual Studio from the last chapter.
Step 2 − 创建新的数据库对象来创建新数据库。如需创建新数据库,我们只需分配 Id 属性,我们将其设置为 CreateDatabase 任务中的“mynewdb”。
Step 2 − Create the new database by creating a new database object. To create a new database, we only need to assign the Id property, which we are setting to “mynewdb” in a CreateDatabase task.
private async static Task CreateDatabase(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("******** Create Database *******");
var databaseDefinition = new Database { Id = "mynewdb" };
var result = await client.CreateDatabaseAsync(databaseDefinition);
var database = result.Resource;
Console.WriteLine(" Database Id: {0}; Rid: {1}", database.Id, database.ResourceId);
Console.WriteLine("******** Database Created *******");
}
Step 3 − 现在将此 databaseDefinition 传递给 CreateDatabaseAsync,然后以资源属性的形式获取结果。所有创建对象的方法都返回一个描述已创建项(在本例中为数据库)的资源属性。
Step 3 − Now pass this databaseDefinition on to CreateDatabaseAsync, and get back a result with a Resource property. All the create object methods return a Resource property that describes the item that was created, which is a database in this case.
从资源属性中获取新的数据库对象,并将其与 DocumentDB 分配给它的资源 ID 一起显示在控制台上。
We get the new database object from the Resource property and it is displayed on the Console along with the Resource ID that DocumentDB assigned to it.
Step 4 − 现在,在实例化 DocumentClient 后,从 CreateDocumentClient 任务调用 CreateDatabase 任务。
Step 4 − Now call CreateDatabase task from the CreateDocumentClient task after DocumentClient is instantiated.
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
await CreateDatabase(client);
}
以下是迄今为止完整的 Program.cs 文件。
Following is the complete Program.cs file so far.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
namespace DocumentDBDemo {
class Program {
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey = "BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/
StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
static void Main(string[] args) {
try {
CreateDocumentClient().Wait();
} catch (Exception e) {
Exception baseException = e.GetBaseException();
Console.WriteLine("Error: {0}, Message: {1}", e.Message, baseException.Message);
}
Console.ReadKey();
}
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
await CreateDatabase(client);
}
}
private async static Task CreateDatabase(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("******** Create Database *******");
var databaseDefinition = new Database { Id = "mynewdb" };
var result = await client.CreateDatabaseAsync(databaseDefinition);
var database = result.Resource;
Console.WriteLine(" Database Id: {0}; Rid: {1}", database.Id, database.ResourceId);
Console.WriteLine("******** Database Created *******");
}
}
}
当编译并执行上述代码时,您会收到以下输出,其中包含数据库和资源 ID。
When the above code is compiled and executed, you will receive the following output which contains the Database and Resources IDs.
******** Create Database *******
Database Id: mynewdb; Rid: ltpJAA==
******** Database Created *******
DocumentDB - List Databases
到目前为止,我们在 DocumentDB 帐户中创建了两个数据库,第一个是使用 Azure 门户创建的,而第二个是使用 .Net SDK 创建的。现在,您可以使用 Azure 门户查看这些数据库。
So far, we have created two databases in our DocumentDB account, first one is created using Azure portal while the second database is created using .Net SDK. Now to view these databases, you can use Azure portal.
转到 Azure 门户中的 DocumentDB 帐户,您现在会看到两个数据库。
Go to your DocumentDB account on Azure portal and you will see two databases now.

您还可以使用 .Net SDK 从代码中查看或列出数据库。以下为相关步骤。
You can also view or list the databases from your code using .Net SDK. Following are the steps involved.
Step 1 − 发出没有参数的数据库查询,该查询会返回完整的列表,但您也可以传入一个查询以查找特定的数据库或特定数据库。
Step 1 − Issue a database Query with no parameters which returns a complete list, but you can also pass in a query to look for a specific database or specific databases.
private static void GetDatabases(DocumentClient client) {
Console.WriteLine();
Console.WriteLine();
Console.WriteLine("******** Get Databases List ********");
var databases = client.CreateDatabaseQuery().ToList();
foreach (var database in databases) {
Console.WriteLine(" Database Id: {0}; Rid: {1}", database.Id, database.ResourceId);
}
Console.WriteLine();
Console.WriteLine("Total databases: {0}", databases.Count);
}
您会看到,有一堆用于查找集合、文档、用户和其他资源的 CreateQuery 方法。这些方法实际上并不执行查询,它们只是定义查询并返回可迭代对象。
You will see that there are a bunch of these CreateQuery methods for locating collections, documents, users, and other resources. These methods don’t actually execute the query, they just define the query and return an iterateable object.
实际上执行查询、迭代结果并将它们作为列表返回的是对 ToList() 的调用。
It’s the call to ToList() that actually executes the query, iterates the results, and returns them in a list.
Step 2 − 在实例化 DocumentClient 之后,从 CreateDocumentClient 任务调用 GetDatabases 方法。
Step 2 − Call GetDatabases method from the CreateDocumentClient task after DocumentClient is instantiated.
Step 3 − 您还需要对 CreateDatabase 任务进行注释或更改数据库 ID,否则您会收到一个错误消息,指出数据库已存在。
Step 3 − You also need to comment the CreateDatabase task or change the database id, otherwise you will get an error message that the database exists.
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
//await CreateDatabase(client);
GetDatabases(client);
}
以下是迄今为止完整的 Program.cs 文件。
Following is the complete Program.cs file so far.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
namespace DocumentDBDemo {
class Program {
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey = "BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/
StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
static void Main(string[] args) {
try {
CreateDocumentClient().Wait();
} catch (Exception e) {
Exception baseException = e.GetBaseException();
Console.WriteLine("Error: {0}, Message: {1}", e.Message, baseException.Message);
}
Console.ReadKey();
}
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
await CreateDatabase(client);
GetDatabases(client);
}
}
private async static Task CreateDatabase(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("******** Create Database *******");
var databaseDefinition = new Database { Id = "mynewdb" };
var result = await client.CreateDatabaseAsync(databaseDefinition);
var database = result.Resource;
Console.WriteLine(" Database Id: {0}; Rid: {1}", database.Id, database.ResourceId);
Console.WriteLine("******** Database Created *******");
}
private static void GetDatabases(DocumentClient client) {
Console.WriteLine();
Console.WriteLine();
Console.WriteLine("******** Get Databases List ********");
var databases = client.CreateDatabaseQuery().ToList();
foreach (var database in databases) {
Console.WriteLine(" Database Id: {0}; Rid: {1}",
database.Id, database.ResourceId);
}
Console.WriteLine();
Console.WriteLine("Total databases: {0}", databases.Count);
}
}
}
当编译并执行上述代码时,您会收到以下输出,其中包含两个数据库的数据库和资源 ID。最后,您还将看到数据库的总数。
When the above code is compiled and executed you will receive the following output which contains the Database and Resources IDs of both the databases. In the end you will also see the total number of databases.
******** Get Databases List ********
Database Id: myfirstdb; Rid: Ic8LAA==
Database Id: mynewdb; Rid: ltpJAA==
Total databases: 2
DocumentDB - Drop Databases
您可以通过门户或代码(使用 .Net SDK)删除数据库。在这里,我们将按部就班地讨论如何在 DocumentDB 中删除数据库。
You can drop a database or databases from the portal as well as from the code by using .Net SDK. Here, we will discuss, in a step-wise manner, how to drop a database in DocumentDB.
Step 1 − 在 Azure 门户上转到 DocumentDB 帐户。为了演示,我添加了另外两个数据库,如下图所示。
Step 1 − Go to your DocumentDB account on Azure portal. For the purpose of demo, I have added two more databases as seen in the following screenshot.

Step 2 − 若要删除任何数据库,您需要单击该数据库。我们选择 tempdb,您将看到以下页面,选择“删除数据库”选项。
Step 2 − To drop any database, you need to click that database. Let’s select tempdb, you will see the following page, select the ‘Delete Database’ option.

Step 3 − 它将显示确认信息,现在单击“是”按钮。
Step 3 − It will display the confirmation message, now click the ‘Yes’ button.

您将看到 tempdb 不再出现在仪表板中。
You will see that the tempdb is no more available in your dashboard.

您还可以使用 .Net SDK 从代码中删除数据库。以下是要执行的步骤。
You can also delete databases from your code using .Net SDK. To do following are the steps.
Step 1 − 让我们通过指定我们想要删除的数据库的 ID 来删除该数据库,但是我们需要它的 SelfLink。
Step 1 − Let’s delete the database by specifying the ID of the database we want to delete, but we need its SelfLink.
Step 2 − 我们像之前一样调用 CreateDatabaseQuery,但这次我们实际上提供了一个查询,只返回 ID 为 tempdb1 的一个数据库。
Step 2 − We are calling the CreateDatabaseQuery like before, but this time we are actually supplying a query to return just the one database with the ID tempdb1.
private async static Task DeleteDatabase(DocumentClient client) {
Console.WriteLine("******** Delete Database ********");
Database database = client
.CreateDatabaseQuery("SELECT * FROM c WHERE c.id = 'tempdb1'")
.AsEnumerable()
.First();
await client.DeleteDatabaseAsync(database.SelfLink);
}
Step 3 − 这一次,我们可以调用 AsEnumerable 而不是 ToList(),因为我们实际上不需要列表对象。预期仅一个结果,调用 AsEnumerable 就足够了,这样我们可以使用 First() 获得查询返回的第一个数据库对象。这是 tempdb1 的数据库对象,它具有 SelfLink,我们可以使用 SelfLink 调用 DeleteDatabaseAsync 来删除该数据库。
Step 3 − This time, we can call AsEnumerable instead of ToList() because we don’t actually need a list object. Expecting only result, calling AsEnumerable is sufficient so that we can get the first database object returned by the query with First(). This is the database object for tempdb1 and it has a SelfLink that we can use to call DeleteDatabaseAsync which deletes the database.
Step 4 − 您还需要在 DocumentClient 实例化后从 CreateDocumentClient 任务调用 DeleteDatabase 任务。
Step 4 − You also need to call DeleteDatabase task from the CreateDocumentClient task after DocumentClient is instantiated.
Step 5 − 若要查看删除指定数据库后的数据库列表,我们再次调用 GetDatabases 方法。
Step 5 − To view the list of databases after deleting the specified database, let’s call GetDatabases method again.
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
//await CreateDatabase(client);
GetDatabases(client);
await DeleteDatabase(client);
GetDatabases(client);
}
以下是迄今为止完整的 Program.cs 文件。
Following is the complete Program.cs file so far.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
namespace DocumentDBDemo {
class Program {
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey = "BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/
StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
static void Main(string[] args) {
try {
CreateDocumentClient().Wait();
} catch (Exception e) {
Exception baseException = e.GetBaseException();
Console.WriteLine("Error: {0}, Message: {1}", e.Message, baseException.Message);
}
Console.ReadKey();
}
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
//await CreateDatabase(client);
GetDatabases(client);
await DeleteDatabase(client);
GetDatabases(client);
}
}
private async static Task CreateDatabase(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("******** Create Database *******");
var databaseDefinition = new Database { Id = "mynewdb" };
var result = await client.CreateDatabaseAsync(databaseDefinition);
var database = result.Resource;
Console.WriteLine(" Database Id: {0}; Rid: {1}",
database.Id, database.ResourceId);
Console.WriteLine("******** Database Created *******");
}
private static void GetDatabases(DocumentClient client) {
Console.WriteLine();
Console.WriteLine();
Console.WriteLine("******** Get Databases List ********");
var databases = client.CreateDatabaseQuery().ToList();
foreach (var database in databases) {
Console.WriteLine(" Database Id: {0}; Rid: {1}", database.Id,
database.ResourceId);
}
Console.WriteLine();
Console.WriteLine("Total databases: {0}", databases.Count);
}
private async static Task DeleteDatabase(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("******** Delete Database ********");
Database database = client
.CreateDatabaseQuery("SELECT * FROM c WHERE c.id = 'tempdb1'")
.AsEnumerable()
.First();
await client.DeleteDatabaseAsync(database.SelfLink);
}
}
}
编译并执行上述代码后,您将收到以下输出,其中包含三个数据库的数据库和资源 ID 以及数据库总数。
When the above code is compiled and executed, you will receive the following output which contains the Database and Resources IDs of the three databases and total number of databases.
******** Get Databases List ********
Database Id: myfirstdb; Rid: Ic8LAA==
Database Id: mynewdb; Rid: ltpJAA==
Database Id: tempdb1; Rid: 06JjAA==
Total databases: 3
******** Delete Database ********
******** Get Databases List ********
Database Id: myfirstdb; Rid: Ic8LAA==
Database Id: mynewdb; Rid: ltpJAA==
Total databases: 2
删除数据库后,您还将在最后看到 DocumentDB 帐户中只剩下两个数据库。
After deleting the database, you will also see at the end that only two databases are left in DocumentDB account.
DocumentDB - Create Collection
在本章中,我们将了解如何创建集合。它类似于创建数据库。您可以从门户或使用 .Net SDK 从代码中创建集合。
In this chapter, we will learn how to create a collection. It is similar to creating a database. You can create a collection either from the portal or from the code using .Net SDK.
Step 1 − 转到 Azure 门户上的主仪表板。
Step 1 − Go to main dashboard on Azure portal.

Step 2 − 从数据库列表中选择 myfirstdb。
Step 2 − Select myfirstdb from the databases list.

Step 3 − 单击“添加集合”选项并为集合指定 ID。为不同的选项选择定价层级。
Step 3 − Click on the ‘Add Collection’ option and specify the ID for collection. Select the Pricing Tier for different option.

Step 4 − 让我们选择 S1 标准,然后单击选择 → 确定按钮。
Step 4 − Let’s select S1 Standard and click Select → OK button.

正如您所看到的,MyCollection 已添加到 myfirstdb 中。
As you can see that MyCollection is added to the myfirstdb.
您还可以使用 .Net SDK 从代码中创建集合。让我们了解一下从代码中添加集合的以下步骤。
You can also create collection from the code by using .Net SDK. Let’s have a look at the following steps to add collections from the code.
Step 1 −在 Visual Studio 中打开控制台应用程序。
Step 1 − Open the Console application in Visual Studio.
Step 2 −要创建一个集合,首先通过 CreateDocumentClient 任务中的 ID 检索 myfirstdb 数据库。
Step 2 − To create a collection, first retrieve the myfirstdb database by its ID in the CreateDocumentClient task.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
await CreateCollection(client, "MyCollection1");
await CreateCollection(client, "MyCollection2", "S2");
}
}
以下是 CreateCollection 任务的实现。
Following is the implementation for CreateCollection task.
private async static Task CreateCollection(DocumentClient client, string collectionId,
string offerType = "S1") {
Console.WriteLine();
Console.WriteLine("**** Create Collection {0} in {1} ****", collectionId, database.Id);
var collectionDefinition = new DocumentCollection { Id = collectionId };
var options = new RequestOptions { OfferType = offerType };
var result = await client.CreateDocumentCollectionAsync(database.SelfLink,
collectionDefinition, options);
var collection = result.Resource;
Console.WriteLine("Created new collection");
ViewCollection(collection);
}
我们创建了一个新的 DocumentCollection 对象,该对象使用 CreateDocumentCollectionAsync 方法的新 Id 定义新集合,该方法还接受我们在这里用来设置新集合性能层(我们称为 offerType)的选项参数。
We create a new DocumentCollection object that defines the new collection with the desired Id for the CreateDocumentCollectionAsync method which also accepts an options parameter that we’re using here to set the performance tier of the new collection, which we’re calling offerType.
这默认为 S1,并且由于我们没有为 MyCollection1 传入 offerType,所以这将是一个 S1 集合,而对于 MyCollection2,我们已经传递了 S2,这使其成为 S2 如上所示。
This defaults to S1 and since we didn’t pass in an offerType, for MyCollection1, so this will be an S1 collection and for MyCollection2 we have passed S2 which make this one an S2 as shown above.
以下是 ViewCollection 方法的实现。
Following is the implementation of the ViewCollection method.
private static void ViewCollection(DocumentCollection collection) {
Console.WriteLine("Collection ID: {0} ", collection.Id);
Console.WriteLine("Resource ID: {0} ", collection.ResourceId);
Console.WriteLine("Self Link: {0} ", collection.SelfLink);
Console.WriteLine("Documents Link: {0} ", collection.DocumentsLink);
Console.WriteLine("UDFs Link: {0} ", collection.UserDefinedFunctionsLink);
Console.WriteLine(" StoredProcs Link: {0} ", collection.StoredProceduresLink);
Console.WriteLine("Triggers Link: {0} ", collection.TriggersLink);
Console.WriteLine("Timestamp: {0} ", collection.Timestamp);
}
以下是程序的完整实现。collections 中的 cs 文件。
Following is the complete implementation of program.cs file for collections.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
namespace DocumentDBDemo {
class Program {
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey = "BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/
StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
private static Database database;
static void Main(string[] args) {
try {
CreateDocumentClient().Wait();
} catch (Exception e) {
Exception baseException = e.GetBaseException();
Console.WriteLine("Error: {0}, Message: {1}", e.Message, baseException.Message);
}
Console.ReadKey();
}
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
await CreateCollection(client, "MyCollection1");
await CreateCollection(client, "MyCollection2", "S2");
//await CreateDatabase(client);
//GetDatabases(client);
//await DeleteDatabase(client);
//GetDatabases(client);
}
}
private async static Task CreateCollection(DocumentClient client,
string collectionId, string offerType = "S1") {
Console.WriteLine();
Console.WriteLine("**** Create Collection {0} in {1} ****", collectionId,
database.Id);
var collectionDefinition = new DocumentCollection { Id = collectionId };
var options = new RequestOptions { OfferType = offerType };
var result = await
client.CreateDocumentCollectionAsync(database.SelfLink,
collectionDefinition, options);
var collection = result.Resource;
Console.WriteLine("Created new collection");
ViewCollection(collection);
}
private static void ViewCollection(DocumentCollection collection) {
Console.WriteLine("Collection ID: {0} ", collection.Id);
Console.WriteLine("Resource ID: {0} ", collection.ResourceId);
Console.WriteLine("Self Link: {0} ", collection.SelfLink);
Console.WriteLine("Documents Link: {0} ", collection.DocumentsLink);
Console.WriteLine("UDFs Link: {0} ", collection.UserDefinedFunctionsLink);
Console.WriteLine("StoredProcs Link: {0} ", collection.StoredProceduresLink);
Console.WriteLine("Triggers Link: {0} ", collection.TriggersLink);
Console.WriteLine("Timestamp: {0} ", collection.Timestamp);
}
}
}
编译和执行上述代码后,您将收到包含所有与集合相关信息在内的以下输出。
When the above code is compiled and executed, you will receive the following output which contains all the information related to collection.
**** Create Collection MyCollection1 in myfirstdb ****
Created new collection
Collection ID: MyCollection1
Resource ID: Ic8LAPPvnAA=
Self Link: dbs/Ic8LAA==/colls/Ic8LAPPvnAA=/
Documents Link: dbs/Ic8LAA==/colls/Ic8LAPPvnAA=/docs/
UDFs Link: dbs/Ic8LAA==/colls/Ic8LAPPvnAA=/udfs/
StoredProcs Link: dbs/Ic8LAA==/colls/Ic8LAPPvnAA=/sprocs/
Triggers Link: dbs/Ic8LAA==/colls/Ic8LAPPvnAA=/triggers/
Timestamp: 12/10/2015 4:55:36 PM
**** Create Collection MyCollection2 in myfirstdb ****
Created new collection
Collection ID: MyCollection2
Resource ID: Ic8LAKGHDwE=
Self Link: dbs/Ic8LAA==/colls/Ic8LAKGHDwE=/
Documents Link: dbs/Ic8LAA==/colls/Ic8LAKGHDwE=/docs/
UDFs Link: dbs/Ic8LAA==/colls/Ic8LAKGHDwE=/udfs/
StoredProcs Link: dbs/Ic8LAA==/colls/Ic8LAKGHDwE=/sprocs/
Triggers Link: dbs/Ic8LAA==/colls/Ic8LAKGHDwE=/triggers/
Timestamp: 12/10/2015 4:55:38 PM
DocumentDB - Delete Collection
要删除集合或集合,您可以使用 .Net SDK 从门户和代码中执行此操作。
To drop collection or collections you can do the same from the portal as well as from the code by using .Net SDK.
Step 1 −转到 Azure 门户上的 DocumentDB 帐户。为了演示的目的,我添加了另外两个集合,如以下屏幕截图所示。
Step 1 − Go to your DocumentDB account on Azure portal. For the purpose of demo, I have added two more collections as seen in the following screenshot.

Step 2 −要删除任何集合,您需要单击该集合。选择 TempCollection1。 您将看到以下页面,选择“删除集合”选项。
Step 2 − To drop any collection, you need to click on that collection. Let’s select TempCollection1. You will see the following page, select the ‘Delete Collection’ option.

Step 3 −它将显示确认消息。现在点击“是”按钮。
Step 3 − It will display the confirmation message. Now click ‘Yes’ button.

您将看到仪表板上不再有 TempCollection1。
You will see that the TempCollection1 is no more available on your dashboard.

您还可以使用 .Net SDK 从代码中删除集合。为此,请执行以下步骤。
You can also delete collections from your code using .Net SDK. To do that, following are the following steps.
Step 1 −让我们通过指定要删除的集合的 ID 来删除该集合。
Step 1 − Let’s delete the collection by specifying the ID of the collection we want to delete.
这是按 ID 查询以获取删除资源所需的 selfLinks 的常用模式。
It’s the usual pattern of querying by Id to obtain the selfLinks needed to delete a resource.
private async static Task DeleteCollection(DocumentClient client, string collectionId) {
Console.WriteLine();
Console.WriteLine("**** Delete Collection {0} in {1} ****", collectionId, database.Id);
var query = new SqlQuerySpec {
QueryText = "SELECT * FROM c WHERE c.id = @id",
Parameters = new SqlParameterCollection {
new SqlParameter {
Name = "@id", Value = collectionId
}
}
};
DocumentCollection collection = client.CreateDocumentCollectionQuery(database.SelfLink,
query).AsEnumerable().First();
await client.DeleteDocumentCollectionAsync(collection.SelfLink);
Console.WriteLine("Deleted collection {0} from database {1}", collectionId,
database.Id);
}
在这里,我们将看到构造参数化查询的首选方式。我们没有对 collectionId 进行硬编码,因此此方法可用于删除任何集合。我们通过 Id 查询特定集合,其中 Id 参数在此 SqlParameterCollection 中定义,该 SqlParameterCollection 已分配给该 SqlQuerySpec 的参数属性。
Here we see the preferred way of constructing a parameterized query. We’re not hardcoding the collectionId so this method can be used to delete any collection. We are querying for a specific collection by Id where the Id parameter is defined in this SqlParameterCollection assigned to the parameter’s property of this SqlQuerySpec.
然后,SDK 会继续构建最终的查询字符串,以便 DocumentDB 将 collectionId 嵌入在其中。
Then the SDK does the work of constructing the final query string for DocumentDB with the collectionId embedded inside of it.
Step 2 −运行查询,然后使用其 SelfLink 从 CreateDocumentClient 任务中删除该集合。
Step 2 − Run the query and then use its SelfLink to delete the collection from the CreateDocumentClient task.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
await DeleteCollection(client, "TempCollection");
}
}
以下是 Program.cs 文件的完整实现。
Following is the complete implementation of Program.cs file.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
using Newtonsoft.Json;
namespace DocumentDBDemo {
class Program {
private const string EndpointUrl = "https://azuredocdbdemo.documents.azure.com:443/";
private const string AuthorizationKey = "BBhjI0gxdVPdDbS4diTjdloJq7Fp4L5RO/
StTt6UtEufDM78qM2CtBZWbyVwFPSJIm8AcfDu2O+AfV T+TYUnBQ==";
private static Database database;
static void Main(string[] args) {
try {
CreateDocumentClient().Wait();
} catch (Exception e) {
Exception baseException = e.GetBaseException();
Console.WriteLine("Error: {0}, Message: {1}", e.Message, baseException.Message);
}
Console.ReadKey();
}
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
await DeleteCollection(client, "TempCollection");
//await CreateCollection(client, "MyCollection1");
//await CreateCollection(client, "MyCollection2", "S2");
////await CreateDatabase(client);
//GetDatabases(client);
//await DeleteDatabase(client);
//GetDatabases(client);
}
}
private async static Task CreateCollection(DocumentClient client,
string collectionId, string offerType = "S1") {
Console.WriteLine();
Console.WriteLine("**** Create Collection {0} in {1} ****", collectionId,
database.Id);
var collectionDefinition = new DocumentCollection { Id = collectionId };
var options = new RequestOptions { OfferType = offerType };
var result = await client.CreateDocumentCollectionAsync(database.SelfLink,
collectionDefinition, options);
var collection = result.Resource;
Console.WriteLine("Created new collection");
ViewCollection(collection);
}
private static void ViewCollection(DocumentCollection collection) {
Console.WriteLine("Collection ID: {0} ", collection.Id);
Console.WriteLine("Resource ID: {0} ", collection.ResourceId);
Console.WriteLine("Self Link: {0} ", collection.SelfLink);
Console.WriteLine("Documents Link: {0} ", collection.DocumentsLink);
Console.WriteLine("UDFs Link: {0} ", collection.UserDefinedFunctionsLink);
Console.WriteLine("StoredProcs Link: {0} ", collection.StoredProceduresLink);
Console.WriteLine("Triggers Link: {0} ", collection.TriggersLink);
Console.WriteLine("Timestamp: {0} ", collection.Timestamp);
}
private async static Task DeleteCollection(DocumentClient client,
string collectionId) {
Console.WriteLine();
Console.WriteLine("**** Delete Collection {0} in {1} ****", collectionId,
database.Id);
var query = new SqlQuerySpec {
QueryText = "SELECT * FROM c WHERE c.id = @id", Parameters = new
SqlParameterCollection {
new SqlParameter {
Name = "@id", Value = collectionId
}
}
};
DocumentCollection collection = client.CreateDocumentCollectionQuery
(database.SelfLink, query).AsEnumerable().First();
await client.DeleteDocumentCollectionAsync(collection.SelfLink);
Console.WriteLine("Deleted collection {0} from database {1}", collectionId,
database.Id);
}
}
}
当上文代码被编译和执行时,您将收到如下输出。
When the above code is compiled and executed, you will receive the following output.
**** Delete Collection TempCollection in myfirstdb ****
Deleted collection TempCollection from database myfirstdb
DocumentDB - Insert Document
在本教程中,我们将着手处理集合中的实际文档。您可以使用 Azure 门户或 .Net SDK 创建文档。
In this chapter, we will get to work with actual documents in a collection. You can create documents using either Azure portal or .Net SDK.
Creating Documents with the Azure Portal
让我们看一下将文档添加到您的集合的以下步骤。
Let’s take a look at the following steps to add document to your collection.
Step 1 − 在 myfirstdb 中添加新的 S1 定价层的 Families 集合。
Step 1 − Add new collection Families of S1 pricing tier in myfirstdb.

Step 2 − 选择 Families 集合,然后单击“创建文档”选项,打开“新建文档”面板。
Step 2 − Select the Families collection and click on Create Document option to open the New Document blade.

这是一个简单的文本编辑器,它允许您为新文档输入任何 JSON。
This is just a simple text editor that lets you type any JSON for a new document.

Step 3 − 这是原始数据输入,让我们输入我们的第一个文档。
Step 3 − As this is raw data entry, let’s enter our first document.
{
"id": "AndersenFamily",
"lastName": "Andersen",
"parents": [
{ "firstName": "Thomas", "relationship": "father" },
{ "firstName": "Mary Kay", "relationship": "mother" }
],
"children": [
{
"firstName": "Henriette Thaulow",
"gender": "female",
"grade": 5,
"pets": [ { "givenName": "Fluffy", "type": "Rabbit" } ]
}
],
"location": { "state": "WA", "county": "King", "city": "Seattle"},
"isRegistered": true
}
当您输入上述文档时,您会看到以下屏幕。
When you enter the above document, you will see the following screen.

请注意,我们已为文档提供了一个 id。id 值总是必需的,而且在同一个集合中的所有其他文档中必须唯一。如果您忽略这一点,那么 DocumentDB 会使用 GUID 或全局唯一标识符自动为您生成一个。
Notice that we’ve supplied an id for the document. The id value is always required, and it must be unique across all other documents in the same collection. When you leave it out then DocumentDB would automatically generate one for you using a GUID or a Globally Unique Identifier.
id 始终是一个字符串,它不能是数字、日期、布尔值或其他对象,也不能超过 255 个字符。
The id is always a string and it can’t be a number, date, Boolean, or another object, and it can’t be longer than 255 characters.
还要注意文档的分层结构,该结构具有一些顶级属性,例如必需的 id 以及 lastName 和 isRegistered,但它还具有嵌套属性。
Also notice the document’s hierarchal structure which has a few top-level properties like the required id, as well as lastName and isRegistered, but it also has nested properties.
例如,parents 属性作为 JSON 数组提供,由方括号表示。我们还为 children 提供了另一个数组,尽管在该示例此数组中只有一个子项。
For instance, the parents property is supplied as a JSON array as denoted by the square brackets. We also have another array for children, even though there’s only one child in the array in this example.
Step 4 − 单击“保存”按钮以保存文档,我们已创建了我们的第一个文档。
Step 4 − Click ‘Save’ button to save the document and we’ve created our first document.
正如您所看到的,我们的 JSON 已经应用了漂亮的格式,该格式将每个属性拆分为单独的行,并用一个空格缩进以传达每个属性的嵌套级别。
As you can see that pretty formatting was applied to our JSON, which breaks up every property on its own line indented with a whitespace to convey the nesting level of each property.

该门户包括一个文档浏览器,所以现在让我们使用它来检索我们刚刚创建的文档。
The portal includes a Document Explorer, so let’s use that now to retrieve the document we just created.

Step 5 − 选择一个数据库和数据库中任何集合,查看该集合中的文档。我们当前只有一个名为 myfirstdb 的数据库,其中包含一个名为 Families 的集合,两者在此处下拉菜单中都已预先选择。
Step 5 − Choose a database and any collection within the database to view the documents in that collection. We currently have just one database named myfirstdb with one collection called Families, both of which have been preselected here in the dropdowns.

默认情况下,文档浏览器会显示集合中未经过滤的文档列表,但您还可以通过 ID 搜索任何特定文档,或根据部分 ID 的通配符搜索搜索多个文档。
By default, the Document Explorer displays an unfiltered list of documents within the collection, but you can also search for any specific document by ID or multiple documents based on a wildcard search of a partial ID.
到目前为止,我们的集合中只有一个文档,我们在以下屏幕上看到了它的 ID,AndersonFamily。
We have only one document in our collection so far, and we see its ID on the following screen, AndersonFamily.
Step 6 − 单击 ID 以查看文档。
Step 6 − Click on the ID to view the document.

Creating Documents with the .NET SDK
众所周知,文档只是另一种类型的资源,并且您已经熟悉如何使用 SDK 处理资源。
As you know that documents are just another type of resource and you’ve already become familiar with how to treat resources using the SDK.
-
The one big difference between documents and other resources is that, of course, they’re schema free.
-
Thus there are a lot of options. Naturally, you can just work JSON object graphs or even raw strings of JSON text, but you can also use dynamic objects that lets you bind to properties at runtime without defining a class at compile time.
-
You can also work with real C# objects, or Entities as they are called, which might be your business domain classes.
让我们在使用 .Net SDK 时开始创建文档。以下是步骤。
Let’s start to create documents using .Net SDK. Following are the steps.
Step 1 − 实例化 DocumentClient,然后我们将查询 myfirstdb 数据库,再查询 MyCollection 集合,我们将此存储在私有变量集合中,以便可以在整个类中访问它。
Step 1 − Instantiate DocumentClient then we will query for the myfirstdb database and then query for the MyCollection collection, which we store in this private variable collection so that it’s accessible throughout the class.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
await CreateDocuments(client);
}
}
Step 2 − 在 CreateDocuments 任务中创建一些文档。
Step 2 − Create some documents in CreateDocuments task.
private async static Task CreateDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents ****");
Console.WriteLine();
dynamic document1Definition = new {
name = "New Customer 1", address = new {
addressType = "Main Office",
addressLine1 = "123 Main Street",
location = new {
city = "Brooklyn", stateProvinceName = "New York"
}, postalCode = "11229", countryRegionName = "United States"
},
};
Document document1 = await CreateDocument(client, document1Definition);
Console.WriteLine("Created document {0} from dynamic object", document1.Id);
Console.WriteLine();
}
第一个文档将由此动态对象生成。这可能看起来像 JSON,但当然不是。这是 C# 代码,我们正在创建一个真正的 .NET 对象,但没有类定义。相反,属性是从对象初始化方式中推断出来的。
The first document will be generated from this dynamic object. This might look like JSON, but of course it isn’t. This is C# code and we’re creating a real .NET object, but there’s no class definition. Instead, the properties are inferred from the way the object is initialized.
请注意,我们没有为此文档提供 Id 属性。
Notice that we haven’t supplied an Id property for this document.
现在,让我们看一下 CreateDocument。它看起来像我们为创建数据库和集合而看到的相同模式。
Now let’s have a look into CreateDocument. It looks like the same pattern we saw for creating databases and collections.
private async static Task<Document> CreateDocument(DocumentClient client,
object documentObject) {
var result = await client.CreateDocumentAsync(collection.SelfLink, documentObject);
var document = result.Resource;
Console.WriteLine("Created new document: {0}\r\n{1}", document.Id, document);
return result;
}
Step 3 − 这一次,我们调用 CreateDocumentAsync 来指定要向其中添加文档的集合的 SelfLink。我们返回一个具有 resource 属性的响应,在这种情况下,它表示具有系统生成属性的新文档。
Step 3 − This time we call CreateDocumentAsync specifying the SelfLink of the collection we want to add the document to. We get back a response with a resource property that, in this case, represents the new document with its system-generated properties.
Document 对象是 SDK 中的已定义类,它继承自 resource,因此它具有所有常见的 resource 属性,但也包括定义无架构文档本身的动态属性。
The Document object is a defined class in the SDK that inherits from resource and so it has all the common resource properties, but it also includes the dynamic properties that define the schema-free document itself.
private async static Task CreateDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents ****");
Console.WriteLine();
dynamic document1Definition = new {
name = "New Customer 1", address = new {
addressType = "Main Office",
addressLine1 = "123 Main Street",
location = new {
city = "Brooklyn", stateProvinceName = "New York"
}, postalCode = "11229", countryRegionName = "United States"
},
};
Document document1 = await CreateDocument(client, document1Definition);
Console.WriteLine("Created document {0} from dynamic object", document1.Id);
Console.WriteLine();
}
当以上代码编译并执行时,您将收到以下输出。
When the above code is compiled and executed you will receive the following output.
**** Create Documents ****
Created new document: 34e9873a-94c8-4720-9146-d63fb7840fad {
"name": "New Customer 1",
"address": {
"addressType": "Main Office",
"addressLine1": "123 Main Street",
"location": {
"city": "Brooklyn", "stateProvinceName": "New York"
},
"postalCode": "11229", "countryRegionName": "United States"
},
"id": "34e9873a-94c8-4720-9146-d63fb7840fad",
"_rid": "Ic8LAMEUVgACAAAAAAAAAA==",
"_ts": 1449812756,
"_self": "dbs/Ic8LAA==/colls/Ic8LAMEUVgA=/docs/Ic8LAMEUVgACAAAAAAAAAA==/",
"_etag": "\"00001000-0000-0000-0000-566a63140000\"",
"_attachments": "attachments/"
}
Created document 34e9873a-94c8-4720-9146-d63fb7840fad from dynamic object
如您所见,我们没有提供 Id,但是 DocumentDB 为我们生成了新文档的 Id。
As you can see, we haven’t supplied an Id, however DocumentDB generated this one for us for the new document.
DocumentDB - Query Document
在 DocumentDB 中,我们实际上使用 SQL 查询文档,因此本章全是关于在 DocumentDB 中使用特殊 SQL 语法的查询。虽然如果您正在进行 .NET 开发,也可以使用可生成适当 SQL 的 LINQ 提供程序。
In DocumentDB, we actually use SQL to query for documents, so this chapter is all about querying using the special SQL syntax in DocumentDB. Although if you are doing .NET development, there is also a LINQ provider that can be used and which can generate appropriate SQL from a LINQ query.
Querying Document using Portal
Azure 门户具有查询浏览器,可让您针对 DocumentDB 数据库运行任何 SQL 查询。
The Azure portal has a Query Explorer that lets you run any SQL query against your DocumentDB database.
我们将使用查询浏览器来演示查询语言的许多不同功能和特性,从最简单的查询开始。
We will use the Query Explorer to demonstrate the many different capabilities and features of the query language starting with the simplest possible query.
Step 1 − 在数据库边栏中,单击打开查询浏览器边栏。
Step 1 − In the database blade, click to open the Query Explorer blade.

记住查询运行在集合的范围内,因此 Query Explorer 允许您在此下拉列表中选择集合。
Remember that queries run within the scope of a collection, and so the Query Explorer lets you choose the collection in this dropdown.

Step 2 − 选择之前使用门户创建的 Families 集合。
Step 2 − Select Families collection which is created earlier using the portal.
Query Explorer 使用这个简单的查询 SELECT * FROM c 打开,它只从集合中检索所有文档。
The Query Explorer opens up with this simple query SELECT * FROM c, which simply retrieves all documents from the collection.
Step 3 − 通过点击“运行查询”按钮执行此查询。然后您会看到完整的文档在“结果”面板中被检索出来。
Step 3 − Execute this query by clicking the ‘Run query’ button. Then you will see that the complete document is retrieved in the Results blade.

Querying Document using .Net SDK
以下是使用 .Net SDK 运行一些文档查询的步骤。
Following are the steps to run some document queries using .Net SDK.
在这个例子中,我们要查询我们刚刚添加的新创建的文档。
In this example, we want to query for the newly created documents that we just added.
Step 1 − 调用 CreateDocumentQuery,将集合通过其 SelfLink 和查询文本传递给要针对其运行查询的集合。
Step 1 − Call CreateDocumentQuery, passing in the collection to run the query against by its SelfLink and the query text.
private async static Task QueryDocumentsWithPaging(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Query Documents (paged results) ****");
Console.WriteLine();
Console.WriteLine("Quering for all documents");
var sql = "SELECT * FROM c";
var query = client.CreateDocumentQuery(collection.SelfLink, sql).AsDocumentQuery();
while (query.HasMoreResults) {
var documents = await query.ExecuteNextAsync();
foreach (var document in documents) {
Console.WriteLine(" Id: {0}; Name: {1};", document.id, document.name);
}
}
Console.WriteLine();
}
此查询也返回整个集合中的所有文档,但是我们没像之前那样对 CreateDocumentQuery 调用 .ToList,这将发出尽可能多的请求,以便在一行代码中提取所有结果。
This query is also returning all documents in the entire collection, but we’re not calling .ToList on CreateDocumentQuery as before, which would issue as many requests as necessary to pull down all the results in one line of code.
Step 2 − 相反,调用 AsDocumentQuery,此方法返回一个具有 HasMoreResults 属性的查询对象。
Step 2 − Instead, call AsDocumentQuery and this method returns a query object with a HasMoreResults property.
Step 3 − 如果 HasMoreResults 为真,那么调用 ExecuteNextAsync 获取下一个块,然后转储该块中的所有内容。
Step 3 − If HasMoreResults is true, then call ExecuteNextAsync to get the next chunk and then dump all the contents of that chunk.
Step 4 − 如果你愿意,你可以使用 LINQ 而不用 SQL 进行查询。这里我们在 q 中定义了一个 LINQ 查询,但是它在我们在其上运行 .ToList 之前不会执行。
Step 4 − You can also query using LINQ instead of SQL if you prefer. Here we’ve defined a LINQ query in q, but it won’t execute until we run .ToList on it.
private static void QueryDocumentsWithLinq(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Query Documents (LINQ) ****");
Console.WriteLine();
Console.WriteLine("Quering for US customers (LINQ)");
var q =
from d in client.CreateDocumentQuery<Customer>(collection.DocumentsLink)
where d.Address.CountryRegionName == " United States"
select new {
Id = d.Id,
Name = d.Name,
City = d.Address.Location.City
};
var documents = q.ToList();
Console.WriteLine("Found {0} UK customers", documents.Count);
foreach (var document in documents) {
var d = document as dynamic;
Console.WriteLine(" Id: {0}; Name: {1}; City: {2}", d.Id, d.Name, d.City);
}
Console.WriteLine();
}
SDK 将把我们的 LINQ 查询转换为 DocumentDB 的 SQL 语法,根据我们的 LINQ 语法生成 SELECT 和 WHERE 子句
The SDK will convert our LINQ query into SQL syntax for DocumentDB, generating a SELECT and WHERE clause based on our LINQ syntax
Step 5 − 现在从 CreateDocumentClient 任务调用上述查询。
Step 5 − Now call the above queries from the CreateDocumentClient task.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
//await CreateDocuments(client);
await QueryDocumentsWithPaging(client);
QueryDocumentsWithLinq(client);
}
}
执行上述代码后,您将收到以下输出。
When the above code is executed, you will receive the following output.
**** Query Documents (paged results) ****
Quering for all documents
Id: 7e9ad4fa-c432-4d1a-b120-58fd7113609f; Name: New Customer 1;
Id: 34e9873a-94c8-4720-9146-d63fb7840fad; Name: New Customer 1;
**** Query Documents (LINQ) ****
Quering for US customers (LINQ)
Found 2 UK customers
Id: 7e9ad4fa-c432-4d1a-b120-58fd7113609f; Name: New Customer 1; City: Brooklyn
Id: 34e9873a-94c8-4720-9146-d63fb7840fad; Name: New Customer 1; City: Brooklyn
DocumentDB - Update Document
在本章中,我们将学习如何更新文档。使用 Azure 门户,您可以通过在文档浏览器中打开文档,并且更新编辑器中的内容(比如一个文本文件)来轻松更新文档。
In this chapter, we will learn how to update the documents. Using Azure portal, you can easily update document by opening the document in Document explorer and updating it in editor like a text file.

点击“保存”按钮。现在,如果您需要使用 .Net SDK 更改文档,您可以直接替换它。您无需删除并重新创建它,这不仅繁琐乏味,而且还会更改资源标识符,这是您在修改文档时不希望的。以下是使用 .Net SDK 更新文档的步骤。
Click ‘Save’ button. Now when you need to change a document using .Net SDK you can just replace it. You don’t need to delete and recreate it, which besides being tedious, would also change the resource id, which you wouldn’t want to do when you’re just modifying a document. Here are the following steps to update the document using .Net SDK.
让我们看一下以下 ReplaceDocuments 任务,其中我们查询 isNew 属性为真文档,但是我们什么也得不到,因为没有这样的文档。因此,让我们修改早期添加的文档,那些名称以 New Customer 开头的文档。
Let’s take a look at the following ReplaceDocuments task where we will query for documents where the isNew property is true, but we will get none because there aren’t any. So, let’s modify the documents we added earlier, those whose names start with New Customer.
Step 1 − 向这些文档添加 isNew 属性,并将其值设置为真。
Step 1 − Add the isNew property to these documents and set its value to true.
private async static Task ReplaceDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine(">>> Replace Documents <<<");
Console.WriteLine();
Console.WriteLine("Quering for documents with 'isNew' flag");
var sql = "SELECT * FROM c WHERE c.isNew = true";
var documents = client.CreateDocumentQuery(collection.SelfLink, sql).ToList();
Console.WriteLine("Documents with 'isNew' flag: {0} ", documents.Count);
Console.WriteLine();
Console.WriteLine("Quering for documents to be updated");
sql = "SELECT * FROM c WHERE STARTSWITH(c.name, 'New Customer') = true";
documents = client.CreateDocumentQuery(collection.SelfLink, sql).ToList();
Console.WriteLine("Found {0} documents to be updated", documents.Count);
foreach (var document in documents) {
document.isNew = true;
var result = await client.ReplaceDocumentAsync(document._self, document);
var updatedDocument = result.Resource;
Console.WriteLine("Updated document 'isNew' flag: {0}", updatedDocument.isNew);
}
Console.WriteLine();
Console.WriteLine("Quering for documents with 'isNew' flag");
sql = "SELECT * FROM c WHERE c.isNew = true";
documents = client.CreateDocumentQuery(collection.SelfLink, sql).ToList();
Console.WriteLine("Documents with 'isNew' flag: {0}: ", documents.Count);
Console.WriteLine();
}
Step 2 − 使用相同的 STARTSWITH 查询获取要更新的文档,它给了我们文档,我们以动态对象形式获取它们。
Step 2 − Get the documents to be updated using the same STARTSWITH query and that gives us the documents, which we are getting back here as dynamic objects.
Step 3 − 附加 isNew 属性,并为每个文档将其设置为真。
Step 3 − Attach the isNew property and set it to true for each document.
Step 4 − 调用ReplaceDocumentAsync,传递文档的SelfLink,以及更新后的文档。
Step 4 − Call ReplaceDocumentAsync, passing in the document’s SelfLink, along with the updated document.
现在只需证明它有效,查询isNew等于true的文档。让我们从CreateDocumentClient任务中调用上述查询。
Now just to prove that this worked, query for documents where isNew equaled true. Let’s call the above queries from the CreateDocumentClient task.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
//await CreateDocuments(client);
//QueryDocumentsWithSql(client);
//await QueryDocumentsWithPaging(client);
//QueryDocumentsWithLinq(client);
await ReplaceDocuments(client);
}
}
当上文代码被编译和执行时,您将收到如下输出。
When the above code is compiled and executed, you will receive the following output.
**** Replace Documents ****
Quering for documents with 'isNew' flag
Documents with 'isNew' flag: 0
Quering for documents to be updated
Found 2 documents to be updated
Updated document ‘isNew’ flag: True
Updated document ‘isNew’ flag: True
Quering for documents with 'isNew' flag
Documents with 'isNew' flag: 2
DocumentDB - Delete Document
在本章中,我们将学习如何从您的 DocumentDB 帐户中删除文档。使用 Azure 门户后,您可以通过在 Document Explorer 中打开文档并单击“删除”选项,轻松删除任何文档。
In this chapter, we will learn how to delete a document from your DocumentDB account. Using Azure Portal, you can easily delete any document by opening the document in Document Explorer and click the ‘Delete’ option.


它会显示确认消息。现在,按下“是”按钮,您将看到 DocumentDB 帐户中不再有该文档。
It will display the confirmation message. Now press the Yes button and you will see that the document is no longer available in your DocumentDB account.
现在,当您想使用 .Net SDK 删除文档时。
Now when you want to delete a document using .Net SDK.
Step 1 − 它与我们之前看到过的模式相同,我们将在其中首先查询以获得每个新文档的 SelfLinks。我们此处不使用 SELECT *,它将返回文档的全部内容,而这是我们不需要的。
Step 1 − It’s the same pattern as we’ve seen before where we’ll query first to get the SelfLinks of each new document. We don’t use SELECT * here, which would return the documents in their entirety, which we don’t need.
Step 2 − 相反,我们仅将 SelfLinks 选择到列表中,然后我们每次仅对一个 SelfLink 调用 DeleteDocumentAsync,以从集合中删除文档。
Step 2 − Instead we’re just selecting the SelfLinks into a list and then we just call DeleteDocumentAsync for each SelfLink, one at a time, to delete the documents from the collection.
private async static Task DeleteDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine(">>> Delete Documents <<<");
Console.WriteLine();
Console.WriteLine("Quering for documents to be deleted");
var sql =
"SELECT VALUE c._self FROM c WHERE STARTSWITH(c.name, 'New Customer') = true";
var documentLinks =
client.CreateDocumentQuery<string>(collection.SelfLink, sql).ToList();
Console.WriteLine("Found {0} documents to be deleted", documentLinks.Count);
foreach (var documentLink in documentLinks) {
await client.DeleteDocumentAsync(documentLink);
}
Console.WriteLine("Deleted {0} new customer documents", documentLinks.Count);
Console.WriteLine();
}
Step 3 − 现在让我们从 CreateDocumentClient 任务调用上述的 DeleteDocuments。
Step 3 − Now let’s call the above DeleteDocuments from the CreateDocumentClient task.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
await DeleteDocuments(client);
}
}
执行上述代码后,您将收到以下输出。
When the above code is executed, you will receive the following output.
***** Delete Documents *****
Quering for documents to be deleted
Found 2 documents to be deleted
Deleted 2 new customer documents
DocumentDB - Data Modeling
虽然无模式数据库(如DocumentDB)让您非常容易地接受对数据模型的更改,但您仍然应该花一些时间考虑您的数据。
While schema-free databases, like DocumentDB, make it super easy to embrace changes to your data model, you should still spend some time thinking about your data.
-
You have a lot of options. Naturally, you can just work JSON object graphs or even raw strings of JSON text, but you can also use dynamic objects that lets you bind to properties at runtime without defining a class at compile time.
-
You can also work with real C# objects, or Entities as they are called, which might be your business domain classes.
Relationships
让我们看一下文档的层次结构。它有一些顶级属性,例如必需的id,以及lastName和isRegistered,但它还具有嵌套属性。
Let’s take a look at the document’s hierarchal structure. It has a few top-level properties like the required id, as well as lastName and isRegistered, but it also has nested properties.
{
"id": "AndersenFamily",
"lastName": "Andersen",
"parents": [
{ "firstName": "Thomas", "relationship": "father" },
{ "firstName": "Mary Kay", "relationship": "mother" }
],
"children": [
{
"firstName": "Henriette Thaulow",
"gender": "female",
"grade": 5,
"pets": [ { "givenName": "Fluffy", "type": "Rabbit" } ]
}
],
"location": { "state": "WA", "county": "King", "city": "Seattle"},
"isRegistered": true
}
-
For instance, the parents property is supplied as a JSON array as denoted by the square brackets.
-
We also have another array for children, even though there’s only one child in the array in this example. So this is how you model the equivalent of one-to-many relationships within a document.
-
You simply use arrays where each element in the array could be a simple value or another complex object, even another array.
-
So one family can have multiple parents and multiple children and if you look at the child objects, they have a pet’s property that is itself a nested array for a oneto-many relationship between children and pets.
-
For the location property, we’re combining three related properties, the state, county, and city into an object.
-
Embedding an object this way rather than embedding an array of objects is similar to having a one-to-one relationship between two rows in separate tables in a relational database.
Embedding Data
当您开始在文档存储中(如DocumentDB)对数据建模时,请尝试将您的实体视为JSON中表示的自包含文档。当使用关系数据库时,我们总是对数据进行规范化。
When you start modeling data in a document store, such as DocumentDB, try to treat your entities as self-contained documents represented in JSON. When working with relational databases, we always normalize data.
-
Normalizing your data typically involves taking an entity, such as a customer, and breaking it down into discreet pieces of data, like contact details and addresses.
-
To read a customer, with all their contact details and addresses, you need to use JOINS to effectively aggregate your data at run time.
现在让我们来看看如何在文档数据库中将相同的数据建模为自包含的实体。
Now let’s take a look at how we would model the same data as a self-contained entity in a document database.
{
"id": "1",
"firstName": "Mark",
"lastName": "Upston",
"addresses": [
{
"line1": "232 Main Street",
"line2": "Unit 1",
"city": "Brooklyn",
"state": "NY",
"zip": 11229
}
],
"contactDetails": [
{"email": "mark.upston@xyz.com"},
{"phone": "+1 356 545-86455", "extension": 5555}
]
}
正如您所见,我们已经非规范化了客户记录,其中客户的所有信息都嵌入到一个JSON文档中。
As you can see that we have denormalized the customer record where all the information of the customer is embedded into a single JSON document.
在NoSQL中,我们有免费模式,因此您也可以以不同的格式添加联系方式和地址。在NoSQL中,您可以通过一次读取操作从数据库中检索客户记录。同样,更新记录也只是一次写入操作。
In NoSQL we have a free schema, so you can add contact details and addresses in different format as well. In NoSQL, you can retrieve a customer record from the database in a single read operation. Similarly, updating a record is also a single write operation.
以下是用.Net SDK创建文档的步骤。
Following are the steps to create documents using .Net SDK.
Step 1 − 实例化DocumentClient。然后我们将查询myfirstdb数据库,也查询MyCollection集合,我们将该集合存储在这个private变量集合中,以便整个类中都能访问它。
Step 1 − Instantiate DocumentClient. Then we will query for the myfirstdb database and also query for MyCollection collection, which we store in this private variable collection so that’s it’s accessible throughout the class.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
await CreateDocuments(client);
}
}
Step 2 − 在 CreateDocuments 任务中创建一些文档。
Step 2 − Create some documents in CreateDocuments task.
private async static Task CreateDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents ****");
Console.WriteLine();
dynamic document1Definition = new {
name = "New Customer 1", address = new {
addressType = "Main Office",
addressLine1 = "123 Main Street",
location = new {
city = "Brooklyn", stateProvinceName = "New York"
},
postalCode = "11229", countryRegionName = "United States"
},
};
Document document1 = await CreateDocument(client, document1Definition);
Console.WriteLine("Created document {0} from dynamic object", document1.Id);
Console.WriteLine();
}
第一个文档将从这个动态对象生成。这看起来像 JSON,但当然不是。这是 C# 代码,我们正在创建真实的 .NET 对象,但没有类定义。而是根据对象初始化的方式推断属性。您还可能注意到,我们未为该文档提供 Id 属性。
The first document will be generated from this dynamic object. This might look like JSON, but of course it isn’t. This is C# code and we’re creating a real .NET object, but there’s no class definition. Instead the properties are inferred from the way the object is initialized. You can notice also that we haven’t supplied an Id property for this document.
Step 3 - 现在让我们看看 CreateDocument,它看起来与我们看到的用于创建数据库和集合的模式相同。
Step 3 − Now let’s take a look at the CreateDocument and it looks like the same pattern we saw for creating databases and collections.
private async static Task<Document> CreateDocument(DocumentClient client,
object documentObject) {
var result = await client.CreateDocumentAsync(collection.SelfLink, documentObject);
var document = result.Resource;
Console.WriteLine("Created new document: {0}\r\n{1}", document.Id, document);
return result;
}
Step 4 - 这次,我们调用 CreateDocumentAsync,指定要向其中添加文档的集合的 SelfLink。我们收到响应,该响应具有 resource 属性,在此情况下,它表示具有其系统生成属性的新文档。
Step 4 − This time we call CreateDocumentAsync specifying the SelfLink of the collection we want to add the document to. We get back a response with a resource property that, in this case, represents the new document with its system-generated properties.
在以下 CreateDocuments 任务中,我们创建了三个文档。
In the following CreateDocuments task, we have created three documents.
-
In the first document, the Document object is a defined class in the SDK that inherits from resource and so it has all the common resource properties, but it also includes the dynamic properties that define the schema-free document itself.
private async static Task CreateDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents ****");
Console.WriteLine();
dynamic document1Definition = new {
name = "New Customer 1", address = new {
addressType = "Main Office",
addressLine1 = "123 Main Street",
location = new {
city = "Brooklyn", stateProvinceName = "New York"
},
postalCode = "11229",
countryRegionName = "United States"
},
};
Document document1 = await CreateDocument(client, document1Definition);
Console.WriteLine("Created document {0} from dynamic object", document1.Id);
Console.WriteLine();
var document2Definition = @" {
""name"": ""New Customer 2"",
""address"": {
""addressType"": ""Main Office"",
""addressLine1"": ""123 Main Street"",
""location"": {
""city"": ""Brooklyn"", ""stateProvinceName"": ""New York""
},
""postalCode"": ""11229"",
""countryRegionName"": ""United States""
}
}";
Document document2 = await CreateDocument(client, document2Definition);
Console.WriteLine("Created document {0} from JSON string", document2.Id);
Console.WriteLine();
var document3Definition = new Customer {
Name = "New Customer 3",
Address = new Address {
AddressType = "Main Office",
AddressLine1 = "123 Main Street",
Location = new Location {
City = "Brooklyn", StateProvinceName = "New York"
},
PostalCode = "11229",
CountryRegionName = "United States"
},
};
Document document3 = await CreateDocument(client, document3Definition);
Console.WriteLine("Created document {0} from typed object", document3.Id);
Console.WriteLine();
}
-
This second document just works with a raw JSON string. Now we step into an overload for CreateDocument that uses the JavaScriptSerializer to de-serialize the string into an object, which it then passes on to the same CreateDocument method that we used to create the first document.
-
In the third document, we have used the C# object Customer which is defined in our application.
让我们看看这个客户,它具有一个 Id 和 address 属性,其中 address 是一个嵌套对象,具有自己的属性,包括 location,它还是另一个嵌套对象。
Let’s take a look at this customer, it has an Id and address property where the address is a nested object with its own properties including location, which is yet another nested object.
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace DocumentDBDemo {
public class Customer {
[JsonProperty(PropertyName = "id")]
public string Id { get; set; }
// Must be nullable, unless generating unique values for new customers on client
[JsonProperty(PropertyName = "name")]
public string Name { get; set; }
[JsonProperty(PropertyName = "address")]
public Address Address { get; set; }
}
public class Address {
[JsonProperty(PropertyName = "addressType")]
public string AddressType { get; set; }
[JsonProperty(PropertyName = "addressLine1")]
public string AddressLine1 { get; set; }
[JsonProperty(PropertyName = "location")]
public Location Location { get; set; }
[JsonProperty(PropertyName = "postalCode")]
public string PostalCode { get; set; }
[JsonProperty(PropertyName = "countryRegionName")]
public string CountryRegionName { get; set; }
}
public class Location {
[JsonProperty(PropertyName = "city")]
public string City { get; set; }
[JsonProperty(PropertyName = "stateProvinceName")]
public string StateProvinceName { get; set; }
}
}
我们还设置了 JSON 属性属性,因为我们希望在双方面都保持适当的约定。
We also have JSON property attributes in place because we want to maintain proper conventions on both sides of the fence.
所以我只需创建我的 New Customer 对象及其嵌套子对象,并再次调用 CreateDocument。尽管我们的客户对象确实有一个 Id 属性,但我们没有为它提供值,因此 DocumentDB 根据 GUID 生成了一个值,就像它对前两个文档所做的那样。
So I just create my New Customer object along with its nested child objects and call into CreateDocument once more. Although our customer object does have an Id property we didn’t supply a value for it and so DocumentDB generated one based on the GUID, just like it did for the previous two documents.
当以上代码编译并执行时,您将收到以下输出。
When the above code is compiled and executed you will receive the following output.
**** Create Documents ****
Created new document: 575882f0-236c-4c3d-81b9-d27780206b2c
{
"name": "New Customer 1",
"address": {
"addressType": "Main Office",
"addressLine1": "123 Main Street",
"location": {
"city": "Brooklyn",
"stateProvinceName": "New York"
},
"postalCode": "11229",
"countryRegionName": "United States"
},
"id": "575882f0-236c-4c3d-81b9-d27780206b2c",
"_rid": "kV5oANVXnwDGPgAAAAAAAA==",
"_ts": 1450037545,
"_self": "dbs/kV5oAA==/colls/kV5oANVXnwA=/docs/kV5oANVXnwDGPgAAAAAAAA==/",
"_etag": "\"00006fce-0000-0000-0000-566dd1290000\"",
"_attachments": "attachments/"
}
Created document 575882f0-236c-4c3d-81b9-d27780206b2c from dynamic object
Created new document: 8d7ad239-2148-4fab-901b-17a85d331056
{
"name": "New Customer 2",
"address": {
"addressType": "Main Office",
"addressLine1": "123 Main Street",
"location": {
"city": "Brooklyn",
"stateProvinceName": "New York"
},
"postalCode": "11229",
"countryRegionName": "United States"
},
"id": "8d7ad239-2148-4fab-901b-17a85d331056",
"_rid": "kV5oANVXnwDHPgAAAAAAAA==",
"_ts": 1450037545,
"_self": "dbs/kV5oAA==/colls/kV5oANVXnwA=/docs/kV5oANVXnwDHPgAAAAAAAA==/",
"_etag": "\"000070ce-0000-0000-0000-566dd1290000\"",
"_attachments": "attachments/"
}
Created document 8d7ad239-2148-4fab-901b-17a85d331056 from JSON string
Created new document: 49f399a8-80c9-4844-ac28-cd1dee689968
{
"id": "49f399a8-80c9-4844-ac28-cd1dee689968",
"name": "New Customer 3",
"address": {
"addressType": "Main Office",
"addressLine1": "123 Main Street",
"location": {
"city": "Brooklyn",
"stateProvinceName": "New York"
},
"postalCode": "11229",
"countryRegionName": "United States"
},
"_rid": "kV5oANVXnwDIPgAAAAAAAA==",
"_ts": 1450037546,
"_self": "dbs/kV5oAA==/colls/kV5oANVXnwA=/docs/kV5oANVXnwDIPgAAAAAAAA==/",
"_etag": "\"000071ce-0000-0000-0000-566dd12a0000\"",
"_attachments": "attachments/"
}
Created document 49f399a8-80c9-4844-ac28-cd1dee689968 from typed object
DocumentDB - Data Types
JSON 或 JavaScript 对象表示法是一种轻量级的基于文本的开放标准,旨在实现人类可读的数据交换,并且机器也易于解析和生成。JSON 是 DocumentDB 的核心。我们在网络上传输 JSON,我们将 JSON 存储为 JSON,并且我们对 JSON 树进行索引,以便针对完整的 JSON 文档进行查询。
JSON or JavaScript Object Notation is a lightweight text-based open standard designed for human-readable data interchange and also easy for machines to parse and generate. JSON is at the heart of DocumentDB. We transmit JSON over the wire, we store JSON as JSON, and we index the JSON tree allowing queries on the full JSON document.
JSON 格式支持以下数据类型 -
JSON format supports the following data types −
S.No. |
Type & Description |
1 |
Number Double-precision floating-point format in JavaScript |
2 |
String Double-quoted Unicode with backslash escaping |
3 |
Boolean True or false |
4 |
Array An ordered sequence of values |
5 |
Value It can be a string, a number, true or false, null, etc. |
6 |
Object An unordered collection of key:value pairs |
7 |
Whitespace It can be used between any pair of tokens |
8 |
Null Empty |
让我们看一个简单的例子 DateTime 类型。为客户类添加出生日期。
Let’s take a look at a simple example DateTime type. Add birth date to the customer class.
public class Customer {
[JsonProperty(PropertyName = "id")]
public string Id { get; set; }
// Must be nullable, unless generating unique values for new customers on client
[JsonProperty(PropertyName = "name")]
public string Name { get; set; }
[JsonProperty(PropertyName = "address")]
public Address Address { get; set; }
[JsonProperty(PropertyName = "birthDate")]
public DateTime BirthDate { get; set; }
}
我们能用 DateTime 存储、检索以及查询,如下代码所示。
We can store, retrieve, and query using DateTime as shown in the following code.
private async static Task CreateDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents ****");
Console.WriteLine();
var document3Definition = new Customer {
Id = "1001",
Name = "Luke Andrew",
Address = new Address {
AddressType = "Main Office",
AddressLine1 = "123 Main Street",
Location = new Location {
City = "Brooklyn",
StateProvinceName = "New York"
},
PostalCode = "11229",
CountryRegionName = "United States"
},
BirthDate = DateTime.Parse(DateTime.Today.ToString()),
};
Document document3 = await CreateDocument(client, document3Definition);
Console.WriteLine("Created document {0} from typed object", document3.Id);
Console.WriteLine();
}
当上述代码编译并执行时,并且创建了文档后,您将会看到现已添加出生日期。
When the above code is compiled and executed, and the document is created, you will see that birth date is added now.
**** Create Documents ****
Created new document: 1001
{
"id": "1001",
"name": "Luke Andrew",
"address": {
"addressType": "Main Office",
"addressLine1": "123 Main Street",
"location": {
"city": "Brooklyn",
"stateProvinceName": "New York"
},
"postalCode": "11229",
"countryRegionName": "United States"
},
"birthDate": "2015-12-14T00:00:00",
"_rid": "Ic8LAMEUVgAKAAAAAAAAAA==",
"_ts": 1450113676,
"_self": "dbs/Ic8LAA==/colls/Ic8LAMEUVgA=/docs/Ic8LAMEUVgAKAAAAAAAAAA==/",
"_etag": "\"00002d00-0000-0000-0000-566efa8c0000\"",
"_attachments": "attachments/"
}
Created document 1001 from typed object
DocumentDB - Limiting Records
Microsoft 最近添加了许多改善,说明您如何查询 Azure DocumentDB,例如 SQL 语法的 TOP 关键字,它使查询运行更快并消耗更少的资源,增加了查询运算符的限制,并且为 .NET SDK 中的其他 LINQ 运算符添加了支持。
Microsoft has recently added a number of improvements on how you can query Azure DocumentDB, such as the TOP keyword to SQL grammar, which made queries run faster and consume fewer resources, increased the limits for query operators, and added support for additional LINQ operators in the .NET SDK.
让我们来看一个简单的示例,我们将在其中仅检索前两条记录。如果您有一些记录,并且您想仅检索其中某些记录,那么您可以使用 Top 关键字。在此示例中,我们有许多地震记录。
Let’s take a look at a simple example in which we will retrieve only the first two records. If you have a number of records and you want to retrieve only some of them, then you can use the Top keyword. In this example, we have a lot of records of earthquakes.

现在我们只想要显示前两条记录
Now we want to show the first two records only
Step 1 - 转到查询浏览器并运行此查询。
Step 1 − Go to the query explorer and run this query.
SELECT * FROM c
WHERE c.magnitude > 2.5
您将会看到它已检索了 4 条记录,因为我们尚未指定 TOP 关键字。
You will see that it has retrieved four records because we have not specified TOP keyword yet.

Step 2 - 现在与相同查询一起使用 TOP 关键字。这里我们指定了 TOP 关键字,而“2”表示我们仅需要两条记录。
Step 2 − Now use the TOP keyword with same query. Here we have specified the TOP keyword and ‘2’ means that we want two records only.
SELECT TOP 2 * FROM c
WHERE c.magnitude > 2.5
Step 3 - 现在运行此查询,您将会看到仅检索了两条记录。
Step 3 − Now run this query and you will see that only two records are retrieved.

同样地,您可以使用 .Net SDK 的代码中的 TOP 关键字。以下是实现。
Similarly, you can use TOP keyword in code using .Net SDK. Following is the implementation.
private async static Task QueryDocumentsWithPaging(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Query Documents (paged results) ****");
Console.WriteLine();
Console.WriteLine("Quering for all documents");
var sql = "SELECT TOP 3 * FROM c";
var query = client
.CreateDocumentQuery(collection.SelfLink, sql)
.AsDocumentQuery();
while (query.HasMoreResults) {
var documents = await query.ExecuteNextAsync();
foreach (var document in documents) {
Console.WriteLine(" PublicId: {0}; Magnitude: {1};", document.publicid,
document.magnitude);
}
}
Console.WriteLine();
}
以下是 CreateDocumentClient 任务,在其中实例化了 DocumentClient 和地震数据库。
Following is the CreateDocumentClient task in which are instantiated the DocumentClient and earthquake database.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'earthquake'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'earthquakedata'").AsEnumerable().First();
await QueryDocumentsWithPaging(client);
}
}
当上述代码编译并执行时,您将会看到仅检索了 3 条记录。
When the above code is compiled and executed, you will see that only three records are retrieved.
**** Query Documents (paged results) ****
Quering for all documents
PublicId: 2015p947400; Magnitude: 2.515176918;
PublicId: 2015p947373; Magnitude: 1.506774108;
PublicId: 2015p947329; Magnitude: 1.593394461;
DocumentDB - Sorting Records
Microsoft Azure DocumentDB 支持使用 JSON 文档中的 SQL 查询文档。您可以使用查询中的 ORDER BY 子句按照数字和字符串对集合中的文档进行排序。该子句可以包含一个可选的 ASC/DESC 参数,以指定必须检索结果的顺序。
Microsoft Azure DocumentDB supports querying documents using SQL over JSON documents. You can sort documents in the collection on numbers and strings using an ORDER BY clause in your query. The clause can include an optional ASC/DESC argument to specify the order in which results must be retrieved.
让我们来看一下以下示例,在其中我们有一个 JSON 文档。
Let’s take a look at the following example in which we have a JSON document.
{
"id": "Food Menu",
"description": "Grapes, red or green (European type, such as Thompson seedless), raw",
"tags": [
{
"name": "grapes"
},
{
"name": "red or green (european type"
},
{
"name": "such as thompson seedless)"
},
{
"name": "raw"
}
],
"foodGroup": "Fruits and Fruit Juices",
"servings": [
{
"amount": 1,
"description": "cup",
"weightInGrams": 151
},
{
"amount": 10,
"description": "grapes",
"weightInGrams": 49
},
{
"amount": 1,
"description": "NLEA serving",
"weightInGrams": 126
}
]
}
以下是按降序对结果进行排序的 SQL 查询。
Following is the SQL query to sort the result in a descending order.
SELECT f.description, f.foodGroup,
f.servings[2].description AS servingDescription,
f.servings[2].weightInGrams AS servingWeight
FROM f
ORDER BY f.servings[2].weightInGrams DESC
执行上述查询时,您将收到以下输出。
When the above query is executed, you will receive the following output.
[
{
"description": "Grapes, red or green (European type, such as Thompson
seedless), raw",
"foodGroup": "Fruits and Fruit Juices",
"servingDescription": "NLEA serving",
"servingWeight": 126
}
]
DocumentDB - Indexing Records
默认情况下,DocumentDB 会自动索引文档中的每个属性,只要该文档添加到数据库中。但是,您可以进行控制并微调自己的索引策略,这会在不必要索引特定文档和/或属性时减少存储和处理开销。
By default, DocumentDB automatically indexes every property in a document as soon as the document is added to the database. However, you can take control and fine tune your own indexing policy that reduces storage and processing overhead when there are specific documents and/or properties that never needs to be indexed.
默认索引策略会告诉 DocumentDB 自动索引每个属性,这适用于许多常见情况。但是,您还可以实现一项定制的策略,它对将索引用哪个以及不使用哪个执行精细控制,并对索引使用其他功能。
The default indexing policy that tells DocumentDB to index every property automatically is suitable for many common scenarios. But you can also implement a custom policy that exercises fine control over exactly what gets indexed and what doesn’t and other functionality with regards to indexing.
DocumentDB 支持以下索引类型:
DocumentDB supports the following types of indexing −
-
Hash
-
Range
Hash
哈希索引支持对相等进行有效查询,换句话说,在搜索文档时,给定属性等于一个精确值,而不是匹配小于、大于或介于一定值范围之内的值。
Hash index enables efficient querying for equality, i.e., while searching for documents where a given property equals an exact value, rather than matching on a range of values like less than, greater than or between.
您可以使用哈希索引执行范围查询,但 DocumentDB 无法使用哈希索引来查找匹配的文档,而需要顺序扫描每个文档以确定它是否应由范围查询选择。
You can perform range queries with a hash index, but DocumentDB will not be able to use the hash index to find matching documents and will instead need to sequentially scan each document to determine if it should be selected by the range query.
您无法使用仅具有哈希索引的属性上的 ORDER BY 子句对文档进行排序。
You won’t be able to sort your documents with an ORDER BY clause on a property that has just a hash index.
Range
DocumentDB 为属性定义了范围索引,您可以有效查询一系列值的文档。它还允许您使用 ORDER BY 根据该属性对查询结果进行排序。
Range index defined for the property, DocumentDB allows to efficiently query for documents against a range of values. It also allows you to sort the query results on that property, using ORDER BY.
DocumentDB 允许您为任何或所有属性定义哈希和范围索引,这支持相等和范围查询以及 ORDER BY。
DocumentDB allows you to define both a hash and a range index on any or all properties, which enables efficient equality and range queries, as well as ORDER BY.
Indexing Policy
每个集合都有一个索引策略,决定了在每个文档的每个属性中数字和字符串使用哪类索引。
Every collection has an indexing policy that dictates which types of indexes are used for numbers and strings in every property of every document.
-
You can also control whether or not documents get indexed automatically as they are added to the collection.
-
Automatic indexing is enabled by default, but you can override that behavior when adding a document, telling DocumentDB not to index that particular document.
-
You can disable automatic indexing so that by default, documents are not indexed when added to the collection. Similarly, you can override this at the document level and instruct DocumentDB to index a particular document when adding it to the collection. This is known as manual indexing.
Include / Exclude Indexing
索引策略还可以定义路径或应包含在索引中或排除在索引之外的路径。如果您知道某个文档的某些部分永远不会作为查询条件,而某些部分却会作为查询条件,那么这很有用。
An indexing policy can also define which path or paths should be included or excluded from the index. This is useful if you know that there are certain parts of a document that you never query against and certain parts that you do.
在这些情况下,您可以通过指示 DocumentDB 仅为添加到集合的每个文档的那些特定部分建立索引来减少索引开销。
In these cases, you can reduce indexing overhead by telling DocumentDB to index just those particular portions of each document added to the collection.
Automatic Indexing
让我们来看一个自动索引的简单示例。
Let’s take a look at a simple example of automatic indexing.
Step 1 − 首先,我们创建一个名为 autoindexing 的集合,而无需明确提供策略,此集合使用默认索引策略,这意味着启用此集合上的自动索引。
Step 1 − First we create a collection called autoindexing and without explicitly supplying a policy, this collection uses the default indexing policy, which means that automatic indexing is enabled on this collection.
这里我们使用基于 ID 的路由来获得数据库自引用链接,因此我们无需在创建集合之前知道其资源 ID 或查询它。我们可以仅仅使用数据库 ID,即 mydb。
Here we are using ID-based routing for the database self-link so we don’t need to know its resource ID or query for it before creating the collection. We can just use the database ID, which is mydb.
Step 2 − 现在,让我们创建两个文档,姓氏均为 Upston。
Step 2 − Now let’s create two documents, both with the last name of Upston.
private async static Task AutomaticIndexing(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Override Automatic Indexing ****");
// Create collection with automatic indexing
var collectionDefinition = new DocumentCollection {
Id = "autoindexing"
};
var collection = await client.CreateDocumentCollectionAsync("dbs/mydb",
collectionDefinition);
// Add a document (indexed)
dynamic indexedDocumentDefinition = new {
id = "MARK",
firstName = "Mark",
lastName = "Upston",
addressLine = "123 Main Street",
city = "Brooklyn",
state = "New York",
zip = "11229",
};
Document indexedDocument = await client
.CreateDocumentAsync("dbs/mydb/colls/autoindexing", indexedDocumentDefinition);
// Add another document (request no indexing)
dynamic unindexedDocumentDefinition = new {
id = "JANE",
firstName = "Jane",
lastName = "Upston",
addressLine = "123 Main Street",
city = "Brooklyn",
state = "New York",
zip = "11229",
};
Document unindexedDocument = await client
.CreateDocumentAsync("dbs/mydb/colls/autoindexing", unindexedDocumentDefinition,
new RequestOptions { IndexingDirective = IndexingDirective.Exclude });
//Unindexed document won't get returned when querying on non-ID (or selflink) property
var doeDocs = client.CreateDocumentQuery("dbs/mydb/colls/autoindexing", "SELECT *
FROM c WHERE c.lastName = 'Doe'").ToList();
Console.WriteLine("Documents WHERE lastName = 'Doe': {0}", doeDocs.Count);
// Unindexed document will get returned when using no WHERE clause
var allDocs = client.CreateDocumentQuery("dbs/mydb/colls/autoindexing",
"SELECT * FROM c").ToList();
Console.WriteLine("All documents: {0}", allDocs.Count);
// Unindexed document will get returned when querying by ID (or self-link) property
Document janeDoc = client.CreateDocumentQuery("dbs/mydb/colls/autoindexing",
"SELECT * FROM c WHERE c.id = 'JANE'").AsEnumerable().FirstOrDefault();
Console.WriteLine("Unindexed document self-link: {0}", janeDoc.SelfLink);
// Delete the collection
await client.DeleteDocumentCollectionAsync("dbs/mydb/colls/autoindexing");
}
第一个文档属于 Mark Upston,已添加到集合中,然后立即根据默认索引策略自动对其建立索引。
This first one, for Mark Upston, gets added to the collection and is then immediately indexed automatically based on the default indexing policy.
但是,当添加第二个 Mark Upston 的文档时,我们已发送带有 IndexingDirective.Exclude 的请求选项,此选项明确指示 DocumentDB 不要为该文档建立索引,尽管有集合的索引策略。
But when the second document for Mark Upston is added, we have passed the request options with IndexingDirective.Exclude which explicitly instructs DocumentDB not to index this document, despite the collection’s indexing policy.
我们最终为两个文档设置了不同类型的查询。
We have different types of queries for both the documents at the end.
Step 3 − 让我们从 CreateDocumentClient 中调用 AutomaticIndexing 任务。
Step 3 − Let’s call the AutomaticIndexing task from CreateDocumentClient.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
await AutomaticIndexing(client);
}
}
当上文代码被编译和执行时,您将收到如下输出。
When the above code is compiled and executed, you will receive the following output.
**** Override Automatic Indexing ****
Documents WHERE lastName = 'Upston': 1
All documents: 2
Unindexed document self-link: dbs/kV5oAA==/colls/kV5oAOEkfQA=/docs/kV5oAOEkfQACA
AAAAAAAAA==/
正如您所见,我们有两个这样的文档,但查询仅返回马克的那个,因为马克的那个未编入索引。如果我们再次查询,不使用 WHERE 子句来检索集合中的所有文档,那么结果集将包含这两个文档,这是因为始终通过没有 WHERE 子句的查询返回未编制索引的文档。
As you can see we have two such documents, but the query returns only the one for Mark because the one for Mark isn’t indexed. If we query again, without a WHERE clause to retrieve all the documents in the collection, then we get a result set with both documents and this is because unindexed documents are always returned by queries that have no WHERE clause.
我们还可按其 ID 或自链接检索未编制索引的文档。因此,当我们按其 ID MARK 查询马克的文档时,我们看到 DocumentDB 返回该文档,尽管它未在集合中编制索引。
We can also retrieve unindexed documents by their ID or self-link. So when we query for Mark’s document by his ID, MARK, we see that DocumentDB returns the document even though it isn’t indexed in the collection.
Manual Indexing
我们来看一个简单的示例,通过覆盖自动索引来进行手动索引。
Let’ take a look at a simple example of manual indexing by overriding automatic indexing.
Step 1 − 首先,我们将创建一个名为 manualindexing 的集合,并通过明确禁用自动索引来覆盖默认策略。这意味着,除非我们提出其他请求,否则添加到此集合的新文档将不会编制索引。
Step 1 − First we’ll create a collection called manualindexing and override the default policy by explicitly disabling automatic indexing. This means that, unless we request otherwise, new documents added to this collection will not be indexed.
private async static Task ManualIndexing(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Manual Indexing ****");
// Create collection with manual indexing
var collectionDefinition = new DocumentCollection {
Id = "manualindexing",
IndexingPolicy = new IndexingPolicy {
Automatic = false,
},
};
var collection = await client.CreateDocumentCollectionAsync("dbs/mydb",
collectionDefinition);
// Add a document (unindexed)
dynamic unindexedDocumentDefinition = new {
id = "MARK",
firstName = "Mark",
lastName = "Doe",
addressLine = "123 Main Street",
city = "Brooklyn",
state = "New York",
zip = "11229",
};
Document unindexedDocument = await client
.CreateDocumentAsync("dbs/mydb/colls/manualindexing", unindexedDocumentDefinition);
// Add another document (request indexing)
dynamic indexedDocumentDefinition = new {
id = "JANE",
firstName = "Jane",
lastName = "Doe",
addressLine = "123 Main Street",
city = "Brooklyn",
state = "New York",
zip = "11229",
};
Document indexedDocument = await client.CreateDocumentAsync
("dbs/mydb/colls/manualindexing", indexedDocumentDefinition, new RequestOptions {
IndexingDirective = IndexingDirective.Include });
//Unindexed document won't get returned when querying on non-ID (or selflink) property
var doeDocs = client.CreateDocumentQuery("dbs/mydb/colls/manualindexing",
"SELECT * FROM c WHERE c.lastName = 'Doe'").ToList();
Console.WriteLine("Documents WHERE lastName = 'Doe': {0}", doeDocs.Count);
// Unindexed document will get returned when using no WHERE clause
var allDocs = client.CreateDocumentQuery("dbs/mydb/colls/manualindexing",
"SELECT * FROM c").ToList();
Console.WriteLine("All documents: {0}", allDocs.Count);
// Unindexed document will get returned when querying by ID (or self-link) property
Document markDoc = client
.CreateDocumentQuery("dbs/mydb/colls/manualindexing",
"SELECT * FROM c WHERE c.id = 'MARK'")
.AsEnumerable().FirstOrDefault();
Console.WriteLine("Unindexed document self-link: {0}", markDoc.SelfLink);
await client.DeleteDocumentCollectionAsync("dbs/mydb/colls/manualindexing");
}
Step 2 − 现在,我们将再次创建与之前相同的两个文档。这一次,由于集合的索引策略,我们不会为马克的文档提供任何特殊请求选项,该文档将不会编制索引。
Step 2 − Now we will again create the same two documents as before. We will not supply any special request options for Mark’s document this time, because of the collection’s indexing policy, this document will not get indexed.
Step 3 − 现在,当我们添加马克的第二个文档时,我们使用带 IndexingDirective.Include 的 RequestOptions 告诉 DocumentDB 它应该索引此文档,这将覆盖集合中所说的不应该索引的索引策略。
Step 3 − Now when we add the second document for Mark, we use RequestOptions with IndexingDirective.Include to tell DocumentDB that it should index this document, which overrides the collection’s indexing policy that says that it shouldn’t.
我们最终为两个文档设置了不同类型的查询。
We have different types of queries for both the documents at the end.
Step 4 − 从 CreateDocumentClient 调用 ManualIndexing 任务。
Step 4 − Let’s call the ManualIndexing task from CreateDocumentClient.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
await ManualIndexing(client);
}
}
当以上代码编译并执行时,您将收到以下输出。
When the above code is compiled and executed you will receive the following output.
**** Manual Indexing ****
Documents WHERE lastName = 'Upston': 1
All documents: 2
Unindexed document self-link: dbs/kV5oAA==/colls/kV5oANHJPgE=/docs/kV5oANHJPgEBA
AAAAAAAAA==/
同样,该查询仅返回两个文档中的一个,但这一次,它返回简·多伊,我们明确要求对该文档编制索引。但与之前一样,不使用 WHERE 子句进行查询也会检索集合中的所有文档,包括马克的未编制索引的文档。我们还可以按 ID 查询未编制索引的文档,DocumentDB 会返回该文档,即使未对其编制索引。
Again, the query returns only one of the two documents, but this time, it returns Jane Doe, which we explicitly requested to be indexed. But again as before, querying without a WHERE clause retrieves all the documents in the collection, including the unindexed document for Mark. We can also query for the unindexed document by its ID, which DocumentDB returns even though it’s not indexed.
DocumentDB - Geospatial Data
Microsoft 添加了 geospatial support ,它允许您在文档中存储位置数据,并对点和多边形之间的距离和相交进行空间计算。
Microsoft added geospatial support, which lets you store location data in your documents and perform spatial calculations for distance and intersections between points and polygons.
-
Spatial data describes the position and shape of objects in space.
-
Typically, it can be used to represent the location of a person, a place of interest, or the boundary of a city, or a lake.
-
Common use cases often involve proximity queries. For e.g., "find all universities near my current location".
Point 表示空间中的单个位置,它表示确切位置,例如特定大学的街道地址。点在 DocumentDB 中使用其坐标对(经度和纬度)表示。以下是 JSON 点示例。
A Point denotes a single position in space which represents the exact location, e.g. street address of particular university. A point is represented in DocumentDB using its coordinate pair (longitude and latitude). Following is an example of JSON point.
{
"type":"Point",
"coordinates":[ 28.3, -10.7 ]
}
我们来看一个包含大学位置的简单示例。
Let’s take a look at a simple example which contains the location of a university.
{
"id":"case-university",
"name":"CASE: Center For Advanced Studies In Engineering",
"city":"Islamabad",
"location": {
"type":"Point",
"coordinates":[ 33.7194136, -73.0964862 ]
}
}
若要根据位置检索大学名称,您可以使用以下查询。
To retrieve the university name based on the location, you can use the following query.
SELECT c.name FROM c
WHERE c.id = "case-university" AND ST_ISVALID({
"type":"Point",
"coordinates":[ 33.7194136, -73.0964862 ]})
执行以上查询时,您将收到以下输出。
When the above query is executed you will receive the following output.
[
{
"name": "CASE: Center For Advanced Studies In Engineering"
}
]
Create Document with Geospatial Data in .NET
您可以创建包含地理空间数据的新文档,我们来看一个创建大学文档的简单示例。
You can create a document with geospatial data, let’s take a look at a simple example in which a university document is created.
private async static Task CreateDocuments(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents ****");
Console.WriteLine();
var uniDocument = new UniversityProfile {
Id = "nust",
Name = "National University of Sciences and Technology",
City = "Islamabad",
Loc = new Point(33.6455715, 72.9903447)
};
Document document = await CreateDocument(client, uniDocument);
Console.WriteLine("Created document {0} from typed object", document.Id);
Console.WriteLine();
}
以下是 UniversityProfile 类的实现。
Following is the implementation for the UniversityProfile class.
public class UniversityProfile {
[JsonProperty(PropertyName = "id")]
public string Id { get; set; }
[JsonProperty("name")]
public string Name { get; set; }
[JsonProperty("city")]
public string City { get; set; }
[JsonProperty("location")]
public Point Loc { get; set; }
}
当上文代码被编译和执行时,您将收到如下输出。
When the above code is compiled and executed, you will receive the following output.
**** Create Documents ****
Created new document: nust
{
"id": "nust",
"name": "National University of Sciences and Technology",
"city": "Islamabad",
"location": {
"type": "Point",
"coordinates": [
33.6455715,
72.9903447
]
},
"_rid": "Ic8LAMEUVgANAAAAAAAAAA==",
"_ts": 1450200910,
"_self": "dbs/Ic8LAA==/colls/Ic8LAMEUVgA=/docs/Ic8LAMEUVgANAAAAAAAAAA==/",
"_etag": "\"00004100-0000-0000-0000-56704f4e0000\"",
"_attachments": "attachments/"
}
Created document nust from typed object
DocumentDB - Partitioning
当数据库开始增长到超过 10GB 时,只需创建新集合,然后在越来越多的集合中传播或分区您的数据,即可轻松扩展。
When your database starts to grow beyond 10GB, you can scale out simply by creating new collections and then spreading or partitioning your data across more and more collections.
迟早,10GB容量的单个集合将不足以容纳你的数据库。现在10GB听起来可能不算是很大数字,但是请记住,我们正在存储JSON文档,其中只是纯文本,而且即便考虑到索引的存储开销,你也可以在10GB中放入许多纯文本文档。
Sooner or later a single collection, which has a 10GB capacity, will not be enough to contain your database. Now 10GB may not sound like a very large number, but remember that we’re storing JSON documents, which is just plain text and you can fit a lot of plain text documents in 10GB, even when you consider the storage overhead for the indexes.
对于可扩展性而言,存储并不是唯一的问题。在某个集合中可用的最大吞吐量为每秒两千五百个请求单元,你可以通过S3集合获得这个吞吐量。因此,如果你需要更高的吞吐量,那么你还需要通过使用多个集合进行分区来扩展。扩展分区也称为 horizontal partitioning 。
Storage isn’t the only concern when it comes to scalability. The maximum throughput available on a collection is two and a half thousand request units per second that you get with an S3 collection. Hence, if you need higher throughput, then you will also need to scale out by partitioning with multiple collections. Scale out partitioning is also called horizontal partitioning.
可以使用许多方法对Azure DocumentDB中的数据分区。以下是最常见策略:
There are many approaches that can be used for partitioning data with Azure DocumentDB. Following are most common strategies −
-
Spillover Partitioning
-
Range Partitioning
-
Lookup Partitioning
-
Hash Partitioning
Spillover Partitioning
溢出分区是最简单的策略,因为它没有分区键。当你对很多事情不确定时,它往往是个不错的开端。你可能不知道你是否需要扩展到单个集合之外,或者你需要添加多少集合,又或者你可能需要多快添加它们。
Spillover partitioning is the simplest strategy because there is no partition key. It’s often a good choice to start with when you’re unsure about a lot of things. You might not know if you’ll even ever need to scale out beyond a single collection or how many collections you may need to add or how fast you may need to add them.
-
Spillover partitioning starts with a single collection and there is no partition key.
-
The collection starts to grow and then grows some more, and then some more, until you start getting close to the 10GB limit.
-
When you reach 90 percent capacity, you spill over to a new collection and start using it for new documents.
-
Once your database scales out to a larger number of collections, you’ll probably want to shift to a strategy that’s based on a partition key.
-
When you do that you’ll need to rebalance your data by moving documents to different collections based on whatever strategy you’re migrating to.
Range Partitioning
最常见的策略之一是范围分区。使用这种方法,你可以确定文档分区键可能落入的值范围,并将文档定向到与该范围相对应的集合中。
One of the most common strategies is range partitioning. With this approach you determine the range of values that a document’s partition key might fall in and direct the document to a collection corresponding to that range.
-
Dates are very typically used with this strategy where you create a collection to hold documents that fall within the defined range of dates. When you define ranges that are small enough, where you’re confident that no collection will ever exceed its 10GB limit. For example, there may be a scenario where a single collection can reasonably handle documents for an entire month.
-
It may also be the case that most users are querying for current data, which would be data for this month or perhaps last month, but users are rarely searching for much older data. So you start off in June with an S3 collection, which is the most expensive collection you can buy and delivers the best throughput you can get.
-
In July you buy another S3 collection to store the July data and you also scale the June data down to a less-expensive S2 collection. Then in August, you get another S3 collection and scale July down to an S2 and June all the way down to an S1. It goes, month after month, where you’re always keeping the current data available for high throughput and older data is kept available at lower throughputs.
-
As long as the query provides a partition key, only the collection that needs to be queried will get queried and not all the collections in the database like it happens with spillover partitioning.
Lookup Partitioning
使用查找分区,你可以定义一个分区映射,根据其分区键将文档路由到特定集合。例如,你可以按区域分区。
With lookup partitioning you can define a partition map that routes documents to specific collections based on their partition key. For example, you could partition by region.
-
Store all US documents in one collection, all European documents in another collection, and all documents from any other region in a third collection.
-
Use this partition map and a lookup partition resolver can figure out which collection to create a document in and which collections to query, based on the partition key, which is the region property contained in each document.
Hash Partitioning
在哈希分区中,分区根据哈希函数的值分配,从而使你可以在多个分区中均匀地分布请求和数据。
In hash partitioning, partitions are assigned based on the value of a hash function, allowing you to evenly distribute requests and data across a number of partitions.
这通常用于对大量不同客户端产生或使用的数据进行分区,并且对于存储用户配置文件、目录项等非常有用。
This is commonly used to partition data produced or consumed from a large number of distinct clients, and is useful for storing user profiles, catalog items, etc.
让我们来看一下使用 .NET SDK 提供的 RangePartitionResolver 对范围分区进行简单分区的示例。
Let’s take a look at a simple example of range partitioning using the RangePartitionResolver supplied by the .NET SDK.
Step 1 − 创建一个新的 DocumentClient,我们将在 CreateCollections 任务中创建两个集合。一个集合将包含用户 ID 以 A 到 M 开头的用户的文档,另一个集合将包含用户 ID 为 N 到 Z 的用户的文档。
Step 1 − Create a new DocumentClient and we will create two collections in CreateCollections task. One will contain documents for users that have user IDs beginning with A through M and the other for user IDs N through Z.
private static async Task CreateCollections(DocumentClient client) {
await client.CreateDocumentCollectionAsync(“dbs/myfirstdb”, new DocumentCollection {
Id = “CollectionAM” });
await client.CreateDocumentCollectionAsync(“dbs/myfirstdb”, new DocumentCollection {
Id = “CollectionNZ” });
}
Step 2 − 为数据库注册范围解析器。
Step 2 − Register the range resolver for the database.
Step 3 − 创建一个新的 RangePartitionResolver<string>,即分区键的数据类型。此构造函数接收两个参数,分区键的属性名称以及字典,该字典是分片映射或分区映射,它只是我们为解析器预定义的范围列表和对应集合。
Step 3 − Create a new RangePartitionResolver<string>, which is the datatype of our partition key. The constructor takes two parameters, the property name of the partition key and a dictionary that is the shard map or partition map, which is just a list of the ranges and corresponding collections that we are predefining for the resolver.
private static void RegisterRangeResolver(DocumentClient client) {
//Note: \uffff is the largest UTF8 value, so M\ufff includes all strings that start with M.
var resolver = new RangePartitionResolver<string>(
"userId", new Dictionary<Range<string>, string>() {
{ new Range<string>("A", "M\uffff"), "dbs/myfirstdb/colls/CollectionAM" },
{ new Range<string>("N", "Z\uffff"), "dbs/myfirstdb/colls/CollectionNZ" },
});
client.PartitionResolvers["dbs/myfirstdb"] = resolver;
}
必须在此处编码尽可能大的 UTF-8 值。否则,第一个范围将不会匹配到任何 M,除了单个 M,同样,第二个范围中的 Z 也是如此。因此,您可以将此编码值视为匹配分区键的通配符。
It’s necessary to encode the largest possible UTF-8 value here. Or else the first range wouldn’t match on any Ms except the one single M, and likewise for Z in the second range. So, you can just think of this encoded value here as a wildcard for matching on the partition key.
Step 4 − 在创建解析器后,使用当前的 DocumentClient 为数据库注册它。为此,只需将其赋给 PartitionResolver 的字典属性即可。
Step 4 − After creating the resolver, register it for the database with the current DocumentClient. To do that just assign it to the PartitionResolver’s dictionary property.
我们将针对数据库而不是集合(像您通常所做的那样)创建和查询文档,解析器将使用此映射将请求路由到适当的集合。
We’ll create and query for documents against the database, not a collection as you normally do, the resolver will use this map to route requests to the appropriate collections.
现在,让我们创建一些文档。首先,我们将为 userId Kirk 创建一个文档,然后为 Spock 创建一个文档。
Now let’s create some documents. First we will create one for userId Kirk, and then one for Spock.
private static async Task CreateDocumentsAcrossPartitions(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** Create Documents Across Partitions ****");
var kirkDocument = await client.CreateDocumentAsync("dbs/myfirstdb", new { userId =
"Kirk", title = "Captain" });
Console.WriteLine("Document 1: {0}", kirkDocument.Resource.SelfLink);
var spockDocument = await client.CreateDocumentAsync("dbs/myfirstdb", new { userId =
"Spock", title = "Science Officer" });
Console.WriteLine("Document 2: {0}", spockDocument.Resource.SelfLink);
}
此处的第一个参数是到数据库的自我链接,而不是特定集合。如果没有分区解析器,这是不可能的,但如果有一个,它只会无缝地工作。
The first parameter here is a self-link to the database, not a specific collection. This is not possible without a partition resolver, but with one it just works seamlessly.
如果 RangePartitionResolver 工作正常,两个文档都将保存到数据库 myfirstdb 中,但我们知道 Kirk 存储在 A 到 M 的集合中,而 Spock 存储在 N 到 Z 的集合中。
Both documents were saved to the database myfirstdb, but we know that Kirk is being stored in the collection for A through M and Spock is being stored in the collection for N to Z, if our RangePartitionResolver is working properly.
让我们在 CreateDocumentClient 任务中调用这些文档,如下面的代码所示。
Let’s call these from the CreateDocumentClient task as shown in the following code.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
await CreateCollections(client);
RegisterRangeResolver(client);
await CreateDocumentsAcrossPartitions(client);
}
}
执行上述代码后,您将收到以下输出。
When the above code is executed, you will receive the following output.
**** Create Documents Across Partitions ****
Document 1: dbs/Ic8LAA==/colls/Ic8LAO2DxAA=/docs/Ic8LAO2DxAABAAAAAAAAAA==/
Document 2: dbs/Ic8LAA==/colls/Ic8LAP12QAE=/docs/Ic8LAP12QAEBAAAAAAAAAA==/
正如所见,由于两个文档位于两个不同的集合中,因此这两个文档的自我链接具有不同的资源 ID。
As seen the self-links of the two documents have different resource IDs because they exist in two separate collections.
DocumentDB - Data Migration
利用 DocumentDB 数据迁移工具,您可以轻松地将数据迁移到 DocumentDB。DocumentDB 数据迁移工具是一个免费且开源的工具,您可以从 Microsoft 下载中心 https://www.microsoft.com/ 下载该工具。
With the DocumentDB Data Migration tool, you can easily migrate data to DocumentDB. The DocumentDB Data Migration Tool is a free and open source utility you can download from the Microsoft Download Center https://www.microsoft.com/
迁移工具支持许多数据源,其中一些如下所列 −
The Migration Tool supports many data sources, some of them are listed below −
-
MongoDB
-
Azure Table Storage
-
Amazon DynamoDB
-
HBase, and even other DocumentDB databases
下载 DocumentDB 数据迁移工具后,解压该 zip 文件。
After downloading the DocumentDB Data Migration tool, extract the zip file.
您可以在此文件夹中看到两个可执行文件,如下面的屏幕截图所示。
You can see two executables in this folder as shown in the following screenshot.

首先,有 dt.exe,这是具有命令行界面的控制台版本,然后是 dtui.exe,这是具有图形用户界面的桌面版本。
First, there is dt.exe, which is the console version with a command line interface, and then there is dtui.exe, which is the desktop version with a graphical user interface.
我们来启动 GUI 版本。
Let’s launch the GUI version.

您会看到欢迎页面。单击“下一步”以获取源信息页面。
You can see the Welcome page. Click ‘Next’ for the Source Information page.

这是您配置数据源的地方。您可以在下拉菜单中看到受支持的许多选项。
Here’s where you configure your data source, and you can see the many supported choices from the dropdown menu.

当您做出选择时,源信息页面的其余部分将相应更改。
When you make a selection, the rest of the Source Information page changes accordingly.
DocumentDB 数据迁移工具可轻松导入数据。我们建议您执行上述示例并使用其他数据文件。
It is very easy to import data to DocumentDB using the DocumentDB Data Migration Tool. We recommend you exercise the above examples and use the other data files as well.
DocumentDB - Access Control
DocumentDB 提供了控制对 DocumentDB 资源的访问的概念。对 DocumentDB 资源的访问由主密钥令牌或资源令牌控制。基于资源令牌的连接只能访问令牌指定的资源,而不能访问其他资源。资源令牌基于用户权限。
DocumentDB provides the concepts to control access to DocumentDB resources. Access to DocumentDB resources is governed by a master key token or a resource token. Connections based on resource tokens can only access the resources specified by the tokens and no other resources. Resource tokens are based on user permissions.
-
First you create one or more users, and these are defined at the database level.
-
Then you create one or more permissions for each user, based on the resources that you want to allow each user to access.
-
Each permission generates a resource token that allows either read-only or full access to a given resource and that can be any user resource within the database.
-
Users are defined at the database level and permissions are defined for each user.
-
Users and permissions apply to all collections in the database.
我们来看一个简单的示例,在这个示例中,我们将了解如何定义用户和权限,以实现在 DocumentDB 中的细粒度安全。
Let’s take a look at a simple example in which we will learn how to define users and permissions to achieve granular security in DocumentDB.
从一个新的 DocumentClient 开始,查询 myfirstdb 数据库。
We will start with a new DocumentClient and query for the myfirstdb database.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
var alice = await CreateUser(client, "Alice");
var tom = await CreateUser(client, "Tom");
}
}
下面是 CreateUser 的实现。
Following is the implementation for CreateUser.
private async static Task<User> CreateUser(DocumentClient client, string userId) {
Console.WriteLine();
Console.WriteLine("**** Create User {0} in {1} ****", userId, database.Id);
var userDefinition = new User { Id = userId };
var result = await client.CreateUserAsync(database.SelfLink, userDefinition);
var user = result.Resource;
Console.WriteLine("Created new user");
ViewUser(user);
return user;
}
Step 1 −创建两个用户 Alice 和 Tom,就像我们创建的任何资源一样。根据所需的 ID 构建一个定义对象,然后调用 create 方法;在该示例中,我们正在调用 CreateUserAsync 以及数据库的 SelfLink 和 userDefinition。我们从新创建的用户对象的 resource 属性中获取返回的结果。
Step 1 − Create two users, Alice and Tom like any resource we create, we construct a definition object with the desired Id and call the create method and in this case we’re calling CreateUserAsync with the database’s SelfLink and the userDefinition. We get back the result from whose resource property we obtain the newly created user object.
现在查看数据库中的这两个新用户。
Now to see these two new users in the database.
private static void ViewUsers(DocumentClient client) {
Console.WriteLine();
Console.WriteLine("**** View Users in {0} ****", database.Id);
var users = client.CreateUserQuery(database.UsersLink).ToList();
var i = 0;
foreach (var user in users) {
i++;
Console.WriteLine();
Console.WriteLine("User #{0}", i);
ViewUser(user);
}
Console.WriteLine();
Console.WriteLine("Total users in database {0}: {1}", database.Id, users.Count);
}
private static void ViewUser(User user) {
Console.WriteLine("User ID: {0} ", user.Id);
Console.WriteLine("Resource ID: {0} ", user.ResourceId);
Console.WriteLine("Self Link: {0} ", user.SelfLink);
Console.WriteLine("Permissions Link: {0} ", user.PermissionsLink);
Console.WriteLine("Timestamp: {0} ", user.Timestamp);
}
Step 2 −针对数据库的 UsersLink 调用 CreateUserQuery,以检索所有用户的列表。然后遍历这些用户并查看它们的属性。
Step 2 − Call CreateUserQuery, against the database’s UsersLink to retrieve a list of all users. Then loop through them and view their properties.
现在我们必须首先创建它们。所以假设我们想要允许 Alice 对 MyCollection 集合进行读/写,而 Tom 只可以读取集合中的文档。
Now we have to create them first. So let’s say that we wanted to allow Alice read/write permissions to the MyCollection collection, but Tom can only read documents in the collection.
await CreatePermission(client, alice, "Alice Collection Access", PermissionMode.All,
collection);
await CreatePermission(client, tom, "Tom Collection Access", PermissionMode.Read,
collection);
Step 3 −在一个资源(MyCollection 集合)上创建权限,因此我们需要获得该资源的 SelfLink。
Step 3− Create a permission on a resource that is MyCollection collection so we need to get that resource a SelfLink.
Step 4 −然后为 Alice 创建这个集合的 Permission.All,为 Tom 创建这个集合的 Permission.Read。
Step 4 − Then create a Permission.All on this collection for Alice and a Permission.Read on this collection for Tom.
以下是 CreatePermission 的实现。
Following is the implementation for CreatePermission.
private async static Task CreatePermission(DocumentClient client, User user,
string permId, PermissionMode permissionMode, string resourceLink) {
Console.WriteLine();
Console.WriteLine("**** Create Permission {0} for {1} ****", permId, user.Id);
var permDefinition = new Permission {
Id = permId,
PermissionMode = permissionMode,
ResourceLink = resourceLink
};
var result = await client.CreatePermissionAsync(user.SelfLink, permDefinition);
var perm = result.Resource;
Console.WriteLine("Created new permission");
ViewPermission(perm);
}
正如你预期的那样,我们通过创建一个新的权限的定义对象来实现此操作,其中包括一个 Id 和一个权限模式,它可能是 Permission.All 或 Permission.Read,以及要通过该权限保护的资源的 SelfLink。
As you should come to expect by now, we do this by creating a definition object for the new permission, which includes an Id and a permissionMode, which is either Permission.All or Permission.Read, and the SelfLink of the resource that’s being secured by the permission.
@ {s0} − 调用 CreatePermissionAsync 并从结果中的 resource 属性获取已创建的权限。
Step 5 − Call CreatePermissionAsync and get the created permission from the resource property in the result.
要查看创建的权限,以下是 ViewPermissions 的实现。
To view the created permission, following is the implementation of ViewPermissions.
private static void ViewPermissions(DocumentClient client, User user) {
Console.WriteLine();
Console.WriteLine("**** View Permissions for {0} ****", user.Id);
var perms = client.CreatePermissionQuery(user.PermissionsLink).ToList();
var i = 0;
foreach (var perm in perms) {
i++;
Console.WriteLine();
Console.WriteLine("Permission #{0}", i);
ViewPermission(perm);
}
Console.WriteLine();
Console.WriteLine("Total permissions for {0}: {1}", user.Id, perms.Count);
}
private static void ViewPermission(Permission perm) {
Console.WriteLine("Permission ID: {0} ", perm.Id);
Console.WriteLine("Resource ID: {0} ", perm.ResourceId);
Console.WriteLine("Permission Mode: {0} ", perm.PermissionMode);
Console.WriteLine("Token: {0} ", perm.Token);
Console.WriteLine("Timestamp: {0} ", perm.Timestamp);
}
这次,它对用户的权限链接进行权限查询,我们只需列出为用户返回的每个权限。
This time, it’s a permission query against the user’s permissions link and we simply list each permission returned for the user.
让我们删除 Alice 和 Tom 的权限。
Let’s delete the Alice’s and Tom’s permissions.
await DeletePermission(client, alice, "Alice Collection Access");
await DeletePermission(client, tom, "Tom Collection Access");
以下是 DeletePermission 的实现。
Following is the implementation for DeletePermission.
private async static Task DeletePermission(DocumentClient client, User user,
string permId) {
Console.WriteLine();
Console.WriteLine("**** Delete Permission {0} from {1} ****", permId, user.Id);
var query = new SqlQuerySpec {
QueryText = "SELECT * FROM c WHERE c.id = @id",
Parameters = new SqlParameterCollection {
new SqlParameter { Name = "@id", Value = permId }
}
};
Permission perm = client.CreatePermissionQuery(user.PermissionsLink, query)
.AsEnumerable().First();
await client.DeletePermissionAsync(perm.SelfLink);
Console.WriteLine("Deleted permission {0} from user {1}", permId, user.Id);
}
@ {s1} − 为了删除权限,按权限 Id 查询以获取 SelfLink,然后使用 SelfLink 删除权限。
Step 6 − To delete permissions, query by permission Id to get the SelfLink, and then using the SelfLink to delete the permission.
接下来,让我们删除用户本人。我们删除这两个用户。
Next, let’s delete the users themselves. Let’s delete both the users.
await DeleteUser(client, "Alice");
await DeleteUser(client, "Tom");
以下是 DeleteUser 的实现。
Following is the implementation for DeleteUser.
private async static Task DeleteUser(DocumentClient client, string userId) {
Console.WriteLine();
Console.WriteLine("**** Delete User {0} in {1} ****", userId, database.Id);
var query = new SqlQuerySpec {
QueryText = "SELECT * FROM c WHERE c.id = @id",
Parameters = new SqlParameterCollection {
new SqlParameter { Name = "@id", Value = userId }
}
};
User user = client.CreateUserQuery(database.SelfLink, query).AsEnumerable().First();
await client.DeleteUserAsync(user.SelfLink);
Console.WriteLine("Deleted user {0} from database {1}", userId, database.Id);
}
@ {s2} − 首先查询以获取她的 SelfLink,然后调用 DeleteUserAsync 以删除她的用户对象。
Step 7 − First query to get her SelfLink and then call DeleteUserAsync to delete her user object.
以下是 CreateDocumentClient 任务的实现,其中我们调用了上述所有任务。
Following is the implementation of CreateDocumentClient task in which we call all the above tasks.
private static async Task CreateDocumentClient() {
// Create a new instance of the DocumentClient
using (var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey)) {
database = client.CreateDatabaseQuery("SELECT * FROM c WHERE c.id =
'myfirstdb'").AsEnumerable().First();
collection = client.CreateDocumentCollectionQuery(database.CollectionsLink,
"SELECT * FROM c WHERE c.id = 'MyCollection'").AsEnumerable().First();
ViewUsers(client);
var alice = await CreateUser(client, "Alice");
var tom = await CreateUser(client, "Tom");
ViewUsers(client);
ViewPermissions(client, alice);
ViewPermissions(client, tom);
string collectionLink = client.CreateDocumentCollectionQuery(database.SelfLink,
"SELECT VALUE c._self FROM c WHERE c.id = 'MyCollection'")
.AsEnumerable().First().Value;
await CreatePermission(client, alice, "Alice Collection Access", PermissionMode.All,
collectionLink);
await CreatePermission(client, tom, "Tom Collection Access", PermissionMode.Read,
collectionLink);
ViewPermissions(client, alice);
ViewPermissions(client, tom);
await DeletePermission(client, alice, "Alice Collection Access");
await DeletePermission(client, tom, "Tom Collection Access");
await DeleteUser(client, "Alice");
await DeleteUser(client, "Tom");
}
}
当以上代码编译并执行时,您将收到以下输出。
When the above code is compiled and executed you will receive the following output.
**** View Users in myfirstdb ****
Total users in database myfirstdb: 0
**** Create User Alice in myfirstdb ****
Created new user
User ID: Alice
Resource ID: kV5oAC56NwA=
Self Link: dbs/kV5oAA==/users/kV5oAC56NwA=/
Permissions Link: dbs/kV5oAA==/users/kV5oAC56NwA=/permissions/
Timestamp: 12/17/2015 5:44:19 PM
**** Create User Tom in myfirstdb ****
Created new user
User ID: Tom
Resource ID: kV5oAALxKgA=
Self Link: dbs/kV5oAA==/users/kV5oAALxKgA=/
Permissions Link: dbs/kV5oAA==/users/kV5oAALxKgA=/permissions/
Timestamp: 12/17/2015 5:44:21 PM
**** View Users in myfirstdb ****
User #1
User ID: Tom
Resource ID: kV5oAALxKgA=
Self Link: dbs/kV5oAA==/users/kV5oAALxKgA=/
Permissions Link: dbs/kV5oAA==/users/kV5oAALxKgA=/permissions/
Timestamp: 12/17/2015 5:44:21 PM
User #2
User ID: Alice
Resource ID: kV5oAC56NwA=
Self Link: dbs/kV5oAA==/users/kV5oAC56NwA=/
Permissions Link: dbs/kV5oAA==/users/kV5oAC56NwA=/permissions/
Timestamp: 12/17/2015 5:44:19 PM
Total users in database myfirstdb: 2
**** View Permissions for Alice ****
Total permissions for Alice: 0
**** View Permissions for Tom ****
Total permissions for Tom: 0
**** Create Permission Alice Collection Access for Alice ****
Created new permission
Permission ID: Alice Collection Access
Resource ID: kV5oAC56NwDON1RduEoCAA==
Permission Mode: All
Token: type=resource&ver=1&sig=zB6hfvvleC0oGGbq5cc67w==;Zt3Lx
Ol14h8pd6/tyF1h62zbZKk9VwEIATIldw4ZyipQGW951kirueAKdeb3MxzQ7eCvDfvp7Y/ZxFpnip/D G
JYcPyim5cf+dgLvos6fUuiKSFSul7uEKqp5JmJqUCyAvD7w+qt1Qr1PmrJDyAIgbZDBFWGe2VT9FaBH o
PYwrLjRlnH0AxfbrR+T/UpWMSSHtLB8JvNFZNSH8hRjmQupuTSxCTYEC89bZ/pS6fNmNg8=;
Timestamp: 12/17/2015 5:44:28 PM
**** Create Permission Tom Collection Access for Tom ****
Created new permission
Permission ID: Tom Collection Access
Resource ID: kV5oAALxKgCMai3JKWdfAA==
Permission Mode: Read
Token: type=resource&ver=1&sig=ieBHKeyi6EY9ZOovDpe76w==;92gwq
V4AxKaCJ2dLS02VnJiig/5AEbPcfo1xvOjR10uK3a3FUMFULgsaK8nzxdz6hLVCIKUj6hvMOTOSN8Lt 7
i30mVqzpzCfe7JO3TYSJEI9D0/5HbMIEgaNJiCu0JPPwsjVecTytiLN56FHPguoQZ7WmUAhVTA0IMP6 p
jQpLDgJ43ZaG4Zv3qWJiO689balD+egwiU2b7RICH4j6R66UVye+GPxq/gjzqbHwx79t54=;
Timestamp: 12/17/2015 5:44:30 PM
**** View Permissions for Alice ****
Permission #1
Permission ID: Alice Collection Access
Resource ID: kV5oAC56NwDON1RduEoCAA==
Permission Mode: All
Token: type=resource&ver=1&sig=BSzz/VNe9j4IPJ9M31Mf4Q==;Tcq/B
X50njB1vmANZ/4aHj/3xNkghaqh1OfV95JMi6j4v7fkU+gyWe3mJasO3MJcoop9ixmVnB+RKOhFaSxE l
P37SaGuIIik7GAWS+dcEBWglMefc95L2YkeNuZsjmmW5b+a8ELCUg7N45MKbpzkp5BrmmGVJ7h4Z4pf D
rdmehYLuxSPLkr9ndbOOrD8E3bux6TgXCsgYQscpIlJHSKCKHUHfXWBP2Y1LV2zpJmRjis=;
Timestamp: 12/17/2015 5:44:28 PM
Total permissions for Alice: 1
**** View Permissions for Tom ****
Permission #1
Permission ID: Tom Collection Access
Resource ID: kV5oAALxKgCMai3JKWdfAA==
Permission Mode: Read
Token: type=resource&ver=1&sig=NPkWNJp1mAkCASE8KdR6PA==;ur/G2
V+fDamBmzECux000VnF5i28f8WRbPwEPxD1DMpFPqYcu45wlDyzT5A5gBr3/R3qqYkEVn8bU+een6Gl j
L6vXzIwsZfL12u/1hW4mJT2as2PWH3eadry6Q/zRXHAxV8m+YuxSzlZPjBFyJ4Oi30mrTXbBAEafZhA 5
yvbHkpLmQkLCERy40FbIFOzG87ypljREpwWTKC/z8RSrsjITjAlfD/hVDoOyNJwX3HRaz4=;
Timestamp: 12/17/2015 5:44:30 PM
Total permissions for Tom: 1
**** Delete Permission Alice Collection Access from Alice ****
Deleted permission Alice Collection Access from user Alice
**** Delete Permission Tom Collection Access from Tom ****
Deleted permission Tom Collection Access from user Tom
**** Delete User Alice in myfirstdb ****
Deleted user Alice from database myfirstdb
**** Delete User Tom in myfirstdb ****
Deleted user Tom from database myfirstdb
DocumentDB - Visualize Data
在本章中,我们将学习如何可视化存储在 DocumentDB 中的数据。Microsoft 提供了 Power BI Desktop 工具,它可以将你的数据转换为丰富的可视化效果。它还使你能够从各种数据源中检索数据,合并并转换数据,创建强大的报表和可视化效果,以及将报表发布到 Power BI。
In this chapter, we will learn how to visualize data which is stored in DocumentDB. Microsoft provided Power BI Desktop tool which transforms your data into rich visuals. It also enables you to retrieve data from various data sources, merge and transform the data, create powerful reports and visualizations, and publish the reports to Power BI.
在最新版本的 Power BI Desktop 中,Microsoft 也添加了对 DocumentDB 的支持,现在你可以连接到你的 DocumentDB 帐户。你可以从以下链接下载此工具:@ {s3}
In the latest version of Power BI Desktop, Microsoft has added support for DocumentDB as well in which you can now connect to your DocumentDB account. You can download this tool from the link, https://powerbi.microsoft.com
我们来看一个示例,在该示例中,我们将可视化上个章节中导入的地震数据。
Let’s take a look at an example in which we will visualize the earthquakes data imported in the last chapter.
@ {s4} − 下载该工具后,启动 Power BI 桌面版。
Step 1 − Once the tool is downloaded, launch the Power BI desktop.

@ {s5} − 单击“外部数据”组下“主页”选项卡上的“获取数据”选项,它将显示“获取数据”页面。
Step 2 − Click ‘Get Data’ option which is on the Home tab under External Data group and it will display the Get Data page.

@ {s6} − 选择 Microsoft Azure DocumentDB(Beta)选项并单击“连接”按钮。
Step 3 − Select the Microsoft Azure DocumentDB (Beta) option and click ‘Connect’ button.

@ {s7} − 输入你要可视化数据的 Azure DocumentDB 帐户、数据库和集合的 URL,然后按确定。
Step 4 − Enter the URL of your Azure DocumentDB account, Database and Collection from which you want visualize data and press Ok.
如果您是第一次连接到此端点,系统将会提示您输入帐户密钥。
If you are connecting to this endpoint for the first time, you will be prompted for the account key.

Step 5 − 输入帐户密钥(主键),它对于 Azure 门户上可用的每个 DocumentDB 帐户是唯一的,然后单击“连接”。
Step 5 − Enter the account key (primary key) which is unique for each DocumentDB account available on Azure portal, and then click Connect.

当帐户连接成功后,它将从指定数据库中检索数据。预览窗格显示记录项列表,文档在 Power BI 中表示为记录类型。
When the account is successfully connected, it will retrieve the data from specified database. The Preview pane shows a list of Record items, a Document is represented as a Record type in Power BI.
Step 6 − 单击“编辑”按钮,这将启动查询编辑器。
Step 6 − Click ‘Edit’ button which will launch the Query Editor.

Step 7 − 在 Power BI 查询编辑器中,您应当在中心窗格中看到“文档”列,单击“文档”列标题右侧的展开符,然后选择您想要显示的列。
Step 7 − In the Power BI Query Editor, you should see a Document column in the center pane, click on the expander at the right side of the Document column header and select the columns which you want display.

正如你所见,我们有经度和纬度作为单独的列,但以纬度,经度坐标的形式可视化数据。
As you can see that we have latitude and longitude as separate column but we visualize data in latitude, longitude coordinates form.
Step 8 − 为此,点击“添加列”选项卡。
Step 8 − To do that, click ‘Add Column’ tab.

Step 9 − 选择添加自定义列,它将显示以下页面。
Step 9 − Select the Add Custom Column which will display the following page.

Step 10 − 指定新列名,假定为 LatLong,以及公式,它将以逗号分隔的形式将纬度和经度组合在一列中。以下是该公式。
Step 10 − Specify the new column name, let’s say LatLong and also the formula which will combine the latitude and longitude in one column separated by a comma. Following is the formula.
Text.From([latitude])&", "&Text.From([longitude])
Step 11 − 点击确定继续,您将看到添加了新列。
Step 11 − Click OK to continue and you will see that the new column is added.

Step 12 − 转到主页选项卡并点击“关闭并应用”选项。
Step 12 − Go to the Home tab and click ‘Close & Apply’ option.

Step 13 − 您可通过将字段拖放到报告画布中创建报告。您可以在右侧看到两个窗格,一个为可视化窗格,另一个为字段窗格。
Step 13 − You can create reports by dragging and dropping fields into the Report canvas. You can see on the right, there are two panes − one Visualizations pane and the other is Fields pane.

我们创建一个地图视图来展示每次地震的位置。
Let’s create a map view showing the location of each earthquake.
Step 14 − 从可视化窗格中拖动地图可视化类型。
Step 14 − Drag the map visual type from the Visualizations pane.
Step 15 − 现在,从字段窗格将 LatLong 字段拖放到可视化窗格中的位置属性。然后,将震级字段拖放到值属性。
Step 15 − Now, drag and drop the LatLong field from the Fields pane to the Location property in Visualizations pane. Then, drag and drop the magnitude field to the Values property.
Step 16 − 将深度字段拖放到颜色饱和度属性。
Step 16 − Drag and drop the depth field to the Color saturation property.

现在,您将看到地图可视化展示了一组表示每个地震位置的气泡。
You will now see the Map visual showing a set of bubbles indicating the location of each earthquake.