Dynamodb 简明教程

DynamoDB - Global Secondary Indexes

需要使用不同属性执行各种查询类型的应用程序可以在执行这些详细查询时使用一个或多个全局辅助索引。

Applications requiring various query types with different attributes can use a single or multiple global secondary indexes in performing these detailed queries.

For example − 一个跟踪用户、他们的登录状态以及他们登录时间的系统。上一个例子的增长会减慢对其数据的查询速度。

For example − A system keeping a track of users, their login status, and their time logged in. The growth of the previous example slows queries on its data.

全局辅助索引通过按表中的特定属性组织数据来加速查询。它们使用主键对数据进行排序,不需要密钥表属性或与表相同的密钥架构。

Global secondary indexes accelerate queries by organizing a selection of attributes from a table. They employ primary keys in sorting data, and require no key table attributes, or key schema identical to the table.

所有全局二级索引都必须包含一个分区键,并可以包含一个排序键。索引键模式可以不同于表,索引键属性可以使用任何顶级字符串、数字或二进制表属性。

All the global secondary indexes must include a partition key, with the option of a sort key. The index key schema can differ from the table, and index key attributes can use any top-level string, number, or binary table attributes.

在投影中,您可以使用其他表属性,但是查询不会从父表中检索。

In a projection, you can use other table attributes, however, queries do not retrieve from parent tables.

Attribute Projections

投影包含从表复制到二级索引的一组属性。投影始终与表分区键和排序键一起发生。在查询中,投影允许 DynamoDB 访问投影的任何属性;它们本质上作为自己的表而存在。

Projections consist of an attribute set copied from table to secondary index. A Projection always occurs with the table partition key and sort key. In queries, projections allow DynamoDB access to any attribute of the projection; they essentially exist as their own table.

在二级索引创建中,您必须指定要投影的属性。DynamoDB 提供了三种执行此任务的方法−

In a secondary index creation, you must specify attributes for projection. DynamoDB offers three ways to perform this task −

  1. KEYS_ONLY − All index items consist of table partition and sort key values, and index key values. This creates the smallest index.

  2. INCLUDE − It includes KEYS_ONLY attributes and specified non-key attributes.

  3. ALL − It includes all source table attributes, creating the largest possible index.

请注意将属性投影到全局二级索引中的权衡,这涉及吞吐量和存储成本。

Note the tradeoffs in projecting attributes into a global secondary index, which relate to throughput and storage cost.

考虑以下几点:

Consider the following points −

  1. If you only need access to a few attributes, with low latency, project only those you need. This reduces storage and write costs.

  2. If an application frequently accesses certain non-key attributes, project them because the storage costs pale in comparison to scan consumption.

  3. You can project large sets of attributes frequently accessed, however, this carries a high storage cost.

  4. Use KEYS_ONLY for infrequent table queries and frequent writes/updates. This controls size, but still offers good performance on queries.

Global Secondary Index Queries and Scans

您可以使用查询来访问索引中的单个或多个项。您必须指定索引和表名称、所需属性和条件;可以选择按升序或降序返回结果。

You can utilize queries for accessing a single or multiple items in an index. You must specify index and table name, desired attributes, and conditions; with the option to return results in ascending or descending order.

您还可以使用扫描来获取所有索引数据。它需要表和索引名称。您利用筛选器表达式检索特定数据。

You can also utilize scans to get all index data. It requires table and index name. You utilize a filter expression to retrieve specific data.

Table and Index Data Synchronization

DynamoDB 会自动对索引与其父表进行同步。对项的每个修改操作都会导致异步更新,但是应用程序不会直接写入索引。

DynamoDB automatically performs synchronization on indexes with their parent table. Each modifying operation on items causes asynchronous updates, however, applications do not write to indexes directly.

您需要了解 DynamoDB 维护对索引的影响。在创建索引时,您指定键属性和数据类型,这意味着在写入时,这些数据类型必须与键模式数据类型匹配。

You need to understand the impact of DynamoDB maintenance on indices. On creation of an index, you specify key attributes and data types, which means on a write, those data types must match key schema data types.

在创建或删除项目时,索引将以最终一致的方式更新,但是数据更新会在几分之一秒内传播(除非系统发生某种类型的故障)。您必须在应用程序中考虑此延迟。

On item creation or deletion, indexes update in an eventually consistent manner, however, updates to data propagate in a fraction of a second (unless system failure of some type occurs). You must account for this delay in applications.

Throughput Considerations in Global Secondary Indexes −多个全局二级索引会影响吞吐量。索引创建需要容量单位规范,该规范与表分开存在,导致操作消耗索引容量单位而不是表单位。

Throughput Considerations in Global Secondary Indexes − Multiple global secondary indexes impact throughput. Index creation requires capacity unit specifications, which exist separate from the table, resulting in operations consuming index capacity units rather than table units.

如果查询或写入超出预置吞吐量,可能导致限制。使用 DescribeTable 查看吞吐量设置。

This can result in throttling if a query or write exceeds provisioned throughput. View throughput settings by using DescribeTable.

Read Capacity − 全局二级索引提供最终一致性。在查询中,DynamoDB 根据用于表的计算执行预置计算,唯一不同之处在于使用索引条目大小而不是条目大小。一个查询返回的限制仍然是 1MB,包括每个返回条目的属性名称大小和值。

Read Capacity − Global secondary indexes deliver eventual consistency. In queries, DynamoDB performs provision calculations identical to that used for tables, with a lone difference of using index entry size rather than item size. The limit of a query returns remains 1MB, which includes attribute name size and values across every returned item.

Write Capacity

当执行写入操作时,受影响的索引会消耗写入单元。写入吞吐量成本是表格写入中消耗的写入容量单元和索引更新中消耗的单元的总和。成功的写入操作需要有足够的容量,否则会导致节流。

When write operations occur, the affected index consumes write units. Write throughput costs are the sum of write capacity units consumed in table writes and units consumed in index updates. A successful write operation requires sufficient capacity, or it results in throttling.

写入成本还依赖于某些因素,其中一些如下所述 −

Write costs also remain dependent on certain factors, some of which are as follows −

  1. New items defining indexed attributes or item updates defining undefined indexed attributes use a single write operation to add the item to the index.

  2. Updates changing indexed key attribute value use two writes to delete an item and write a new one.

  3. A table write triggering deletion of an indexed attribute uses a single write to erase the old item projection in the index.

  4. Items absent in the index prior to and after an update operation use no writes.

  5. Updates changing only projected attribute value in the index key schema, and not indexed key attribute value, use one write to update values of projected attributes into the index.

所有这些因素都假定条目大小小于或等于 1KB。

All these factors assume an item size of less than or equal to 1KB.

Global Secondary Index Storage

在条目写入时,DynamoDB 会自动将正确的属性集复制到必须存在属性的任何索引中。这会影响您的账户,因为它会为表格条目存储和属性存储向其收费。使用结果空间由以下数量的总和得出 −

On an item write, DynamoDB automatically copies the right set of attributes to any indices where the attributes must exist. This impacts your account by charging it for table item storage and attribute storage. The space used results from the sum of these quantities −

  1. Byte size of table primary key

  2. Byte size of index key attribute

  3. Byte size of projected attributes

  4. 100 byte-overhead per index item

您可以通过估算平均条目大小并将此值乘以具有全局二级索引键属性的表格条目的数量来估算存储需求。

You can estimate storage needs through estimating average item size and multiplying by the quantity of the table items with the global secondary index key attributes.

DynamoDB 不会为定义为索引分区或排序键的未定义属性的表格条目写入条目数据。

DynamoDB does not write item data for a table item with an undefined attribute defined as an index partition or sort key.

Global Secondary Index Crud

使用 CreateTable 操作与 GlobalSecondaryIndexes 参数配对创建包含全局二级索引的表格。您必须指定一个属性作为索引分区键,或使用另一个属性作为索引排序键。所有索引键属性必须是字符串、数字或二进制标量。您还必须提供吞吐量设置,包括 ReadCapacityUnitsWriteCapacityUnits

Create a table with global secondary indexes by using the CreateTable operation paired with the GlobalSecondaryIndexes parameter. You must specify an attribute to serve as the index partition key, or use another for the index sort key. All index key attributes must be string, number, or binary scalars. You must also provide throughput settings, consisting of ReadCapacityUnits and WriteCapacityUnits.

再次使用 GlobalSecondaryIndexes 参数使用 UpdateTable 向现有表格添加全局二级索引。

Use UpdateTable to add global secondary indexes to existing tables using the GlobalSecondaryIndexes parameter once again.

在此操作中,您必须提供以下输入 -

In this operation, you must provide the following inputs −

  1. Index name

  2. Key schema

  3. Projected attributes

  4. Throughput settings

通过添加全局二级索引,可能会花费大量时间处理具有大量表格的情况,这是因为项体量、预计属性量、写入量和写活动。使用 CloudWatch 度量监控此进程。

By adding a global secondary index, it may take a substantial time with large tables due to item volume, projected attributes volume, write capacity, and write activity. Use CloudWatch metrics to monitor the process.

使用 DescribeTable 提取全局二级索引的状态信息。它会返回全局二级索引的一个四种 IndexStatus 之一 -

Use DescribeTable to fetch status information for a global secondary index. It returns one of four IndexStatus for GlobalSecondaryIndexes −

  1. CREATING − It indicates the build stage of the index, and its unavailability.

  2. ACTIVE − It indicates the readiness of the index for use.

  3. UPDATING − It indicates the update status of throughput settings.

  4. DELETING − It indicates the delete status of the index, and its permanent unavailability for use.

在加载/重新填充阶段(DynamoDB 将属性写入索引并跟踪添加/删除/更新的项)更新全局二级索引预配的吞吐量设置。使用 UpdateTable 执行此操作。

Update global secondary index provisioned throughput settings during the loading/backfilling stage (DynamoDB writing attributes to an index and tracking added/deleted/updated items). Use UpdateTable to perform this operation.

请记住,您不能在重新填充阶段添加/删除其他索引。

You should remember that you cannot add/delete other indices during the backfilling stage.

使用 UpdateTable 删除全局二级索引。它只允许每次操作删除一个索引,然而,您可以同时运行多个操作(最多五个)。删除过程不影响父表读/写活动,但在操作完成前您不能添加/删除其他索引。

Use UpdateTable to delete global secondary indexes. It permits deletion of only one index per operation, however, you can run multiple operations concurrently, up to five. The deletion process does not affect the read/write activities of the parent table, but you cannot add/delete other indices until the operation completes.

Using Java to Work with Global Secondary Indexes

通过 CreateTable 使用索引创建表格。只需创建一个 DynamoDB 类实例,一个 CreateTableRequest 类实例用于请求信息,并将请求对象传递给 CreateTable 方法。

Create a table with an index through CreateTable. Simply create a DynamoDB class instance, a CreateTableRequest class instance for request information, and pass the request object to the CreateTable method.

以下程序是一个小例子 -

The following program is a short example −

DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient (
   new ProfileCredentialsProvider()));

// Attributes
ArrayList<AttributeDefinition> attributeDefinitions = new
   ArrayList<AttributeDefinition>();
attributeDefinitions.add(new AttributeDefinition()
   .withAttributeName("City")
   .withAttributeType("S"));

attributeDefinitions.add(new AttributeDefinition()
   .withAttributeName("Date")
   .withAttributeType("S"));

attributeDefinitions.add(new AttributeDefinition()
   .withAttributeName("Wind")
   .withAttributeType("N"));

// Key schema of the table
ArrayList<KeySchemaElement> tableKeySchema = new ArrayList<KeySchemaElement>();
tableKeySchema.add(new KeySchemaElement()
   .withAttributeName("City")
   .withKeyType(KeyType.HASH));              //Partition key

tableKeySchema.add(new KeySchemaElement()
   .withAttributeName("Date")
   .withKeyType(KeyType.RANGE));             //Sort key

// Wind index
GlobalSecondaryIndex windIndex = new GlobalSecondaryIndex()
   .withIndexName("WindIndex")
   .withProvisionedThroughput(new ProvisionedThroughput()
   .withReadCapacityUnits((long) 10)
   .withWriteCapacityUnits((long) 1))
   .withProjection(new Projection().withProjectionType(ProjectionType.ALL));

ArrayList<KeySchemaElement> indexKeySchema = new ArrayList<KeySchemaElement>();
indexKeySchema.add(new KeySchemaElement()
   .withAttributeName("Date")
   .withKeyType(KeyType.HASH));              //Partition key

indexKeySchema.add(new KeySchemaElement()
   .withAttributeName("Wind")
   .withKeyType(KeyType.RANGE));             //Sort key

windIndex.setKeySchema(indexKeySchema);
CreateTableRequest createTableRequest = new CreateTableRequest()
   .withTableName("ClimateInfo")
   .withProvisionedThroughput(new ProvisionedThroughput()
   .withReadCapacityUnits((long) 5)
   .withWriteCapacityUnits((long) 1))
   .withAttributeDefinitions(attributeDefinitions)
   .withKeySchema(tableKeySchema)
   .withGlobalSecondaryIndexes(windIndex);
Table table = dynamoDB.createTable(createTableRequest);
System.out.println(table.getDescription());

使用 DescribeTable 检索索引信息。首先,创建一个 DynamoDB 类实例。然后创建一个类实例,以关注一个索引。最后,将表格传递给 describe 方法。

Retrieve the index information with DescribeTable. First, create a DynamoDB class instance. Then create a Table class instance to target an index. Finally, pass the table to the describe method.

下面是一个小例子 -

Here is a short example −

DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient (
   new ProfileCredentialsProvider()));

Table table = dynamoDB.getTable("ClimateInfo");
TableDescription tableDesc = table.describe();
Iterator<GlobalSecondaryIndexDescription> gsiIter =
   tableDesc.getGlobalSecondaryIndexes().iterator();

while (gsiIter.hasNext()) {
   GlobalSecondaryIndexDescription gsiDesc = gsiIter.next();
   System.out.println("Index data " + gsiDesc.getIndexName() + ":");
   Iterator<KeySchemaElement> kse7Iter = gsiDesc.getKeySchema().iterator();

   while (kseIter.hasNext()) {
      KeySchemaElement kse = kseIter.next();
      System.out.printf("\t%s: %s\n", kse.getAttributeName(), kse.getKeyType());
   }
   Projection projection = gsiDesc.getProjection();
   System.out.println("\tProjection type: " + projection.getProjectionType());

   if (projection.getProjectionType().toString().equals("INCLUDE")) {
      System.out.println("\t\tNon-key projected attributes: "
         + projection.getNonKeyAttributes());
   }
}

使用 Query 执行索引查询,就像表格查询一样。只需创建一个 DynamoDB 类实例,一个类实例用于目标索引,一个类实例用于具体索引,并将索引和查询对象传递给 query 方法。

Use Query to perform an index query as with a table query. Simply create a DynamoDB class instance, a Table class instance for the target index, an Index class instance for the specific index, and pass the index and query object to the query method.

查看以下代码以更好地理解 -

Take a look at the following code to understand better −

DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient (
   new ProfileCredentialsProvider()));

Table table = dynamoDB.getTable("ClimateInfo");
Index index = table.getIndex("WindIndex");
QuerySpec spec = new QuerySpec()
   .withKeyConditionExpression("#d = :v_date and Wind = :v_wind")
   .withNameMap(new NameMap()
   .with("#d", "Date"))
   .withValueMap(new ValueMap()
   .withString(":v_date","2016-05-15")
   .withNumber(":v_wind",0));

ItemCollection<QueryOutcome> items = index.query(spec);
Iterator<Item> iter = items.iterator();

while (iter.hasNext()) {
   System.out.println(iter.next().toJSONPretty());
}

以下程序是一个较大的例子,以便更好地理解:

The following program is a bigger example for better understanding −

Note − 以下程序可能会假设有一个已创建的数据源。在尝试执行之前,获得支持库并创建必要的数据源(具备所需特征的表格或其他引用的源)。

Note − The following program may assume a previously created data source. Before attempting to execute, acquire supporting libraries and create necessary data sources (tables with required characteristics, or other referenced sources).

此示例还使用了 Eclipse IDE、AWS 凭据文件和 Eclipse AWS Java 项目中的 AWS Toolkit。

This example also uses Eclipse IDE, an AWS credentials file, and the AWS Toolkit within an Eclipse AWS Java Project.

import java.util.ArrayList;
import java.util.Iterator;

import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient;
import com.amazonaws.services.dynamodbv2.document.DynamoDB;
import com.amazonaws.services.dynamodbv2.document.Index;
import com.amazonaws.services.dynamodbv2.document.Item;
import com.amazonaws.services.dynamodbv2.document.ItemCollection;
import com.amazonaws.services.dynamodbv2.document.QueryOutcome;
import com.amazonaws.services.dynamodbv2.document.Table;
import com.amazonaws.services.dynamodbv2.document.spec.QuerySpec;
import com.amazonaws.services.dynamodbv2.document.utils.ValueMap;

import com.amazonaws.services.dynamodbv2.model.AttributeDefinition;
import com.amazonaws.services.dynamodbv2.model.CreateTableRequest;
import com.amazonaws.services.dynamodbv2.model.GlobalSecondaryIndex;
import com.amazonaws.services.dynamodbv2.model.KeySchemaElement;
import com.amazonaws.services.dynamodbv2.model.KeyType;
import com.amazonaws.services.dynamodbv2.model.Projection;
import com.amazonaws.services.dynamodbv2.model.ProvisionedThroughput;

public class GlobalSecondaryIndexSample {
   static DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient (
      new ProfileCredentialsProvider()));
   public static String tableName = "Bugs";
   public static void main(String[] args) throws Exception {
      createTable();
      queryIndex("CreationDateIndex");
      queryIndex("NameIndex");
      queryIndex("DueDateIndex");
   }
   public static void createTable() {
      // Attributes
      ArrayList<AttributeDefinition> attributeDefinitions = new
         ArrayList<AttributeDefinition>();
      attributeDefinitions.add(new AttributeDefinition()
         .withAttributeName("BugID")
         .withAttributeType("S"));

      attributeDefinitions.add(new AttributeDefinition()
         .withAttributeName("Name")
         .withAttributeType("S"));

      attributeDefinitions.add(new AttributeDefinition()
         .withAttributeName("CreationDate")
         .withAttributeType("S"));

      attributeDefinitions.add(new AttributeDefinition()
         .withAttributeName("DueDate")
         .withAttributeType("S"));

      // Table Key schema
      ArrayList<KeySchemaElement> tableKeySchema = new ArrayList<KeySchemaElement>();
      tableKeySchema.add (new KeySchemaElement()
         .withAttributeName("BugID")
         .withKeyType(KeyType.HASH));              //Partition key

      tableKeySchema.add (new KeySchemaElement()
         .withAttributeName("Name")
         .withKeyType(KeyType.RANGE));             //Sort key

      // Indexes' initial provisioned throughput
      ProvisionedThroughput ptIndex = new ProvisionedThroughput()
         .withReadCapacityUnits(1L)
         .withWriteCapacityUnits(1L);

      // CreationDateIndex
      GlobalSecondaryIndex creationDateIndex = new GlobalSecondaryIndex()
         .withIndexName("CreationDateIndex")
         .withProvisionedThroughput(ptIndex)
         .withKeySchema(new KeySchemaElement()
         .withAttributeName("CreationDate")
         .withKeyType(KeyType.HASH),               //Partition key
         new KeySchemaElement()
         .withAttributeName("BugID")
         .withKeyType(KeyType.RANGE))              //Sort key
         .withProjection(new Projection()
         .withProjectionType("INCLUDE")
         .withNonKeyAttributes("Description", "Status"));

      // NameIndex
      GlobalSecondaryIndex nameIndex = new GlobalSecondaryIndex()
         .withIndexName("NameIndex")
         .withProvisionedThroughput(ptIndex)
         .withKeySchema(new KeySchemaElement()
         .withAttributeName("Name")
         .withKeyType(KeyType.HASH),                  //Partition key
         new KeySchemaElement()
         .withAttributeName("BugID")
         .withKeyType(KeyType.RANGE))                 //Sort key
         .withProjection(new Projection()
         .withProjectionType("KEYS_ONLY"));

      // DueDateIndex
      GlobalSecondaryIndex dueDateIndex = new GlobalSecondaryIndex()
         .withIndexName("DueDateIndex")
         .withProvisionedThroughput(ptIndex)
         .withKeySchema(new KeySchemaElement()
         .withAttributeName("DueDate")
         .withKeyType(KeyType.HASH))               //Partition key
         .withProjection(new Projection()
         .withProjectionType("ALL"));

      CreateTableRequest createTableRequest = new CreateTableRequest()
         .withTableName(tableName)
         .withProvisionedThroughput( new ProvisionedThroughput()
         .withReadCapacityUnits( (long) 1)
         .withWriteCapacityUnits( (long) 1))
         .withAttributeDefinitions(attributeDefinitions)
         .withKeySchema(tableKeySchema)
         .withGlobalSecondaryIndexes(creationDateIndex, nameIndex, dueDateIndex);
         System.out.println("Creating " + tableName + "...");
         dynamoDB.createTable(createTableRequest);

      // Pause for active table state
      System.out.println("Waiting for ACTIVE state of " + tableName);
      try {
         Table table = dynamoDB.getTable(tableName);
         table.waitForActive();
      } catch (InterruptedException e) {
         e.printStackTrace();
      }
   }
   public static void queryIndex(String indexName) {
      Table table = dynamoDB.getTable(tableName);
      System.out.println
      ("\n*****************************************************\n");
      System.out.print("Querying index " + indexName + "...");
      Index index = table.getIndex(indexName);
      ItemCollection<QueryOutcome> items = null;
      QuerySpec querySpec = new QuerySpec();

      if (indexName == "CreationDateIndex") {
         System.out.println("Issues filed on 2016-05-22");
         querySpec.withKeyConditionExpression("CreationDate = :v_date and begins_with
            (BugID, :v_bug)")
            .withValueMap(new ValueMap()
            .withString(":v_date","2016-05-22")
            .withString(":v_bug","A-"));
         items = index.query(querySpec);
      } else if (indexName == "NameIndex") {
         System.out.println("Compile error");
         querySpec.withKeyConditionExpression("Name = :v_name and begins_with
            (BugID, :v_bug)")
            .withValueMap(new ValueMap()
            .withString(":v_name","Compile error")
            .withString(":v_bug","A-"));
         items = index.query(querySpec);
      } else if (indexName == "DueDateIndex") {
         System.out.println("Items due on 2016-10-15");
         querySpec.withKeyConditionExpression("DueDate = :v_date")
         .withValueMap(new ValueMap()
         .withString(":v_date","2016-10-15"));
         items = index.query(querySpec);
      } else {
         System.out.println("\nInvalid index name");
         return;
      }
      Iterator<Item> iterator = items.iterator();
      System.out.println("Query: getting result...");

      while (iterator.hasNext()) {
         System.out.println(iterator.next().toJSONPretty());
      }
   }
}