bahr.dev serverless on AWS

How To Paginate DynamoDB Tables With The AWS SDK For JavaScript

Python library boto3 has built in paginators, but this doesn’t seem to be the case for AWS SDK for JavaScript. This article has code examples that you can copy and paste to your code.

The table setup and examples are available on GitHub.

Prerequisites

This guide is for TypeScript and aws-sdk (article written with version 2.958.0).

The examples are intended for the DynamoDB.DocumentClient. If you don’t have to use this client, consider Jeremy Daly’s DynamoDB Toolbox.

I assume that you are familiar with DynamoDB table structures, especially partition and sort keys.

Async Generators

You can use async generators as a wrapper for any AWS SDK request where continuation tokens are used. The example below is based on Tamás Sallai’s article.

export const getPaginatedResults = async(fn) => {
  const EMPTY = Symbol("empty");
  const res = [];
  for await (const lf of (async function* () {
    let NextMarker = EMPTY;
    let count = 0;
    while (NextMarker || NextMarker === EMPTY) {
      const {marker, results, count: ct} =
        await fn(NextMarker !== EMPTY ? NextMarker : undefined, count);

      yield* results;

      // if there's no marker, then we reached the end
      if (!marker) {
        break;
      }

      NextMarker = marker;
      count = ct;
    }
  })()) {
    res.push(lf);
  }

  return res;
};

Usage Example

Copy the function above to your source code, and use it as shown below.

import { DynamoDB } from 'aws-sdk';
const ddb = new DynamoDB.DocumentClient();

/*
The table in this example has a partition key 'pk'
and a sort key 'sk'. While not shown in the query
below, the sort key is required to store multiple
items in a partition.
 */
const ddbQueryParams = {
  TableName: TABLE_NAME,
  KeyConditionExpression: 'pk = :pk',
  FilterExpression: 'randomValue > :r',
  ExpressionAttributeValues: {
    ':pk': 'my-partition',
    ':r': 0.5,
  },
};

const records = await getPaginatedResults(async (ExclusiveStartKey) => {
  const queryResponse = await ddb
    .query({ExclusiveStartKey, ...ddbQueryParams })
    .promise();

  return {
    marker: queryResponse.LastEvaluatedKey,
    results: queryResponse.Items,
  };
});

console.log(records);
// [record1, record2, record3, ...]

The result is one array with all the records from the paginated requests.

This is the cleanest solution I’ve seen so far, and doesn’t require you to perform any logic with the result or continuation token, if all that you care about is the full result of the query.

Adding Offset And Pagination

If your query targets a partition that has a sort key, we can add offset and pagination to the example. Note that this approach may load more items from DynamoDB that pageSize specifies, but the function will return at most pageSize records. This is because DynamoDB operations can read a maximum of 1 MB per request, and adding a Limit parameter would lead to many more requests than necessary.

If our sort key is numerical, or lexicographically ascending sorted, we achieve an offset by specifying the first sort key that the query shall look at.

To limit the page size, we add an early break. The example below can return a page size that’s larger than the expected page. It requires the calling code to discard additional records, and pick the appropriate start key for the next request.

import { DynamoDB } from 'aws-sdk';
const ddb = new DynamoDB.DocumentClient();

const offset = 10;
const pageSize = 50;

const ddbQueryParams = {
  TableName: TABLE_NAME,
  KeyConditionExpression: 'pk = :pk and sk > :sk',
  ExpressionAttributeValues: {
    ':pk': 'my-partition',
    ':sk': offset,
  },
};

const records = await getPaginatedResults(async (ExclusiveStartKey, count: number) => {
  const queryResponse = await ddb
    .query({ExclusiveStartKey, ...ddbQueryParams })
    .promise();

  // stop the pagination when we reach the pageSize
  if (count + queryResponse.Count >= pageSize) {
    return {
        // manually trim to the item count to max pageSize records
      results: queryResponse.Items.slice(0, pageSize - count),
      // return an empty marker to indicate that we have
      // no further results to process
      marker: null,
    };
  }

  return {
    marker: queryResponse.LastEvaluatedKey,
    results: queryResponse.Items,
    count: count + queryResponse.Count,
  };
});

console.log(records);
// [record1, record2, record3, ...]

I hope that this helps you to add pagination quicker when you need it.

Found a better way to use this code? Please share it or create a pull request so that I can improve this article :)

Resources


Enjoyed this article? I publish a new article every month. Connect with me on Twitter and sign up for new articles to your inbox!