Skip to content

Instantly share code, notes, and snippets.

@ActuallyConnor
Last active February 3, 2025 22:02
Show Gist options
  • Select an option

  • Save ActuallyConnor/9700fb7a5dc8d1d96197670ca7ed9443 to your computer and use it in GitHub Desktop.

Select an option

Save ActuallyConnor/9700fb7a5dc8d1d96197670ca7ed9443 to your computer and use it in GitHub Desktop.
connection.js
// SNIPPET
class Connection {
DEFAULT_EDGE_LIMIT = 25
MAX_EDGE_LIMIT = 100
/**
* Get *all* items
*
* This method is for internal application only, not intended for use in the graph itself, but useful internally when processing larger connections.
* Depending on the size of the collection of items, this may or may not be a good idea. This method should grab each page iteratively, and be implemented as a generator for obvious reasons.
*
* @return {AsyncGenerator<any>} - An async generator to iterate over all items.
*/
async * getAll (limit = this.DEFAULT_EDGE_LIMIT) {
// to be implemented by subclass
}
/**
* Get a page (a list of edges) of entities
*
* The pagination style of this method matches what we use for API output.
*
* @param {number|null} [first=null] - The number of items to return from the beginning of the list, or null for no limit.
* @param {string|null} [after=null] - The cursor to start retrieving items after, or null to start from the beginning.
* @param {number|null} [last=null] - The number of items to return from the end of the list, or null for no limit.
* @param {string|null} [before=null] - The cursor to start retrieving items before, or null to start from the end.
* @return {Promise<Page>} - The page of results.
*/
async getPage (first = null, after = null, last = null, before = null) {
// to be implemented by subclass
}
}
class CursorConnection extends Connection {
INTERNAL_MAX_EDGE_LIMIT = 1000
INTERNAL_DEFAULT_EDGE_LIMIT = 100
/**
* Constructor
*
* @param {Function} fetchCallback - A callback `(bool, int, ?string):iterable<T>`
* @param {Function} cursorCallback - A callback `(T):non-empty-string`
* @param {boolean} overRead - Set to true if you want next/prev to be guaranteed accurate in the direction of traversal
*/
constructor (fetchCallback, cursorCallback, overRead = false) {
super()
this.fetchCallback = fetchCallback
this.cursorCallback = cursorCallback
this.overRead = overRead
}
/**
* Retrieve all items using pagination with a set limit.
*
* @param {number} limit - The limit for the number of items returned.
* @returns {AsyncGenerator<any>} - An async generator to iterate over all items.
*/
async * getAll (limit = this.INTERNAL_DEFAULT_EDGE_LIMIT) {
let cursor = null
while (cursor !== null) {
const page = await this.getPage(limit, cursor)
for (const edge of page.edges) {
yield edge.node
}
const info = page.pageInfo
if (info.hasNextPage && info.endCursor) {
cursor = info.endCursor
} else {
cursor = null
}
}
}
/**
* Retrieve a specific page of results based on pagination parameters.
*
* @param {number|null} [first=null] - The number of items to return from the beginning of the list, or null for no limit.
* @param {string|null} [after=null] - The cursor to start retrieving items after, or null to start from the beginning.
* @param {number|null} [last=null] - The number of items to return from the end of the list, or null for no limit.
* @param {string|null} [before=null] - The cursor to start retrieving items before, or null to start from the end.
* @returns {Promise<Page>} - The page of results.
*/
async getPage (first = null, after = null, last = null, before = null) {
let limit = (first ?? last) ?? 0
if (limit > 0) {
limit = Math.min(limit, this.INTERNAL_MAX_EDGE_LIMIT)
} else {
limit = this.DEFAULT_EDGE_LIMIT
}
const forward = (last ?? 0) <= 0
const cursor = forward ? after : before
const count = limit + (this.overRead ? 1 : 0)
const nodes = await this.fetchCallback(forward, count, cursor)
if (!forward) {
nodes.reverse()
}
let hasPreviousPage = false
let hasNextPage = false
if (this.overRead && nodes.length === count) {
if (forward) {
nodes.pop()
hasNextPage = true
} else {
nodes.shift()
hasPreviousPage = true
}
}
const edges = nodes.map(node => {
const nodeCursor = this.cursorCallback(node)
return { node, cursor: nodeCursor }
})
let startCursor = null
let endCursor = null
if (edges.length > 0) {
startCursor = edges[0].cursor
endCursor = edges[edges.length - 1].cursor
const hasMore = edges.length >= limit
if (forward) {
if (this.overRead && !hasNextPage) {
endCursor = null
}
hasPreviousPage = cursor !== null
} else {
if (this.overRead && !hasPreviousPage) {
startCursor = null
}
hasNextPage = cursor !== null
}
}
const pageInfo = new PageInfo(
hasPreviousPage,
hasNextPage,
startCursor,
endCursor,
)
return new Page(
edges,
pageInfo,
)
}
}
class Edge {
/**
* @param {object} node
* @param {string} cursor
*/
constructor (node, cursor) {
this.node = node
this.cursor = cursor
}
}
class PageInfo {
constructor (hasNextPage, hasPreviousPage, startCursor, endCursor) {
this.hasNextPage = hasNextPage
this.hasPreviousPage = hasPreviousPage
this.startCursor = startCursor
this.endCursor = endCursor
}
}
class Page {
/**
* @param {Iterable<Edge>} edges - The edges of the page.
* @param {PageInfo} pageInfo - Information about the page.
*/
constructor(edges, pageInfo) {
this.edges = edges;
this.pageInfo = pageInfo;
}
/**
* Get the edges
*
* @returns {Iterable<Edge>} - The edges of the page.
*/
getEdges() {
return this.edges;
}
/**
* Get the page info
*
* @returns {PageInfo} - Information about the page.
*/
getPageInfo() {
return this.pageInfo;
}
}
// EXAMPLE USAGE
/**
* @param {string|Uint8Array} ownerUuid
* @returns {CursorConnection}
*/
const findAll = (ownerUuid) => {
return new CursorConnection(
async (forward, limit, cursor) => {
let query = knex('projects').where({ owner_uuid: ownerUuid })
if (isset(cursor)) {
query = query.where('id', forward ? '>' : '<',
decodeOpaqueStringToId(cursor))
}
if (forward) {
query = query.orderBy('id', forward ? 'asc' : 'desc')
}
query = query.limit(limit)
return query.then((rows) => {
return rows.map(row => {
return parse(row)
})
})
},
(project) => {
return encodeIdToOpaqueString(project.id)
},
)
}
/**
* Convert an integer ID into an opaque string using Base64 encoding
* @param {number} id - The integer ID to convert
* @returns {string} - The opaque string representation
*/
function encodeIdToOpaqueString(id) {
const buffer = Buffer.from(id.toString(), 'utf-8');
return buffer.toString('base64');
}
/**
* Decode an opaque string back into its original integer ID using Base64 decoding
* @param {string} opaqueString - The opaque string representation to decode
* @returns {number} - The original integer ID
* @throws {Error} - If the input is not a valid Base64 encoded string
*/
function decodeOpaqueStringToId (opaqueString) {
const buffer = Buffer.from(opaqueString, 'base64')
const decodedString = buffer.toString('utf-8')
const id = parseInt(decodedString, 10)
if (isNaN(id)) {
throw new Error('The input is not a valid Base64 encoded string.')
}
return id
}

connection.js

Overview

This code implements a cursor-based pagination system based on the Relay Connection spec, primarily used for efficiently fetching and navigating large datasets in chunks. It defines a base Connection class with abstract methods for retrieving all items (getAll) or a specific page (getPage). The CursorConnection class extends this functionality by implementing the pagination logic, utilizing fetchCallback to retrieve data from a source and cursorCallback to generate unique identifiers (cursors) for items. Pagination metadata, such as hasNextPage and hasPreviousPage, is handled through the PageInfo and Page classes, which organize fetched edges (data items and their cursors) and page boundaries. This system is commonly used in applications like APIs with GraphQL Relay-style pagination or infinite-scrolling interfaces.

Info

The Relay Connection Specification is useful because it provides a standardized and efficient way to handle pagination for large datasets in APIs, particularly in GraphQL. Cursor-based pagination, which the spec implements, is more robust than offset pagination since it avoids issues with data shifting when records are added or removed. It also enables precise control with forward and backward navigation, offering consistent results in dynamic datasets. Additionally, the spec includes metadata like hasNextPage and hasPreviousPage, allowing APIs to deliver useful information for building features like infinite scrolling or page navigation, and it integrates well with tools and libraries that adhere to the Relay standard.

Technical Considerations

Generators

Using generators in this code, particularly in the getAll method, provides an efficient and scalable way to handle the iterative retrieval of large datasets. Generators allow the code to fetch and process data on demand, rather than loading the entire dataset into memory at once, which is crucial for performance when dealing with potentially massive collections. This approach enables lazy evaluation, meaning items are only fetched and yielded one at a time as they are needed, reducing memory consumption and enabling the application to start processing results immediately without waiting for the full dataset to load. Additionally, since getAll is implemented as an AsyncGenerator, it gracefully integrates with asynchronous operations (e.g., fetching paginated data), making it ideal for modern, non-blocking workflows.

Callbacks

Using callbacks to allow the caller to retrieve the data provides flexibility and decouples the pagination logic from the actual data source. This enables the pagination system to work with any data-fetching mechanism—whether it's a database query, a REST API call, or an in-memory dataset—making the code more reusable and adaptable. The caller can define custom logic in the callback to retrieve data, apply transformations, or handle unique business rules, without modifying the core pagination implementation. Additionally, this approach promotes separation of concerns, as the pagination logic focuses solely on how to paginate, while the data retrieval logic remains the responsibility of the callback provided by the caller. This makes the system more extensible and easier to maintain.

Challenges

Custom concatenated cursors

Implementing custom concatenated cursors can be particularly challenging due to the added complexity of encoding and decoding multiple pieces of data into a single cursor string. These cursors often combine various fields (e.g., IDs, timestamps, or other sorting parameters) to maintain state across queries and ensure correct pagination order. However, managing consistency and accuracy in such concatenated cursors can be error-prone, especially when data structures or sorting rules evolve. They also introduce complexity in validating and parsing cursor values, as incorrect implementations might lead to broken pagination behavior or bugs in edge cases. Additionally, concatenated cursors often make it harder to debug or inspect issues, as cursors become opaque to developers. The given implementation avoids these issues by focusing on simpler cursor mechanisms and leaving cursor generation entirely to the caller, making it more straightforward and reducing the potential for errors in handling complex cursor structures.

Reverse pagination (paginating backward)

Paginating backward with reversed order presents several challenges, particularly when working with cursor-based pagination. One of the main difficulties is determining the correct subset of data to return while preserving consistent and predictable results. For instance, when navigating backward, the system may need to retrieve data in its original forward order, reverse it in memory, and then apply the pagination logic. This can become inefficient for large datasets, as the entire slice of data surrounding the "current page" might need to be fetched and sorted before it can be reversed and reduced.

Additionally, reversing the order might complicate how cursors are generated and interpreted. Cursors, which typically encode the position and sorting information for navigation, may need to be recalculated or adapted to work consistently both forward and backward. This introduces potential bugs and increases the complexity of cursor management.

Moreover, some data sources (e.g., certain APIs or database queries) may not natively support reverse traversal, making it necessary to perform additional computations or re-fetch large portions of data. This could lead to degraded performance or higher resource usage. Maintaining consistent pagination behavior, especially when new items are inserted or deleted in a dynamic dataset, further adds to the challenge, as backward pagination must preserve the integrity and usability of the data order.

What you would have done different

If I were designing the pagination implementation differently, I would enforce stricter functionality to prevent users from accidentally implementing offset-based pagination. This could be achieved by explicitly requiring cursor-based queries, providing helper utilities for generating and handling cursors, and avoiding exposed parameters like offset or pageNumber in the API. Encouraging cursor-based solutions ensures users are guided toward the optimal path, reducing the risks of inefficiencies or inconsistent behavior. Additionally, clear documentation and examples would highlight best practices, making it easier for developers to follow performant patterns and avoid falling back to offset-based logic.

Offset-based pagination is problematic because it scales poorly and introduces significant inefficiencies for large datasets. For example, database queries with high offsets (e.g., LIMIT x OFFSET y) force the engine to scan rows up to the offset, slowing down navigation to later pages. Furthermore, it’s prone to data inconsistencies, such as skipping or duplicating rows when the underlying dataset changes while paging through records. In contrast, cursor-based pagination is built around unique identifiers (IDs or timestamps), offering stable and efficient page boundaries that scale better with large datasets and dynamic environments. By restricting offset usage, the implementation ensures better consistency, performance, and scalability for all users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment