Serverless Database Design Patterns NoSQL

Ram Vadranam
6 min readJul 5, 2020

When switching from a Relational database management system to the NoSQL database management system, it’s important to understand the design approaches.

Key Differences Between Relational Data Design and NoSQL

  1. RDMS supports query flexibility with scaling limitations
  2. NoSQL limits query flexibility but support scaling capabilities
  3. RDMS suitable for OLAP applications which don’t need scaling
  4. NoSQL suitable for OLTP applications with high velocity

Amazon DynamoDB supports scalable serverless database design solution for NoSQL.General design principles of DynamoDB is to keep the number of tables to a minimum. For most applications, a single table is enough.

NoSQL is fit for known access patterns which will be repeatable

Steps For NoSQL Desing Approach Using DynamoDB:

  1. Create an Entity Relationship Diagram(ERD)
  2. Identifying the access patterns
  3. Specify primary key(PK) and sort key(SK) for grouping and sorting
  4. Data size, shape, and velocity are the key consideration of developing an efficient data model

Primary Key:

Partition keys are import to distribute the workloads and enhance the performance. Sharding primary key using random or calculated suffixes will enhance the performance by creating more distributed physical partitions of the data

Sort Key:

Well designed sort keys support a range of queries begins_with, between, >, <, and so on. Composite sort keys support one to many relationships with query level hierarchy

Example:

[orginzation]#[teams]#[projects]#[employees]#[roles]

sort key supports a range of queries from organization to roles

Secondary Indexes:

DynamoDB supports two types of secondary indexes. Global secondary index and Local secondary index.

Global secondary index operates independently from the primary table which needs separate read and writes capacity set up.

Local secondary index operates on the base table with different sort keys. The total indexed items size of the local secondary index limited to 10GB.

Each table in DynamoDB can have up to 20 global secondary indexes (default quota) and 5 local secondary indexes.

Sparse Index:

Sparse indexes are useful for queries over a small subsection of a table.

Aggregations:

Lambda with dynamo DB streams is used to perform aggregation operations without affecting the performance of the table.

Example: Count number of downloads of a selected song in a month

GSI Overloading:

GSI overloading can be done by creating a common attribute with name Data on the base table and store a variety of items. Create GSI with sort key(SK) of the base table as Primary key(PK) and Data attribute as a sort key(SK). This will allow performing a range of queries with a keyword search in sort key for a selected partition

GSI Sharding:

GSI sharding is used to enhance the efficiency of queries performed over a small set of data distributed randomly in the table. To enable selective queries across the entire keyspace, write sharding by adding an attribute containing a (0-N) value to every item that will use for the global secondary index partition key.

Global secondary indexes will create eventually consistent replica of a table. Creating separate read replicas will help to provision different read capacity with a priority basis.

Boto3 example for creating DynamoDB table and mapping relations using primary key and secondary key

  1. Create an Organization
  2. Add teams
  3. Add employees
  4. Assign employees to teams
  5. Create GSI for reverse lookup

Access patters:

  1. Get Organizational details #PK(org_id)
  2. Get all teams related to the organization #PK(org_id)#SK(begins_with: #TEAM)
  3. Get all employees details related to the organization #PK(org_id)#SK(begins_with:#EMP)
  4. Get Team details GSI reverse lookup using #SK(team_id)
  5. Get employee details GSI reverse lookup #SK(emp_id)

Setup DynamoDB table

import hashlib
import uuid

import boto3

client = boto3.client('dynamodb')
table_name = 'org-data'
def create_table():
response = client.create_table(
AttributeDefinitions=[
{
'AttributeName': 'PK',
'AttributeType': 'S',
},
{
'AttributeName': 'SK',
'AttributeType': 'S',
},
],
KeySchema=[
{
'AttributeName': 'PK',
'KeyType': 'HASH',
},
{
'AttributeName': 'SK',
'KeyType': 'RANGE',
},
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5,
},
TableName=table_name,
)
print(response)

Add organization details with name as primary key

def add_organization_details():
org_name = 'ABC'
orgId = hashlib.sha1(org_name.encode('utf-8')).hexdigest()
response = client.put_item(
Item={
'PK': {
'S': f'ORG#{orgId}',
},
'SK': {
'S': F'#METADATA#{orgId}',
},
'name': {
'S': org_name
},
'type': {
'S': 'Retail'
}
},
ReturnConsumedCapacity='TOTAL',
TableName=table_name,
)
print(response)

Add team details with org name as primary key and team name as the secondary key

org_name = 'ABC'
orgId = hashlib.sha1(org_name.encode('utf-8')).hexdigest()
teams = [{'name': 'mobile_team', 'details': 'mobile app team'},
{'name': 'web_team', 'details': 'web development team'},
{'name': 'dev_ops', 'details': 'devops team'}
def add_team_details():
for team in teams:
teamId = hashlib.sha1(team['name'].encode('utf-8')).hexdigest()
response = client.put_item(
Item={
'PK': {
'S': f'ORG#{orgId}',
},
'SK': {
'S': f'#TEAM#{teamId}',
},
'name': {
'S': team['name']
},
'details': {
'S': team['details']
}
},
ReturnConsumedCapacity='TOTAL',
TableName=table_name,
)
print(response)

update organizational details

def update_organization_detail():
response = client.update_item(
ExpressionAttributeNames={
'#org_id': 'org_id',

},
ExpressionAttributeValues={
':org_id': {
'S': orgId
}
},
Key={
'PK': {
'S': f'ORG#{orgId}',
},
'SK': {
'S': f'#METADATA#{orgId}',
}
},
ReturnValues='ALL_NEW',
TableName=table_name,
UpdateExpression='SET #org_id = :org_id',
)

print(response)

Define access patterns

Get organizational details

def get_organization_details():
response = client.get_item(
Key={
'PK': {
'S': f'ORG#{orgId}',
},
'SK': {
'S': f'#METADATA#{orgId}',
},
},
TableName=table_name,
)

print(response)

Get all team details

def get_all_team_details():
response = client.query(
ExpressionAttributeValues={
':PK': {
'S': f'ORG#{orgId}',
},
':SK': {
'S': '#TEAM',
},
},
KeyConditionExpression='PK = :PK and begins_with(SK, :SK)',
TableName=table_name,
)

print(response

Add employee details

emp_details = [
{'name': 'Maureen', 'email': 'maureen@test.com', 'details': 'Excepteur sint occaecat cupidatat non proident'},
{'name': 'Allie', 'email': 'allie@test.com', 'details': 'Lorem ipsum dolor sit amet, consectetur adipiscing elit'},
{'name': 'Teri', 'email': 'teri@test.com', 'details': 'Nemo enim ipsam voluptatem quia voluptas sit '}]


def add_employee_details():
for emp in emp_details:
emp_name = emp['name']
empId = hashlib.sha1(emp_name.encode('utf-8')).hexdigest()
response = client.put_item(
Item={
'PK': {
'S': f'ORG#{orgId}',
},
'SK': {
'S': f'EMP#{empId}',
},
'name': {
'S': emp['name']
},
'email': {
'S': emp['email']
}
},
ReturnConsumedCapacity='TOTAL',
TableName=table_name,
)
print(response)

Assign an employee to a team

def assign_employee_to_a_team():
orgId = '3c01bdbb26f358bab27f267924aa2c9a03fcfdb8'
web_team_id = '49b6eed059e9ce2afe97151e875cddc3214c2d69'
emp_name = emp_details[0]['name']
emp_id = hashlib.sha1(emp_name.encode('utf-8')).hexdigest()
response = client.put_item(
Item={
'PK': {
'S': f'ORG#{orgId}#TEAM#{web_team_id}',
},
'SK': {
'S': f'ORG#{orgId}#EMP#{emp_id}',
},
'name': {
'S': emp_details[0]['name']
},
'email': {
'S': emp_details[0]['email']
}
},
ReturnConsumedCapacity='TOTAL',
TableName=table_name,
)
print(response)

Query employees assigned to a team

def query_employees_assigned_to_a_team():
orgId = '3c01bdbb26f358bab27f267924aa2c9a03fcfdb8'
team_id = '49b6eed059e9ce2afe97151e875cddc3214c2d69'
response = client.query(
ExpressionAttributeValues={
':PK': {
'S': f'ORG#{orgId}#TEAM#{team_id}',
}
},
KeyConditionExpression='PK = :PK',
TableName=table_name,
)

print(response)

Create GSI index with reverse lookup

Get employee details by id

def get_employee_details_by_id():
emp_id = '66913d89866bbd575213fd397bf342b63f1c7c58'
response = client.query(
ExpressionAttributeValues={
':SK': {
'S': f'EMP#{emp_id}',
}
},
IndexName='reversed_lookup',
KeyConditionExpression='SK = :SK',
TableName=table_name,
)

print(response)

--

--

Ram Vadranam
Ram Vadranam

Written by Ram Vadranam

Blending Technology and Innovation: Navigating the Cloud, Unraveling AI Mysteries, and Empowering Entrepreneurial Journeys

No responses yet