Exporting Data from RDS Postgres to S3 with NodeJs & Lambda

Streaming Data from RDS/Postgres to S3 CSV using Lambda and NodeJs
Streaming Data from RDS/Postgres to S3 CSV using Lambda and NodeJs

Alternatives Considered

The following alternatives to using pg-copy-streams were considered but were disqualified for a variety of reasons.

“Streamless” Approach using Lambda Filesystem/Memory

In this scenario, interim data would be pre-processed/parsed in memory, with final CSV file saved to the Lambda’s local file system /tmp before uploading to S3.

AWS RDS PostgresQL Extension (aws_s3)

The aws_s3 RDS Postgres extension can be used to export to S3 via function aws_s3.query_export_to_s3, however, the extension is only supported for PG database engines 10.14, 11.9, 12.4 and later. Engines running versions < 10.14 are not supported.

Proposed Solution — Limitations

Before deciding to adopt the outlined approach, assess whether the components/services used would fulfil processing/capacity needs of your requirements. For example, if the RDS query submitted by the Lambda function has an expected execution time of 30 mins, then Lambda may not be an appropriate service, given that timeout quota of 15 mins applies to functions. Refer to the list of quotas before making the final decision.

Sample Lambda Function

The function below streams output of query results from RDS Postgres to a CSV file in S3.

  • Postgres Export Query: select * from public.pg_sample_tbl
  • RDS Endpoint: sample.xxxxxxxxxxxx.us-east-1.RDS.amazonaws.com
  • Database Name: postgres
  • Database Credentials: Note, these should be stored in secretsmanager and retrieved at runtime
    Username:
    postgres
    Password:
    sampleonly
  • S3 Bucket for CSV Export: s3://pg-rds-csv-exports
  • CSV Filename: pg-sample-tbl.csv

Lambda Source Code

main.js

Lambda Function for Streaming Data from RDS/Postgres to S3 CSV file

Deployment and Testing — AWS Cloud

Separate knowledge articles will be posted in due course to cover environment creation and function deployment/testing using the AWS Free tier.

Typical Deployment Overview

High level overview of deployment steps and components required are as follows:

  • Create S3 bucket (location for CSV exports)
  • Create Lambda zip deployment package
  • Create new IAM Lambda Execution Role, with trust relationship allowing Lambda service to assume this role
  • Create new IAM policy with permissions to resources accessed by the function
  • Attach this policy to the Execution Role
  • During function creation, specify the Execution Role and path to the zip package

Sandpit/Performance Testing

To evaluate performance of the function, populate the source Postgres table with test data to reflect expected volumes. Include iterations to cover longer term data volume forecasts.

Closing Points

Several factors influence selection of services/languages used for a particular solution.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tony Tannous

Tony Tannous

Learner. Interests include Cloud and Devops technologies.