Building Scalable Serverless Payment Processing with AWS Lambda, SQS, and Step Functions

Building a Scalable Serverless Payment Processing System with AWS Lambda, SQS, and Step Functions

Handling millions of payment transactions requires a scalable, resilient, and efficient system. Achieving these takes just a swift move from a traditional server-based approach to serverless payment processing.

Indeed, serverless architecture delivers undeniable advantages: automatic scaling, cost efficiency, and reduced operational overhead. However, serverless computing has its setbacks: managing data states, finding bugs, and dealing with time limits.

That’s why a well-architected approach is essential. By combining AWS tools like Lambda, SQSStep Functions, and DynamoDB, we can build an efficient and flexible payment processing system that ensures smooth operation even under heavy loads.

Challenges Of Serverless Payment Processing System: Its Solutions

Serverless architecture may not be the silver bullet due to its unique challenges. For instance, since serverless functions have execution time limits, they may face issues while handling long tasks.

Or, let’s say a function with a higher workload depends on another function that scales more slowly. The interdependence of these components may make it difficult to fix problems and grow your application.

Yet, these limitations can be mitigated with the proper configurations and design patterns. Here is a great solution using AWS services:

  • AWS Lambda for event-driven execution without needing to provision or manage servers, reducing overhead.
  • Amazon SQS for asynchronous and decoupled messaging to ensure different parts can work independently.
  • AWS Step Functions orchestrate workflows by coordinating the execution of tasks in a reliable and scalable sequence, enabling better control over multi-step payment processes.
  • Transaction records are persistently stored in AWS DynamoDB, a highly scalable, low-latency NoSQL database that ensures faster data access and real-time processing.

You May Need To Consider: As with any architecture, security remains a crucial consideration in serverless systems.

Prerequisites

To build a scalable serverless payment processing system, you will need the following:

  • AWS account to access and configure AWS services.
  • Basic knowledge of Python to write and manage Lambda functions.
  • Basic understanding of cloud computing and serverless application models.

Step-by-Step Implementation

Step 1: API Gateway → Lambda Function (Receiving Payment Requests)

In this step, API Gateway triggers a Lambda function when a payment request is received.

The function processes the payment and forwards transaction details to an SQS queue for further asynchronous handling.

1.1 API Gateway Configuration

The API Gateway is set up to accept POST requests at the dynamic endpoint /integrate/pay/{method}. It supports CORS for cross-origin requests.

ApiGatewayApi:
    Type: AWS::Serverless::Api
    Properties:
      Cors:
        AllowMethods: "'POST', 'GET', 'OPTIONS'"
        AllowHeaders: "'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token'"
        AllowOrigin: "'*'"

1.2 Lambda Function Configuration

API Gateway triggers the Lambda function through a POST request. This function processes the payment request and sends the details to the SQS queue.

InitialLambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: 
      CodeUri: 
      Runtime: python3.10
      MemorySize: 128
      Timeout: 30
      Environment:
        Variables:
          SQS_QUEUE_URL: !Ref SQSMessageQueue
      Tracing: Active
      Events:
        ApiPOST:
          Type: Api
          Properties:
            Path: /integrate/pay/{method}
            Method: POST
            RestApiId:
              Ref: ApiGatewayApi
      Role: !Sub ${SQSMessageIntegrationRole.Arn}

Step 2: Lambda Function → SQS (Message Queue for Decoupling)

In this step, the Lambda function forwards payment requests to an SQS queue for decoupling and asynchronous processing. This setup enhances fault tolerance and prevents API Gateway timeouts.

2.1 SQS Queue Configuration

The SQS queue acts as a buffer between the API-triggered Lambda function and the downstream processing components. We can define the SQS resource as:

SQSMessageQueue:
    Type: AWS::SQS::Queue

Since the Lambda function does not have permission to send messages to SQS by default, we must define an IAM role with the necessary permissions.

2.2 IAM Role for SQS Access

The IAM role grants the Lambda function permissions to send messages to the SQS queue.

SQSMessageIntegrationRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action:
              - sts:AssumeRole
      Policies:
        - PolicyName: logs
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: arn:aws:logs:*:*:*
        - PolicyName: sqs
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - sqs:SendMessage
                Resource: !Sub ${SQSMessageQueue.Arn}

Note:

  • The SendMessage action enables the Lambda function to send messages to the queue.
  • Additional permissions like ReceiveMessage and DeleteMessage can be added if needed for further processing.

2.3 Lambda Handler Implementation

The following Python (boto3) Lambda function reads the method parameter from the request and sends it to the SQS queue.

import json
import boto3
import os

SQS_QUEUE_URL= os.environ["SQS_QUEUE_URL"]

def lambda_handler(event, _):
    method = event.get("pathParameters", {}).get("method")
    sqs_client = boto3.client("sqs")
    sqs_client.send_message(
        QueueUrl=SQS_QUEUE_URL, MessageBody=json.dumps({"method":method})
    )

Boto3 is a toolkit of Amazon’s Python that works with AWS resources. For more information, refer to the Boto3 Official Documentation.

Example Of Lambda Trigger and SQS Data Flow

Below is an example of a POST request sent to the /integrate/pay/khalti endpoint. The API Gateway triggers the Lambda function, which processes the request by extracting the method parameter (e.g., ‘Khalti’) and sends it to the SQS queue.

The API responds with a 200 OK status, confirming the successful processing of the request.

After the Lambda function processes the request, the method parameter (e.g., ‘Khalti’) is passed as a message to the SQS queue for asynchronous processing.

Step 3: SQS → Step Function (Workflow Orchestration)

AWS Step Functions allows developers to implement logic in different functions separately and in an orderly way. It also allows parallel execution and is as easy to debug and change.

In this workflow, Step Functions processes messages from SQS and invokes Lambda functions to handle payment processing. It coordinates multiple AWS services, including Lambda, SQS, and EventBridge.

3.1 Defining and Implementing the Step Function State Machine

To create a workflow, we define a State Machine using Amazon States Language (ASL) in JSON format. The state machine orchestrates a workflow consisting of two tasks, each invoking an AWS Lambda function: ValidatePayment and DeductAmount.

Below is the CloudFormation definition for the Step Function:

StateMachineProcess:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      StateMachineName: StateMachineProcess
      DefinitionString: !Sub | 
      TracingConfiguration:
        Enabled: true
      RoleArn: !GetAtt StateMachineRole.Arn

The JSON_FORMAT_DEFINATION is arranged as below:

{
          "Comment": "Arrange multiple Lambda functions",
          "StartAt": "ValidatePayment",
          "States": {
            "ValidatePayment": {
              "Type": "Task",
              "Resource": "${ValidatePayment.Arn}",
              "Parameters": {
                "Payload.$": "$"
              },
              "Retry": [
                {
                  "ErrorEquals": ["Lambda.TooManyRequestsException"],
                  "IntervalSeconds": 2,
                  "MaxAttempts": 5,
                  "BackoffRate": 2.0
                }
              ],
              "Next": "DeductAmount"
            },
            "DeductAmount": {
              "Type": "Task",
              "Resource": "${DeductAmount.Arn}",
              "Parameters": {
                "Payload.$": "$"
              },
              "Retry": [
                {
                  "ErrorEquals": ["Lambda.TooManyRequestsException"],
                  "IntervalSeconds": 2,
                  "MaxAttempts": 5,
                  "BackoffRate": 2.0
                }
              ],
              "End": true
            }
          }
        }

3.2 IAM Role for Step Function

Unlike SQS, the state machine requires permission to invoke the lambda function. Hence, an IAM role must be defined to grant these permissions.

StateMachineRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: states.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: CloudWatchLogs
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - "logs:CreateLogDelivery"
                  - "logs:GetLogDelivery"
                  - "logs:UpdateLogDelivery"
                  - "logs:DeleteLogDelivery"
                  - "logs:ListLogDeliveries"
                  - "logs:PutResourcePolicy"
                  - "logs:DescribeResourcePolicies"
                  - "logs:DescribeLogGroups"
                Resource: "*"
        - PolicyName: StepFunctionInvokePolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - lambda:InvokeFunction
                Resource:
                  - !GetAtt ValidatePayment.Arn
                  - !GetAtt DeductAmount.Arn

3.3 Lambda Functions for Step Functions

With the above configurations, the state machine is granted access to invoke all the lambda functions within it.

Also, we need to define the lambda functions for the state machine.

ValidatePayment:
    Type: AWS::Serverless::Function
    Properties:
      Handler: 
      CodeUri: 
      Runtime: python3.10
      MemorySize: 128
      Timeout: 30
      Tracing: Active
      Description: This is lambda function one
DeductAmount:
    Type: AWS::Serverless::Function
    Properties:
      Handler: 
      CodeUri: 
      Runtime: python3.10
      MemorySize: 128
      Timeout: 30
      Tracing: Active
      Description: This is lambda function two

3.4 Triggering the State Machine from SQS

We use EventBridge Pipes to trigger Step Functions automatically when a message arrives in SQS. The pipe listens to SQS and starts executing the state machine.

SqsToStateMachine:
    Type: AWS::Pipes::Pipe
    Properties:
      Name: SqsToStateMachinePipe
      RoleArn: !GetAtt EventBridgePipesRole.Arn
      Source: !GetAtt SQSMessageQueue.Arn
      SourceParameters:
        SqsQueueParameters:
          BatchSize: 1
      Target: !Ref StateMachineProcess
      TargetParameters:
        StepFunctionStateMachineParameters:
          InvocationType: FIRE_AND_FORGET

EventBridge Pipe also requires permission to read from SQS and launch the State machine. Here’s the IAM role that grants these permissions:

EventBridgePipesRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - pipes.amazonaws.com
            Action:
              - sts:AssumeRole
      Policies:
        - PolicyName: CloudWatchLogs
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - "logs:CreateLogGroup"
                  - "logs:CreateLogStream"
                  - "logs:PutLogEvents"
                Resource: "*"
        - PolicyName: ReadSQS
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - "sqs:ReceiveMessage"
                  - "sqs:DeleteMessage"
                  - "sqs:GetQueueAttributes"
                Resource: !GetAtt SQSMessageQueue.Arn
        - PolicyName: ExecuteSFN
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - "states:StartExecution"
                Resource: !Ref StateMachineProcess

3.5 Lambda Handler Implementations

We implement two Lambda functions, ValidatePayment and DeductAmount, which work together in a Step-Function workflow.

The ValidatePayment function processes input from SQS and sends it to Step Functions.

import json

# ValidatePayment
def lambda_handler(event, _):
    body = json.loads(event["Payload"][0]["body"])  # Parse the body JSON string
    method = body["method"]  # Access the method key
    return {"statusCode": 200, "body": json.dumps({"method": method})}

The DeductAmount function processes the output of ValidatePayment:

import json

#DeductAmount
def lambda_handler(event, _):
    body = json.loads(event["Payload"]["body"])
    print(body)

The json.loads function is used to parse the body of the payload, allowing DeductAmount to handle the response data. As specified in the state machine, DeductAmount marks the end of the Step Function, completing the workflow.

Step 4: Step Function → DynamoDB (Storing Transaction Data)

After executing the payment workflow in Step Functions, the final step is storing transaction details in Amazon DynamoDB. This enables secure, high-performance data storage and makes transaction records easily accessible.

4.1 Defining the DynamoDB Table

We define a DynamoDB table called MethodTable with the following schema:

MethodTable:
  Type: AWS::DynamoDB::Table
  Properties:
    TableName: MethodTable
    AttributeDefinitions:
      - AttributeName: payment_id
        AttributeType: S
      - AttributeName: payment_method
        AttributeType: S
    KeySchema:
      - AttributeName: payment_id
        KeyType: HASH
      - AttributeName: payment_method
        KeyType: RANGE
    BillingMode: PAY_PER_REQUEST

Key Descriptions:

  • Partition Key (payment_id): Uniquely identifies each transaction.
  • Sort Key (payment_method): Helps in filtering data based on payment methods.
  • PAY_PER_REQUEST : To handle scaling automatically.

4.2 Lambda Function to Store Transactions in DynamoDB

The DeductAmount Lambda function processes transactions and writes them to DynamoDB.

DeductAmount:
  Type: AWS::Serverless::Function
  Properties:
    Handler: 
    CodeUri: 
    Runtime: python3.10
    MemorySize: 128
    Timeout: 30
    Tracing: Active
    Description: Processes payment and stores transaction details
    Policies:
      - DynamoDBCrudPolicy:
          TableName: !Ref MethodTable
    Environment:
      Variables:
        METHOD_TABLE: !Ref MethodTable

4.3 Writing Data to DynamoDB from Lambda

The Lambda function generates a unique transaction ID and stores the data in DynamoDB.

import json
import boto3
import uuid
import os

METHOD_TABLE= os.environ["METHOD_TABLE"]

def lambda_handler(event, _):
    body = json.loads(event["Payload"]["body"])  # Parse the body JSON string
    method = body["method"]  # Access the method key

    id= str(uuid.uuid4())

    methodDb= boto3.resource("dynamodb")
    dynamo_table = methodDb.Table(METHOD_TABLE)
    dynamo_table.put_item(Item={"payment_method": method, "payment_id": id})

    return {"statusCode": 200, "body": json.dumps({"message": "Stored!!"})}

The boto3 toolkit is used to access DynamoDB resources and store data. We can also update the table items using a unique ID generated using uuid ,ensuring each transaction is uniquely identifiable.

Here is a snapshot of the DynamoDB record of a stored transaction with a unique ID and payment method:

Final Architecture Overview:

  • API Gateway triggers Lambda, which sends requests to SQS.
  • SQS forwards messages to Step Functions with EventBridge pipes, initiating workflow execution.
  • Step Functions coordinate multiple Lambda functions for validation and processing.
  • DynamoDB stores transaction details for record-keeping.

Conclusion

The proposed serverless architecture combines AWS Lambda, SQS, Step Functions, and DynamoDB to provide a scalable, efficient, and resilient payment processing system.

The integration of these cloud services enables seamless handling of large transaction volumes. Additionally, leveraging Dead Letter Queues (DLQ) in SQS further enhances the system’s resilience by capturing and isolating failed messages, ensuring better error handling and system reliability.