Software Alternatives, Accelerators & Startups

Large-scale Data Processing with Step Functions : AWS Project

AWS Step Functions Amazon SQS Amazon S3 AWS Lambda AWS Fargate Amazon ECS DynamoDB
  1. AWS Step Functions makes it easy to coordinate the components of distributed applications and microservices using visual workflows.
    The solution uses AWS Step Functions to provides end to end orchestration for processing billions of records with your simulation or transformation logic using AWS Step Functions Distributed Map and Activity features. At the start of the workflow, Step Functions will scale the number of workers to a (configurable) predefined number. It then reads in the dataset and distributes metadata about the dataset in batches to the Activity. The workers are polling the Activity looking for data to process. Upon receiving a batch, the worker will process the data and report back to Step Functions that the batch has been completed. This cycle continues until all records from the dataset have been processed. Upon completion, Step Functions will scale the workers back to zero.

    #Workflow Automation #Workflows #Automation 67 social mentions

  2. Amazon Simple Queue Service is a fully managed message queuing service.
    Pricing:
    • Open Source
    Amazon SQS — Fully managed message queuing for microservices, distributed systems, and serverless applications.

    #Data Integration #Stream Processing #Web Service Automation 72 social mentions

  3. Amazon S3 is an object storage where users can store data from their business on a safe, cloud-based platform. Amazon S3 operates in 54 availability zones within 18 graphic regions and 1 local region.
    Amazon S3 — Object storage built to retrieve any amount of data from anywhere.

    #Cloud Hosting #Object Storage #Cloud Storage 196 social mentions

  4. Automatic, event-driven compute service
    Pricing:
    • Open Source
    You will build a Step Functions workflow that processes healthcare claims data in a highly parallel fashion. The workflow uses the Distributed Map state that runs multiple child workflows, each processing a batch of the overall claims data. Each child workflow picks a set of individual claims files and processes them using AWS Lambda functions that load the data to an Amazon DynamoDB table and then apply rules to determine validity of the claims. Upon processing the claims, the functions returns the output back to the workflow.

    #Cloud Computing #Cloud Hosting #Business & Commerce 273 social mentions

  5. AWS Fargate is a compute engine for Amazon ECS and EKS that allows you to run containers without having to manage servers or clusters.
    The workers in this example are containers, running in Amazon Elastic Container Service (ECS) with an Amazon Fargate Capacity Provider . Though the workers could potentially run almost anywhere so long as they had access to poll the Step Functions Activity and report SUCCESS/FAILURE back to Step Functions.

    #Developer Tools #DevOps Tools #Containers As A Service 50 social mentions

  6. Amazon EC2 Container Service is a highly scalable, high-performance​ container management service that supports Docker containers.
    Pricing:
    • Open Source
    The workers in this example are containers, running in Amazon Elastic Container Service (ECS) with an Amazon Fargate Capacity Provider . Though the workers could potentially run almost anywhere so long as they had access to poll the Step Functions Activity and report SUCCESS/FAILURE back to Step Functions.

    #Developer Tools #Containers As A Service #Cloud Computing 52 social mentions

  7. Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models.
    You will build a Step Functions workflow that processes healthcare claims data in a highly parallel fashion. The workflow uses the Distributed Map state that runs multiple child workflows, each processing a batch of the overall claims data. Each child workflow picks a set of individual claims files and processes them using AWS Lambda functions that load the data to an Amazon DynamoDB table and then apply rules to determine validity of the claims. Upon processing the claims, the functions returns the output back to the workflow.

    #Databases #NoSQL Databases #Relational Databases 119 social mentions

Discuss: Large-scale Data Processing with Step Functions : AWS Project

Log in or Post with