Fully Managed
AWS Glue is a fully managed ETL (Extract, Transform, Load) service, which means you don't need to manage any underlying infrastructure. This reduces the operational overhead and allows you to focus on the data processing tasks.
Scalability
AWS Glue can automatically scale resources up or down based on the demand and workload, ensuring optimal performance without manual intervention.
Serverless
Being serverless, there are no servers to manage or maintain. You only pay for the resources that you consume, which can result in significant cost savings.
Integrated Data Catalog
AWS Glue comes with a built-in data catalog that helps you organize and discover your data. It automatically indexes and maintains metadata about your data, making it easier to manage.
Support for Multiple Data Sources
AWS Glue supports a variety of data sources including Amazon S3, RDS, Redshift, and many external databases, providing flexibility in your ETL processes.
Developer Tools
AWS Glue provides developer endpoints for custom ETL logic, and integrates with AWS SDKs, Boto3, and the AWS CLI, allowing for a flexible development experience.
Promote AWS Glue. You can add any of these badges on your website.
AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analysis. It helps bridge the gap between our MongoDB Atlas data and the services we'll use for recommendation. - Source: dev.to / about 1 year ago
AWS Glue is a fully managed extract, transform, and load (ETL) service provided by Amazon Web Services (AWS). It is designed to make it easy for users to prepare and load their data for analysis. AWS Glue simplifies the process of building and managing ETL workflows by providing a serverless environment for running ETL jobs. - Source: dev.to / about 1 year ago
It is serverless data integration service to allow you to easily scale your workloads in preparing data and moving transformed data into a target location. - Source: dev.to / over 1 year ago
So in the next post, we'll do that: We'll take what we've done here, add a few more components with Pulumi and AWS Glue, and wire it all up with a few magical lines of Python scripting. - Source: dev.to / over 2 years ago
Once it's in a Data Lake then you have different options depending on the analytics you need. For more advanced constant analytics then you could look into Amazon Kinesis Data Analytics instead of Firehose to S3, but for Ad-Hoc queries then this is where Glue and Athena come in. - Source: dev.to / over 2 years ago
You will want to use metrics based on operations outcomes to gain useful insights. Now you want to do analytics on your logs and use Cloudwatch Logs Insights or store the logs in Amazon S3, which then triggers an AWS Glue crawler to create an AWS Glue Data Catalog that then can be queried using Amazon Athena using standard SQL. The results can be visualized in Amazon Quicksight. - Source: dev.to / over 2 years ago
Storing data in S3 has an additional benefit, given how well it integrates with other AWS services. For instance, you can use Amazon Athena to query your S3 data, or Amazon Rekognition to analyze it. Additionally you can use AWS Glue to perform extract, transform, and loan (ETL) operations. To create ad hoc visualizations and business analysis reports, Amazon QuickSight can connect to your S3 buckets and produce... - Source: dev.to / over 2 years ago
Not 100% if this is what you need, but look into integrating AWS Glue. It should be able to keep the data source that Athena uses up to date real time or close to it, from what I understand. Source: over 2 years ago
AWS Glue streaming ETL (Extract Transform and Load) can now detect compressed data streaming from Amazon Kinesis, Amazon Managed Streaming for Apache Kafka (Amazon MSK), and self managed Apache Kafka. It can then automatically decompresses this data without customers having to write code, saving them development hours. AWS Glue Streaming ETL jobs continuously consume data from streaming sources, cleans and... - Source: dev.to / over 2 years ago
Use some ETL service from AWS and push what has been processed to a Log Group: AWS Glue is the service that can be used for this purpose. So it's another option to make everything inside the cloud itself. - Source: dev.to / almost 3 years ago
Also, if you're doing this for an employer, and they have some deeper pockets, there is also AWS Data Pipeline. Source: almost 3 years ago
Why aren't you looking at AWS Glue to load the data from Postgres to Redshift? It's relatively inexpensive and purpose built for such tasks. Source: about 3 years ago
AWS Glue is a fully managed ETL service that makes it simple and cost-effective to categorize, clean, enrich, and migrate data from a source system to a data store for ML. - Source: dev.to / over 3 years ago
Unfortunately there's just so many options for data ingest. Any programming language could be used, and there's plenty of off-the-shelf software and SaaS solutions to do it too. For example it could be done with AWS Data Pipeline (https://aws.amazon.com/datapipeline) or maybe there's just a EC2 virual machine running some custom python code that is doing it. Source: almost 4 years ago
Looks like that is a ETL system, so https://aws.amazon.com/glue/. Source: about 4 years ago
Do you know an article comparing AWS Glue to other products?
Suggest a link to a post with product alternatives.
This is an informative page about AWS Glue. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.