OctoSQL[0] or DuckDB[1] will most likely be much simpler, while going through 10 GB of JSON in a couple seconds at most. Disclaimer: author of OctoSQL [0]: https://github.com/cube2222/octosql. - Source: Hacker News / about 1 year ago
This is really cool! With their Postgres scanner[0] you can now easily query multiple datasources using SQL and join between them (i.e. Postgres table with JSON file). Something I strived to build with OctoSQL[1] before. It's amazing to see how quickly DuckDB is adding new features. Not a huge fan of C++, which is right now used for authoring extensions, it'd be really cool if somebody implemented a Rust extension... - Source: Hacker News / about 1 year ago
Congrats on the Show HN! It's great to see more tools in this area (querying data from various sources in-place) and the Lambda use case is a really cool idea! I've recently done a bunch of benchmarking, including ClickHouse Local and the usage was straightforward, with everything working as it's supposed to. Just to comment on the performance area though, one area I think ClickHouse could still possibly improve... - Source: Hacker News / over 1 year ago
SPyQL is really cool and its design is very smart, with it being able to leverage normal Python functions! As far as similar tools go, I recommend taking a look at DataFusion[0], dsq[1], and OctoSQL[2]. DataFusion is a very (very very) fast command-line SQL engine but with limited support for data formats. Dsq is based on SQLite which means it has to load data into SQLite first, but then gives you the whole breath... - Source: Hacker News / over 1 year ago
To add somewhat of a counterpoint to the other response, I've tried the Steampipe CSV plugin and got 50x slower performance vs OctoSQL[0], which is itself 5x slower than something like DataFusion[1]. The CSV plugin doesn't contact any external API's so it should be a good benchmark of the plugin architecture, though it might just not be optimized yet. That said, I don't imagine this ever being a bottleneck for the... - Source: Hacker News / over 1 year ago
Actually, folks just use gRPC or Yaegi in Go. See Terraform[0], Traefik[1], or OctoSQL[2]. Although I agree plugins would be welcome, especially for performance reasons, though also to be able to compile and load go code into a running go process (JIT-ish). [0]: https://github.com/hashicorp/terraform [1]: https://github.com/traefik/traefik [2]: https://github.com/cube2222/octosql Disclaimer: author of OctoSQL. - Source: Hacker News / over 1 year ago
This looks really cool! Especially using datafusion underneath means that it probably is blazingly fast. If you like this, I recommend taking a look at OctoSQL[0], which I'm the author of. It's plenty fast, easier to add new data sources for as external plugins, can handle endless streams of data natively and is able to push down predicates to the database below (so if you're selecting 10 rows from a 1 billion row... - Source: Hacker News / over 1 year ago
Through more magic, you COULD of course use stuff like Spark, or easier with programs like TextQL, sq, OctoSQL. Source: over 1 year ago
The logo was created for OctoSQL and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think! Source: over 1 year ago
Hey, author here, happy to answer any questions! The logo was created for OctoSQL[0] and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think! [0]:https://github.com/cube2222/octosql. - Source: Hacker News / over 1 year ago
And as you can see, the results go in a very different direction. [0]: https://github.com/cube2222/octosql [1]: https://github.com/dcmoura/spyql/blob/master/notebooks/json_benchmark.ipynb. - Source: Hacker News / over 1 year ago
OctoSQL allows you to join data from different sources using SQL\ (24 comments). Source: almost 2 years ago
Since many people are sharing ones liners with various tools... OctoSQL[0]: octosql 'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi.csv GROUP BY passenger_count' [0]:https://github.com/cube2222/octosql. - Source: Hacker News / almost 2 years ago
Hey, congrats on the journey! Especially on the community building aspect, it's really impressive that you've been able to spark so many communities on various platforms (Reddit, GitHub, Discord, etc.)! On a more technical note, since dsq is based on the "load it into SQLite and query it from there" architecture, have you considered integrating with the plugin ecosystems of other existing projects, like... - Source: Hacker News / almost 2 years ago
I recommend adding telemetry to your project (and I know a lot of people feel strongly about this, so I'll add: with a very easy way of disabling it). In OctoSQL[0] I'm literally just sending JSON files with coarse information about 1. Invocations of the CLI, 2. Features used in these invocations, to a VM on DigitalOcean (with a 10-line server receiving them and writing to a JSON file - which I can then process... - Source: Hacker News / almost 2 years ago
Hey, OctoSQL[0] author here. This is an updated benchmark originally posted a week ago[1] and is a bit of an update to the assertion made at that time. It now includes clickhouse-local, which is very very fast, as well as a new release of OctoSQL where I tackled some low hanging fruits to get a huge boost in performance (happy to expand in case somebody's interested), which resulted in substantial changes to the... - Source: Hacker News / about 2 years ago
Great article, just skimmed it, but will definitely dive deeper into it. I thought Go is doing full monomorphization. As another datapoint I can add that I tried to replace the interface{}-based btree that I use as the main workhorse for grouping in OctoSQL[0] with a generic one, and got around 5% of a speedup out of it in terms of records per second. That said, compiling with Go 1.18 vs Go 1.17 got me a 10-15%... - Source: Hacker News / about 2 years ago
OctoSQL is primarily a command-line application that allows you to query a variety of databases and filetypes using SQL in a single interface, as well as perform JOINS between them. OctoSQL is a fully expandable, fully-featured dataflow engine that can be used to provide a SQL interface for your applications. It validates and optimizes queries based on database types. It may process massive volumes of data and... - Source: dev.to / about 2 years ago
> [0]:https://github.com/cube2222/octosql nice tool, thanks for making this available. - Source: Hacker News / over 2 years ago
That's harsh, could you explain why, please? As far as I understand (and I'm the author) `COUNT()` means "count the number of rows, regardless of the values". `COUNT(id)` means "count all non-NULL id's" which ends up being the same for non-nullable columns, but requires additional processing for nullable ones. Thus the first option should be able to optimize out more processing time in some situations and at... - Source: Hacker News / over 2 years ago
I don't really want a wholly different language - I'm happy with SQL most of the time. (I've even written a tool for ubiquitus data access with sql - OctoSQL - so there's that.). Source: over 2 years ago
Do you know an article comparing OctoSQL to other products?
Suggest a link to a post with product alternatives.
This is an informative page about OctoSQL. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.