Schema Migration from Solr to Elasticsearch / OpenSearch

Instead of manually converting Solr schema files to Elasticsearch or OpenSearch index mappings, our team created a script which helps automating the process. Migrating from Solr to Elasticsearch (or OpenSearch) was never easier. | Continue reading


@blog.bigdataboutique.com | 1 year ago

Architectures of a Modern Data Platform

In this post, first in a series, we will look at how a typical Data Warehouse and Data Lake architectures are designed and built, and the technologies involved. | Continue reading


@blog.bigdataboutique.com | 1 year ago

Tuning Elasticsearch: Garbage Collection Algorithms

Our experts have set out to find which JVM GC algorithm works best with Elasticsearch. Should you use G1 GC or the Parallel GC? Is the recommendation going to be the same for all workloads? | Continue reading


@blog.bigdataboutique.com | 2 years ago

Tuning Elasticsearch: The Ideal Java Heap Size

In the journey for peak performance and lowest possible cost, the Elasticsearch Java heap size plays a significant role. What is the right value for ES_HEAP_SIZE? Is there a right number to always use? In this article we ignore all the disinformation and get to the optimal answer … | Continue reading


@blog.bigdataboutique.com | 2 years ago

Apache Kafka vs. Apache Pulsar

Apache Kafka has been the go-to publish-subscribe (pub-sub) messaging system for a while. It offers functionality for a wide range of enterprise use cases, along with a large ecosystem of tools and a dedicated community. But lately, upstart Apache Pulsar has been gaining ground. | Continue reading


@blog.bigdataboutique.com | 3 years ago

Exploratory Analysis and ETL with Presto and AWS Glue

In this post, we dive into the most common use case of exploring and managing data located in an S3 object storage, using Presto and schema data stored on AWS Glue. | Continue reading


@blog.bigdataboutique.com | 3 years ago

Presto Meets Elasticsearch – our Elasticsearch connector for Presto [video]

More often than not we find ourselves implementing BigData architectures that include those two technologies. Presto is usually deployed for what we call the “cold layer”, and Elasticsearch for the “hot layer”. Our connector finally allows to inter-connect them seamlessly. Here … | Continue reading


@blog.bigdataboutique.com | 3 years ago

Our High-Performance Elasticsearch Connector for Presto

Our Presto Elasticsearch Connector is built with performance in mind. We leveraged our deep knowledge of both Elasticsearch and Presto to build this production ready, enterprise grade, connector that is up for any challenge. Here are some of the use-cases it is being used for. | Continue reading


@blog.bigdataboutique.com | 4 years ago

Using Elastic Maps to visualize Covid-19 spread – Part 3

In this post in the blog series on using Kibana to visualize data on COVID-19, we'll visualize the data using maps, while also learning scripting and data ingestion basics. | Continue reading


@blog.bigdataboutique.com | 4 years ago

Visualizing Covid-19 spread with Elasticsearch and Kibana – Part 2

We're back again for this blog series on using Kibana to visualize data on COVID-19. In the previous post, we've loaded the data and used Kibana's Discovery app to explore it. This time we'll create some visualizations and add them to a dashboard. | Continue reading


@blog.bigdataboutique.com | 4 years ago

Pulumi Drives Our Elasticsearch Capacity Planning and Cost Optimization Service

Continue reading


@blog.bigdataboutique.com | 4 years ago

The Apache Iceberg Table Format Is the Bright Future of Data Warehousing

Continue reading


@blog.bigdataboutique.com | 4 years ago