Aleph Technologies Blog

Applications come and go, data stays.

Alexis

Alexis is the founder of Aleph Technologies, a data infrastructure consulting and professional services provider based in Brussels, Belgium.

Postgresql Performance on Ubuntu linux 18.04 and Amazon AWS EC2 iXL instance

Introduction This is a follow-up on our Postgresql Performance on Ubuntu 18.04 and Nutanix post. We document here the same pgbench tests on an Amazon EC2 i-XL instance with NVMe SSD storage. Disclaimer The following test results are not intended to give definitive numbers on the tested platform. These are only illustrative of the methodology used to test the platform for postgresql workloads. Sustained database benchmark with different linux kernel parameters We tested.. Read More

Postgresql Performance on Ubuntu linux 18.04 and Nutanix

Background On november 4, 2019, we were benchmarking a number of virtual machines on a new 12-node DELL 740XD-24 appliance under Nutanix AHV (and RDMA enabled). I was getting weird results and I tweeted the following: At the moment we were configuring both the Nutanix appliance and the Linux kernel parameters most suitable for a postgresql 9.5 specific workload. That was not a good idea and produced confusing test results since the actual.. Read More

GDPR article mapping to RDBMs feature availability

To close the article series on GDPR and available features in each of the four database systems reviewed, I was made aware by a few acquaintances that a more clear mapping between the actual regulation text and the itentified tasks, along with the features available on each database system would be very helpful. This is the purpose of this article and I hope it will be of help to everyone trying to map.. Read More

GDPR demystified – Part 4 : Mysql / MariaDB

In the previous posts, we discussed GDPR and how we could enforce it in the scope of Oracle , SQL Server and Postgresql databases. We now cover the same topic with Mysql and Mariadb databases in mind. As always, please refer to GDPR Data Security requirements laid down in the initial post. We will continue to walk the same task and activity structure for all RDBMs we cover. Let’s go to the identified.. Read More

GDPR demystified – Part 3 : Postgresql

In the previous posts, we discussed GDPR and how we could enforce it in the scope of Oracle¬† and SQL Server databases. We now cover the same topic with Postgresql databases in mind. As always, please refer to GDPR Data Security requirements laid down in the initial post. We will continue to walk the same task and activity structure for all RDBMs we cover. Let’s go to the identified GDPR tasks and requireements.. Read More

GDPR demystified – Part 2 : Oracle

In the previous post we talked about GDPR and how it could be enforced in the scope of SQL Server databases. We now cover the same topic but with Oracle databases in mind. Please refer to GDPR Data Security requirements¬† as stated in the post. We will religiously walk the same task and activity structure for all RDBMs we cover. Let’s move fast to the identified GDPR tasks and requireements and then the.. Read More

GDPR demystified – Part 1 : SQL Server

After months (years ?) seeing warnings and calls for attention to companies that are in a way or another impacted by the GDPR regulation and after being asked every now and then the same questions over and over again, I decided to write a few posts covering the technologies that can be used to implement the technical mechanisms required to enforce EU GDPR. I will cover major RDBMS and today we start with.. Read More

Improving fuzzy searches in postgresql with pg_trgm

I have been researching postgresql search capabilities for quite a while. Besides the very powerful Full Text Search capabilities the engine offers today, there are a few less known index types that support string searches in the form : where foo like ‘%bar%’ For these cases, a regular b-tree index on ‘foo’ is guaranteed to never be used. The following example illustrates a case in which a relatively simple query is transformed to.. Read More

Note on Kafka consumer group offsets

I recently was testing Kafka 0.10 with the excellent Confluent Kafka Python client API and in the process came up with an initially confusing situation where even after dropping a topic and re-creating it I still was seeing an old offset for a consumer group and didn’t understand how to clear it. I had been testing with both the old and the new Consumer API, and using different ways to obtain consumer group.. Read More

Maths and Machine Learning – Linear Algebra

I recently read the post by Wale Akinfaderin on the Mathematics of Machine Learning and was compelled to write on the subject as a way to keep track of some useful resources and pointers, having gone – even though having a decent mathematical background – through the process of refreshing up certain mathematical concepts in order to be able to understand machine learning algorithms but also making sense of the strengths and weaknesses.. Read More