Aleph Technologies Blog

Applications come and go, data stays.

GDPR article mapping to RDBMs feature availability

To close the article series on GDPR and available features in each of the four database systems reviewed, I was made aware by a few acquaintances that a more clear mapping between the actual regulation text and the itentified tasks, along with the features available on each database system would be very helpful. This is the purpose of this article and I hope it will be of help to everyone trying to map.. Read More

GDPR demystified – Part 4 : Mysql / MariaDB

In the previous posts, we discussed GDPR and how we could enforce it in the scope of Oracle , SQL Server and Postgresql databases. We now cover the same topic with Mysql and Mariadb databases in mind. As always, please refer to GDPR Data Security requirements laid down in the initial post. We will continue to walk the same task and activity structure for all RDBMs we cover. Let’s go to the identified.. Read More

GDPR demystified – Part 3 : Postgresql

In the previous posts, we discussed GDPR and how we could enforce it in the scope of Oracle  and SQL Server databases. We now cover the same topic with Postgresql databases in mind. As always, please refer to GDPR Data Security requirements laid down in the initial post. We will continue to walk the same task and activity structure for all RDBMs we cover. Let’s go to the identified GDPR tasks and requireements.. Read More

GDPR demystified – Part 2 : Oracle

In the previous post we talked about GDPR and how it could be enforced in the scope of SQL Server databases. We now cover the same topic but with Oracle databases in mind. Please refer to GDPR Data Security requirements  as stated in the post. We will religiously walk the same task and activity structure for all RDBMs we cover. Let’s move fast to the identified GDPR tasks and requireements and then the.. Read More

GDPR demystified – Part 1 : SQL Server

After months (years ?) seeing warnings and calls for attention to companies that are in a way or another impacted by the GDPR regulation and after being asked every now and then the same questions over and over again, I decided to write a few posts covering the technologies that can be used to implement the technical mechanisms required to enforce EU GDPR. I will cover major RDBMS and today we start with.. Read More

Improving fuzzy searches in postgresql with pg_trgm

I have been researching postgresql search capabilities for quite a while. Besides the very powerful Full Text Search capabilities the engine offers today, there are a few less known index types that support string searches in the form : where foo like ‘%bar%’ For these cases, a regular b-tree index on ‘foo’ is guaranteed to never be used. The following example illustrates a case in which a relatively simple query is transformed to.. Read More

Note on Kafka consumer group offsets

I recently was testing Kafka 0.10 with the excellent Confluent Kafka Python client API and in the process came up with an initially confusing situation where even after dropping a topic and re-creating it I still was seeing an old offset for a consumer group and didn’t understand how to clear it. I had been testing with both the old and the new Consumer API, and using different ways to obtain consumer group.. Read More

Maths and Machine Learning – Linear Algebra

I recently read the post by Wale Akinfaderin on the Mathematics of Machine Learning and was compelled to write on the subject as a way to keep track of some useful resources and pointers, having gone – even though having a decent mathematical background – through the process of refreshing up certain mathematical concepts in order to be able to understand machine learning algorithms but also making sense of the strengths and weaknesses.. Read More

What are the differences between parametric models and non-parametric models ?

In the context of statistical modeling, a model is a set of distributions.  It is said to be parametric when it is completely determined by a finite set of parameters. For example in the case of a linear model In this case the regression line that fits the data is completely determined by its parameters and a noise term. We say that the vector space of the model parameters is finite-dimensional. On the.. Read More

What is the difference between correlation and regression ?

We are assuming here that the “regression” in the question is of a linear form. Correlation is the measure of how strong a linear relationship between two variables is. There are several correlation coefficient standards (Pearson, Spearman, etc). The correlation coefficient ranges between -1.0 and 1.0. Zero correlation means there is no linear relationship between the variables. The greater the correlation coefficient, the stronger the relationship, meaning that when one variable goes up,.. Read More