# Towards Automatic Machine Learning Pipeline Design

I recently finished my PhD thesis and is now available online. Most of the code related to the thesis is available in this repository.

# In node.js, always query in JSON from PostgreSQL

Recently I was exploring the use of PostgreSQL as a replacement for MongoDB. PostgreSQL has in recent versions great support for JSON. You can store JSON values and you can even make indices on JSON fields. When combined with node.js and its driver things look almost magical. You read from PostgreSQL and you get automatically a JavaScript object, JSON fields automatically embedded. But can we also use JSON for transporting results of queries themselves, especially joins? In MongoDB the idea is to embed such related documents. In PostgreSQL we could also embed them instead of joining them, but would that be faster? I made a benchmark to get answers.

# Reactive queries in PostgreSQL

I am a big fan of the application architecture promoted by Meteor. I like declarative programming. You describe what you want and not how and the system does the rest. Reactive programming is very similar. You define how outputs should be computed from inputs, but when is this computed and how it is composed with other computations is left to the system. So you can define what is read from the database and send to the client. And how it is read on the client and transformed and send to the UI library. And then UI library can render this data. And every time something changes, the rest gets automatically recomputed, refreshed, re-rendered.

Meteor is tightly linked with MongoDB. They developed a complex piece of technology to provide reactive queries. Reactive queries are queries which after providing initial results they also continue providing any changes to those results as input data used in queries change. While I like MongoDB, I still prefer consistency tools provided by traditional SQL databases: transactions, foreign keys, joins and triggers. They are close to declarative programming as well. You define relations between data once and then the system makes sure data is consistent. I had to implement many of those features on top of MongoDB, like my package PeerDB.

This is why I made reactive-postgres node.js package. It provides exactly such reactive queries, but for PostgreSQL open source database. Its API is simple, on purpose, and because it should be. You provide a query, you get initial data, and then you get all changes. Try it out.

# Proof of luck consensus protocol and Luckychain blockchain

Proof of work consensus protocol used in modern cryptocurrencies like Bitcoin and Ethereum consumes a lot of energy and requires participants to use their CPUs for mining instead of other useful work. But exactly this cost is why it is works to prevent Sybil attacks. One cannot participate in the selection of the next block without paying this cost, which makes the issue of puppet participants trying to influence block selection irrelevant, because they also have to do the work, and pay the cost.

In recent Intel CPUs a new set of instructions is available, SGX, which allows one to run code inside a special environment where even operating system cannot change its execution. In the paper we published (arXiv, Cryptology ePrint Archive) we explore consensus protocol designs using the Intel SGX technology, with the goal of making blockchain participation energy efficient, with low CPU usage, and to democratize mining so that participants can participate again with their general purpose computers (with Intel CPUs) instead of only with specialized ASICs.

In May 2011 a EU directive was adopted with the goal of empowering web users with control over their exposure to cookies. The main issue is that 3rd party cookies allow users to be tracked across websites. The issue is that websites are often a mash-up of content coming from various services, each providing their own set of cookies. A service (a 3rd party) can thus track users across all websites using it.

In Slovenia I have participated in the process of adopting this directive into a local law which come into the effect in 2013. During this process I believed that the goal is good, and the law is reasonable. I thought that it handles technology well and with understanding, defining cookies broadly enough to be applicable to various tracking techniques and not just literally only cookies.

Maybe because of my participation and hearing all the arguments and perspectives I had a biased view, because once the law got into the effect a public outcry followed. At approximately the same time it got into the effect also in other EU countries which just reinforced public reception. Developers did not like that they had to do extra work and web frameworks they were using were not really helping them. It was unclear who will pay for that, especially because those changes were not planned and budgeted, especially for sites already made. To my surprise even developers who are otherwise outspoken about users’ privacy disliked the requirement of asking users for consent about cookies.

# Pay-it-forward cryptocurrency

Bitcoin, Ethereum, and blockchain in particular are often claimed as revolutionary as the Internet itself. They will decentralize the Internet again, change how we make apps, empower end-users, and remove intermediaries. But are they really so revolutionary? Even ignoring the technical limitations of scaling and power consumption, we can hardly imagine such wide influence on our society as we observed for Internet. Internet connected people globally, provided means of immediate communication and access to knowledge and information. It changed many aspects of our lives and how we as a species operate. But blockchain, does it really have this potential?

# Decentralized governance and four fallacies

Together with popularization of blockchain we can notice revived calls for decentralization of national and international governments, their reboots, or even their dissolution. But such calls lack fundamental understanding of how our governments operate, their role in our global society, and what all in fact regulate and control our existence beyond just governments.

# Sybil attacks and shell corporations

In computer security an important type of an attack in a decentralized system like Internet is a Sybil attack. The core of the attack is that many protocols we have developed depend on the assumption that each entity participating in a protocol participates only once and that it cannot create an arbitrarily number of additional “puppet” entities which it can control. Because on the Internet it is easy to present yourself with multiple identities, decentralized systems with open membership are often susceptible to this type of an attack.

Why are we designing such protocols? Maybe it is because so many protocols are based on existing processes we find between humans? Or maybe there is some fundamental issue of open, decentralized systems and identities.

But even more interesting is to observe that we have similar protocols in our existing society, with the same assumption. Moreover, we also allow people to create additional identities as needed. We call them corporations. Many our protocols in our society were designed when only humans were persons. Governments make sure that humans have unique identities and we have passports to allow governments to trust other governments about this validation. Creating corporations is on the other hand much less controlled. You have many countries with less strict laws which allow one to create shell (“puppet”) corporations. Traversing multiple jurisdictions through interactions between such corporations can hide many traces of linked identities. In a way, we allow arbitrary number of corporations to be created, without really requiring passports for them to be able to work with other corporations across borders. A passport which would link corporations to their unique identities. Furthermore, the issue is even more complicated because there can be multiple people behind corporations, and also other corporations.

So a real question is not why we are designing such protocols on the Internet, but why we are having ways to compromise such protocols outside the Internet. When we know that they can be misused and used to launch attacks. We already see such attacks in practice through pervasive tax evasions and other financial maneuvers.

# Wanted: precise terminology about democracy

In the previous blog post I presented one example of a confusion when talking about democracy: we use democracy for both “one person, one vote” and “one dollar, one vote” approaches to voting. But the issue is much broader. Saying that something is democratic does not really tell much, because it can mean anything from a majority voting, consensus (unanimity), voting based on shares, a system with representatives and one where we vote directly. Democracy is used to wage wars, topple dictators, but also topple democratically elected people. We use democracy to say “you cannot argue with it”. And we use it to position ourselves as morally superior. As such, the term democracy became almost useless.

We need to start finding more precise terminology for all aspects of democracy. What does it mean that a cooperative is democratically run? That workers can elect board members? That they do not have votes based on shares? Or that they can directly influence business decisions through a democratic process? Which process exactly? Does it matter? Are all the same? I do not think so.

Let’s start building terminology. Collectively.

# One person, one vote or one dollar, one vote and blockchain

We live in times of a hidden war between “one person, one vote” and “one dollar, one vote” ideologies. The reason why it is hidden is because we use the same terms for both: democracy, voting, consensus, etc. We govern our governments each having one vote, but in our companies shareholders commonly hold votes proportional to their share. Some people are claiming that the latter is a better approach and everything should be decided through markets and power. I believe that using power (physical or monetary) to make decisions is barbaric and that our civilization progress was to introduce a more true democracy, one person, one vote. But I do not believe even that is the end of our developments in this respect and we should continue developing our collective governance. Moreover, I do not believe that these two positions are necessary the only possibilities, and some combinations might also exist. In some way we might even already have that: using “one person, one vote” to decide the rules under which we operate, but using “one dollar, one vote” to decide how to split the profits.

Anyway, all this could be a topic of some other longer blog post. Here I wanted to explain this existing tension between these two ideologies to present how they have existed in decentralized technologies as well and why Bitcoin’s blockchain is so innovative.

Subscribe
Recent Tweets @mitar_m