08

Yesterday, some of the strangest thing happened. I couldn’t send email any more!

The real problem was that I got no error feedback. Actually I haven’t been able to send email since August 1, but I had not realised! I thought everything was entirely fine. Until I was online discussing some issue with Torsten and he kept waiting for my emailed answer. After two hours: still no reply from me.

So, something was wrong. But what?

First, I tried with many different SMTP servers (my own server, or FSFE’s server which both use Kolab but then I also tried with OIN’s server) and the result was always the same: I couldn’t send email. Clearly, the problem was local.

Was it mutt? I use an experimental fork of mutt, called mutt-kz… I tried to see the result of the debug file, but there was nothing really interesting in it. Then I tried with using mutt’s builtin sending email feature and I got:

La session SMTP a échoué : 554 5.7.1 Service unavailable; Client host [46.115.137.139] blocked using zen.spamhaus.org; http://www.spamhaus.org/query/bl?ip=46.115.137.139

Right now, I rely on an UMTS connection at home. Maybe that was the problem? But no, using FSFE’s VPN led to the same awkward result!

At this stage, I was completely clueless.


Maybe the problem was because of the program I use to connect to my SMTP servers: msmtp. Actually, I use the scripts that allow me to queue email before sending it (very useful on a laptop). But unfortunately, whether I relied on msmtp or msmtpq in the mutt config, the same result occurred: mutt told me “Email sent” even though no email was sent (I even tried, when disconnected from the internet, the same result occurred!!)

Today by trying msmtpq manually, I finally got this:

hrd@xps ~/.msmtpqueue (git)-[annex/direct/master] % cat email.mail| msmtpq -a ampoliros

cannot use queue /home/hrd/.msmtp.queue : waited 240 seconds for
lockdir [ /home/hrd/.msmtp.queue/.lock ] to vanish ; giving up
if you are certain that no other instance of this script
is running, then 'rmdir' the lock dir manually

Hurray!

There must be some better way to handle this! My first impression is that feedback between msmtpq and mutt should be better dealt with, so that I don’t get “Email sent” while msmtpq is actually hanging!

I’m not sure where to file this bug report:

Replicate the bug:

  1. Have a unusable .lock directory in your msmtp queue (I have no idea how I did that)
  2. Send an email with Mutt

    Get “Email sent”

  3. msmtpq is actually not handling the email at all, not adding it to the queue (msmtp-queue says there’s no email in the queue) and not sending it to msmtp

Expected behaviour:

  1. Have a unusable .lock directory in your msmtp queue (I have no idea how I did that)
  2. Send an email with Mutt

    Get an error feedback

  3. Suggest to delete the lockdir


It’s very hard to see which program should be improved here. When using msmtpq scripts, the README tells me to add this in mutt:

set sendmail_wait = -1 #send in the background

So I assume that whatever I do, mutt will now always tell me “Email sent”. Isn’t there a better way?

I’ve seen on Twitter some criticism raised against Google’s announcement to give a bit more weight to websites with https. The core of the argument is not entirely clear, but it takes various forms similar to:

You can’t applaud Google’s decision and be mad for what carriers do against network neutrality at the same time

But actually: yes, I can.

I think Google’s decision is the right one, because even though I’m far from satisfied with the way the whole CA circus runs, it’s still better to have https than no encrypted traffic at all.

But why has this nothing to do with network neutrality? It’s simple. Because the Google search engine is not a network operator nor an internet access provider! It does not even come close, fundamentally these are entirely different activities…

Just look at how we use each of them:

When I “use” what my internet access providers provides to me: I connect my laptop to the internet; my web browser makes requests that the network operator carries back and forth for me; my web browser renders a web page; or I write an email and the network operator connects me to my email server to carry my email to it so that my email server can actually send it.

Notice something: the activity of the internet access provider is entirely generic! My basic interaction with them is not between me and them, but between the machines and software I use and their machines.1

Now let’s analyse how I use Google: I write a search query, Google analyses it and gives me back an answer, a list of results. Then I choose to click on a link.

To sum it up, while I make automated requests to my ISP, I ask a human-edited question to my search engine.

These are so fundamentally different activities that it makes absolutely no sense to put search engines issues and network neutrality issues in the same basket!

Every step of the way from the moment I enter “Search” and the moment a list of results is displayed to me is an entirely edited process, with complex algorithms etc. There’s nothing neutral, ever, in a search engine! (The fact that it is automated is entirely irrelevant and is purely a question of implementation.)

If you are not convinced, consider this:

  • How do we measure a good ISP? Certainly not by the “relevance” of their answers; their answer is not relevant, it’s either true or false! Any tinkering with the process is exactly the opposite of what I want them to provide, which is fast, reliable, and predictable internet connections.

    On the other hand, you measure a good search engine by how relevant the results are to you.

  • If I switch from one internet access provider to the next, for instance because I commute from home to an office or a cafĂ©, I do expect the results to the queries my software makes to be exactly the same.

    However that’s absolutely not true if I change one search engine for another. The reason I choose to use one or another probably means that I actually expect different results! (Otherwise, why change? I would probably only use the one that’s faster and has a better user interface.)

And finally that’s the last big difference. An ISP is part of the infrastructure around me. In some cases it’s entirely possible that I don’t have the choice of which provider is going to provide internet access to me.

However, that’s entirely false for search engines. And in fact, in the last three years I have moved away from using Google to DuckDuckGo, and I also have installed YaCy and lately, Searx on my own servers.

So, please, if you’re unhappy for some weird reason about Google’s change to give a bit more weight to https, do not make other people confused with the issue of network neutrality.


  1. Which does not mean that it’s not important! It is fundamentally important that they do it in a way that safeguards our freedom of expression and privacy – which is why I support do-it-yourself ISPs. If you’re looking for one near you, check out this map. ↩

In late 2012, a new manifesto emerged from the free software community: The User Data Manifesto, written by Frank Karlitschek of Owncloud. Quite similar to the Franklin Street Statement on freedom and network services, the manifesto was taking another approach which I think was good: identifying a new set of rights for users, or as the manifesto puts it: “defining basic rights for people to control their own data in the internet age.”

I have applauded the approach and I think the current manifesto is a good starting point – which is why I have started an effort to create a new better version built on the first version. If you are interested directly into discussing the new version then you can skip the first part of this article.

This page uses fragmentions JS. Link to any bit of this document by appending to the URI: ##any bit of this doc.

Example

What’s wrong with the current version?

Right now, the manifesto consists of 8 points — and I think that’s probably too much. As you will see, some of these points overlap. Another thing that’s wrong with the current version is that it mixes several issues together with no hierarchy or context between these; for instance, some points are about user rights, some others are about implementation only (like point 8. Server software transparency).

So let me take some points separately:

1 - Own the data
The data that someone directly or indirectly creates belongs to the person who created it.

This one is very, very problematic. What does “belong” mean, what does “own” mean? Why is one used in the title and the other in the description? What happens when several persons “created” data. What does “create [data]” even mean? I don’t create “data”, my computer generates data when I do things and make stuff.

This point could be read like a copyright provision and thus justify current copyright laws. This is probably not the intention behind this. So this point should be fixed. This reason alone is enough to make it a necessity to update the current manifesto.

But what was the intention behind this?

I think I understand it, and I agree with it. Maybe you know the meme “All your base are belong to us” sometimes deviated into “All Your Data Are Belong to Us” in reference to Google/NSA/etc.

This is basically what we want to prevent. For a user data manifesto to be effective, it means that even if I use servers to store some of my data, it does not mean that the server admin should feel like being able to do as if it was their data.

However, a careful note is needed here. As you will notice, I’m referring to data as “my data” or “their data.” This is very important to consider. If we want a good User Data Manifesto, we need to think clearly about what makes data, “User Data.”

The current version of the manifesto says that what makes User Data is data “created by the user.” But I think that’s misleading.

Usually, there are two ways in which one might refer to data as “their data” (i.e. “their own” data):

  1. Personal data, or personally-identifiable information, are often referred to by someone as their data. But in our case, that’s not relevant, this is covered by laws such as data protection in the European Union. That’s not the scope of this manifesto, because in this case the person is called the “data subject” and typically, this person is not necessarily a “user.”

    However, this is users that we are concerned with in this manifesto. Which leads to the second case in which one usually refers to data as their own data:

  2. Data that is stored on my hard-drive or other storage apparatus. In this case, the meaning of ownership of data is an extension of the ownership of the physical layer on which it sits.

    For instance, when I refer to the books that are in my private library at home, I say that these are my books even though I have not written any of them. I own these books not because I have created them, but because I bought them.

So, for the purpose of the User Data Manifesto, how should we define User Data to convey the objective that servers admins do not have the right to do as they wish with user data, i.e. our data?

I propose this:

“User data” means any data uploaded by a user and/or generated by a user, while using a service on the Internet.

This definition is aimed at replacing point 1 of the first version. This definition is consistent with our current way of referring to data as “our own data” but it also includes the case where data is not necessarily generated by devices that we own, but instead are generated by us, for us on devices that somebody else owns.

2 - Know where the data is stored
Everybody should be able to know: where their personal data is physically stored, how long, on which server, in what country, and what laws apply.

I have tried to improve this. This is point 2 in my version of the manifesto.

3 - Choose the storage location
Everybody should always be able to migrate their personal data to a different provider, server or their own machine at any time without being locked in to a specific vendor.

This is point 3 in my version of the manifesto.

4 - Control access
Everybody should be able to know, choose and control who has access to their own data to see or modify it.

5 - Choose the conditions
If someone chooses to share their own data, then the owner of the data selects the sharing license and conditions.

These two points are now point 1 in my version. I have merged them together. However, I have modified the part about “choosing the conditions” and instead refer to “permissions” (as in, read-only, read-write, etc.). I think the “conditions” as in licensing conditions are out of scope of this manifesto.

6 - Invulnerability of data
Everybody should be able to protect their own data against surveillance and to federate their own data for backups to prevent data loss or for any other reason.

This point was redundant with point 4 and it was drafted in a vague manner, so I have modified it and integrated in point 1 of my version of the manifesto.

7 - Use it optimally
Everybody should be able to access and use their own data at all times with any device they choose and in the most convenient and easiest way for them.

I feel this is not in scope with the manifesto because this describes a feature, not a right, and also because I felt it was a bit vague: what’s “most convenient and easiest way for them”? So I decided to leave this one out.

8 - Server software transparency
Server software should be free and open source software so that the source code of the software can be inspected to confirm that it works as specified.

This is about implementation related to point 3 of the current version related to the right to choose any location to store their data, the right to move to another platform. So I have merged it into point 3 of my version of the manifesto regarding the freedom to choose a platform.


That’s it. Overall, I think the manifesto was a good starting point and that it should be improved and updated. I think that we should reduce the number of points because 8 is too many; especially because some of them are redundant. We should also give more context after we lay out the rules.

This is what I have tried to do with my modifications. There is a pull request on Github pending. Feel free to give your impressions there.

Obviously, this is also a request for comments, criticism and improvement of my version of the manifesto.

Thanks to Jan-Christoph Borchardt, Maurice Verheesen, Okhin and Cryptie for their feedback and/or suggested improvements since April 2013.

My current proposal

User Data Manifesto, v2 DRAFT: as of today, August 26, 2014:

This manifesto aims at defining basic rights for people regarding their own data in the Internet age. People ought to be free and should not have to pay allegiance to service providers.

  1. “User data” means any data uploaded by a user and/or generated by a user, while using a service on the Internet.

Thus, users should have:

  1. Control over user data access

    Data explicitly and willingly uploaded by a user should always be under the ultimate control of the user. Users should be able to decide whom to grant (direct) access to their data and under which permissions such access should occur.

    Cryptography (e.g. a PKI) is necessary to enable this control.

    Data received, generated, collected and/or constructed from users’ online activity while using the service (e.g. metadata or social graph data) should be made accessible to these users and put under their control. If this control can’t be given, than this type of data should be anonymous and not stored for long periods.

  2. Knowledge of how the data is stored

    When the data is uploaded to a specific service provider, users should be able to know where that specific service provider stores the data, how long, in which jurisdiction the specific service provider operates, and which laws apply.

    A solution would be, that all users are free to choose to store their own data on devices (e.g. servers) in their vicinity and under their direct control. This way, users do not have to rely on centralised services. The use of peer-to-peer systems and unhosted apps are a means to that end.

  3. Freedom to choose a platform

    Users should always be able to extract their data from the service at any time without experiencing any vendor lock-in.

    Open standards for formats and protocols, as well as access to the programs source code under a Free Software license are necessary to guarantee this.

If users have these rights, they are in control of their data rather than being subjugated by service providers.

Many services that deal with user data at the moment are gratis, but that does not mean they are free. Instead of paying with money, users are paying with their allegiance to the service providers so that they can exploit user data (e.g. by selling them or building a profile for advertisers).

Surrendering privacy in this way may seem to many people a trivial thing and a small price to pay for the sake of convenience that the Internet services brings. This has made this kind of exchange to become common.

Service providers have thus been unwittingly compelled to turn their valuable Internet services into massive and centralised surveillance systems. It is of grave importance that people understand/realize this, since it forms a serious threat to the freedom of humanity

When users control access to the data they upload (Right #1), it means that data intended to be privately shared should not be accessible to the service provider, nor shared with governments. Users should be the only ones to have ultimate control over it and to grant access to it. Thus, a service should not force you to disclose private data (including private correspondence) with them.

That means the right to use cryptography1 should never be denied. On the contrary, cryptography should be enabled by default and be put under the users’ control with Free Software that is easy to use.

Some services allow users to submit data with the intention to make it publicly available for all. Even in these cases, some amount of user data is kept private (e.g. metadata or social graph data). The user should also have control over this data, because metadata or logging information can be used for unfair surveillance. Service providers must commit to keeping these to a minimum, and only for the purpose of operating the service.

When users make data available to others, whether to a restrictive group of people or to large groups, they should be able to decide under which permissions they grant access to this data. However, this right is not absolute and should not extend over others’ rights to use the data once it has been made available to them. What’s more, it does not mean that users should have the right to impose unfair restrictions to other people.

Ultimately, to ensure that user data is under the users’ control, the best technical designs include peer-to-peer or distributed systems, and unhosted applications. Legally, that means terms of service should respect users’ rights.

When users use centralised services that uploads data to specific storage providers instead of relying on peer-to-peer systems, it is important to know where the providers might store data because they could be compelled by governments to turn over data they have in their possession (Right #2).

In the long term, all users should have their own server. Unfortunately, this is made very difficult by some Internet access providers that restrict their customers unfairly. Also, being your own service provider often means having to administer systems which require expertise and time that most people currently don’t have or are willing to invest.

Users should not get stuck into a specific technical solution. This is why they should always be able to leave a platform and settle elsewhere (Right #3). It means users should be able to have their data in an open format, and to exchange information with an open protocol. Open standards are standards that are free of copyright and patent constraints. Obviously, without the source code of the programs used to deal with user data, this is impractical. This is why programs should be distributed under a Free Software license like the GNU AGPL-32.


Thanks to Sam Tuke for his feedback on the post and the manifesto!


  1. We mean effective cryptography. If the service provider enables cryptography but controls the keys or encrypts the data with your password, it’s probably snake oil. ↩

  2. The GNU AGPL-3 safeguards this right by making it a legal obligation to provide access to the modified program run by the service provider. (§ 13. Remote Network Interaction) ↩