The Internet has changed the world not only for the better. Digitalization of everything and everything has led to the uncontrolled dissemination of information, much of which would previously have remained strictly private. It’s not just about “scandals, intrigues and investigations”, which with varying success help to bring unscrupulous citizens to justice. The trouble is that most people have lost control of their personal data. Naked Science understands whether it is possible to return it, whether it will be possible to escape from “calls from the bank’s security service” and why the philosophy “I have nothing to hide, and no one is interested in me” in the modern world works against its bearer.
A huge number of details are known about each user on the Network, some of them are stored in conditionally closed databases, some are publicly available. This data is constantly operated by various services, applications and individuals. Not all of them are conscientious and harmless. The problem is that a significant part of such processes is simply not visible to the naked eye. The victim of the attackers until the very last moment, most likely, will not suspect that someone has conducted a full investigation in her or his relationship. The result is a great awareness of scammers about the most diverse aspects of the life of a person who eventually loses money. Sometimes — huge, sometimes — the last. In addition, many types of network activity have become, if not prohibited, then censured at the state level, so even law-abiding people will not be prevented from “cleaning up their tracks”.
But let’s not focus on “horror stories”. Let’s consider the issue in a comprehensive, simple and accessible way. Let’s start with the terminology so that we can continue to speak the same language. Key terms that are often confused or misused in the context of cybersecurity:
personal data (private data) — any information that its creator or the person it describes may want or wants to keep confidential (limit the circle of persons who have access to it);
personal data is any information that allows you to identify a person, or related to a person that can be clearly identified.
As can be seen from the description, both concepts are intersecting sets. But there is at least a legislative difference between them. Personal data is partially protected by the right to privacy of correspondence. Personal ones have become the “new oil”, so no matter how hard politicians try to protect them under public pressure, the interests of the modern economy counteract this.
A host of digital doubles
Almost every website warns users about the use of “cookies” (cokie, “cookies”). This is a sluggish consequence of the aforementioned public pressure on politicians. Those sites that follow Western norms allow you to choose which cookies to refuse, and provide detailed information about what data the resource collects about the visitor. Portals located in jurisdictions of countries with less strict legislation in the field of personal data often simply follow the “fashion for transparency”.
In simple words, cookies are a kind of tickets that are issued to the client by the server. They record the user ID, as well as a number of related parameters. The latter can be anything and depend on the needs of a particular service. One site can use not only its own “cookies”, but also third-party partners. For example, an advertising network, a statistics service, a search engine. Visiting one resource, the user is immediately “marked” in several more. And this is just the tip of the iceberg. Each of the services collects its own “dossier” on the user, which is not limited to “cookies”.
Such profiles additionally contain quite harmless data — screen resolution, language preferences, region, browser and operating system versions, fonts used. In some cases, even the model of the device from which the user visited the site. Depersonalized information, due to its abundance, becomes suitable for (almost) unmistakable identification of a particular person. The site can be visited by thousands of people through the same browser or from the same smartphone. For some part of them, these parameters will match. But not at all. Even a smaller number will have the same region and operating system version. Add to this a couple or three (or a dozen) attributes, and the coincidence will already be quite unlikely — that is, each individual data set will actually turn into a “personal” one.
Such sets of information are called digital fingerprints, and they work even when cookies on the user’s side are disabled or blocked. Formally, this makes it possible to identify not the person himself, but the device. However, it is not difficult to link a certain set of digital fingerprints with specific credentials. And they, in turn, are much more accurately associated with a specific person. This is how a digital personality turns out.
There is no single definition of the term yet — depending on the field of application, it is formulated differently. For the purposes of this article, a digital identity will be a collection of personal and personal data together with a digital fingerprint. It is important to understand that one real person may have several digital personalities formed by similar (or not so much) sets of data that different services collected about him. In addition, it should not be confused with online identity – the image of a person that he voluntarily or unconsciously forms about himself on the Web.
Personal data — “new oil”
The enemy must be known by sight, so we will briefly tell you why anyone needs our digital identities at all. One of the first comparisons of such data with oil is considered to be a publication in the online magazine The Wired in 2014 on the topic of the future of the digital economy. This brief material contains a fairly simple idea: if we are able to measure something, it can be improved, and data is needed to measure something. Consequently, those companies that will be able to build the most complete and effective system for collecting and analyzing information about all areas of their activities will gain a huge competitive advantage.
This statement applies not only to the “telemetry” of the company’s internal processes — KPI, ROI and other abbreviations, understood by each manager in their own way. It is much more useful to “measure” users. The better the company understands the changes in customer behavior during and after the provision of services, the more perfect the product can be made.
It is not easy to understand the value of big data analytics in words. Its benefits become most evident when you get to know it yourself. For example, no single course of literary skill helps the author as a couple of hours of thoughtful study of behavioral factors in Google Analytics, or “Yandex Metrica”. There may be two seemingly similar texts, but having completely different indicators of readability, bounce and retention on the page. Looking at their minor structural and linguistic differences, it is easiest to realize how to write for a specific audience. And to eliminate the randomness factor, hundreds of texts and thousands of readers are needed.
From all this follows this: a lot of data is value, the data of one person is dust (unless, of course, there is some significant anomaly in them). That is why the services have no motivation to worry about the safety of personal information of individual users. The task of any information company is to collect as much data as possible, improve its product with minimal costs due to their analysis, and then become indispensable for customers in their field of activity. Then, even if it turns out that leaks regularly occur from the company, no one will want to refuse its services.
However, there is a fundamental difference between real and digital oil. Yes, we will not be able to get profit from our personal data, as well as from the “black gold” in the bowels of the motherland. But at least partially regain control over them is quite realistic. And even complicate the activities of those who earn free money on our digital personalities.