Much ink has been spilled in the run-up to the General Data Protection Regulation (GDPR) and its potential impact on businesses all around the globe. With the deadline for compliance nearing, it’s important to take stock of where you, as an organisation, are on your journey to GDPR readiness and understand what steps are needed to safeguard yourself from possible negative impacts after May 25th, 2018.

While the implementation date is set, it’s widely agreed that many aspects of GDPR and its implications have yet to be fully determined; even the European Commission recognizes this. While the new regulation provides a single set of rules directly applicable to all EU member states, there is still a great deal of enablement work to be done at the country-level. The Commission is dedicating EUR 1.7 million to fund data protection authorities and train data protection professionals. A further EUR 2 million is available to support national authorities in reaching out to businesses, in particular small and medium enterprises (SMEs).

Who is off the hook?
Talking about SMEs…some people have stated that companies with fewer than 250 employees do not need to comply. While it is true that some obligations don’t apply to them, generally speaking, they still need to comply with the GDPR. For example, SMEs with fewer than 250 employees specifically do not need to keep records of their data processing activities unless processing of personal data is considered a regular activity, poses a threat to individuals’ rights and freedoms, or concerns sensitive data or criminal records.

Another misconception is around the location of the organisation. Not being a member of the EU does not exclude you from the GDPR. The regulation applies to any organisation offering goods/services (paid or for free) to individuals located in the EU and any organisation that monitors the behavior of individuals in the EU. So, seeing that a lot of businesses are global, including online businesses, they must comply with GDPR to continue to serve their EU customers.

How should you handle personal data?
Generally speaking under the GDPR, personal data must be processed in a lawful and transparent manner, meaning you must have specific purposes for processing the data and you must indicate those purposes to individuals when collecting their personal data. You may only collect and process personal data that is necessary to fulfill that particular purpose. You must also ensure it is accurate and up-to-date. If it is not, it’s mandatory to provide a way to correct it. And once that personal data is no longer needed for the purposes for which it was collected, you need to remove it.

Is it all bad news and useless busywork?
If you take a step back, you will find that working towards GDPR compliance can bring organisational benefits as well. It will likely enhance your security posture and better protect your company against things like data breaches and unlawful access that potentially lead to reputational damage and loss of business.

Not just a technology issue.
The old cybersecurity adage applies to the GDPR as well; it is a combination of people + process + technology. It certainly does not solely fall into the domain of the IT department but should instead mobilize the entire organisation. It should be tackled with a combination of organisational, even at the executive level, and technology approaches in order to move to compliance.

But definitely also a technology issue.
One of the implications of the GDPR is that you must implement appropriate technical and organisational safeguards that ensure the security of the personal data, including protection against unauthorised or unlawful processing and accidental loss, destruction, or damage, using appropriate technology.

Organisations don’t store data in one single central location; in reality, an organisation will find that data is spread across multiple different systems and locations. Additionally, not all data is created equal. Some is stored as structured data, in database systems for example, while other data, like documents and emails, is unstructured. The location of data also varies, including in different internal systems, but more and more is stored externally in public cloud environments, such as SaaS-based applications.

Because of this, it seems most prudent to start with an inventorisation (what data do I hold?) and classification (does it fall under personal data?) of said data and build from there. This should include a verification of the need for a Data Protection Impact Assessment (DPIA) in case you process data that is likely to result in a high risk to the rights and freedoms of individuals.

When you as an organisation collect personal data, you now need to make sure that you clearly inform the individuals whose data you are requesting. This most likely implies a process change and is something that can happen in parallel with your data inventorisation and classification work. When asking for consent to collect and process data, you must indicate who you are (including your DPO, if you have one), why you need the data (including the legal justification of why you want to process the data if applicable), for how long you are keeping the data, and if others, such as processors, will receive the data (including if it will be sent outside the EU). You also need to inform the individual about their rights pertaining to their personal data at the time of collection and if they have the right to receive a copy of the data, lodge a complaint with the individual’s local Data Protection Authority, and withdraw consent at any time.

Now think about all the data locations and data types you have. Can you comply with the above requests in a timely manner?
Technology can certainly help, although I feel the need to point out that there is no single silver technological bullet to make you GDPR compliant. Technology can relax some GDPR requirements, such as those pertaining to data breaches. If a data breach occurs, and it is likely that the breach poses a risk to an individual’s rights and freedoms, your organisation has to notify the supervisory authority (DPA) within 72 hours after having become aware of the breach. Encrypting the data might diminish the likelihood of posing a risk after a data breach and therefore alleviate your need to report on it.

Failure to comply
Most of the GDPR content out there starts with warning you about the potential monetary sanctions you are liable for in case of non-compliance (up to €20 million or 4% of the business’s total annual worldwide turnover). I feel this justifies some relativation however. It is certainly true that these sanctions exist and they should certainly function as a deterrent, but you shouldn’t forget that the authority must ensure that fines imposed in each individual case are effective, proportionate, and dissuasive. The authority will take into account a number of factors like the nature, gravity and duration of the infringement, its intentional or negligent character, any action taken to mitigate the damage suffered by individuals, the degree of cooperation of the organisation, and other factors. In other words, if you are working towards compliance, have done your utmost, have documented your procedures, have implemented systems and technology, you may be able to avoid the full extent of the potential fines under the GDPR.

Is the GDPR hampering technological progress?
The world’s most valuable resource is no longer oil, but data. Companies are ever more reliant on data and data processing. Think about innovations in Artificial Intelligence and Machine Learning in which models are predicated on the availability of data to draw possible conclusions. Let’s take a look at the difference between AI, Machine Learning, and Deep Learning to see where GDPR would potentially come into play.

AI involves machines that can perform tasks characteristic of human intelligence. We can further differentiate between general and narrow AI. General AI covers all the characteristics of human intelligence, whereas narrow AI focuses on a subset, like image recognition, but can’t perform other tasks generally thought of as human-like intelligence in a meaningful way.

Machine learning is the path to getting the “machine” to achieve AI. It’s defined as the machine’s ability to learn without being explicitly programmed. This is achieved by “training” the machine (the model) by feeding large amounts of data to the ML algorithm and allowing it to adjust itself and improve. Turning back to the image recognition example, how do you teach the machine to differentiate between a cat and a dog? It’s similar to how we as humans learn this skill–by seeing lots of examples of cats and dogs. First, you provide a sufficient set of data (training data) and you tag specific features of cats vs dogs. Next, the algorithm builds a model to accurately tag those areas in pictures itself based on the historical training data you provided. Once the accuracy level is up to human intelligence, the machine has learned how to differentiate between cats and dogs.


With Deep Learning, however, you skip the step of manually extracting features from images. Instead, you feed these images directly into the ML algorithm, which then predicts the object.


Deep Learning is therefore classified as a subtype of Machine Learning. In the above example, it deals directly with the images and is often more complex. The amount of data needed to train the Deep Learning algorithm is typically much larger than in the Machine Learning case. The Convolution Neural Network (CNN) is a network architecture for Deep Learning, in this case learning directly from the images provided. It is made up of several layers that process and transform an input (image) to produce an output (image recognition).

Seeing that we’re handing off conclusions based on data to Machine Learning algorithms, we need to make sure that–in light of the GDPR–we can explain those steps.

The GDPR states that, where applicable, the existence of automated decision-making and meaningful information about the logic involved, including the significance and envisaged consequences thereof, need to be reported to the individual. The question thus becomes if we can still easily explain how a decision is reached in such systems and what the potential outcome is. Interesting times ahead.

Want to learn more about the GDPR? Read our blog, Rubrik Cloud Data Management: Security by Design.