What is Data Loss Prevention?
How does SharePoint know what sensitive data is?
In SharePoint sensitive information is defined by a pattern which is identified by a regular expression e.g a bank number. In addition to this, the search engine contains a number of pre-defined keywords and checksums that are used to identify sensitive information alongside a confidence level process. You can view a list of the pre-defined keywords, checksums here.
For example if a DLP has been configured where a UK Passport Number cannot be sent, the following checks are done:
||Nine consecutive digits
|Definition||A DLP policy is 75% confident that it's detected this type of sensitive information if, within a proximity of 300 characters:
<Entity id="178ec42a-18b4-47cc-85c7-d62c92fd67f8" patternsProximity="300" recommendedConfidence="75">
Hope that has given you a good understanding of what DLP is and how it works. Now I will show you how to set this up in a few easy steps.
To set up DLP on SharePoint on-premise, there are a few pre-requisites that need to be setup prior.
- SharePoint Server 2016
- Search service application configured and running crawls.
- Compliance Centre
- eDiscovery Centre
- Outgoing email with emails configured on users.
Then select New Item
From the New DLP Query pop up box, choose the template you wish to use, for example, above I used the Passport Number example, so for this demo, I will use the “UK Data Protection Act”, as below
(Ensure to change the number at the bottom from 9 to 1)
Give the Query a Name, and a start and end data and choose the source you want the DLP to work from as below (for this demo, I will leave the source as ‘Search Everything in SharePoint’).
That’s it, the DLP query has been created. Now upload a document into SharePoint which contains nine consecutive numbers and a term from the Keyword, something like below. Save the document into SharePoint as Loreum ipsum.
loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum 789208725 passportno loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum loreum ipsum
Run a crawl, and select Search, you should see the document appear
So you can see that the document I uploaded which contained nine consecutive numbers and a term from the keywords has been flagged up via the eDiscovery Centre.
Now we need to create a Policy for this DLP.
Navigate to the Compliance Centre and select ‘Data Loss Prevention Policies’
Select New Item and select then a name for the policy, select Template you chose above and edit the 9 to a 1 to change the number to 1 conflict before the rule to take effect. Insert an email address so that a when a DLP finds a match, it will email this person. And then choose what to do with the file once a match is find, i.e. show a policy tip and block document, as below
After the Policy is created, we must assign that policy to a site collection. From the Compliance centre select DLP Policy Assignments for Site Collections
Select New Item and choose, First Choose a site Collection.
When a document in a library meets a Policy, a Policy tip is shown and the document is blocked, as below
Now under Managed Assigned Policy, assign your Policy to the site collection.
Please note that when you add a New Policy Assignment, it may take 24 hours to apply, but High Priority rules such as Credit Cards and Passport numbers take up to 15 mins.
In the Compliance Policy, we ticked a box to say we wanted to enable Policy Tips and to block access to documents which meet the DLP policy rules, well this is what a Policy tips looks is and how it behaves.
The Policy tip displays an error on the document informing the user it is blocked (as we selected in the compliance centre).
The tip informs who the document is open to, the user the problems with the document. The Owner, last modifier or the site owner can go into the document in remove the passport number, or if they think its an error, click resolve.
When you click resolve, you can override the policy, which means that you are aware and its normal that the data lives in the document. The other choice is Report an issue, where you think the document in fine and that it shouldn’t trigger a policy.
When you click on override, you must give a business justification as to why you want to over ride the rule, as below
The rule has been overwritten, and the error image is now been removed.