Blackbox extraction of secrets from deep learning models

Fascinating paper: “The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets”, Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, Dawn Song at https://arxiv.org/abs/1802.08232

Turns out that your algorithm memorizes your secrets in the training data. -Even if the algorithm is a lot smaller than the actual secrets… – My jaw fell do the ground right here :

“The fact that models completely memorize secrets in the training data is completely unexpected: our language model is only 600KB when compressed , and the PTB dataset is 1.7MB when compressed. Assuming that the PTB dataset can not be compressed significantly more than this, it is therefore information-theoretically impossible for the model to have memorized all training data—it simply does not have enough capacity with only 600KB of weights. Despite this, when we repeat our experiment and train this language model multiple times, the inserted secret is the most likely 80% of the time (and in the remaining times the secret is always within the top10 most likely). At present we are unable to fully explain the reason this occurs. We  conjecture that the model learns a lossy compression of the training data on which it is forced to learn and generalize. But since secrets are random, incompressible parts of the training data, no such force prevents the model from simply memorizing their exact details.”

https://arxiv.org/pdf/1802.08232.pdf

Irish DPA – Guide to Audit Process

v2.0 August 2014

https://www.dataprotection.ie/docimages/documents/GuidetoAuditProcessAug2014.pdf

“This guidance was originally published in 2009. This revised version has been updated to take account of legislative developments and to reflect any changes in the approach of the Office of the Data Protection Commissioner to the audit process. The guidance is designed to assist organisations selected for audit by the Office of the Data Protection Commissioner. It is hoped that
this resource will provide organisations holding personal data with a simple and clear basis to conduct a self-assessment of their compliance with their obligations under Irish Data Protection Law”

 

CNIL sanctions DARTY (100,000 Euro)

Interesting case – data breach due to ticket ID enumeration in a standard software URL (developed by a service provider) – CNIL sanctions data controller.

https://www.cnil.fr/fr/darty-sanction-pecuniaire-pour-une-atteinte-la-securite-des-donnees-clients

  • CNIL was informed in February 2017 of a security vulnerability in the URL  http://darty.epticahosting.com/selfdarty/register.do, which would have allowed access to several thousand customer data of the company DARTY.
  • Online check by CNIL in March 2017 reveals security vulnerability in http://darty.epticahosting.com/selfdarty/register.do ,  a form allowing the company’s customers to submit a service request after-sale. Once the form has been filled in with an e-mail address and a password, a hypertext link corresponding to the registration number of the request allowed access to its follow-up. The identifier (ticket number) was contained in the URL as follows: http://darty.epticahosting.com/selfdarty/requests.do?id= XXX.
    By changing the ID number in this URL, an attacker would be able to access customer service request forms completed by other customers.
  • 912,938 files were potentially accessible. During the inspection,  7,417 of them  were downloaded for sampling. It was found that personal data of customers were accessible on cards, such as their surname, first name, postal address, e-mail address and their orders.At the end of the audit, the delegation contacted the company to inform it of the existence of this personal data breach.
  • On premise inspection by CNIL revealed that support form was developed by a service provider.
  • Controller should have checked access controls and tested for vulnerabilities.