Secure big data collection and processing: Framework, means and opportunities
Peer reviewed, Journal article
Published version
Permanent lenke
https://hdl.handle.net/11250/3027426Utgivelsesdato
2022Metadata
Vis full innførselSamlinger
- Artikler / Journal articles [426]
- Publikasjoner fra Cristin [138]
Originalversjon
Journal of the Royal Statistical Society: Series A (Statistics in Society). 2022, 1-19. 10.1111/rssa.12836Sammendrag
Statistical disclosure control is important for the dissemination of statistical outputs. There is an increasing need for greater confidentiality protection during data collection and processing by National Statistical Offices. In particular, various transactions and remote sensing signals are examples of useful but very detailed big data that can be highly sensitive. Moreover, possible conflicts of interest may arise for data suppliers who operate commercially. In this paper, we formulate statistical disclosure control for data collection and processing as an optimisation problem. Even when it is difficult to specify and solve the problem unequivocally, the formulation can still provide the basis for comparing different disclosure control methods. We develop a general compartmented system that adapts and implements non-perturbative methods in the related fields of linking sensitive data and secure computation. We illustrate how the system can be configured to yield variously required tables and microdata sets with sufficiently low disclosure risks.