Privacy protecting software
Metadata removal tool or metadata scrubber is a type of privacy software built to protect the privacy of its users by removing potentially privacy-compromising metadata from files before they are shared with others, e.g., by sending them as e-mail attachments or by posting them on the Web.[1][2]
Overview
Metadata can be found in many types of files such as documents, spreadsheets, presentations, images, and audio files. They can include information such as details on the file authors, file creation and modification dates, geographical location, document revision history, thumbnail images, and comments.[3] Metadata may be added to files by users, but some metadata is often automatically added to files by authoring applications or by devices used to produce the files, without user intervention.
Since metadata is sometimes not clearly visible in authoring applications (depending on the application and its settings), there is a risk that the user will be unaware of its existence or will forget about it and, if the file is shared, private or confidential information will inadvertently be exposed. The purpose of metadata removal tools is to minimize the risk of such data leakage.[4]
The metadata removal tools that exist today can be divided into four groups:
- Integral metadata removal tools, which are included in some applications, like the Document Inspector in Microsoft Office.
- Batch metadata removal tools, which can process multiple files.
- E-mail client add-ins, which are designed to remove metadata from e-mail attachments just before they are sent.
- Server based systems, which are designed to automatically remove metadata at the network gateway.
To securely delete the metadata of a PDF file, it is important to linearize the PDF file afterwards, otherwise changes are reversible and the metadata can be recovered.[5][6]
Metadata removal tools are also commonly used to reduce the overall sizes of files, particularly image files posted on the Web. For example, a small image on a website, which may contain metadata including a thumbnail image, can easily contain as much metadata as image data, thus removal of that metadata can halve the file size.
See also
References