The extension of media forensics to such novel and more realistic scenarios implies the ability to face significant technological challenges related to the (possibly multiple) uploading/sharing process, thus requiring methods that can reliably work under these more general conditions. In fact, during those steps the data get further manipulated by the platforms in order to reduce the memory and bandwidth requirements. This hinders conventional forensics approaches but also introduces detectable patterns. We report in Table 1 two examples of images shared through popular social media platforms and downloaded from there, where it can be seen how the signal gets altered in terms of size and compression quality.
jis b 8270 pdf download
Afterwards, multiple different steps can follow where the object can be either downloaded, re-uploaded, or re-shared through other platforms. In addition, the multimedia content is generally tied to textual information (e.g., in news, social media posts, articles).
The explosion in the usage of social network services enlarges the variability of image and video data and presents new scenarios and challenges especially in the source identification task, such as: knowing the kind of device used for the acquisition after the upload; knowing the brand and model of a device after the upload as well as the specific device associate to a shared media; dealing with the ability to cluster a bunch of data according to the device of origin; dealing with the ability to associate various profiles belonging to different SNs. Not all of such open questions are equally covered; i.e. very few works exist on brand identification [4]. On the contrary most of the works are mainly dedicated to source camera identification, such as, tracing back the origin of an image or a video by identifying the device or the model that acquired a particular media object. Similarly to what happened in forensic scenarios with no sharing processes, the idea behind these kind of approaches is that each phase of the acquisition process leaves an unique fingerprint on the digital content itself, which should be estimated and extracted. The fingerprint should be robust enough to the modification introduced by the sharing process, so that it is not drastically affected by the uploading/downloading operations and can be still detectable. Several papers use the PRNU (Photo Response Non-Uniformity) noise [5] as fingerprint to perform source identification, as it has proven widely viable for traditional approaches. Some others methods adopt some variants of the PRNU extraction method, and propose to use hybrid techniques or consider different footprints such as video file containers. We decide to split the source camera identification group of techniques in two categories: perfect knowledge methods and limited and zero knowledge methods, according to the level of information available or assumed on the forensic scenario. The first case, described in Section 3.1.1, is related to the methods employing known reference databases of cameras to perform their task. In the second case (Section 3.1.2) the reference dataset can be partially known or completely unknown and no assumption on the numbers of camera composing the dataset is given. A summary of the papers, that will be described in the following, is reported in Table 2 with details regarding the techniques employed, the SNs involved and the dataset used.
The alterations that social media platforms apply on images are further investigated in [26, 27] where their impact on tampering detection is evaluated. A number of well-established, state-of-the-art algorithms for forgery detection are compared on different datasets including images downloaded from social media platforms. The results confirm that such operations are so disruptive that sometimes could completely nullify the possibility of a successful forgery identification throughout a detector.
In order to analyze the traces left by the sharing operations, suitable datasets must be created by reproducing the conditions of the studied scenario. For platform provenance analysis, images need to be uploaded to and downloaded from the web platforms and SNs under analysis. This can be performed automatically or manually, depending on the accessibility and regulations of the different platforms. For several platforms (such as Facebook, Twitter, Flickr [32]), APIs are available that allow to perform automatically sharing operations with different uploading options, thus significantly speeding up the collection process. Moreover, the platforms often allow to process multiple files in batches, although sharing with different parameters has to be performed manually.
Useful evidence for provenance analysis can then be contained in the EXIF information of JPEG files. In fact, sharing platforms usually strip out optional metadata fields (like acquisition time, GPS coordinates, acquisition device), but in JPEG files downloaded from diverse platforms different EXIF fields are retained. This aspect is also explored in [37], where the authors aim at linking JPEG headers of images acquired with Apple smartphones and shared on different apps to their acquisition device; their analysis show that JPEG headers can be used to identify the operating system version and the sharing app used to a certain extent.
Lastly, the work in [62] rely only on visual information but trains in parallel different CNNs operating both in the pixel domain and in the frequency domain. The authors in fact conjecture that frequency-based features can capture different image qualities and compressions potentially due to repeated upload and download from multiple platforms, while pixel-based features can express semantic characteristics of images belonging to fake composite objects.
In [71], this problem is tackled by resorting to a deep multimodal representation on composite objects, which allows for the computation of a consistency score based on a reference training dataset. To this purpose, the authors create their own dataset of images, captions and other metadata downloaded from Flickr, and also test their approach on existing datasets like Flickr30K and MS COCO. A larger and more realistic dataset called MEIR (Multimodal Entity Image Repurposing)Footnote 7 is then collected in [72], where an improved multimodal representation is proposed and a novel architecture is designed to compare the analyzed composite object with a set of retrieved similar objects.
In this section, we report an annotated list of the publicly available datasets for media forensics on shared data, with reference to the specific area for which they are created (i.e., forensics analysis, platform provenance or verification analysis). Those are summarized in Table 7. In the first column, the name of each dataset is reported, together with the link for the download (if available). The considered SNs are explicitly stated together with the number of sharing to which images or video are subjected to. An indication of the numerosity of the dataset is also provided with a specification of the devices used. 2ff7e9595c
Commenti