The past decade has witnessed great success of deep learning in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. We review the representative works about using deep learning for image/video coding, an actively developing research area since 2015. We divide the related works into two categories: new coding schemes that are built primarily upon deep networks, and deep network-based coding tools that shall be used within traditional coding schemes. For deep schemes, pixel probability modeling and auto-encoder are the two approaches, that can be viewed as predictive coding and transform coding, respectively. For deep tools, there have been several techniques using deep learning to perform intra-picture prediction, inter-picture prediction, cross-channel prediction, probability distribution prediction, transform, post- or in-loop filtering, down- and up-sampling, as well as encoding optimizations. In the hope of advocating the research of deep learning-based video coding, we present a case study of our developed prototype video codec, Deep Learning Video Coding (DLVC). DLVC features two deep tools that are both based on convolutional neural network (CNN), namely CNN-based in-loop filter and CNN-based block adaptive resolution coding. The source code of DLVC has been released for future researches.
Ateniese et al. proposed the Provable Data Possession (PDP) model in 2007. Following that, Erway et al. adapted the model for dynamically updatable data, and called it the Dynamic Provable Data Possession (DPDP) model. The idea is that a client outsources her files to a cloud server, and later on challenges the server to obtain a proof of the integrity of her data. Many schemes have later been proposed for this purpose, all following a similar framework. We analyze dynamic data outsourcing schemes for the cloud regarding security and efficiency, and show a general framework for constructing such schemes that encompasses existing DPDP-like schemes as different instantiations. This generalization shows that a dynamic outsourced data integrity verification scheme can be constructed given black-box access to an implicitly-ordered authenticated data structure. Moreover, for blockless verification efficiency, a homomorphic verifiable tag scheme is also needed. We investigate the requirements and conditions these building blocks should satisfy, using which one may easily check the applicability of a given building block for dynamic data outsourcing. Our framework serves as a guideline/tutorial/survey and enables us to provide a comparison among different building blocks that existing schemes employ.
The common perception in both academic literature and the industry today is that virtual machines offer better security, while containers offer better performance. However, a detailed review of the history of these technologies and the current threats they face reveals a different story. This survey covers key developments in the evolution of virtual machines and containers from the 1950s to today, with an emphasis on countering modern misperceptions with accurate historical details and providing a solid foundation for ongoing research into the future of secure isolation for multitenant infrastructures, such as cloud and container deployments.
This survey focuses on intrusion detection systems (IDS) that leverage host-based data sources for detecting attacks on enterprise network. The host-based IDS (HIDS) literature is organized by the input data source, presenting targeted sub-surveys of HIDS research leveraging system logs, audit data, Windows Registry, file systems, and program analysis. While system calls are generally included in audit data, several publicly available system call datasets have spawned a flurry of IDS research on this topic, which merits a separate section. Similarly, a section surveying algorithmic developments that are applicable to HIDS but tested on network data sets is included, as this is a large and growing area of applicable literature. To accommodate current researchers, a supplementary section giving descriptions of publicly available datasets is included, outlining their characteristics and shortcomings when used for IDS evaluation. Related surveys are organized and described. All sections are accompanied by tables concisely organizing the literature and datasets discussed. Finally, challenges, trends, and broader observations are throughout the survey and in the conclusion along with future directions of IDS research.
The vast nature of scientific publications brings out the importance of Literature-Based Discovery (LBD) research that is highly beneficial to accelerate knowledge acquisition and the research development process. LBD is a knowledge discovery process that automatically detects significant, implicit knowledge associations hidden in fragmented knowledge areas by analysing scientific literature. Therefore, the LBD output not only assists in formulating scientifically sensible, novel research hypotheses but also encourages the development of cross-disciplinary research. Initially, the review outlines the major LBD tools and the application areas to provide a general overview of the discipline. Subsequently, an in-depth analysis of the computational techniques is provided using a novel, up-to-date and detailed classification. Moreover, the review summarises the key milestones of the discipline through a timeline of topics. We also outline the insights gathered through our statistical analysis that captures the trends in LBD literature. To conclude, we discuss the prevailing research deficiencies in the discipline by highlighting the challenges and opportunities of future LBD research.
Stream processing handles continuous big data in memory on a process-once-arrival basis, powering latency-critical application such as fraud detection, algorithmic trading, and health surveillance. Though the development of streaming applications has been facilitated by a variety of Data streaming Management Systems (DSMS), the problem of resource management and task scheduling is not automatically handled by the DSMS middleware and remains the heavy burden of the application providers. As the advent of cloud computing has supported customised deployment on rented resources, it is of great interest to investigate novel resource management mechanisms to host streaming systems in clouds satisfying the Quality of Service (QoS) while minimising the resource cost. In this paper, we introduce the hierarchical structure of a streaming system, define the scope of the resource management problem, and then present a comprehensive taxonomy regarding critical research topics such as resource provisioning, operator parallelisation, and task scheduling. We also review the existing works based on the proposed taxonomy, which helps in making a better comparison of the specific work properties and method features. Finally, we propose the open issues and research directions towards realising an automatic, QoS-aware resource management framework for deploying stream processing systems in distributed computing environments.
Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-art results in various domains such as image recognition and natural language processing. One of the reasons for this success is the increasing size of DL models and the proliferation of vast amounts of training data being available. To keep on improving the performance of DL, increasing the scalability of DL systems is necessary. In this survey, we perform a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures. This incorporates infrastructures for DL, methods for parallel DL training, multi-tenant resource scheduling and the management of training and model data. Further, we analyze and compare 11 current open-source DL frameworks and tools and investigate which of the techniques are commonly implemented in practice. Finally, we highlight future research trends in DL systems that deserve further research.
Workflows are an application model that enable the automated execution of multiple interdependent and interconnected tasks. They are widely used by the scientific community to manage the distributed execution and data flow of complex simulations and experiments. As the popularity of scientific workflows continue to rise, and their computational requirements continue to increase, the emergence and adoption of multi-tenant computing platforms that offer the execution of these workflows as a service becomes widespread. This paper discusses the scheduling and resource provisioning problems particular to this type of platforms. It presents a detailed taxonomy and a comprehensive survey of the current literature and identifies future directions to foster research in the field of multiple workflow scheduling in multi-tenant distributed computing systems.
DevOps is a collaborative and multidisciplinary organizational effort to automate continuous delivery of new software updates while guaranteeing their correctness and reliability. The present survey investigates and discusses DevOps challenges from the perspective of engineers, managers, and researchers. We review the literature and develop a DevOps conceptual map, correlating the DevOps automation tools with these concepts. We then discuss their practical implications for engineers, managers, and researchers. Finally, we critically explore some of the most relevant DevOps challenges reported by the literature.
Virtually any software running on a computer has been processed by a compiler or a compiler-like tool. Because compilers are such a crucial piece of infrastructure for building software, their correctness is of paramount importance. To validate and increase the correctness of compilers, significant research efforts have been devoted to testing compilers. This survey article provides a comprehensive summary of the current state of the art of research on compiler testing. The survey covers different aspects of the compiler testing problem, including how to construct test programs, what test oracles to use for determining whether a compiler behaves correctly, how to execute compiler tests efficiently, and how to help compiler developers take action on bugs discovered by compiler testing. Moreover, we survey work that empirically studies the strengths and weaknesses of current compiler testing research and practice. Based on the discussion of existing work, we outline several open challenges that remain to be addressed in future work.
Blockchains are a topic of immense interest in academia and industry, but their true nature is often obscured by marketing and hype. In this tutorial, we explain the fundamental elements of blockchains. We discuss their ability to achieve availability, consistency, and data integrity as well as their inherent limitations. Using Ethereum as a case study, we describe the inner workings of blockchains in detail before comparing blockchains to traditional distributed systems. In the second part of our tutorial, we discuss the major challenges facing blockchains and summarize ongoing research and commercial offerings that seek to address these challenges.
Mixed reality (MR) technology development is now gaining momentum due to advances in computer vision, sensor fusion, and realistic display technologies. With most of the research and development focused on delivering the promise of MR, there is only barely a few working on the privacy and security implications of this technology. This survey paper aims to put in to light these risks, and to look into the latest security and privacy work on MR. Specifically, we list and review the different protection approaches that have been proposed to ensure user and data security and privacy in MR. We extend the scope to include work on related technologies such as augmented reality (AR), virtual reality (VR), and human-computer interaction (HCI) as crucial components, if not the origins, of MR, as well as numerous related work from the larger area of mobile devices, wearables, and Internet-of-things (IoT). We highlight the lack of investigation, implementation, and evaluation of data protection approaches in MR. Further challenges and directions on MR security and privacy are also discussed.
The Cyberworld is plagued with ever-evolving malware that readily infiltrates all defense mechanisms, operates viciously unbeknownst to users and surreptitiously exfiltrate sensitive data. Understanding the inner workings of such malware provides a leverage to effectively combat them. This understanding is pursued through dynamic analysis which is conducted manually or automatically. Malware authors accordingly, have devised and advanced evasion techniques to thwart analyses. In this paper, we present a comprehensive survey on malware dynamic analysis evasion techniques. In addition, we propose a detailed classification of these techniques and further demonstrate how their efficacy hold against different types of detection and analysis approach. Our observations attest that evasive behavior is mostly interested in detecting and evading sandboxes. The primary tactic of such malware we argue is fingerprinting followed by new trends for reverse Turing test tactic which aims at detecting human interaction. Furthermore, we will posit that the current defensive strategies beginning with reactive methods to endeavors for transparent analysis systems are readily foiled by zero-day fingerprinting techniques or other evasion tactics such as stalling. Accordingly, we would recommend pursuit of more generic defensive strategies with an emphasis on path exploration techniques that has the potential to thwart all the evasive tactics.
Decentralized trust management is used as a referral benchmark for assisting decision making by human or intelligence machines in open collaborative systems. During any given period of time, each participant might only interact with a couple of participants. Simply relying on direct trust may frequently resort to random team formation. Thus, trust aggregation becomes critical. It can leverage decentralized trust management to learn about indirect trust of every participant based on past transaction experiences. This paper investigates the design principles of decentralized trust management and their efficiency and robustness against various risks and threats. First, we study the risk factors and adverse effects of six common threat models. Second, we review the representative trust aggregation models and trust metrics. Third, we present an in-depth analysis and comparison of these reference trust aggregation methods with respect to effectiveness and robustness. We show our comparative study results through formal analysis and experimental evaluation. This comprehensive study advances the understanding of adverse effects of present and future threats and the robustness of different trust metrics. It may also serve as a guideline for research and development of next generation trust aggregation algorithms and services in the anticipation of risk factors and malicious threats.
Distributed Ledger Technologies (DLTs) and blockchain systems have received enormous academic, government and commercial interests. Recent work has seen an enormous range of research and technologies on applying DLTs to the Internet of Things. In this paper, we provide a comprehensive survey of different DLTs-IoT combined applications, from smart home, smart transport, supply chain, smart healthcare to smart energy. We also review a comprehensive selection of existing DLT solutions, and platforms. We then identify issues for future research, including DLT security and scalability, multi-DLT applications, and DLT applicability in the post-quantum world.
There has been a large number of works on making classrooms smart with the help of technology. These works span over a wide range of research areas covering information communication technology, machine learning, sensor networks, mobile computing, cloud computing, and hardware. Consequently, there has been a number of review papers on various aspects of smart classrooms. These reviews define smart classroom from a specific perspective and review related works. While such reviews are useful and necessary for improving technology related to a specific aspect, it is hard to derive a general picture of how smart the smart classroom is currently. In this article, we complement the literature with a comprehensive review covering all main aspects of a smart classroom. We first provide a common definition of smart classrooms and review works in different fields against that common definition. This multi-field review has exposed new research opportunities and challenges that need to be addressed for the synergistic integration of interdisciplinary works. We list these challenges in the paper along with a systematic review of the works. We believe that the article is very useful for researchers and practitioners in the field of smart classrooms.
Model-driven game development (MDGD) introduces model-driven methodology to the game domain, shifting the focus of game development from coding to modeling in order to make the game production faster and easier. The research on MDGD is concerned with both the general model-driven software development methodology and the particular characteristics of game development. MDGD implies changes in many aspects of game development, from the technology to the workflow and the organization. This article presents the state of art of MDGD research based on a literature review including 23 individual work in the field. The review includes four perspectives: target domain, domain framework, modelling language, and tooling. We also present our reflections during the review.
The fireworks algorithm, which is inspired from the phenomenon of fireworks explosion, is a special swam intelligence optimization algorithm proposed in 2010. Since then, it has been attracting more and more research interest and has been widely employed in many real-world problems due to its unique search manner and high efficiency. In this paper, we present a comprehensive review of its advances and applications. We begin with an introduction to the original fireworks algorithm. Then we review its algorithmic research work for single objective and multi-objective optimization problems. After that, we present the theoretical analyses of the fireworks algorithm. Finally, we give a brief overview of its applications and implementations. Hopefully this paper could provide a useful road map for researchers and practitioners who are interested in this algorithm and inspire new ideas for its further development.
In this survey, 105 papers related to interactive clustering were reviewed according to seven perspectives: (1) on what level is the interaction happening, (2) what interactive operations are involved, (3) how is user feedback incorporated, (4) how is the interactive clustering evaluated, (5) which data and (6) which clustering methods have been used, and (7) what outlined challenges there are. This paper serves as a comprehensive overview of the field and outlines the state-of-the-art within the area as well as identifies challenges and future research needs.
The challenges of cloud forensics have been well-documented by both researchers and government agencies (e.g. U.S. National Institute of Standards and Technology), although many of the challenges remain unresolved. In this paper, we perform a comprehensive survey of cloud forensic literature published between Jan 1, 2007 and Dec 11, 2018, categorized using a five-step forensic investigation process. We also present a taxonomy of existing cloud forensic solutions, with the aim of better informing both the research and practitioner communities, as well as an in-depth discussion of existing conventional digital forensic tools and cloud-specific forensic investigation tools. Based on the findings from the survey, we present a set of design guidelines to inform future cloud forensic investigation processes, and a summary of digital artifacts that can be obtained from different stakeholder in the cloud computing architecture / ecosystem.