Environmental audio scene and sound event recognition for autonomous surveillance : A survey and comparative studies
In the absence of any global positioning infrastructure for indoor environments, research on supporting human indoor localization and navigation trails decades behind research on outdoor localization and navigation. The major barrier to broader progress has been the dependency of indoor positioning on environment-specific infrastructure and resulting tailored technical solutions. Combined with the fragmentation and compartmentalization of indoor environments this poses significant challenges to widespread adoption of indoor location-based services. This article puts aside all approaches of infrastructure-based support for human indoor localization and navigation and instead reviews technical concepts that are independent of sensors embedded in the environment. The reviewed concepts rely on a mobile computing platform with sensing capability and a human interaction interface ('smartphone'). This platform may or may not carry a stored a map of the environment, but does not require in-situ internet access. In this regard, the presented approaches are more challenging than any localization and navigation solutions specific to a particular, infrastructure-equipped indoor space since they are not adapted to local context, and they may lack some of the accuracy achievable with those tailored solutions. On the other hand, only these approaches have the potential to be universally applicable.
Fairness assumptions are a valuable tool when reasoning about systems. In this paper, we classify several fairness properties found in the literature and argue that most of them are too restrictive for many applications. As an alternative we introduce the concept of justness.
Today?s Cyber-Physical Systems (CPS) are facing new cyber-attacks on daily basis. Traditional cyber security approaches and intrusion detection systems are based on old threat-knowledge and need to be updated on daily basis to stand against new generations of cyber-threats. To update the threat-knowledge database, there is a need for proper management and processing of the generated data. In recent years, computing platforms based on representation learning methodologies are emerging as a useful resource to manage and exploit the generated data to extract meaningful information. If properly utilized, strong intrusion prevention systems can be developed to protect CPS using these platforms. In this survey, we frst highlight various cyber-threats and initiatives taken by international organizations. Then we discuss various computing platforms based on representation learning models to process the generated data. We also highlight various popular data sets that can be used to train representation learning models. Recently made e?orts in the representation learning domain to protect CPS against cyber-threats are also discussed in detail. Finally, we highlight limitations as research challenges when using the available data sets and representation learning techniques designed for cyber security.
The wide proliferation of various wireless communication systems and wireless devices has led to the arrival of big data era in large scale wireless networks. Big data of large scale wireless networks has the key features of wide variety, high volume, real-time velocity and huge value leading to the unique research challenges that are different from existing computing systems. In this paper, we present a survey of the state-of-art big data analytics (BDA) approaches for large scale wireless networks. In particular, we categorize the life cycle of BDA into four consecutive stages: Data Acquisition, Data Preprocessing, Data Storage and Data Analysis. We then present a detailed survey of the technical solutions to the challenges in BDA for large scale wireless networks according to each stage in the life cycle of BDA. Besides, we discuss the open research issues and outline the future directions in this promising area.
Although most human-technology interactions are still based on traditional desktop/mobile interfaces that involve primarily the visual and audio senses, in recent years we have witnessed a progress towards multisensory experiences. Companies are proposing new additions to the multisensory world and are unveiling new products that promise to offer amazing experiences exploiting mulsemedia - multiple sensorial media - where users can perceive odors, tastes, and the sensation of wind blowing against their face. Whilst researchers, practitioners and users alike are faced with a wide-range of such new devices, relatively little work has been undertaken to summarize efforts and initiatives in this area. The current paper addresses this shortcoming in two ways - firstly, by presenting a survey of devices targeting senses beyond that of sight and hearing; secondly, by describing an approach to guide newcomers and experienced practitioners alike to build their own mulsemedia environment, both in a desktop setting and in an immersive 360? environment.
Contrary to using distant and centralized cloud data center resources, employing decentralized resources at the edge of a network for processing data closer to user devices, such as smartphones and tablets, is an upcoming computing paradigm, referred to as fog/edge computing. Fog/edge resources are typically resource-constrained, heterogeneous, and dynamic compared to the cloud, thereby making resource management an important challenge that needs to be addressed. This article reviews publications as early as 1991, with 85% of the publications between 2013-2018, to identify and classify the architectures, infrastructure, and underlying algorithms for managing resources in fog/edge computing.
Although computer hardware is getting increasingly more powerful following the Moores laws, nothing stops end users from demanding for more immersive viewing experience in video streaming applications. 360° videos have become a popular video format because Head-Mounted Displays (HMDs) are mass-produced. HMDs allow viewers to naturally navigate through 360° videos by rotating their heads or rolling their eyes. Streaming 360° videos over the best-effort Internet, however, imposes tremendous challenges, because of the high resolution (>8 K) and the short response time (<100 ms) requirements. This survey presents the current literature related to 360° video streaming. We start from 360° video streaming systems built for real experiments for showing the practicality and efficiency of 360° video streaming.We then present the video and viewer datasets, which may be used to drive large-scale simulations. Different optimization tools in different stages of the 360° video streaming pipeline are discussed in details. We also present various applications enabled by 360° video streaming. This is followed by a quick review on the off-the-shelf hardware available at the time of writing. Last, future research directions are highlighted.
With the advent of fog and edge computing paradigms, computation capabilities have been moved towards the edge of the network to support the requirements of highly demanding services. To ensure the quality of such services is still met in the event of users? mobility, migrating services across different computing nodes becomes essential. Several studies have emerged recently to address service migration in different edge-centric research areas, including fog computing, multi-access edge computing (MEC), cloudlets and vehicular clouds. Since existing surveys in this area either focus on VM migration in general or migration in a single research field (e.g. MEC), the objective of this survey is to bring together studies from different, yet related, edge-centric research fields, while capturing the different facets they addressed. More specifically, we examine the diversity characterizing the landscape of migration scenarios at the edge, we present an objective-driven taxonomy of the literature and we highlight contributions that rather focused on architectural design and implementation. Finally, we identify a list of gaps and research opportunities based on the observation of the current state of the literature. One such opportunity lies in joining efforts from both networking and computing research communities to facilitate future research in this area.
Motion Capture and whole-body interaction technologies have been experimentally proven to contribute to the enhancement of dance learning and to the investigation of bodily knowledge, innovating at the same time the practice of dance. Designing and implementing a dance interactive learning system with the aim to achieve effective, enjoyable and meaningful educational experiences is, however, a highly demanding interdisciplinary and complex problem. In this work we examine the interactive dance training systems that are described in the recent bibliography, proposing a framework of the most important design parameters, which we present along with particular examples of implementations. We discuss the way that the different phases of a common workflow are designed and implemented in these systems, examining aspects such as the visualization of feedback to the learner, the movement qualities involved, the technological approaches used as well as the general context of use and learning approaches. Our aim is to identify common patterns and areas that require further research and development towards creating more effective and meaningful digital dance learning tools.
Cryptographic hash functions are widely used primitives with a purpose to ensure the integrity of data. Hash functions are also utilized in conjunction with digital signatures to provide authentication and non-repudiation services. The SHA has been developed over time by the National Institute of Standards and Technology for security, optimal performance, and robustness. The best-known hash standards are SHA-1, SHA-2, and SHA-3. Security is the most notable criterion for evaluating the hash functions. However, hardware performance of an algorithm serves as a tiebreaker among the contestants when all other parameters (security, software performance, and flexibility) have equal strength. Field Programmable Gateway Array (FPGA) is re-configurable hardware that supports a variety of design options, making it the best choice for implementing the hash standards. In this survey, particular attention is devoted to the FPGA optimization techniques for the three hash standards. The study covers several types of optimization techniques and their contributions to the performance of FPGAs. Moreover, the article highlights the strengths and weaknesses of each of the optimization methods and their influence on performance. We are optimistic that the study will be a useful resource encompassing the efforts carried out on the SHAs and FPGA optimization techniques in a consolidated form.
With the high demand for wireless data traffic, WiFi networks have a very rapid growth because they provide high throughput and are easy to deploy. Recently, there are many papers using WiFi for different sensing applications. This survey presents a comprehensive review of WiFi sensing applications from more than 140 papers. The survey groups different WiFi sensing applications into three categories: detection, recognition, and estimation. Detection applications try to solve binary classification problems, recognition applications aim at multi-class classifications problems, and estimation applications try to get the quantity values of different tasks. Different WiFi sensing applications have different requirements of signal processing techniques and classification/estimation algorithms. This survey gives a summary of signal processing techniques and classification/estimation algorithms that are widely used for WiFi sensing applications. The survey also presents the future WiFi sensing trends: integrating cross-layer network stack information, multi-device cooperation, and fusion of different sensors. These WiFi sensing technologies help enhance the existing WiFi sensing capabilities and enable new WiFi sensing opportunities. The targets of future WiFi sensing may go beyond from humans to environments, animals, and objects.
Textual deception constitutes a major problem for online security. Many studies have argued that deceptiveness leaves traces in writing style, which could be detected using text classification techniques. By conducting an extensive literature review of existing empirical work, we demonstrate that while certain linguistic features have been indicative of deception in certain corpora, they fail to generalize across divergent semantic domains. We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide superior means of classifying texts as potentially deceptive. Additionally, we discuss forms of deception beyond semantic content, focusing on hiding author identity by writing style obfuscation. Surveying the literature on both author identification and obfuscation techniques, we conclude that current style transformation methods fail to achieve reliable obfuscation while simultaneously ensuring semantic faithfulness to the original text. We propose that future work in style transformation should pay particular attention to disallowing semantically drastic changes.
Volunteer Computing is a kind of distributed computing that harnesses the aggregated spare computing resources of volunteer devices. It provides a cheaper and greener alternative computing infrastructure that can complement the dedicated, centralized, and expensive data centers. The aggregated idle computing resources of computers are being utilized to provide the much-needed computing infrastructure for compute-intensive tasks such as scientific simulations and big data analysis. However, the use of Volunteer Computing is still dominated by scientific applications and only a very small fraction of the potential volunteer nodes are participating. This paper provides a comprehensive survey of Volunteer Computing, covering key technical and operational issues such as security, task distribution, resource management, and incentive models. The paper also presents a taxonomy of Volunteer Computing systems, together with discussions of the characteristics of specific systems in each category. In order to harness the full potentials of Volunteer Computing and make it a reliable alternative computing infrastructure for general applications, we need to improve the existing techniques and device new mechanisms. Thus, this paper also sheds light on important issues regarding the future research and development of Volunteer Computing systems with the aim of making them a viable alternative computing infrastructure.
Phase Change Memory (PCM) is an emerging memory technology which has the capability to be used at main memory level of the memory hierarchy due to poor scalability, considerable leakage power, and high cost/bit of DRAM. PCM is a new resistive memory which is capable of storing data based on resistance values. The wide resistance range of PCM allows storing multiple bits per cell (MLC) rather than a single bit per cell (SLC). Unfortunately, PCM cells suffer from short lifetime. That means, PCM cells could tolerate a limited number of write operations and afterward, they tend to permanently stuck-at a constant value. Limited lifetime is a significant issue related to PCM memory, hence, in recent years many studies are conducted to prolong PCM lifetime. These schemes have vast variety and are applied at different architectural levels. In this survey, we review the important works of such schemes in order to give insights to those starting to research on PCMs.
Understanding peoples expertise is not a trivial task, since it is time-consuming when manually executed. Automated approaches have become a topic of research in recent years in various scientific fields, such as information retrieval, databases and machine learning. This article carries out a survey on automated expertise retrieval, i.e., finding data linked to a person that describes his expertise, which allows tasks such as profiling or finding people with a certain expertise. A faceted taxonomy is introduced that covers many of the existing approaches and classifies them on the basis of features chosen from studying the state-of-the-art. A list of open issues, with suggestions for future research topics, is introduced as well. It is hoped that our taxonomy and review of related works on expertise retrieval will be useful when analyzing different proposals and allow a better understanding of existing work and a systematic classification of future work on the topic.
Adaptive Authentication allows a system to dynamically select the best mechanism(s) for authenticating a user depending on contextual factors, such as location, proximity to devices, and other attributes. Though this technology has the potential to change the current password-dominated authentication landscape, research to date has not lead to practical solutions that transcend to our daily lives. Motivated to find out how to improve adaptive authentication design, we provide a structured survey of the existing literature to date and analyze it to derive future research directions.
Interest in processing big data has increased rapidly to gain insights that can transform businesses, government policies and research outcomes. This has led to advancement in communication, programming and processing technologies, including Cloud computing services and technologies such as Hadoop, Spark and Storm. This trend also affects the needs of analytical applications, which are no longer monolithic but composed of several individual analytical steps running in the form of a workflow. These Big Data Workflows are vastly different in nature from traditional workflows. Researchers are currently facing the challenge of how to orchestrate and manage the execution of such workflows. In this paper, we discuss in detail orchestration requirements of these workflows as well as the challenges in achieving these requirements. We also survey current trends and research that supports orchestration of big data workflows and identify open research challenges to guide future developments in this area.
The gap between the speed of the memory systems and processors has motivated large bodies of work on hiding or lessening the delay of memory accesses. Data prefetching is a well-known and widely-used approach to hide the data access latency. It has been shown that data prefetching is able to significantly improve the performance of processors by overlapping computation with data delivery. There are a wide variety of prefetching techniques where each one is suitable for a particular class of workloads. This survey analyzes the state-of-the-art hardware data prefetching techniques and sheds light on their design trade-offs. Moreover, we quantitatively compare state-of-the-art prefetching techniques for accelerating server workloads. To have a fair comparison, we choose a target architecture based on a contemporary server processor and stack competing prefetchers on top of it. For each prefetching technique, we thoroughly evaluate the performance improvement along with the imposed overheads. The goal of this survey is to shed light on the status of the state-of-the-art data prefetchers and motivate further work on improving data prefetching techniques.
Binary rewriting is changing the semantics of a program without having the source code at hand. It is used for diverse purposes as emulation (e.g., Qemu), optimization (e.g., DynInst), observation (e.g., Valgrind) and hardening (e.g., SecondWrite). This survey gives detailed insight into the development and state-of-the-art in binary rewriting by reviewing 56 publications from 1992 up to 2018. First, we provide an in-depth investigation of the challenges and respective solutions of the steps to successful binary rewriting. Based on our findings we establish a thorough categorization of binary rewriting approaches with respect to their use-case, applied analysis technique, code-transformation method and code generation techniques. Furthermore, we contribute a comprehensive mapping between binary rewriting tools, applied techniques and their domain of application. Our findings emphasize that although much work has been done over the last decades, most of the effort was put into improvements aiming at the x86 architecture ignoring other instruction set architectures like ARM or MIPS. This is of special interest as these kind of architectures are often used in the emerging field of the Internet of Things. To the best of our knowledge, our survey is the first comprehensive overview on the complete binary rewriting process.
Model comparison has been widely used to support many tasks in model-driven software development. For this reason, many comparison techniques have been proposed in the last decades. However, academia and industry have overlooked the production of a panoramic view of the current literature. Hence, a thorough understanding of the state-of-the-art techniques remains limited and inconclusive. This article, therefore, focuses on providing a classification and a thematic analysis of studies on comparison of software design models. We carried out a Systematic Mapping Study, following well-established guidelines, for answering nine research questions. In total, 55 articles (out of 4132) were selected from ten widely recognized electronic databases after a careful filtering process. The main results are that the majority of the primary studies (1) provide coarse-grained comparison techniques of general-purpose diagrams, (2) adopt graph as the principal data structure and compare software design models considering structural properties only, (3) pinpoint commonalities and differences between software design models, rather than score their similarity, (4) propose new techniques, whereas neglect the production of empirical knowledge from experimental studies, and (5) propose automatic techniques without demonstrating their effectiveness. Finally, this article highlights some challenges and further directions that might be explore in upcoming studies.
Many scientists use scripts for designing experiments since script languages deliver sophisticated data structures, simple syntax, and easiness to obtain results without spending much time on designing systems. While scripts provide adequate features for scientific programming, they fail to guarantee the reproducibility of experiments, and they present challenges for data management and understanding. These challenges include, but are not limited to: understanding each trial (experiment execution); connecting several trials to the same experiment; tracking the difference between these trials; and relating results to the experiment inputs and parameters. Such challenges can be addressed with the help of provenance and multiple approaches have been proposed with different techniques to support collecting, managing, and analyzing provenance in scripts. In this work, we propose a classification taxonomy for the existing state-of-the-art techniques and we classify them according to the proposed taxonomy. The identification of state-of-the-art approaches followed an exhaustive protocol of forward and backward literature snowballing.
Indoor localization is essential for healthcare, security, augmented reality gaming, and other location-based services. There is currently a wealth of relevant literature on indoor localization. This paper focuses on recent advances and challenges in indoor localization methods that use spatial information and landmarks to improve the location estimation. Spatial information in the form of maps, spatial model, etc. have been used to improve the localization by constraining location estimates in the navigable parts of the indoor environment. Landmarks such as doors and corners are quite useful in assisting indoor localization by calibrating a user's step length and heading. This survey gives a comprehensive review of state-of-the-art indoor localization methods using maps, spatial models, and landmarks.
Deep neural networks have proven to be particularly effective in visual and audio recognition tasks. Existing models tend to be computationally expensive and memory intensive, however, and so methods for hardware-oriented approximation have become a hot topic. Research has shown that custom hardware-based neural network accelerators can surpass their general-purpose processor equivalents in terms of both throughput and energy efficiency. Application-tailored accelerators, when co-designed with approximation-based network training methods, transform large, dense and computationally expensive networks into small, sparse and hardware-efficient alternatives, increasing the feasibility of network deployment. In this article, we provide a comprehensive evaluation of approximation methods for high-performance network inference along with in-depth discussions of their effectiveness for custom hardware implementation. We also include proposals for future research based on a thorough analysis of current trends. This article represents the first survey providing detailed comparisons of custom hardware accelerators featuring approximation for both convolutional and recurrent neural networks, through which we hope to inspire exciting new developments in the field.
Data mining information about people is becoming increasingly important in the data-driven society of the 21st century. Unfortunately, sometimes there are real-world considerations that conflict with the goals of data mining; sometimes the privacy of the people being data mined needs to be considered. This necessitates that the output of data mining algorithms be modified to preserve privacy while simultaneously not ruining the predictive power of the outputted model. Differential privacy is a strong, enforceable definition of privacy that can be used in data mining algorithms, guaranteeing that nothing will be learned about the people in the data that could not already be discovered without their participation. In this survey, we focus on one particular data mining algorithm -- decision trees -- and how differential privacy interacts with each of the components that constitute decision tree algorithms. We analyze both greedy and random decision trees, and the conflicts that arise when trying to balance privacy requirements with the accuracy of the model.
Group key agreement (shorten as GKA) protocol enables a group of users to negotiate a one-time session key and protect the thereafter group-oriented communication with this session key across an unreliable network. The number of communication rounds is one of the main concern for practical applications where the cardinality of group participants involved is considerable. It is critical to have fixed constant rounds in GKA protocols to secure these applications. In light of overwhelming variety and multitude of constant-round GKA protocols, this paper surveys these protocols from a series of perspectives to supply better comprehension for researchers and scholars. Concretely, this article captures the state-of-the-art of constant-round GKA protocols by analyzing the design rationale, examining the framework and security model, and evaluating all discussed protocols in terms of efficiency and security properties. In addition, this article discusses the extension of constant-round GKA protocols including dynamic membership updating, password-based, affiliation-hiding and fault-tolerance. In conclusion, this article also points out a number of interesting future directions.
Orthogonal Moments provide an efficient mathematical framework for computer vision, image analysis and pattern recognition. They are derived from the polynomials which are relatively perpendicular to each other. Orthogonal Moments are more efficient than non-orthogonal moments for image representation with minimum attribute redundancy, robustness to noise, invariance to rotation, translation and scaling. Orthogonal Moments can be both continuous and discrete. Prominent continuous moments are Zernike, Pseudo-Zernike, Legendre and Gaussian-Hermite. This paper provides a comprehensive and comparative review for continuous orthogonal moments along with their applications.
Beyond 2014: Formal methods for attack tree-based security modeling
Virtualization works as an underlying technology behind the success of cloud computing. It runs multiple operating systems simultaneously by means of virtual machine. Through virtual machine live migration, virtualization efficiently manages resources within cloud datacenter with the minimum service interruption. Precopy and postcopy are the traditional techniques of virtual machine memory live migration. Out of these two techniques, precopy is widely adopted due to its reliability in terms of destination side crash. A large number of migrations take place within the datacenters for resource management purpose. Virtual machine live migration a?ects the performance of virtual machine as well as system performance, hence it needs to be efficient. In this paper, several precopy based methods for efficient virtual machine memory live migration are properly classified and discussed. The paper compares these methods on several parameters like their methods, goals, limitations, performance parameter evaluated, virtualization platform used, and workload used. Further, the paper also shows the analytical comparison between di?erent virtualized benchmark platforms to understand the implementation aspects and some open areas related to VM live migration with their issues.
Semantic Annotation is a crucial precondition towards semantic web and has long been a research topic among communities. Currently, the most promising results are achieved via manual/semi-supervised approaches or hybrid of these two. There are already many surveys targeting the semantic annotators adopting manual/semi-supervised approaches. However, a comprehensive survey targeting unsupervised semantic approach is severely missing. Supervised approach means human intervention and training examples are required. With the vast amount of documents need to be annotated, fully automated semantic annotation is still the ultimate goal. Though fully automatic semantic annotation is hard, there are many works toward this goal. To better understand the state-of-art of fully automatic approaches for semantic annotation, this paper investigate literatures and present a classification of the approaches. In contrast to existing surveys, this paper focuses on fully automatic approaches. This paper helps reader understand the existing unsupervised approaches and get insight of the state-of-art.
Workflow scheduling is one of the challenging issues in emerging trends of the distributed environment that focuses on satisfying various quality of service (QoS) constraints. The cloud receives the applications as a form of a workflow, consisting of a set of interdependent tasks, to solve the large-scale scientific or enterprise problems. Workflow scheduling in cloud environment has been studied extensively over the years, and the paper provides a comprehensive review of the approaches. The paper analyses the characteristics of various workflow scheduling techniques and classifies them based on their objectives and execution model. In addition, the recent technological developments and paradigms such as serverless computing and Fog computing are creating new requirements/opportunities for workflow scheduling in a distributed environment. The serverless infrastructures are mainly designed for processing background tasks such as Internet-of-Things(IoT), web applications or event-driven applications. To address the ever-increasing demands of resources and to overcome the drawbacks of the cloud-centric IoT, Fog computing paradigm has been developed. The paper also discusses the workflow scheduling in the context of these emerging trends of cloud computing.
Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. Specifically, we present trends in DNN architectures and the resulting implications on parallelization strategies. We discuss the different types of concurrency in DNNs; synchronous and asynchronous stochastic optimization; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, we extrapolate potential directions for parallelism in deep learning.
Food is essential for human life and it is fundamental to the human experience. Food-related study may support multifarious applications and services, such as guiding the human behavior, improving the human health and understanding the culinary culture. With the fast development of social networks, mobile networks, and Internet of Things, people commonly upload, share, and record food images, recipes, cooking videos, and food diaries leading to large-scale food data. Large-scale food data offers rich knowledge about food and can help tackle many central issues of human society. Therefore, it is time to group several disparate issues related to food computing. Food computing acquires and analyzes heterogeneous food data from disparate sources for perception, recognition, retrieval, recommendation, and monitoring of food. In food computing, computational approaches are applied to address food related issues in various fields. Both large-scale food data and recent breakthroughs in computer science are transforming the way we analyze food data. Therefore, a vast amount of research work has been done, targeting different food-oriented tasks and applications. We formalize food computing and present such a comprehensive overview of various emerging concepts, methods, and tasks. We summarize key challenges and future directions ahead for food computing.
Blockchain offers a totally different approach to storing information, making transactions, performing functions, and establishing trust in an open environment. Many consider blockchain as a technology breakthrough for cryptography and cybersecurity, with use cases ranging from globally deployed cryptocurrency systems like Bitcoin, to smart contracts, smart grids over the Internet of Things, and so forth. Although blockchain has received growing interests in both academia and industry over the past five years, the security and privacy of blockchains continue to be at the center of the debate when deploying blockchain in different applications. This paper presents a comprehensive overview of the security and privacy of blockchain. To facilitate the discussion, we first describe the concept of blockchains for online transactions. Then we describe the basic security properties that are inherent in Bitcoin like cryptocurrency systems and the additional security properties that are desired in many blockchain applications. Finally, we review the security and privacy techniques for achieving these security properties in blockchain-based systems. We conjecture that this survey may help readers to gain an in-depth understanding of the security and privacy of blockchain with respect to concept, attributes, techniques and systems.
Although malicious software (malware) has been around since the early days of computers, the sophistication and innovation of malware has increased over the years. In order to protect institutions and the public from malware attacks, malicious activity must be detected as early as possible. Analyzing a suspicious file by static or dynamic analysis methods can provide relevant and valuable information regarding a file?s impact on the hosting system and help determine whether the file is malicious or not. Although dynamic analysis is more robust than static analysis, existing dynamic analysis tools and techniques are imperfect, and there is no single tool that can cover all aspects of malware behavior. Over the last seven years, computing environment has changed dramatically with new types of malware (ransomware, cryptominers); new analysis methods (volatile-memory forensics, side-channel analysis); new computing environments (cloud-computing, IoT devices); and more. The goal of this survey is to provide a comprehensive and up-to-date overview of existing methods used to dynamically analyze malware, which includes a description of each method, its strengths and weaknesses, and resilience against malware evasion techniques. In addition, we present prominent studies that applied machine learning methods to enhance dynamic analysis aimed at malware detection and categorization.
The machine learning community has been overwhelmed by a plethora of deep learning based approaches. Many challenging computer vision tasks such as detection, localization, recognition and segmentation of objects in unconstrained environment are being efficiently addressed by various types of deep neural networks like convolutional neural networks, recurrent networks, adversarial networks, autoencoders and so on. While there have been plenty of analytical studies regarding the object detection or recognition domain, many new deep learning techniques have surfaced with respect to image segmentation techniques. This paper approaches these various deep learning techniques of image segmentation from an analytical perspective. The main goal of this work is to provide an intuitive understanding of the major techniques that has made significant contribution to the image segmentation domain. Starting from some of the traditional image segmentation approaches, the paper progresses describing the effect deep learning had on the image segmentation domain. Thereafter, most of the major segmentation algorithms have been logically categorized with paragraphs dedicated to their unique contribution. With an ample amount of intuitive explanations, the reader is expected to have an improved ability to visualize the internal dynamics of these processes.
Integration of various embedded multimedia devices with powerful computing platforms, e.g., machine learning platforms, helps to build smart cities and transforms the concept of the Internet of Things (IoT) into the Internet of Multimedia Things (IoMT). To provide security and infotainment applications to the residents of smart cities, IoMT technology will generate huge volumes of multimedia data. Management of such large data is a challenging task for IoMT technology. Without proper management, it is hard to maintain consistency, reusability, and reconcilability of multimedia data, generated by embedded multimedia devices in smart cities. Various feature or representation learning techniques can be utilized for automatic classification of raw multimedia data, and to allow machines to learn features and perform a specific task. In this survey, we focus on various representation learning platforms for processing and management of multimedia data, generated by different applications in smart cities. We also highlight various limitations and research challenges that smart city applications can face while processing huge multimedia data in real-time.
The number of applications being developed that require access to knowledge about the real world have increased rapidly over the past two decades. Domain ontologies, which formalize the terms being used in a discipline, have become essential for research in areas such as Machine Learning, the Internet of Things, Robotics, and Natural Language Processing because they enable separate systems to exchange information. The quality of these domain ontologies, however, must be assured for meaningful communication. Assessing the quality of domain ontologies for their suitability to potential applications remains difficult, even though a variety of frameworks and metrics have been developed for doing so. This paper reviews domain ontology assessment efforts to highlight the work that has been carried out and to clarify the important issues that remain. These assessment efforts are classified into six distinct evaluation approaches and the state-of-the-art of each described. Challenges associated with domain ontology assessment are outlined and recommendations made for future research and applications.
Internet of Things (IoT) devices are gaining momentum as mechanisms to authenticate the porting user. Then, it is critical to ensure that such user is not impersonated at any time. This need is known as Continuous Authentication (CA). Since 2007, a plethora of IoT-based CA academic research and industrial contributions have been proposed. We offer a comprehensive overview of 62 research papers regarding the main components of a CA system. The status of the industry is studied as well, covering 37 market contributions, research projects and related standards. Lessons learned to foster further research in this area are finally presented.
The skin offers exciting possibilities for human--computer interaction by enabling new types of input and feedback. We survey 42 research papers on interfaces that allow users to give input on their skin. Skin-based interfaces have developed rapidly over the past eight years but most work consists of individual prototypes, with limited overview of possibilities or identification of research directions. The purpose of this article is to synthesize what skin input is, which technologies can sense input on the skin, and how to give feedback to the user. We discuss challenges for research in each of these areas.
The variety of data is one of the most challenging issues for the research and practice in data management systems. The data are naturally organized in different formats and models, including structured data, semi-structured data, and unstructured data. In this survey, we introduce the area of multi-model DBMSs which build a single database platform to manage multi-model data. Even though multi-model databases are a newly emerging area, in recent years we have witnessed many database systems to embrace this category. We provide a general classification and multi-dimensional comparisons for existing multi-model databases. This comprehensive introduction on existing approaches and open problems, from the technique and application perspective, make this survey useful for motivating new multi-model database approaches, as well as serving as a technical reference for developing multi-model database applications.
A systematic literature review is presented that surveyed the topic of cloud testing over the period (2012-2017). Cloud testing can refer either to testing cloud-based systems (testing of cloud), or to leverage the cloud for the purpose of testing (testing in the cloud): both approaches (and their combination into testing of cloud in the cloud) have drawn research interest. An extensive paper search was conducted by both automated query of popular digital libraries and by snowballing, which resulted into the final selection of 147 primary studies. Along the study a classification framework has been incrementally derived. The paper includes a quantitative analysis of the primary studies against such framework, as well as a discussion of their main highlights. We can conclude that cloud testing is an active and variegated research field, although not all topics have received so far enough attention.
The emergent context-aware applications in ubiquitous computing demands for obtaining accurate location information of humans or objects in real-time. Indoor location-based services can be delivered through implementing different types of technology among which is a recent approach that utilizes LED lighting as a medium for Visible Light Communication (VLC). The ongoing development of solid-state lighting (SSL) is resulting in the wide increase of using LED lights and thereby building the ground for a ubiquitous wireless communication network from lighting systems. Considering the recent advances in implementing Visible Light Positioning (VLP) systems, this paper presents a review of VLP systems and focuses on the performance evaluation of experimental achievements on location sensing through LED lights. We have outlined the performance evaluation of different prototypes by introducing new performance metrics, their underlying principles, and their notable findings. Furthermore, the study synthesizes the fundamental characteristics of VLC-based positioning systems that need to be considered, presents several technology gaps based on the current state-of-the-art for future research endeavors, and summarizes our lessons-learned towards the standardization of the performance evaluation.
Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents and other media formats. These infection vectors hide embedded malicious code to the victim users, thus facilitating the use of social engineering techniques to infect their machines. In the last decade, machine-learning algorithms provided an effective defense against such threats, being able to detect malware embedded in various infection vectors. However, the existence of an arms race in an adversarial setting like that of malware detection has recently questioned their appropriateness for this task. In this work, we focus on malware embedded in PDF files, as a representative case of how such an arms race can evolve. We first provide a comprehensive taxonomy of PDF malware attacks, and of the various learning-based detection systems that have been proposed to detect them. Then, we discuss more sophisticated attack algorithms that craft evasive PDF malware oriented to bypass such systems. We describe state-of-the-art mitigation techniques, highlighting that designing robust machine-learning algorithms remains a challenging open problem. We conclude the paper by providing a set of guidelines for designing more secure systems against the threat of adversarial malicious PDF files.
This survey provides an overview of the scientific literature on timing verification techniques for multi-core real-time systems. It reviews the key results in the field from its origins around 2006 to the latest research published up to the end of July 2018. The survey highlights the key issues involved in providing guarantees of timing correctness for multi-core systems. A detailed review is provided covering four main categories: full integration, temporal isolation, integrating interference effects into schedulability analysis, and mapping and allocation. The survey concludes with a discussion of the advantages and disadvantages of these different approaches, identifying open issues, key challenges, and possible directions for future research.
The Information-centric Network paradigm is a Future Internet approach aiming to tackle the Internet architectural problems and inefficiencies by swapping the main entity of the network architecture from hosts to content items. This paradigm change potentially enables a future Internet with better performance, reliability, scalability, and suitability for wireless and mobile communication. It also provides new intrinsic means to deal with some popular attacks on the Internet architecture, such as denial of service. However, this new paradigm also represents new challenges related to security that need to be addressed, to ensure its capability to support current and future Internet requirements. This paper surveys and summarizes ongoing research concerning security aspects of information-centric networks, discussing vulnerabilities, attacks, and proposed solutions to mitigate them. We also discuss open challenges and propose future directions regarding research in information-centric networks security.
Recent global smart city efforts resemble the establishment of electricity networks when electricity was first invented, which meant the start of a new era to sell electricity as a utility. A century later, in the smart era, the network to deliver services goes far beyond a single entity like electricity. Supplemented by a well-established internet infrastructure that can run an endless number of applications, abundant processing and storage capabilities of cloud, resilient edge-computing, and sophisticated data analysis like machine learning and deep learning, an already-booming Internet of Things (IoT) movement makes this new era far more exciting. In this article, we present a multi-faceted survey of machine intelligence in modern implementations. We partition smart city infrastructure into application, sensing, communication, security, and data planes and put an emphasis on the data plane as the mainstay of computing and data storage. We investigate i) a centralized and distributed implementation of data plane's physical infrastructure and ii) a complementary application of data analytics, machine learning, deep learning, and data visualization to implement robust machine intelligence in a smart city software core. We finalize our paper with pointers to open issues and challenges.
Recent advances in Internet of Things (IoT) have enabled myriad domains such as smart homes, personal monitoring devices, and enhanced manufacturing. IoT is now pervasive---new applications are being used in nearly every conceivable environment, which leads to the adoption of device-based interaction and automation. However, IoT has also raised issues about the security and privacy of these digitally augmented spaces. Program analysis is crucial in identifying those issues, yet the application and scope of program analysis in IoT remains largely unexplored by the technical community. In this paper, we study privacy and security issues in IoT that require program-analysis techniques with an emphasis on identified attacks against these systems and defenses implemented so far. Based on a study of five IoT programming platforms, we identify the key insights that result from research efforts in both the program analysis and security communities and relate the efficacy of program-analysis techniques to security and privacy issues. We conclude by studying recent IoT analysis systems and exploring their implementations. Through these explorations, we highlight key challenges and opportunities in calibrating for the environments in which IoT systems will be used.
The color is a powerful communication component everywhere, not only as part of the message and its meaning, but also as way of discriminating contents therein. However, 5\% of world population suffer from a visual impairment, known as color vision deficiency (CVD), commonly known as colorblindness, which constrains the color perception. This handicap adulterates the way the color is perceived, compromising the reading and understanding of the message contents. This issue becomes even more serious due to the increasing availability of multimedia contents in computational environments, mainly on web and other resources provided by Internet, as well due to the growing on graphical software and tools. Aware of this problem, a significant number of research works related with the CVD condition have been described in the literature in the last two decades, in particular those aimed at improving the readability of contents via color enhancing, independently of they include text, images or both. This survey mainly addresses the state-of-the-art with respect to recoloring algorithms for still images, as well as to identify the current trends in the color adaptation techniques for colorblind people.
Anomaly detection has attracted many applications in diverse research areas. In network security, it has been widely used for discovering network intrusions and malicious events. Detection of anomalies in quantitative data has received a considerable attention in the literature and has a venerable history. By contrast, and despite the widespread use of categorical data in practice, anomaly detection in categorical data has received relatively little attention. This is because detection of anomalies in categorical data is a challenging problem. One such a challenge is that anomaly detection techniques usually depend on identifying representative patterns then measuring distances between objects and these patterns. However, identifying patterns and measuring distances are not easy in categorical data. Fortunately, several papers focussing on the detection of anomalies in categorical data have been published in the recent literature. In this article, we provide a comprehensive review of the research on anomaly detection problem in categorical data. We categorize existing algorithms into different approaches based on the conceptual definition of anomalies they use. For each approach, we survey anomaly detection algorithms, and then show the similarities and differences among them.