This article comprehensively surveys Arabic Online Handwriting Recognition (AOHR). We address the challenges posed by online handwriting recognition including ligatures, dots and diacritic problems, online/offline touching of text, and geometric variations. Then, we present a general model of AOHR system that incorporates the different phases of an AOHR system. We summarize the main AOHR databases and identify their uses and limitations. Preprocessing techniques that are used in AOHR, viz. normalization, smoothing, de-hooking, baseline identification, and delayed stroke processing, are presented with illustrative examples. We discuss different techniques for Arabic online handwriting segmentation at the character and morpheme levels and identify their limitations. Feature extraction techniques that are used in AOHR are discussed and their challenges identified. We address the classification techniques of non-cursive (characters and digits) and cursive Arabic online handwriting and analyze their applications. We discuss different classification techniques, viz. structural approaches, SVM, Fuzzy SVM, Neural Networks, HMM, Genetic algorithms, decision trees, and rule-based systems, and analyze their performance. Post-processing techniques are also discussed. Several tables that summarize the surveyed publications are provided for ease of reference and comparison. In the conclusions, we summarize the current limitations and difficulties of AOHR, and future directions of research.
The twenty-first century has ushered in the age of data economy, in which data DNA becomes an intrinsic constituent of all data-based organisms and carries important knowledge and insights. An appropriate understanding of data DNA and their organisms relies on a new field: data science and its keystone analytics. Although it is widely debated whether big data is a hype and buzz and data science is at its very early phase, significant challenges and opportunities are emerging or inspired from the research, innovation, business and education of data science and analytics. This paper provides a comprehensive survey and tutorial of fundamental aspects of data science and analytics: the evolution from data analysis to data science, the data science concepts, a big picture of the era of data science, major challenges and directions in data innovation, the nature of data analytics, new industrialization and service opportunities in data economy, profession and competency of data education, and typical pitfalls in data science. This article serves as the first in the field to draw a comprehensive big picture, in addition to rich observations, lessons and thinking about data science and analytics.
We survey foundational features underlying modern graph query languages. We first discuss two popular graph data models: edge-labelled graphs, where nodes are connected to other nodes by directed, labelled edges; and property graphs, where nodes and edges can have attributes. Next we discuss the two most basic graph querying functionalities: graph patterns and navigational expressions. We start with graph patterns, in which a graph-structured query is matched against the data. Thereafter we discuss navigational expressions, in which patterns can be matched recursively against the graph to navigate paths of arbitrary length; we give an overview of what kinds of expressions have been proposed, and how such expressions can be combined with graph patterns. We also discuss a variety of semantics under which queries using the previous features can be evaluated, what effects the introduction of additional features and the selection of semantics has on complexity, as well as offering examples of said features in three modern languages that can be used to query graphs: SPARQL, Cypher and Gremlin. We conclude with discussion of the importance of formalisation for graph query languages, as well as possible future directions in which such languages can be extended.
Automated Vehicle Classification (AVC) based on vision sensors has received active attention from researchers, due to heightened security concerns in Intelligent Transportation Systems. In this work, we propose a categorization of AVC studies based on the granularity of classification, namely Vehicle Type Recognition (VTR), Vehicle Make Recognition (VMR) and Vehicle Make and Model Recognition (VMMR). For each category of AVC systems, we present a comprehensive review and comparison of features extraction, global representation, and classification techniques. The various datasets proposed over the years for AVC are also compared in light of the real-world challenges they represent, and those they do not. The major challenges involved in each category of AVC systems are presented, highlighting open problems in this area of research. Finally, we conclude by providing future directions of research in this area, paving the way towards efficient large-scale AVC systems. This survey shall help researchers interested in the area to analyze works completed so far in each category of AVC, focusing on techniques proposed for each module, and to chalk out strategies to enhance state-of-the-art technology.
During a processor development cycle, validation is performed on the first fabricated chip to detect and fix design errors. Design errors due to functional issues occur when a unit in a design does not meet its specification. Their chances of occurrence are high when new features are added in a processor. Therefore, the task of verifying the functionality independently and in coordination with other units increases for multicore architectures. Several new techniques are being proposed in the field of functional validation. In this paper, we undertake a survey of these techniques to identify areas that need to be addressed for multicore designs. We start with an analysis of design errors in two multicore architectures. We then survey different functional validation techniques based on hardware, software and formal methods and propose a comprehensive taxonomy for each of these approaches. We also perform a critical analysis to identify gaps in existing research and propose new research directions for validation of multicore architectures.
An enormous amount of research has been conducted in the area of positioning systems and thus it calls for a detailed literature review of recent localization systems. This paper focuses on recent developments of non-Global Positioning System (GPS) localization/positioning systems. We have presented a new hierarchical method to classify various positioning systems. A comprehensive performance comparison of the techniques and technologies against multiple performance metrics along with the limitations is presented. A few indoor positioning systems have emerged as more successful in particular application environments than others, which are presented at the end.
Ray tracing has long been considered as the next generation technology for graphics rendering. Recent years witnessed a strong momentum to adopt the ray tracing based rendering techniques on consumer level platforms due to the inability of further enhancing user experience by increasing display resolution. On the other hand, the computing workload of ray tracing is still overwhelming. A 10-fold performance gap has to be narrowed for real-time applications, even on the latest graphics processing units (GPUs). As a result, hardware acceleration techniques are critical to deliver a satisfying level performance, while at the same time meet an acceptable power budget. A large body of research on ray tracing hardware has been proposed over the past decade. This paper is aimed to provide a timely survey on hardware techniques to accelerate the ray tracing algorithm. A quantitative profiling on the ray tracing workload is first presented. We then review hardware techniques for the main functional blocks in a ray tracing pipeline. On such a basis, the ray tracing microarchitectures for both ASIC and processors are surveyed by following a systematic taxonomy.
The aim of this article is to provide an understanding of social networks as a useful addition to the standard tool-box of techniques used by system designers. To this end, we give examples of how data about social links have been collected and used in different application contexts. We develop a broad taxonomy-based overview of common properties of social networks, review how they might be used in different applications, and point out potential pitfalls where appropriate. We propose a framework, distinguishing between two main types of social network-based user selection personalised user selection which identifies target users who may be relevant for a given source node, using the social network around the source as a context, and generic user selection or group delimitation, which filters for a set of users who satisfy a set of application requirements based on their social properties. Using this framework, we survey applications of social networks in three typical kinds of application scenarios: recommender systems, content-sharing systems (e.g., P2P or video streaming), and systems which defend against users who abuse the system (e.g., spam or sybil attacks). In each case, we discuss potential directions for future research that involve using social network properties.
Programming languages expressiveness is limited by paradigm because it is focused on solving abstraction problems without considering expressiveness of abstractions described using natural language. So, authors have developed tools for natural language software development. In this paper, many works consisting of tools that use some natural language level and domain-specific languages that have an expressiveness level similar to natural languages are reviewed. The goal of the paper is to present a review and highlight the problems that were solved and those left aside. Also, it addresses the fact that a naturalistic language based on a model is not reported.
Technological advances allow more physical objects to connect to the Internet and provide their services on the Web as resources. Search engines are the key to fully utilize this emerging Web of Things, as they bridge users and applications with resources needed for their operation. Developing these systems is challenging due to the diversity of Web of Things resources that they work with. Each combination of resources in query resolution process requires a different type of search engine with its own technical challenges and usage scenarios. This diversity complicates both the development of new systems and assessment of the state of the art. In this article, we present a systematic survey on Web of Things Search Engines (WoTSE), focusing on the diversity in forms of these systems. We collect and analyze over 200 related academic works to build a flexible conceptual model for WoTSE. We develop an analytical framework on this model to review the development of the field and its current status, reflected by 30 representative works in the area. We conclude our survey with a discussion on open issues to bridge the gap between the existing progress and an ideal WoTSE.
Locality of information is a major concern for the design of distributed algorithms. With the LOCAL model, theoretical research already established a common model of locality that has gained little practical relevance. As a result, practical research de facto lacks any common locality model. The only common denominator among practitioners is that a local algorithm is distributed with a limited scope of interaction. This paper closes the gap by introducing four practically motivated classes of locality that successively weaken the strict requirements of the LOCAL model. These classes are applied to categorize and to survey 32 local algorithms from nine different application domains. A detailed comparison shows the practicality of the classification and provides interesting insights. For example, the majority of algorithms limit the scope of interaction to at most two hops, independent of their locality class. Moreover, the application domain of algorithms tends to influence their degree of locality.
Complex Event Recognition applications exhibit various types of uncertainty, ranging from incomplete and erroneous data streams to imperfect complex event patterns. We review Complex Event Recognition techniques that handle, to some extent, uncertainty. We examine techniques based on automata, probabilistic graphical models and first-order logic, which are the most common ones, and approaches based on Petri Nets and Grammars, which are less frequently used. A number of limitations are identified with respect to the employed languages, their probabilistic models and their performance, as compared to the purely deterministic cases. Based on those limitations, we highlight promising directions for future work.
Detecting and analyzing dense groups or communities from social and information networks has attracted immense attention over last one decade due to its enormous applicability in different domains. Community detection is an ill-defined problem, as the nature of the communities is not known in advance. The problem has turned out to be even complicated due to the fact that communities emerge in the network in various forms - disjoint, overlapping, hierarchical etc. Various heuristics have been proposed depending upon the applications in hand. All these heuristics have been materialized in the form of new metrics, which in most cases are used as optimization functions for detecting the community structure, or provide an indication of the goodness of detected communities during evaluation. There arises a need for an organized and detailed survey of the metrics proposed with respect to community detection and evaluation. This paper presents a detailed discussion of the state-of-the-art metrics used for the detection and the evaluation of community structure. Finally, experiments are conducted on synthetic and real networks to present a comparative analysis of these metrics in measuring the goodness of the detected community structure.
Algorithmic debugging is a technique proposed in 1982 by E.Y. Shapiro in the context of logic programming. This survey shows how the initial ideas have been developed to become a widespread debugging schema fitting many different programming paradigms, and with applications out of the program debugging field. We describe the general framework and the main issues related to the implementations in different programming paradigms, and discuss several proposed improvements and optimizations. We also review the main algorithmic debugger tools that have been implemented so far and compare their features. From this comparison, we elaborate a summary of desirable characteristics that should be considered when implementing future algorithmic debuggers.
This article presents an annotated bibliography on automatic software repair. Automatic software repair consists of automatically finding a solution to software bugs, without human intervention. The uniqueness of this article is that it spans the research communities that contribute to this body of knowledge: software engineering, dependability, operating systems, programming languages and security. Furthermore, it provides a novel and structured overview of the diversity of bug oracles and repair operators used in the literature.
The task of quantification consists in providing an aggregate estimation (e.g. the class distribution in a classification problem) for unseen test sets, applying a model that is trained using a training set with a different data distribution.} Several real-world applications demand this kind of methods that do not require predictions for individual examples and just focus on obtaining accurate estimates at an aggregate level. During the past few years, several quantification methods have been proposed from different perspectives and with different goals. This paper presents a unified review of the main approaches with the aim of serving as an introductory tutorial for newcomers in the field.
Recently, multimedia researchers have added several so called new media to the traditional multimedia components (e.g. olfaction, haptic and gustation). The inclusion of such stimuli in addition to traditional media components is typically labeled as multiple sensorial media or mulsemedia. Capturing multimedia user perceived Quality of Experience (QoE) is already non-trivial and the addition of multiple sensorial media components increases this challenge. No standardized methodology exists to conduct subjective quality assessments of multiple sensorial media applications. To date researchers have employed different aspects of audiovisual standards to assess user QoE of multiple sensorial media applications and thus, a fragmented approach exists. In this paper, the authors highlight issues researchers face from numerous perspectives including applicability (or lack of) existing audio-visual standards to evaluate user QoE and lack of result comparability due to varying approaches, specific requirements of olfactory-based multiple sensorial media applications, and novelty associated with these applications. Finally, based on the diverse approaches in the literature and the collective experience of authors, this paper provides a tutorial and recommendations on the key steps to conduct olfactory-based multiple sensorial media QoE evaluation.
Modeling pedestrian dynamics and their implementation in a computer are challenging and important issues in the knowledge areas of transportation and computer simulation. The aim of this paper is to provide a bibliographic outlook so that the reader could have a quick access to the most relevant works related with this problem. We have used three main axes to organise the paper contents: pedestrian models, validation techniques and multiscale approaches. The backbone of the paper is the classification of existing pedestrian models; we have organised the works in the literature under five categories, according to the techniques used for the operational level in each pedestrian model. Then, the main existing validation methods, oriented to evaluate the behavioural quality of the simulation systems, are reviewed. Furthermore, we review the key issues that arise when facing multiscale pedestrian modeling, where we firstly focus on the behavioural scale (combinations of micro and macro pedestrian models) and secondly, on the scale size (from individuals to crowds). Finally, the paper concludes with a discussion about the contributions that different knowledge fields can do in a near future to this exciting area.
Digital advances have transformed the face of automatic music generation since its beginnings at the dawn of computing. Despite the many breakthroughs, issues such as the musical tasks targeted by different machines and the degree to which they succeed remain open questions. We present a functional taxonomy for music generation systems with reference to existing systems according to the purposes for which they were designed. The taxonomy also reveals the inter-relatedness among the systems. This design-centred approach contrasts with predominant methods-based surveys, and facilitates the identification of grand challenges so as to set the stage for new breakthroughs.