Vol.16 No.7&8 December 1, 2017
A
Taxonomy of Web Effort Predictors
(pp541-570)
Ricardo Britto,
Muhammad Usman and Emilia Mendes
Web engineering as a field has emerged
to address challenges associated with developing Web applications. It is
known that the development of Web applications differs from the
development of non-Web applications, specially regarding some aspects
such as Web size metrics. The classification of existing Web engineering
knowledge would be beneficial for both practitioners and researchers in
many different ways, such as finding research gaps and supporting
decision making. In the context of Web effort estimation, a taxonomy was
proposed to classify the existing size metrics, and more recently a
systematic literature review was conducted to identify aspects related
to Web resource/effort estimation. However, there is no study that
classifies Web predictors (both size metrics and cost drivers). The main
objective of this study is to organize the body of knowledge on Web
effort predictors by designing and using a taxonomy, aiming at
supporting both research and practice in Web effort estimation. To
design our taxonomy, we used a recently proposed taxonomy design method.
As input, we used the results of a previously conducted systematic
literature review (updated in this study), an existing taxonomy of Web
size metrics and expert knowledge. We identified 165 unique Web effort
predictors from a final set of 98 primary studies; they were used as one
of the basis to design our hierarchical taxonomy. The taxonomy has three
levels, organized into 13 categories. We demonstrated the utility of the
taxonomy and body of knowledge by using examples. The proposed taxonomy
can be beneficial in the following ways: i) It can help to identify
research gaps and some literature of interest and ii) it can support the
selection of predictors for Web effort estimation. We also intend to
extend the taxonomy presented to also include effort estimation
techniques and accuracy metrics.
An SMIL-Timesheets based temporal behavior model for the visual
development of Web user interfaces
(pp571-594)
M. Linaje, J.C.
Preciado, and R. Rodriguez-Echeverria
Temporal behaviors are being incorporated into the user interfaces
of Web applications making them look more and more like multimedia
applications, the so-called Rich Internet Application (RIA) user
interfaces. Due to RIA complexity, some research communities have
proposed models to ease its development. However, there is a gap to
cover between formal temporal relationships and the current state of the
art in the RIA model-driven development techniques. The purpose of this
paper is to specify a temporal behavioral model for data-intensive RIA
user interfaces with three main objectives. The first one is that the
model must be usable by non-experts in engineering specifications (e.g.,
Web designers). The second one is that the model must be suitable to be
implemented in a CASE tool integrating temporal behaviors in the RIA
model driven development workflow. The third one is that the temporal
behaviors specified must run in current Web browsers. The approach here
presented is based on SMIL Timesheets, a standard that can be used as a
foundation to extend RIA user interface model driven proposals.
Service Recommendation Based on
Separated Time-aware Collaborative Poisson Factorization
(pp595-618)
Shuhui Chen, Yushun
Fan, Wei Tan, Jia Zhang, Bing Bai, and Zhenfeng Gao
With the booming of web service
ecosystems, finding suitable services and making service compositions
have become an principal challenge for inexperienced developers.
Therefore, recommending services based on service composition queries
turns out to be a promising solution. Many recent studies apply Latent
Dirichlet Allocation (LDA)
to model the queries and services' description. However, limited by the
restrictive assumption of the Dirichlet-Multinomial
distribution assumption,
LDA
cannot generate high-quality latent presentation, thus the accuracy of
recommendation isn't quite satisfactory. Based on our previous work, we
propose a Separated Time-aware Collaborative Poisson Factorization (STCPF)
to tackle the problem in this paper.
STCPF
takes Poisson Factorization as the foundation to model
mashup
queries and service descriptions separately, and incorporates them with
the historical usage data together by using collective matrix
factorization. Experiments on the real-world show that our model
outperforms than the state-of-the-art methods (e.g., Time-aware
collaborative domain regression) in terms of mean average precision, and
costs much less time on the sparse but massive data from web service
ecosystem.
A Metric
Based Automatic Selection of Ontology Matchers Using Bootstrapped
Patterns
(pp691-652)
B. Sathiya, Geetha T V, and Vijayan
Sugumaran
The ontology matching
process has become a vital part of the (semantic) web, enabling
interoperability among heterogeneous data. To enable interoperability,
similar entity pairs across heterogeneous data are discovered using a
static set of matchers consisting of linguistic, structural and/or
instance matchers that discover similar entities. Numerous sets of
matchers exist in the literature; however, none of the matcher sets are
capable of achieving good results across all data. In addition, it is
both tedious and painstaking for domain experts to select the best set
of matchers for the given data to be matched. In this paper, we propose
two bootstrapping-based approaches, Bottom-up and Top-down, to
automatically select the best set of matchers for the given ontologies
to be matched. The selection is processed, based on the characteristics
of the ontologies which are quantified by a set of quality metrics. Two
new structural quality metrics, the Concept External Structural Richness
(CESR) and the Concept Internal Structural Richness (CISR), have also
been proposed to better quantify the structural characteristics of the
ontology. The best set of matchers is chosen using the sets of patterns
learned through the proposed Bottom-up and Top-down bootstrapping
approaches. The proposed metrics and the patterns constructed using
these approaches are evaluated using the COMA matching tool with
existing benchmark ontologies (Benchmark, Conference and Benchmark2
tracks of the OAEI 2011). The proposed Bottom-up based patterns, along
with the two proposed quality metrics, achieved better effectiveness
(F-measure) in selecting the best set of matchers in comparison with the
static set of matching, supervised ML algorithms and the existing
automatic matching. Specifically, the proposed Bottom-up patterns
achieve a 14.6% Average Gain/Task and a
significant improvement of 129% in comparison with the existing KNN
model’s Average Gain/Task.
Discover Semantic Topics in Patents within a Specific Domain
(pp653-675)
Wen Ma, Xiangfeng Luo, Junyu Xuan, and Ruirong Xue
Patent topic discovery is critical for
innovation-oriented enterprises to hedge the patent application risks
and raise the success rate of patent application. Topic models are
commonly recognized as an efficient tool for this task by researchers
from both academy and industry. However, many existing well-known topic
models, e.g., Latent Dirichlet Allocation (LDA), which are particularly
designed for the documents represented by word-vectors, exhibit low
accuracy and poor interpretability on patent topic discovery task. The
reason is that 1) the semantics of documents are still under-explored in
a specific domain 2) and the domain background knowledge is not
successfully utilized to guide the process of topic discovery. In order
to improve the accuracy and the interpretability, we propose a new
patent representation and organization with additional inter-word
relationships mined from title, abstract, and claim
of patents. The representation can endow each patent with more semantics
than word-vector. Meanwhile, we build a Backbone Association Link
Network (Backbone ALN) to incorporate domain background semantics to
further enhance the semantics of patents. With new semantic-rich patent
representations, we propose a Semantic LDA model to discover semantic
topics from patents within a specific domain. It can discover semantic
topics with association relations between words rather than a single
word vector. At last, accuracy and interpretability of the proposed
model are verified on real-world patents datasets from the United States
Patent and Trademark Office. The experimental results show that Semantic
LDA model yields better performance than other conventional models
(e.g., LDA). Furthermore, our proposed model can be easily generalized
to other related text mining corpus.
A Hybrid Approach for
Automatic Mashup Tag Recommendation
(pp676-692)
Min Shi, Jianxun
Liu, and Dong Zhou
Tags have been extensively utilized to annotate
Web services, which is beneficial to the management, classification and
retrieval of Web service data. In the past, a plenty of work have been
done on tag recommendation for Web services and their compositions (e.g.
mashups). Most of them mainly exploit tag service matrix and textual
content of Web services. In the real world, multiple relationships could
be mined from the tagging systems, such as composition relationships
between mashups and Application Programming Interfaces (APIs), and
co-occurrence relationships between APIs. These auxiliary information
could be utilized to enhance the current tag recommendation approaches,
especially when the tag service matrix is sparse and in the absence of
textual content of Web services. In this paper, we propose a hybrid
approach for mashup tag recommendation. Our hybrid approach consists of
two continuous processes: APIs selection and tags ranking. We first
select the most important APIs of a new mashup based on a probabilistic
topic model and a weighted PageRank algorithm. The topic model
simultaneously incorporates the composition relationships between
mashups and APIs as well as the annotation relationships between APIs
and tags to elicit the latent topic information. Then, tags of chosen
important APIs are recommended to this mashup. In this process, a tag
filtering algorithm has been employed to further select the most
relevant and prevalent tags. The experimental results on a real world
dataset prove that our approach outperforms several state-of-the-art
methods.
Back
to JWE Online Front Page
|