In all private and government sectors, the semantic heterogeneity problem constitutes an important roadblock to organizations’ efforts to implement systems interoperability. Semantic heterogeneity originates from application systems designed with different vocabularies or data models within an enterprise. Systems interoperability represents a crucial capability to the industry and government sectors. The scientific community has yet to propose a solution for this problem (Doan et al., 2012) (Olivé, 2017). This problem has a financial impact in respect to IT expenses that can be used for more productive functionality, (M. Dietrich et al., 2013), (Lemcke, 2009) as well as (Brodie, 2010) and (Jhingran et al., 2002). Furthermore, there may be consequences in terms of human life since there is logically a cost stemming from valuable medical and pharmaceutical research funds wasted in addressing semantic heterogeneity (Lenz et al., 2012). In (Williams et al., 2012) and (Mirhaji et al., 2009) the authors stress that efforts in deploying data integration pose significant challenges in biomedical research and hinders knowledge discovery critically needed to develop new drugs.
One solution to the semantic heterogeneity problem is data integration using semantic webcapable technologies (De Giacomo et al., 2018). Data integration is a capability that allows harmonizing the meaning of data originating from various sources in a seamless manner, as if the data came from one single source (Jirkovský et al., 2017). (Daniel Fitzpatrick et al., 2013) propose a knowledge management model: the Reference Architecture – Enterprise Knowledge Architecture (RA-EKI), which comprises high-level specifications for several ontology-driven applications such as Natural Language Processing (NLP), knowledge extraction and data integration. RA-EKI comprises a mid-level type ontology, a form of ontology more specific than a foundational ontology but more generic than a domain ontology (Obrst, Chase, & Markeloff, 2012) (Zuanelli, 2017) called the multi-domain ontology. The multi-domain ontology is designed to fulfill the requirements of various semantic web-based applications, such as inferential or cognitive applications.
State of the art
As stated in the previous section this project aims in eliciting agnostic CODPs from data model patterns. After this project, these agnostic CODPs are to be eventually axiomatized and developed as a multi-domain ontology for performing data integration. A dual method qualitative research process is proposed to perform the required elicitation of agnostic CODPs. Although no similar dual method qualitative research with the purpose of eliciting agnostic CODPs were found, related publications were extracted and examined as indicated in this section.
(Simsion, Milton, & Shanks, 2012) and (Anglim, Milton, Rajapakse, & Weber, 2009) used qualitative research approaches using interviews or surveys to acquire insight from data modeling professionals. (Anglim et al., 2009) studied the current and expected practice in data modeling. Anglim and co-authors elicited from experienced data modelers insight in respect to high-level data modeling. Their approach, with a documented method, involved semi-structured interviews. The latter research reached out to the practitioners by contacting professional associations. (Simsion et al., 2012) directly addressed the issue of the purpose of data modeling, i.e. descriptive versus design, which this project intends to explore in a future phase as a variable that may be associated with the semantic heterogeneity problem. Simsion and his co-authors also diligently documented the research method that used surveys intended for practitioners and semi-structured interviews intended to data modeling «thought leaders» identified by name in the publication. The research design does not explain the method to determine how the «thought leaders» were selected. This research attempted to identify the purpose of data modeling, either descriptive, i.e. to foster communication of requirements, to design semantic structures such as databases. Following the synthesis of the survey and interview data (Simsion et al., 2012) concluded that data modeling was better characterized as design.
In (Olivé, 2017), the author covers a new variation of the notion of ontological agnosticism, a similar concept to the multi-domain ontology. This research proposes the concept a universal ontology. This paper elicits positive and negative reactions from the scientific community in regards to an ontology that is intended to solve semantic integration, which we interpreted as semantic heterogeneity.
In respect to SLRs, only seven papers used the SLR approach on the broad subject of ontologies and were identified using the following search query in the scholar google publication database: «allintitle: ontology « systematic literature survey » OR « systematic survey » OR « systematic literature review » OR « systematic review»
Overview of the research process design
To answer the research question that pertains to eliciting agnostic CODPs to solve the semantic heterogeneity, the project is using a dual method qualitative research process. This dual method research process, while attempting to solve the problem, also intends to satisfy the trustworthiness criteria.
The research protocol used for both the SLR and phenomenological methods, follow the same techniques for the analysis and synthesis stages. The exceptions, i.e. the differences between the SLR and phenomenological methods, are:
• The techniques used to select the knowledge sources. In the case of the SLR, a practical screen is designed to systematically and rigorously select the publications to be studied to answer the research question. In the case of the phenomenological method, the selection criterion, for example, targeted practitioners with a minimum of eight years’ experience in conceptualizing that speaks either French or English;
• The elicitation of the knowledge from the knowledge sources. In the case of the SLR, a note-taking approach allows to extract the sought concepts from publications. In the case of the phenomenological method, notes are taken and the conversations are recorded.
Conclusion and future work
The research question motivated the inquiry into the elicitation of agnostic concepts that can be used as agnostic CODPs in a multi-domain ontology. Although positivist or hypotheticodeductive criteria of validation cannot apply here in a qualitative research (Guba & Lincoln, 2001), evidences are emerging to indicate that the findings of this paper’s phenomenological research method is significantly consistent, in the similarity of the findings, with two other sources: this paper’s companion publication (Fitzpatrick, Ratté, et al., 2018a) and the best practice research on CODPs in (Blomqvist, 2010). This significant similarity in the outcome of qualitative research, as in the case of this project’s two companion papers along with Blomqvist research on CODP best practices, is referred to as triangulation. Anney in (Anney, 2014) recommends that one or two such triangulations be demonstrated as a criterion to establish the research’s trustworthiness. The authors posit that, although this is an initial phase of a multi-phase project, the outcome of this phenomenological study demonstrated a credible inductive process in eliciting data model patterns from experienced practitioners that may be considered as experts in twenty out of twenty-two individuals based on criteria established in (S. Ahmed, Hacker, & Wallace, 2005). Furthermore, the companion SLR is also followed by two use case papers: (Fitzpatrick, Coallier, et al., 2018) and (Fitzpatrick, Ratté, et al., 2018d). These use cases allow determining the transferability of the SLR. (Anney, 2014) indicates that transferability is the equivalent of positivism’s generalizability criterion for qualitative research. Anney also posit that «thick description» and purposeful sampling facilitates transferability. Along with the involvement of several co-researchers in the execution of the phenomenological protocol (use of peer debriefing) (C. Moustakas, 1994) (Anney, 2014), an audit trail, thick documentation and the application of Okoli’s best practice approach for conducting qualitative, this research has shown evidence of trustworthiness following the guidelines established in (Guba & Lincoln, 2001).
INTRODUCTION |