|Thesis Title||Development of Data Quality Framework for Linked Data Readiness Assessment of Thailand Open Government Data|
|Advisor||Assoc. Prof. Dr. Poonpong Boonbrahm|
In Thailand, the Open Government Data (OGD) project was established. Many government sectors have participated in the project. Similar to other OGD projects, problems in data quality have been reported in open government data of Thailand (TOGD). To understand the underlying problems within the data, data quality issues should be systematically assessed. Moreover, the existing data quality assessment frameworks mostly focused on content quality. To the best of knowledge, none has
provided assessment framework for open data readiness towards the concept of linked open data (LOD), this research aims to find current quality status of TOGD and develops a quantitative assessment model to evaluate the readiness level in linking data across datasets. Improving the data linking readiness level could be a useful first step towards moving OGD to LOD. The base assessment model for data quality is the open data quality assessment proposed by Vetro et al., 2016. Additional aspects are
added to focus more on schema quality of the open data since schema plays crucial role towards linking datasets. This work also proposes an assessment model for semantic type degree and dataset linking types towards linked open data with linkability metrics for each type. The semantic type in a dataset is a conceptual meaning which may be used as a key to match data of the same semantic type in another dataset; hence, it may be used as a means to represent linking potential.
Moreover, four types of data linking are defined as merging, merging (added), inlinking and ex-linking. An assessment framework to measure the linkability degree of each type is developed. To prove the usefulness of the developed assessment models, they were applied to assess the datasets from TOGD, i.e., Data.go.th. The assessment results reflected data quality and linkability of TOGD. Furthermore, correlations between data quality, semantic type degree and linkability were measured. The correlation scores signified that both data quality and semantic type degree were positively correlated to linkability. Especially, semantic type degree was more significantly correlated. This result implied that while both data quality and semantic type degree can effectively be used to assess link potential of datasets, the semantic type degree of dataset has higher influence on dataset linkability degree.
Keywords: data quality assessment, open government data, linked open data, data characteristic