4.7 Article

Open Data Categorization Based on Formal Concept Analysis

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TETC.2019.2919330

Keywords

Portals; Metadata; Government; Text categorization; Software; Formal concept analysis; Machine learning; e-government; open data; data categorization; formal concept analysis

Ask authors/readers for more resources

Government institutions have released a large number of datasets on their open data portals, categorized based on different criteria. However, missing information makes it difficult to find datasets in all ways, as the number of datasets on the portals grows. The EODClassifier framework is introduced to suggest the best category match for datasets and utilize formal concept analysis to categorize uncategorized open datasets.
Government institutions have released a large number of datasets on their open data portals, which are in line with the data transparency and open government initiatives. With the purpose of making it more accessible and visible, these portals categorize datasets based on different criteria like publishers, categories, formats, and descriptions. However, some of this information is often missing, making it impossible to find datasets in all of these ways. As a result, with the number of datasets growing further on the portals, it is getting harder to obtain the desired information. This paper addresses this issue by introducing EODClassifier framework that suggests the best match for the category where a dataset should belong to. It relies on formal concept analysis as a means to generate a data structure that will reveal shared conceptualization originating from tags' usage and utilize it as a knowledge base to categorize uncategorized open datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available