Conference PaperFull Review

Converting data organised for visual perception into machine-readable formats

Loading...
Thumbnail Image

Fulltext URI

Document type

Text/Conference Paper

Additional Information

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Gesellschaft für Informatik e.V.

Abstract

Spreadsheets are used to store an extraordinary amount of important data. The fact that spreadsheets are both easy to use and allow users a great deal of flexibility in how they store their data is a significant reason why they are so popular. Users often use a variety of layout techniques to make the data easy for humans to understand. But this layout also creates problems for traditional Extract-Transform-Load (ETL) tools. We propose a program that allows users to easily extract data from Excel files by selecting the cells containing the data and metadata thereby determining the data hierarchy. We have used this program to extract data of the Agricultural Structure Survey on land use and livestock in Germany, which does not follow a nationwide standard, leading to large differences in the structuring of the data between the federal states, making it a good benchmark.

Description

Aue, Alexander; Ackermann, Andrea; Röder, Norbert (2024): Converting data organised for visual perception into machine-readable formats. 44. GIL - Jahrestagung, Biodiversität fördern durch digitale Landwirtschaft. DOI: 10.18420/giljt2024_59. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-738-8. pp. 179-184. Stuttgart. 27.-28. Februar 2024

Keywords

semi-structured data, ETL, no-code, Excel, spreadsheets, data harmonisation

Citation

Endorsement

Review

Supplemented By

Referenced By

Show citations