StreamConnect: Ingesting Historic and Real-Time Data into Unified Streaming Architectures

P. Zehnder and D. Riemer
Joint Proceedings of the Web Stream Processing workshop (WSP 2017) and the 2nd International Workshop on Ontology Modularity, Contextuality, and Evolution (WOMoCoE 2017)
The web of things provides a steadily increasing amount of both real-time and historic data sources. Yet widespread standards are missing and the heterogeneity of data formats and communication protocols makes the integration of such sources a challenging task often requiring for manual programming effort. This paper presents a novel, lightweight semantics-based approach to quickly connect heterogeneous data sources to stream processing systems. Our main contributions are i) a new model to represent characteristics of data streams and data sets such as schema and quality independent from the actual run-time format, ii) generic data adapters and methods to automatically discover these characteristics at runtime and iii) a distributed architecture to pre-process (e.g. clean and filter) raw data coming from these adapters directly on the edge before data is processed by a stream processing engine. Our contribution eases the ingestion of batch and real-time data into unified streaming architectures.
Download .bib
Download .bib
Eingetragen von
Philipp Zehnder