Change Detection in High Dimensional Data Streams

Masterarbeit

Themen-Schwerpunkt: Big Data and Service Science, Cloud Computing und Cloud Services, Maschinelles Lernen, Real Time Data Management, Software-Entwicklung, Wissen und Informationsdienste
Studiengänge: Informationstechnik, Informationswirtschaft, Mathematik, Mechatronik, Verwandte Studiengänge, Wirtschaftsinformatik, Wirtschaftsmathematik

Umfeld

In real world applications such as manufacturing, environmental analyses, and e-commerce, data streams are high dimensional and evolve over time: Weather stations observe multivariate data which shows natural seasonality. Online retailers collect customer behavior through different indicators which may change with her/his interests. Cars or production machines monitor their internal state by analyzing multivariate sensor data. The detection of change in data streams, i.e, concept drift, is a key component in designing reliable and adaptive systems, as it enables them to react in the first place.

Although there exist numerous techniques for the detection of concept drift, most of them only handle low dimensional data. They are thus unable to deal with the two key challenges which this thesis addresses: First, data streams are generally high dimensional and concept drift might only be visible in selected subspaces of the data. Second, different types of drift may occur simultaneously at various time scales and in arbitrary subspaces. To the best of our knowledge, there so far exists no approach which addresses both of these challenges. For this purpose, this thesis aims at the development of a framework which identifies simultaneous concept drifts in multivariate data streams. Hereby, the student acquires state-of-the-art knowledge and practical experience in the domain of data stream monitoring and related fields.

Aufgaben

  • In-depth literature review about the detection of concept drift in multivariate data streams and related fields
  • Development of a framework for the detection of concept drift in multivariate data streams
  • Experimental evaluation of the developed approach against baselines and state-of-the-art algorithms

Wir bieten

  • Continuous and thorough mentoring of the student
  • A highly motivated and fun team and constructive teamwork

Wir erwarten

  • Basic knowledge in python programming and data mining
  • Ability to plan and work independently
  • Very good knowledge of German or English
  • High level of motivation and enthusiasm

Bewerbung

We are looking forward to your application to Marco Heyden (heyden@dont-want-spam.fzi.de).

Please provide us with a Transcript of Records and your CV.

Weitere Informationen

  • Supervision in cooperation with Edouard Fouché from IPD
  • Responsible institute at KIT for WIWI faculty: AIFB | Prof. Dr. York Sure-Vetter
  • Responsible institute at KIT for Computer Science faculty: IPD | Prof. Dr.-Ing. Klemens Böhm