After the increased adoption of machine learning (ML) in various applications and disciplines, a synergy between the database (DB) systems and ML communities emerged. Steps involved in an ML pipeline, such as data preparation and cleaning, feature engineering and management of the ML lifecycle, can benefit from research conducted by the data management community. For example, the management of the ML lifecycle requires mechanisms for modeling, storing and querying ML artifacts. Moreover, in many use cases pipelines require a mixture of relational and linear algebra operators raising the question of whether a seamless integration between the two algebras is possible. In the opposite direction, ML techniques are explored in core components of database systems, e.g., query optimization, indexing and monitoring. Traditionally hard problems in databases, such as cardinality estimation, or problems with high human supervision like DB administration, might benefit more from learning algorithms than from rule-based or cost-based approaches.

The workshop aims at bringing together researchers and practitioners in the intersection of DB and ML research, providing a forum for DB-inspired or ML-inspired approaches addressing challenges encountered in each of the two areas. In particular, we welcome new research topics combining the strengths of both fields.

Topics of particular interest in the workshop include, but are not limited to:

  • Data collection and preparation for ML applications
  • Declarative machine learning on databases, data warehouses or data lakes
  • Hybrid optimization techniques for databases and machine learning
  • Model-aware data discovery, cleaning, and transformation
  • Benchmarking ML-oriented data management systems (data augmentation, data cleaning, etc)
  • Data management during the life cycle of ML models
  • Novel data management systems for accelerating training and inference of ML models
  • DB-inspired techniques for modeling, storage and provenance of ML artifacts
  • Learned database design, configuration and tuning
  • Machine learning for query optimization
  • Applied machine learning/deep learning for data integration
  • ML-enabled data exploration and discovery in data lakes
  • ML functionality inside DBMS


The workshop will accept both regular papers and short papers (work in progress, vision/outrageous ideas). All submissions must be prepared in accordance with the IEEE template available here. The following are the page limits:

Regular papers: 8 pages
Short papers: 4 pages

All submissions (in PDF format) should be sent to Easychair.


All deadlines are 11:59PM PST.

Submission deadline: 14 January 2022 (extended) 27 January 2022
Author notification: 22 February 2022
Camera-ready version: 8 March 2022
Workshop day: 9 May 2022


Martin Grohe

Martin Grohe

RWTH Aachen University, Germany

Martin Grohe is a computer scientist known for his research on parameterized complexity, mathematical logic, finite model theory, the logic of graphs, database theory, and descriptive complexity theory. He is a University Professor of Computer Science at RWTH Aachen University, where he holds the Chair for Logic and Theory of Discrete Systems. Grohe won the Heinz Maier-Leibnitz Prize awarded by the German Research Foundation in 1999. He was elected as an ACM Fellow in 2017 for "contributions to logic in computer science, database theory, algorithms, and computational complexity.

Paul Groth

Paul Groth

University of Amsterdam, The Netherlands




Program committee:

  • Hazar Harmouch - Hasso Plattner Institute, Germany
  • Roee Shraga - Technion - Israel Institute of Technology, Israel
  • Syed Muhammad Fawad Ali - Poznan University of Technology, Poland
  • Rana Alotaibi - University of California San Diego, USA
  • Christos Koutras - Delft University of Technology, The Netherlands
  • Zoi Kaoudi - Qatar Computing Research Institute, Qatar
  • Marios Fragkoulis - University of Ioannina, Greece
  • Nikolaos Vasiloglou - relationalAI
  • Stefan Manegold - CWI, The Netherlands

Attendance Support

We are very happy to announce attendance support opportunities for students to attend DBML 2022, which allow free workshop registration for virtual attendees. Due to the limited funding opportunities, there is a strong focus on universities in developing countries (as listed by ACM).

Who can apply

The first authors of each accepted paper at DBML 2022 can apply, who are also full-time students (graduate or undergraduate) affiliated with universities.

How to apply

After the paper notification, please send your paper number and student certificate to Dr. Hai (R.Hai@tudelft.nl)