Semantic Synthesis of Analytical Pipelines: Automated Workflow Composition for Predictive Data Science
- Authors
-
-
Muneeb Uddin Syed
Author
-
- Keywords:
- Automated workflow synthesis, Data science pipelines, Ontology modeling, SAT solving, Answer Set Programming, Jupyter notebooks
- Abstract
-
The proliferation of data science libraries has created a complexity barrier for practitioners seeking to construct effective analytical workflows. This paper introduces a constraint-driven framework for automated pipeline generation that operates at a semantic abstraction layer above technical implementations. By modeling domain-specific ontologies encompassing data manipulation, feature engineering, and machine learning operations, the system synthesizes executable Jupyter notebook workflows through SAT-based reasoning. An alternative Answer Set Programming backend extends synthesis capabilities for knowledge-intensive search problems. Evaluation across three canonical datasets housing price regression, survival classification, and sentiment analysis demonstrates the framework's ability to generate valid workflows while revealing trade-offs between automation flexibility and user control. The approach enables domain experts to rapidly prototype analytical solutions without deep programming expertise, accelerating the data science lifecycle from hypothesis formulation to model evaluation.
- References
- Downloads
- Published
- 2026-06-01
- Issue
- Vol. 1 No. 2 (2026)
- Section
- Articles
- License
-
Copyright (c) 2026 International Journal of Intelligent Systems and Data Science

This work is licensed under a Creative Commons Attribution 4.0 International License.
