Software Engineering & AI (SE&AI)
Description
This research area explores the synergies between Software Engineering and Artificial Intelligence (AI) systems, with special emphasis on Machine Learning-Based Systems (MLS). Our main aim is to apply Software Engineering principles and knowledge to master the development of MLS. We are also exploring how Generative AI (Gen AI) can support Software Engineering activities.
Team
Outcomes/Main Contributions
Research:
- Applying Software Engineering principles and knowledge to improve the design and development of Machine Learning Systems (MLS).
- Assessment of the application of Software Design Principles for the development of Machine Learning Pipelines, with the aim of improving ML pipelines quality. López, Lidia, et al. “Insights on the Use of Software Design Principles in Machine Learning Pipelines” (to appear in PROFES 2024, no link available yet).
- Assessment of ML models trustworthiness. Manzano, M., Ayala, C.P., Gómez, C.: TrustML: A Python package for computing the trustworthiness of ML models. SoftwareX 26: 101740 (2024).
- Supporting end-to-end schema integration from heterogeneous and semi-structured data sources for conducting comprehensive data analysis for MLS. Flores, Javier, et al. "Incremental schema integration for data wrangling via knowledge graphs." Semantic Web 15.3 (2024): 793-830.
- Software engineering artifacts (e.g., requirements, architecture, changes) beyond an ML component and its standard constituents:
- State of the art on Software Engineering for AI-Based Systems: A Survey.
- Research Directions for Developing and Operating Artificial Intelligence Models in Trustworthy Autonomous Systems.
- Requirements engineering for AI.
- Software Architecture and Software Design Decisions for AI.
- Software changes on AI systems.
- Testing on AI systems.
- Experiences on the development and operation of Machine Learning Software Systems as a whole, specifically in the following applications:
- AI for Software Engineering:
Technology Transfer:
- Development of MLS-Toolbox, a set of tools to support ML pipeline development:
- MLS-Toolbox on GitHub: Includes a low-code application for ML pipeline code generation where the user can define a pipeline graphically and generate Python code.
- A preliminary version of a quality assessment tool to assess ML pipelines written in Python.
- TrustML: A Python package for computing the trustworthiness of ML models. This package supports evaluating ML models' trustworthiness both during their development process and in production environments.
Teaching:
- Teaching MLOps in Higher Education through Project-Based Learning:
- Experiences from Training Future Machine Learning Engineers with Software Engineering Practices.
- The state-of-the-art outcomes of this research area (and the sustainability of AI systems research area) are integrated into two subjects:
- “Machine Learning Systems in Production” (MLOps) of the Master of Data Science at UPC.
- “Advanced Topics of Data Engineering 2” (TAED2) of the Bachelor degree of Data Science and Engineering at UPC.
Community:
- Involvement in SE&AI conferences:
Projects Overview
Title | Project Type | Goal |
---|---|---|
HIVEMIND (2025-2027) | Research | HIVEMIND's main goal is to promote responsible software engineering practices that accelerate all stages of the software development lifecycle (SDLC), leveraging novel AI and data technologies. |
DOGO4 ML (2021-2025) | Research | DOGO4ML proposes a holistic end-to-end framework to develop, operate and govern MLSS and their data. This framework revolves around the DevDataOps lifecycle, which unifies two software lifecycles: a DevOps lifecycle and a DataOps lifecycle. |
AI4Software (2023-2025) | Research Network | Fostering collaboration between national research groups to enhance the application of AI methods within the software development lifecycle. |
Collaborations
Contact
Lidia López
Contact Lidia López
Silverio Martínez-Fernández
Contact Silverio Martínez-Fernández
Share: