The overarching objective of NFDI4DS is the development, establishment, and sustainment of a national research data infrastructure (NFDI) for the Data Science and Artificial Intelligence community in Germany. This will also deliver benefits for a wider community requiring data analytics solutions, within the NFDI and beyond. The key idea is to work towards increasing the transparency, reproducibility and fairness of Data Science and Artificial Intelligence projects, by making all digital artifacts available, interlinking them, and offering innovative tools and services.
The NFDI4DS Lecture Series fosters collaboration, exchange of ideas, and discussions among various national and international stakeholders towards increasing transparency, reproducibility, and fairness of Data Science and Artificial Intelligence projects.
Find recorded videos of previous lectures in the TIB AV portal and our NFDI4DS Youtube channel.
Overview of all Lectures
- Lecture 6 by Yiannis Papadopoulos: Safety of AI Systems with Executable Causal Models and Statistical Data Science
- Lecture 5 by Suchith Anand: The Ethics of AI and Data in Higher Education
- Lecture 4 by Marco Jahn: Software ate the world - and Open Source is eating software
- Lecture 3 by Beatriz Serrano-Solano: Introduction to AI4Life
- Lecture 2 by Michael Barton: The Open Modeling Foundation
- Lecture 1 by Silvio Peroni: OpenCitations
Lecture 6: Safety of AI Systems with Executable Causal Models and Statistical Data Science
AI systems that learn from data present a unique challenge for safety, as there is no specific design artifact, model, or code to analyse and verify. The safety assurance challenges become even more complex in cooperative intelligent systems, like collaborative robots and autonomous vehicles. These systems are often loosely interconnected, allowing them to form and dissolve configurations dynamically. Evaluating the consequences of failures in largely unpredictable configurations is a daunting task. Intentional or unintentional interactions between systems, along with newly learned behaviours and varying environmental conditions, can lead to unpredictable or emergent behaviours. Achieving complete safety assurance of such systems of systems at the design stage through traditional model-based methods is unfeasible. In this talk, I will explore these challenges and introduce executable causal models and statistical techniques that may help address these emerging issues.
Speaker: Yiannis Papadopoulos
Professor Papadopoulos is a foremost international expert on safety of computer systems including safety of AI and intelligent systems. He is leading a research group on Dependable Intelligent Systems and has pioneered a method and set of tools for model-based safety and reliability assessment and evolutionary optimisation of complex engineering systems known as Hierarchically Performed Hazard Origin and Propagation Studies.
Professor Papadopoulos is currently developing new model-based and data driven technologies for dynamic safety assurance of autonomous and cooperative systems that include swarms of robots and autonomous cars using cutting-edge statistical methods for improving the safety of AI, including safety of Machine Learning, Deep Learning and Large Language Models.
Lecture 5: The Ethics of AI and Data in Higher Education
The presentation will introduce the Ethical Data Initiative. The Ethical Data Initiative provides a neutral space to bring together diverse actors and stakeholders, shaping the future of data governance. In doing so, we aim to increase equality and inclusivity in the data space; building data confidence and empowering the digital citizens of tomorrow. The presentation will also share information about the Campaign for Data Ethics in Education. The Campaign advocates for the integration of data ethics in all higher education courses focused on data science and research. It aims to educate the next generation of data and research professionals about their legal and ethical obligations when it comes to using, reusing, and sharing data.
Speaker: Suchith Anand
Dr Suchith Anand is an internationally recognised expert in sustainable development and geospatial science, providing guidance and advice to governments and international organisations on data science, data ethics, open education, open data and open science policies. He has authored a wide range of publications; from journal papers, scientific reports, book chapters to international strategy documents. He is passionate about education enabling an inclusive society which supports a full commitment to equality, diversity and the public good. He has positioned his work to serve as a bridge between academia and the worlds of policy and practice. He is a UN SDG Volunteer and Advocate. His recent research has focused on ‘Leadership for a More Ethical, Equitable, and Just World’.
Slides: https://zenodo.org/doi/10.5281/zenodo.10721178
Lecture 4: Software ate the world - and Open Source is eating software!
“Software is eating the world” [1] - The famous quote and article by Marc Andreesen, founder of Netscape, is now 12 years old and it is fair to say: He was right, software is the driver of modern economy and pervasive throughout all industries. Taking it one step further, we argue that while software has eaten the world, open source is eating software. Open source makes up 80% - 90% of applications and if we think about it, it is clear that the modern IT industry would not be where it is today without open source. Just to name one example, the internet as we know it is based on open source technology. This lecture will give an introduction to open source and will try to investigate how open source can be applied successfully - in industry and research alike. We will cover aspects such as licenses, governance, best practices, success stories and the role of open source foundations. Last but not least we will have a look at how we can increase the impact of research projects with open source. [1] https://a16z.com/why-software-is-eating-the-world/
Speaker: Marco Jahn
Marco Jahn is Senior Research Project Manager at the Eclipse Foundation. He obtained his diploma in computer science from Ulm University in 2006 and his PhD from RWTH Aachen in 2016. He worked as software developer at denkwerk GmbH before moving to Fraunhofer FIT in 2009. There he has been working as researcher and (technical) project manager in various European research projects in the areas of IoT and Smart Cities and was leading the IoT Platforms team. He joined the Eclipse Foundation in 2019 to help turning innovations into successful open source projects.
Video: https://doi.org/10.5446/65467
Slides: https://doi.org/10.5281/zenodo.10259442
Lecture 3: Introduction to AI4Life
Machine learning (ML) has enabled and accelerated frontier research in the life sciences, but democratised access to such methods is, unfortunately, not a given. Access to necessary hardware and software, knowledge and training, is limited, while methods are typically insufficiently documented and hard to find. Furthermore, even though modern AI-based methods typically generalize well to unseen data, no standard exists to enable sharing and fine-tuning of pre-trained models between different analysis tools. Existing user-facing platforms operate entirely independently from each other, often failing to comply with FAIR data and Open Science standards. The field of AI and ML is developing at a staggering pace, making it impossible for non-specialists to stay up to date. To enable the life science communities to benefit from AI/ML-powered image analysis methods, AI4Life will build bridges, providing urgently needed services on the common European research infrastructures. We will build an open, accessible, community-driven repository of FAIR pre-trained AI models and develop services to deliver these models to life scientists, including those without substantial computational expertise. Our direct support and ample training activities will prepare life scientists for the responsible use of AI methods, while contributor services and open standards will drive community contributions of new models and interoperability between analysis tools. Open calls and public challenges will provide state-of-the-art solutions to yet unsolved image analysis problems in the life sciences. Our consortium brings together AI/ML researchers, developers of popular open-source image analysis tools, providers of European scale storage and compute services and European life sciences Research Infrastructures – all united behind the common goal to enable life scientists to fully benefit from the untapped but potentially tremendous power of AI-based analysis methods.
Beatriz Serrano-Solano holds the position of Scientific Project Manager in the Euro-BioImaging ERIC. Her responsibilities encompass the management of the work package “WP7 – Communication, Outreach and Training” and active contributions to “WP6 – Support for Open Calls, Challenges and New Services” in AI4Life (this Horizon Europe-funded project started in September 2022). With a background in Computer Science, Beatriz earned her PhD in Computational Biology, further refining her expertise during her postdoc in image analysis. Following this experience, she assumed the role of community manager for the European Galaxy project. In this role, she was deeply involved in organizing outreach, training, and community engagement activities for the global Galaxy Community.
Video: https://doi.org/10.5446
Slides: https://zenodo.org/doi/10.5281/zenodo.10563414
Lecture 2: The Open Modeling Foundation: a Global Community for Standards-Based Modeling of Human and Natural Systems
Computation is ubiquitous across all areas of science, policy, and daily life in a diverse array of applications. Modeling is one such application that has become critical to a wide range of research and policy issues, spanning multiple scientific disciplines. These computational tools allow researchers to study and forecast complex, dynamic interactions of multiple social and natural processes in ways not possible with more traditional means. While scientists share the results of model-based research with policymakers and others in respected, peer-reviewed journals and conferences, following widely understood and accepted scientific norms, equivalent practices for documenting, evaluating, and sharing the code of the models that produced such research findings have lagged behind. This especially critical when this technology is urgently needed to help humanity is confront the challenge of successfully and sustainably managing a planetary socioecological system, in which a highly complex, telecoupled, global society is tightly coupled with diverse biophysical systems. A grass-roots initiative of the international modeling community, over the past eight years, led to the formation of the Open Modeling Foundation (OMF). The OMF is a global alliance of modeling organizations that coordinates and administers a common, community developed body of standards and best practices among diverse communities of modeling scientists. As an international open science community, the OMF works to enable the next generation modeling of human and natural systems.
Michael Barton is a Professor in the School of Complex Adaptive Systems and in the School of Human Evolution & Social Change, and Director of the Center for Social Dynamics & Complexity at Arizona State University (USA). He is Executive Director of the Open Modeling Foundation, a global consortium of organizations to promote standards and best practices for computational modeling across the social and natural sciences. He also directs the Network for Computational Modeling in Social and Ecological Sciences (CoMSES.Net), an international scientific network to enable accessibility, open science, and best practices for computation in the socio-ecological sciences. Barton received his BA from the University of Kansas in Anthropology/Archaeology, and MA and PhD from the University of Arizona in Anthropology/Archaeology and Geosciences.
His research centers around long-term human ecology and landscape dynamics, integrating computational modeling, geospatial technologies, and data science with geoarchaeological field studies. Barton has directed transdisciplinary research on hunter-gatherers and small-holder farmers in the Mediterranean and North America for over three decades, and directs research on human-environmental interactions in the modern world. He is a member of the open-source GRASS GIS Development Team and Project Steering Committee, dedicated to making advanced geospatial technologies openly accessible to the world.
Video: http://doi.org/10.5446/62444
Lecture 1: Introduction to NFDI4DS and OpenCitations
Zeyd Boukhers is a data scientist and AI specialist at the Institute for Web Science and Technologies. He is Co-Leader of the group FAIR Data & Distributed Analytics at Fraunhofer-Institut für Angewandte Informationstechnik FIT.
Silvio Peroni holds a Ph.D. degree in Computer Science and he is an Associate Professor at the Department of Classical Philology and Italian Studies, University of Bologna. He is an expert in document markup and semantic descriptions of bibliographic entities using Semantic Web technologies and Co-Director of OpenCitations
Slides: Peroni, S. (2023, April). OpenCitations, an open infrastructure organization for bibliographical data. https://doi.org/10.5281/zenodo.7920424