cd
NFDI4DS

Datasets

Datasets

2025-02-01
2 min read

Find our datasets including data for our shared tasks in our repositories.

2025

Type Title Year URL / DOI
dataset Crosswalks from the NFDI4DS hackathon “machine-actionable Data Management Plan for NFDI” 2025 https://doi.org/10.5281/zenodo.15129829
collection The Dagstuhl Artifacts Series DARTS (Evaluated Artifacts) 2025 https://drops.dagstuhl.de/entities/journal/DARTS
dataset NERdME 2025 https://doi.org/10.5281/zenodo.15495960
dataset SciVQA 2025 https://huggingface.co/datasets/katebor/SciVQA
dataset TableEval 2025 https://huggingface.co/datasets/katebor/TableEval
dataset ClimateCheck 2025 https://huggingface.co/datasets/rabuahmad/climatecheck
dataset Hugging Face Model Cards Metadata Dataset 2025 https://doi.org/10.5281/zenodo.14652169
dataset GESIS Knowledge Graph 2025 https://doi.org/10.7802/2878
dataset Joint Named Entity Recognition and Relation Extraction for Software Mentions (SOMD 2025) 2025 https://www.codabench.org/competitions/5840/
dataset TeleScope: A Longitudinal Dataset for Aggregated User Interactions and Information Dissemination on Telegram 2025 https://data.gesis.org/telescope/
dataset Tweetplomacy 23 – An Annotated Collection of Tweets Outlining Strategies of Political Risk Communication during Global Crises (2018-2023) 2025 https://doi.org/10.7802/2860

2024

Type Title Year URL / DOI
dataset FAIR4ML metadata schema 2024 https://w3id.org/fair4ml
dataset FoRC Shared Task Subtask I 2024 https://zenodo.org/records/10777735
dataset FoRC4CL 2024 https://zenodo.org/records/10777674
dataset GESIS KG 2024 (to come)
dataset GESIS MethodsHub KG 2024 (to come)
dataset GSAP-NER 2024 https://github.com/ottowg/gsap-ner/tree/emnlp_submission/data
dataset Hybrid Scholarly Question Answering (QA) dataset 2024 https://codalab.lisn.upsaclay.fr/competitions/19747
dataset KG on AI & DS Methods 2024 (to come)
dataset LLMs4OL 2024 @ ISWC Challenge dataset 2024 https://sites.google.com/view/llms4ol
dataset machine-actionable Software Management Plan Ontology (maSMP Ontology) 2024 https://doi.org/10.5281/zenodo.7806638
dataset Metadata Extraction 2024 (to come)
dataset NFDI4DS KG 2024 (to come)
dataset SOMD - Software Mention Detection 2024 https://zenodo.org/records/10974890
dataset Usage guidance (aka profiles) for the machine-actionable Software Management Plan Ontology 2024 https://doi.org/10.5281/zenodo.10582121
dataset Collection Supplementary Materials corresponding to research articles published by Dagstuhl Publishing 2024 https://drops.dagstuhl.de/entities/collection/supplementary-materials

2023

Type Title Year URL / DOI
dataset DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph 2023 https://zenodo.org/records/7643971
dataset GESIS Datasearch KG 2023 https://data.gesis.org/gesisdatasearchkg
dataset Metadata crosswalks for software management plans at NFDI4DS hackathon maSMP 2023 2023 https://doi.org/10.5281/zenodo.10275895
dataset Open Source Large Language Models 2023 https://github.com/Jamarpaul/OSLLMs/tree/main/Datasets
dataset SciQA benchmark: Dataset and RDF dump 2023 https://doi.org/10.5281/zenodo.7707888
dataset SOTA? Tracking the State-of-the-Art in Scholarly Publications 2023 https://github.com/jd-coderepos/sota/
dataset TD4CLTabs Corpus 2023 https://zenodo.org/records/10972922
dataset Towards metadata for machine learning - Crosswalk tables 2023 https://doi.org/10.5281/zenodo.10407320

2022

Type Title Year URL / DOI
dataset ClaimsKG 2022 https://data.gesis.org/claimskg/
dataset dblp KG RDF (dump download) 2022 https://dblp.org/rdf/release/
dataset dblp XML (dump download) 2022 https://dblp.org/xml/release/
dataset SoftwareKG 2022 https://data.gesis.org/softwarekg/
dataset TweetsCOV19KG 2022 https://data.gesis.org/tweetscov19/
dataset TweetsKB 2022 https://data.gesis.org/tweetskb/