
Find our datasets including data for our shared tasks in our repositories.
2025
Type | Title | Year | URL / DOI |
---|---|---|---|
dataset | Crosswalks from the NFDI4DS hackathon “machine-actionable Data Management Plan for NFDI” | 2025 | https://doi.org/10.5281/zenodo.15129829 |
collection | The Dagstuhl Artifacts Series DARTS (Evaluated Artifacts) | 2025 | https://drops.dagstuhl.de/entities/journal/DARTS |
dataset | NERdME | 2025 | https://doi.org/10.5281/zenodo.15495960 |
dataset | SciVQA | 2025 | https://huggingface.co/datasets/katebor/SciVQA |
dataset | TableEval | 2025 | https://huggingface.co/datasets/katebor/TableEval |
dataset | ClimateCheck | 2025 | https://huggingface.co/datasets/rabuahmad/climatecheck |
dataset | Hugging Face Model Cards Metadata Dataset | 2025 | https://doi.org/10.5281/zenodo.14652169 |
dataset | GESIS Knowledge Graph | 2025 | https://doi.org/10.7802/2878 |
dataset | Joint Named Entity Recognition and Relation Extraction for Software Mentions (SOMD 2025) | 2025 | https://www.codabench.org/competitions/5840/ |
dataset | TeleScope: A Longitudinal Dataset for Aggregated User Interactions and Information Dissemination on Telegram | 2025 | https://data.gesis.org/telescope/ |
dataset | Tweetplomacy 23 – An Annotated Collection of Tweets Outlining Strategies of Political Risk Communication during Global Crises (2018-2023) | 2025 | https://doi.org/10.7802/2860 |
2024
Type | Title | Year | URL / DOI |
---|---|---|---|
dataset | FAIR4ML metadata schema | 2024 | https://w3id.org/fair4ml |
dataset | FoRC Shared Task Subtask I | 2024 | https://zenodo.org/records/10777735 |
dataset | FoRC4CL | 2024 | https://zenodo.org/records/10777674 |
dataset | GESIS KG | 2024 | (to come) |
dataset | GESIS MethodsHub KG | 2024 | (to come) |
dataset | GSAP-NER | 2024 | https://github.com/ottowg/gsap-ner/tree/emnlp_submission/data |
dataset | Hybrid Scholarly Question Answering (QA) dataset | 2024 | https://codalab.lisn.upsaclay.fr/competitions/19747 |
dataset | KG on AI & DS Methods | 2024 | (to come) |
dataset | LLMs4OL 2024 @ ISWC Challenge dataset | 2024 | https://sites.google.com/view/llms4ol |
dataset | machine-actionable Software Management Plan Ontology (maSMP Ontology) | 2024 | https://doi.org/10.5281/zenodo.7806638 |
dataset | Metadata Extraction | 2024 | (to come) |
dataset | NFDI4DS KG | 2024 | (to come) |
dataset | SOMD - Software Mention Detection | 2024 | https://zenodo.org/records/10974890 |
dataset | Usage guidance (aka profiles) for the machine-actionable Software Management Plan Ontology | 2024 | https://doi.org/10.5281/zenodo.10582121 |
dataset | Collection Supplementary Materials corresponding to research articles published by Dagstuhl Publishing | 2024 | https://drops.dagstuhl.de/entities/collection/supplementary-materials |
2023
Type | Title | Year | URL / DOI |
---|---|---|---|
dataset | DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph | 2023 | https://zenodo.org/records/7643971 |
dataset | GESIS Datasearch KG | 2023 | https://data.gesis.org/gesisdatasearchkg |
dataset | Metadata crosswalks for software management plans at NFDI4DS hackathon maSMP 2023 | 2023 | https://doi.org/10.5281/zenodo.10275895 |
dataset | Open Source Large Language Models | 2023 | https://github.com/Jamarpaul/OSLLMs/tree/main/Datasets |
dataset | SciQA benchmark: Dataset and RDF dump | 2023 | https://doi.org/10.5281/zenodo.7707888 |
dataset | SOTA? Tracking the State-of-the-Art in Scholarly Publications | 2023 | https://github.com/jd-coderepos/sota/ |
dataset | TD4CLTabs Corpus | 2023 | https://zenodo.org/records/10972922 |
dataset | Towards metadata for machine learning - Crosswalk tables | 2023 | https://doi.org/10.5281/zenodo.10407320 |
2022
Type | Title | Year | URL / DOI |
---|---|---|---|
dataset | ClaimsKG | 2022 | https://data.gesis.org/claimskg/ |
dataset | dblp KG RDF (dump download) | 2022 | https://dblp.org/rdf/release/ |
dataset | dblp XML (dump download) | 2022 | https://dblp.org/xml/release/ |
dataset | SoftwareKG | 2022 | https://data.gesis.org/softwarekg/ |
dataset | TweetsCOV19KG | 2022 | https://data.gesis.org/tweetscov19/ |
dataset | TweetsKB | 2022 | https://data.gesis.org/tweetskb/ |