Mark up datasets you publish online (like research data, statistics, data files) so Google can display them with rich results and users can discover and explore your datasets easily.
What is a Dataset? #
A dataset is a collection of data, usually organized as tables or spreadsheets, which can be raw data, research data, or any structured data published online for public use.
Benefits of Adding Dataset Structured Data #
- Helps Google understand your dataset better.
- Enables rich result display with info like dataset name, description, creator, keywords, and download links.
- Improves discoverability in Google Search and Dataset search experiences.
Required Properties (minimum for eligibility) #
| Property | Type | Description |
| name | Text | Name/title of the dataset |
| description | Text | Short description of the dataset |
| distribution | DataDownload | How the dataset is published (download URL, format, etc.) |
Recommended Properties (to enrich your markup) #
| Property | Type | Description |
| creator | Person/Organization | Who created the dataset |
| keywords | Text or array | Relevant keywords/tags related to dataset |
| license | URL/Text | License under which the dataset is published |
| datePublished | Date | When the dataset was published |
| temporalCoverage | Date/Period | Time period the dataset covers |
| spatialCoverage | Place | Geographic area the dataset covers |
How to Add Dataset Structured Data? #
Example JSON-LD for a simple dataset with download link: #
html
CopyEdit
<script type=”application/ld+json”>
{
“@context”: “https://schema.org/”,
“@type”: “Dataset”,
“name”: “India Population Statistics 2025”,
“description”: “Detailed population data of India by state for the year 2025.”,
“creator”: {
“@type”: “Organization”,
“name”: “FSIDM Data Research Team”
},
“keywords”: [“population”, “India”, “statistics”, “2025”],
“distribution”: {
“@type”: “DataDownload”,
“encodingFormat”: “CSV”,
“contentUrl”: “https://fsidm.in/datasets/india-population-2025.csv”
},
“datePublished”: “2025-07-01”
}
</script>
Guidelines & Best Practices #
- Make sure the dataset is publicly accessible and not behind login or paywalls.
- Use correct URLs and file formats in the distribution property (e.g., CSV, JSON, XML).
- Keep the description concise but informative.
- Avoid duplicate dataset markup on multiple URLs unless they represent distinct versions.
- Validate with Google’s Rich Results Test or Schema Markup Validator.
- Ensure your page is crawlable and not blocked by robots.txt or noindex.