Hugging Face Data for Research

A resource hub for researchers studying AI ecosystem development and adoption using data from the Hugging Face platform.

Why the Hub matters for research

The Hugging Face Hub offers a rich source of data for understanding how the AI ecosystem evolves. Information about models, datasets, Spaces, papers, and community activity is publicly accessible—making it possible to analyze trends in model development, dataset usage, research directions, and adoption patterns over time.

How to access the data

Pre-compiled datasets provide the simplest entry point. We recommend starting with community-maintained snapshots such as:

For custom views or real-time access, use the Hub API. The API supports programmatic access to repository metadata, search, and more. See the OpenAPI specification and documentation for details. Python users can rely on the huggingface_hub client.

Understanding data limitations

Metrics such as download counts are useful but imperfect. They reflect a complex process shaped by infrastructure, caching, and even repository type - you can find documentation here. They work well for:

They are less reliable for fine-grained rankings or absolute comparisons. When designing studies, consider what the data actually measures and how infrastructure and noise may affect your conclusions.

Explore and connect

We welcome researchers interested in responsible use of Hub data for ecosystem studies.