# HAPI

> History of APIs (HAPI) is a large-scale, longitudinal database of commercial ML API predictions. It contains 1.7 million predictions collected from 2020 to 2022 and spanning APIs from Amazon, Google, IBM, and Microsoft. The database include diverse

**Wikidata**: [Q127485387](https://www.wikidata.org/wiki/Q127485387)  
**Source**: https://4ort.xyz/entity/hapi-q127485387

## Summary
HAPI (History of APIs) is a large-scale, longitudinal database focused on commercial Machine Learning (ML) API predictions. It contains 1.7 million predictions collected between 2020 and 2022, spanning services offered by Amazon, Google, IBM, and Microsoft.

## Key Facts
- **Full Name:** History of APIs (HAPI).
- **Data Volume:** Contains 1.7 million predictions.
- **Timeframe:** Data collection spans from 2020 to 2022.
- **Data Type:** Commercial Machine Learning (ML) API predictions.
- **Providers Covered:** Amazon, Google, IBM, and Microsoft.
- **License:** Apache Software License 2.0.
- **Classification:** Instance of software and free software.
- **Repository:** Source code is available at https://github.com/lchen001/HAPI.

## FAQs
### Q: What specific type of data does HAPI contain?
A: HAPI contains commercial Machine Learning (ML) API predictions. It serves as a longitudinal record of how major APIs performed and predicted between 2020 and 2022.

### Q: Which commercial providers are included in the HAPI database?
A: The database spans APIs from four major technology companies: Amazon, Google, IBM, and Microsoft.

### Q: Is HAPI free to use?
A: Yes, HAPI is classified as free software and is distributed under the Apache Software License 2.0.

### Q: What is the time range of the data in HAPI?
A: The database covers predictions collected over a three-year period from 2020 to 2022.

## Why It Matters
HAPI serves as a critical resource for analyzing the stability and consistency of commercial Machine Learning services over time. As major tech companies like Amazon, Google, IBM, and Microsoft frequently update their models, researchers and developers often face "model drift" or unexpected changes in API behavior without notice.

By providing a longitudinal record of 1.7 million predictions from 2020 to 2022, HAPI allows for the systematic study of these changes. It creates transparency in the "black box" of commercial ML, enabling better reproducibility of experiments and a deeper understanding of how cloud-based AI predictions evolve. This database is essential for anyone relying on commercial APIs for long-term projects, offering a historical baseline that is rarely exposed by providers themselves.

## Notable For
- **Scale:** It is a large-scale database, documenting 1.7 million individual predictions.
- **Longitudinal Analysis:** It offers a rare, long-term view (2020–2022) of API behavior rather than a single snapshot.
- **Provider Diversity:** It aggregates data from the four dominant cloud AI providers (Amazon, Google, IBM, Microsoft) into a single dataset.
- **Open Access:** It is available as free software under the Apache 2.0 license, encouraging academic and commercial research.

## Body
### Nature and Scope
The History of APIs (HAPI) is a software database designed to archive and analyze the outputs of commercial Machine Learning APIs. Unlike static datasets, HAPI functions as a longitudinal record, capturing the behavior of live APIs over a specific timeframe.

### Data Composition
The core of the database consists of 1.7 million predictions. These predictions were harvested from live commercial services between the years 2020 and 2022. The data spans APIs from four distinct providers:
- Amazon
- Google
- IBM
- Microsoft

The database is noted to include diverse inputs, though the specific nature of the inputs is defined by the scope of the commercial ML APIs tracked during that period.

### Technical Accessibility
HAPI is maintained as free software under the Apache Software License 2.0. The project is accessible to the public, with its source code and data housed in a GitHub repository located at `https://github.com/lchen001/HAPI`. It is strictly classified as software and free software, intended for use in research and development environments requiring historical API data.