Detailed usage and pinned models

API Usage dashboard

The API Usage Dashboard (beta) shows historical number of requests and input characters per model for an API Token.

Please note that each user account, and each organization, has its own API Token. Lab, Startup and Enterprise subscriptions are billed according to the organization API Token usage. By default, you should not have anything to do. However, if you have any doubt about what’s being shown to you, or you have a complex setup (user subscription, multiple organizations and so on), please contact api-entreprise@hugginface.co.

_images/dashboard_example.png

Pinned models

A pinned model is a model which is preloaded for inference and instantly available for requests authenticated with an API Token.

Lab, Startup and Enterprise organization subscriptions can have a number of models pinned to their organization API Token - see pricing for details.

You can set pinned models to your API Token in the API Usage dashboard.

Pinned models

Model pinning is also accessible directly from the API. Here is how you see what your current pinned models are :

import requests

api_url = "https://api-inference.huggingface.co/usage/pinned_models"
headers = {"Authorization": f"Bearer {API_TOKEN}"}
response = requests.get(api_url, headers=headers)
# {"pinned_models": [...], "allowed_pinned_models": 5}

Pinning models is done that way. Be careful, you need to specify ALL the pinned models each time !

import json

import requests

api_url = "https://api-inference.huggingface.co/usage/pinned_models"
headers = {"Authorization": f"Bearer {API_TOKEN}"}
# XXX: Put ALL the models you want to pin at once, this will override
# the previous values.
data = json.dumps({"pinned_models": [{"model_id": "gpt2", "compute_type": "cpu"}]})
response = requests.post(api_url, headers=headers, data=data)
# {"ok":"Pinned 1 models, please wait while we load them."}'