How to use the HTTP cache and rate limiter¶
The stack has an HTTP cache and a rate limiter shared across all child projects (mitigation of failure mode F05). This guide covers how to inspect, configure, and clear them so you don't exhaust quotas or refetch DOIs.
Where they live¶
Two stdlib file-based components under ~/.belico/ (override with BELICO_CACHE_DIR):
~/.belico/http_cache.db— HTTP cache with per-host TTL (SQLite).~/.belico/rate_limiter.db— persistent per-host token bucket (SQLite).
Two invocations of different tools share the same cache: if find_top_sources already downloaded a work, verify_citations reads it from cache without hitting OpenAlex.
TTL per host¶
| Host | TTL | Reason |
|---|---|---|
api.openalex.org |
24h | Metadata stable day to day |
api.semanticscholar.org |
24h | Same |
api.crossref.org |
7d | DOI metadata nearly immutable |
api.elsevier.com |
30d | Paid quota; annual snapshots |
api.zotero.org |
1h | Personal library changes |
| default | 24h | Conservative |
Not cached: status ≥ 500, status 429, Cache-Control: no-store, POST without X-Cache-Idempotent: true.
Rate limiter limits¶
| Host | Capacity | Refill/s | Polite pool |
|---|---|---|---|
api.openalex.org |
10 | 10.0 | 100 |
api.semanticscholar.org |
1 | 0.33 | 100 (with key) |
api.crossref.org |
50 | 50.0 | 50 |
api.elsevier.com |
10 | 10.0 | 10 |
| default | 5 | 1.0 | 5 |
Task: enable the polite pool¶
If you set BELICO_API_EMAIL, OpenAlex and Semantic Scholar recognize the client and raise capacity 10x. Add to .env:
Task: inspect state¶
python tools/cache_inspector.py stats # cache state
python tools/cache_inspector.py rate-stats # available tokens, throttled count
python tools/cache_inspector.py paths # DB paths
Task: clear the cache (stale / old data)¶
# Clear a specific host's cache
python tools/cache_inspector.py clear --host api.openalex.org
# Clear entries older than N days
python tools/cache_inspector.py clear --older-than 7d
Stale cache
If you updated metadata on OpenAlex/Zotero but the tool still sees the old value, it's the TTL. Clear the host with clear --host or wait for expiry.
Automatic behavior¶
tools/adapters/rest_json.py (get/post/request_raw) does:
cache_get()— on hit, returns without touching network or rate limiter.rate_limiter.acquire(host)— blocks until a token is available.- Success →
cache_put()with the host TTL. - 429/5xx → exponential backoff with jitter, up to 3 retries. Not cached.
Per-call opt-out: get(url, use_cache=False) or get(url, rate_limited=False).
See also¶
- Troubleshooting — rate-limit and stale-cache symptoms.
- Stack FMEA — the F05 failure this mitigates.
Canonical source
Derives from docs/shared/CACHE_AND_RATE_LIMIT.md.