High-volume deployments see LiteLLM_SpendLogs grow unbounded because
retention via DELETE leaves dead tuples that autovacuum cannot reclaim
fast enough. With a range-partitioned table, retention drops whole
partitions instead: an instant metadata operation that returns disk to
the OS immediately.
The feature is gated behind general_settings.use_spend_logs_partitioning
(default false). With the flag off, the cleanup job never queries the
catalog and behaves exactly as today. With it on, the job verifies the
table is partitioned, pre-creates upcoming partitions, and drops expired
ones; expired rows the drops cannot reach (DEFAULT partition, partitions
spanning the cutoff) are still deleted row-wise so retention is never
bypassed. If the table is not partitioned it falls back to batched
DELETE only.
Converting an existing table is a manual, documented operation in
db_scripts/partition_spend_logs.sql; db_scripts/unpartition_spend_logs.sql
rolls it back. Both scripts rename the old table's indexes aside before
recreating them, since a table rename keeps the schema-unique index names
and would otherwise silently skip the CREATE INDEX IF NOT EXISTS block.
Granularity and pre-create lookahead are tunable via
SPEND_LOG_PARTITION_INTERVAL (day/week/month, invalid values fall back to
day) and SPEND_LOG_PARTITION_PRECREATE_AHEAD.
- proxy_server.py: disable allow_credentials when allow_origins=['*'] (wildcard
+ credentials is a browser security misconfiguration). Add LITELLM_CORS_ORIGINS
env var to configure explicit allowed origins.
- create_views.py: narrow broad 'except Exception' to only catch genuine
'view does not exist' errors; re-raise all other DB errors (auth, connection,
etc.) that were previously silently swallowed.
- spend_log_cleanup.py: validate execute_raw() return type is int before using
it as a deletion count; break loop safely on unexpected types to prevent
infinite deletion loops.
- Only release distributed lock in finally if it was actually acquired;
prevents spurious Redis release_lock calls on early returns
- Treat bare integer maximum_spend_logs_retention_period as days (e.g. 3 → "3d")
instead of silently failing with a ValueError
- Elevate "Skipping cleanup" log from info to error so misconfigured
retention settings are visible without verbose logging
- Add tests for all three fixes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Fix: Add support for GOOGLE_API_KEY environment variables for Gemini API authentication
* added test cases
* incoperated feedback to make it more maintainable
* fix failed linting CI