claude-cache-savings

Name	claude-cache-savings JSON
Version	0.1.0 JSON
	download
home_page	https://github.com/jojje/claude-cache-savings
Summary	Calculates Anthropic Claude cost savings from use of prompt caching
upload_time	2025-10-17 20:22:38
maintainer	None
docs_url	None
author	jojje
requires_python	<4.0,>=3.10
license	MIT
keywords	athropic claude cost cache caching price usage statistics stats
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
# claude-cache-savings

Calculates Anthropic Claude cost savings from use of prompt caching

Provides light-weight analytics about how valuable "prompt caching" is to _your_ specific use of the claude models, what models you gravitate towards and how that use translates to what you pay.

# Installation

`pip install claude-cache-savings`

# Usage

1. Export API usage CSV files from Antropic's [API usage dashboard](https://console.anthropic.com/usage?group_by=token_type). Click the `Export` button.
2. Have the script analyze one or several of those files, each covering period up to one month of usage: `claude-cache-savings *.csv`

## Options

```
usage: claude-cache-savings [-m] files

Provides Anthropic Claude caching, model use and cost statistics

positional arguments:
  file        Exported API usage CSV files from the console

options:
  -m          Show per-model breakdown
```


## Examples

Aggregate statistics

```
claude-cache-savings claude_api_tokens_*.csv

==[ Total ]====================================

Input cache hit   71 %
Input savings     52 %
Total savings     44 %

Total cost        $ 41.12
Est cache-less    $ 74.04

Input tokens      11,146,779   | $ 29.87
Output tokens     409,695      | $ 11.25
Input/output tok  27x
```

Per model break-down, ordered by the fractional cost incurred by use of the respective models with the ones having the highest spend listed first.

```
claude-cache-savings -m claude_api_tokens_*.csv

==[ Total ]====================================

Input cache hit   71 %
Input savings     52 %
Total savings     44 %

Total cost        $ 41.12
Est cache-less    $ 74.04

Input tokens      11,146,779   | $ 29.87
Output tokens     409,695      | $ 11.25
Input/output tok  27x

==[ Per model ]================================

claude-opus-4-1-20250805    (use-cost: 22% 63%)
  Input cache hit   60 %
  Input savings     46 %
  Total savings     39 %

  Total cost        $ 25.99
  Est cache-less    $ 42.45

  Input tokens      2,407,472    | $ 19.65
  Output tokens     84,467       | $ 6.34
  Input/output tok  29x

claude-sonnet-4-20250514    (use-cost: 78% 36%)
  Input cache hit   74 %
  Input savings     62 %
  Total savings     52 %

  Total cost        $ 14.80
  Est cache-less    $ 30.97

  Input tokens      8,701,412    | $ 9.93
  Output tokens     324,578      | $ 4.87
  Input/output tok  27x

claude-opus-4-20250514      (use-cost: 0% 1%)
  Input cache hit   66 %
  Input savings     51 %
  Total savings     47 %

  Total cost        $ 0.33
  Est cache-less    $ 0.62

  Input tokens      37,895       | $ 0.28
  Output tokens     650          | $ 0.05
  Input/output tok  58x
```

## Fields

* `Input cache hit`: ratio of input tokens that were retrieved from previously cached data (at significant discount price) vs total number of input tokens, incl. those that were not previously cached, and incurred a 25% extra cost to put them into the anthropic prompt cache for subsequence referencing.
* `Input savings`: Percent of how much money was saved on input token processing by using caching vs. if caching had Not been used at all. This takes into account the higher price for initial prompt caching writes. Worst case would be a negative percentage where there are always cache misses and no cache hits, as the input cost would then be higher than not using caching at all. As long as this percentage is positive money is saved.
* `Total savings`: As the term implies, factors in the output token cost as well. It's `(cached-input-cost + output-cost) / (uncached-input-cost + output-cost) - 1`
* `Total cost`: What was being charged by anthropic overall. This doesn't use anthropic's billing statement but derives the cost from the published pricing information that's been translated into the pricing configuration for the project. As long as this number lines up with your billing statement for the period the csv files cover, you know the pricing information this project uses is correct. If there is a difference, then anthropic has likely changed their pricing so the pricing information for the project should be adjusted accordingly.
* `Est cache-less`: An estimate of what the total cost would have been if caching had not been used at all.
* `Input tokens`: Total number of input tokens processed during the covered period. The dollar value at the end is the calculated actual cost for those input tokens.
* `Output tokens`: Total number of output tokens generated during the covered period. The dollar value at the end is the calculated actual cost for those output tokens.
* `Input/output tok`: Ratio between input and output tokens. Useful to get a sens of how you use different models, or models overall; if your usage is context heavy where you want models to answer based on information your provide / engage in long conversations, or mostly context free zero-shot like "tell me a joke". It's mostly a trivia statistic about _you_ and your model use.

Of all the statistics, the Cache hit and Input savings are the most important. The former tells you how well (or poorly) you use Anthropic's caching, something you can affect with your model use behavior. The latter how valuable caching is to your specific use of the caching feature. The other stats are mostly derived from those two.

Finally, in the per-model break-down, there are two percentage values in braces following the model name such as `(use-cost: 22% 63%)`. The two fractions respectively communicate that specific model's contribution to the total token use (input+output) and total cost. I.e. 22% in the example means the input token use for `claude-opus-4-1-20250805` contributed 22% of the total tokens processed by anthropic for the period. The second fraction; 63% means the cost contribution for those tokens in relation to all models used for the period.

These numbers are useful to get a sense of "bang for buck". For instance, is the output of Opus really worth the extremely high cost, or are you perhaps using Opus for things where Sonnet might have sufficed? 

Like for Input/output token ratio, these numbers are mostly about you and _your_ behavioral use of the models, trying to surface dimensions that _might_ save you even more money if you changed your model use slightly. For a very model-concious user it would of course mean something slightly different. In the example above; that there is a significant number of tasks / questions for where Sonnet simply isn't good enough, and where one has to _pay through the nose_ by using Opus instead to get the output quality required. In other words, these numbers just help surface usage information and how that usage contributes to what you pay. What the numbers mean is entirely dependent on what you prioritize and need. They are just statistics about your use and the relative cost of that specific use.

## Updating pricing / Staying up-to-date

This script uses a pricing file derived from Anthropic's published model pricing pages since Anthropic does not provide an API for fetching this information automtically. As such this snapshot-information is likely to become stale if anthropic either introduces new models or changes the pricing for any of them, including the caching discounts.

For that reason the pricing information embedded is saved as a JSON file upon first use to your home directory, in the folder `$HOME/.config/claude-cache-savings/config.json` or on windows ``%HOMEPATH%\.config\claude-cache-savings\config.json`.

If pricing changes, you can just revise the numbers in that file.

### Config file (& missing model-id warning)

The config file has two elements to it:
1. The pricing dimensions for each model **family**, in $ (USD) / 1M tokens as shown on the provider's pricing related web pages.
2. A mapping between a specific model ID and a model family.

The reason for this separation is to make it as simple as possible to deal with new model variants in a given family, and the pricing, which is set for an entire model family and not for a specific model in that family. I.e. there is a many-to-one relationship between models and a model family.

If the pricing changes, then it changes for an entire family, so you just have to revise the changed pricing value for _that_ specific family and it will take effect for all models of that family. If a new model is introduced, such as an update to Sonnet, then that model will have a _new_ model ID, so you just add that new model ID to the second section, and the Sonnet family pricing will automatically be applied to this model as well.

If you've used a model (id) not covered by this script, the script will tell you about it when run. For instance `[!] Warning: the following model-ids do not have a pricing link established and will be excluded from the statistics calculation: {'claude-opus-4-1-20250805'}`. This means you have to add `claude-opus-4-1-20250805` to the Opus family in the config file so the script knows what price to assign for this model's API usage. Once done, the warning goes away and the usage + cost incurreed by this model will be included in the statistics as well. I.e. link this model ID to the family by adding the following to the `pricing` section of the config file: `"claude-opus-4-1-20250805": "Opus"`

The pricing information specified and required for a model family is as follows:

* `input`: Normal input token pricing without caching.
* `output`: Output token generation cost.
* `cache_write`: Cost of input tokens that aren't already cached when you leverage the Claude caching feature in your programs / APIs.
* `cache_read`: Cost for input tokens that were retrieved from the claude cache (cache hits).
* `cache_write_1h`: A recent caching variant with a separate pricing, where anthropic will cache the input tokens for 1 hour instead of the normal 10 minutes only.

All prices are in $/1M tokens.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jojje/claude-cache-savings",
    "name": "claude-cache-savings",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": "Athropic, Claude, Cost, Cache, Caching, Price, Usage, Statistics, Stats",
    "author": "jojje",
    "author_email": "tinjon+pypi@gmail.com",
    "download_url": null,
    "platform": null,
    "description": "\n# claude-cache-savings\n\nCalculates Anthropic Claude cost savings from use of prompt caching\n\nProvides light-weight analytics about how valuable \"prompt caching\" is to _your_ specific use of the claude models, what models you gravitate towards and how that use translates to what you pay.\n\n# Installation\n\n`pip install claude-cache-savings`\n\n# Usage\n\n1. Export API usage CSV files from Antropic's [API usage dashboard](https://console.anthropic.com/usage?group_by=token_type). Click the `Export` button.\n2. Have the script analyze one or several of those files, each covering period up to one month of usage: `claude-cache-savings *.csv`\n\n## Options\n\n```\nusage: claude-cache-savings [-m] files\n\nProvides Anthropic Claude caching, model use and cost statistics\n\npositional arguments:\n  file        Exported API usage CSV files from the console\n\noptions:\n  -m          Show per-model breakdown\n```\n\n\n## Examples\n\nAggregate statistics\n\n```\nclaude-cache-savings claude_api_tokens_*.csv\n\n==[ Total ]====================================\n\nInput cache hit   71 %\nInput savings     52 %\nTotal savings     44 %\n\nTotal cost        $ 41.12\nEst cache-less    $ 74.04\n\nInput tokens      11,146,779   | $ 29.87\nOutput tokens     409,695      | $ 11.25\nInput/output tok  27x\n```\n\nPer model break-down, ordered by the fractional cost incurred by use of the respective models with the ones having the highest spend listed first.\n\n```\nclaude-cache-savings -m claude_api_tokens_*.csv\n\n==[ Total ]====================================\n\nInput cache hit   71 %\nInput savings     52 %\nTotal savings     44 %\n\nTotal cost        $ 41.12\nEst cache-less    $ 74.04\n\nInput tokens      11,146,779   | $ 29.87\nOutput tokens     409,695      | $ 11.25\nInput/output tok  27x\n\n==[ Per model ]================================\n\nclaude-opus-4-1-20250805    (use-cost: 22% 63%)\n  Input cache hit   60 %\n  Input savings     46 %\n  Total savings     39 %\n\n  Total cost        $ 25.99\n  Est cache-less    $ 42.45\n\n  Input tokens      2,407,472    | $ 19.65\n  Output tokens     84,467       | $ 6.34\n  Input/output tok  29x\n\nclaude-sonnet-4-20250514    (use-cost: 78% 36%)\n  Input cache hit   74 %\n  Input savings     62 %\n  Total savings     52 %\n\n  Total cost        $ 14.80\n  Est cache-less    $ 30.97\n\n  Input tokens      8,701,412    | $ 9.93\n  Output tokens     324,578      | $ 4.87\n  Input/output tok  27x\n\nclaude-opus-4-20250514      (use-cost: 0% 1%)\n  Input cache hit   66 %\n  Input savings     51 %\n  Total savings     47 %\n\n  Total cost        $ 0.33\n  Est cache-less    $ 0.62\n\n  Input tokens      37,895       | $ 0.28\n  Output tokens     650          | $ 0.05\n  Input/output tok  58x\n```\n\n## Fields\n\n* `Input cache hit`: ratio of input tokens that were retrieved from previously cached data (at significant discount price) vs total number of input tokens, incl. those that were not previously cached, and incurred a 25% extra cost to put them into the anthropic prompt cache for subsequence referencing.\n* `Input savings`: Percent of how much money was saved on input token processing by using caching vs. if caching had Not been used at all. This takes into account the higher price for initial prompt caching writes. Worst case would be a negative percentage where there are always cache misses and no cache hits, as the input cost would then be higher than not using caching at all. As long as this percentage is positive money is saved.\n* `Total savings`: As the term implies, factors in the output token cost as well. It's `(cached-input-cost + output-cost) / (uncached-input-cost + output-cost) - 1`\n* `Total cost`: What was being charged by anthropic overall. This doesn't use anthropic's billing statement but derives the cost from the published pricing information that's been translated into the pricing configuration for the project. As long as this number lines up with your billing statement for the period the csv files cover, you know the pricing information this project uses is correct. If there is a difference, then anthropic has likely changed their pricing so the pricing information for the project should be adjusted accordingly.\n* `Est cache-less`: An estimate of what the total cost would have been if caching had not been used at all.\n* `Input tokens`: Total number of input tokens processed during the covered period. The dollar value at the end is the calculated actual cost for those input tokens.\n* `Output tokens`: Total number of output tokens generated during the covered period. The dollar value at the end is the calculated actual cost for those output tokens.\n* `Input/output tok`: Ratio between input and output tokens. Useful to get a sens of how you use different models, or models overall; if your usage is context heavy where you want models to answer based on information your provide / engage in long conversations, or mostly context free zero-shot like \"tell me a joke\". It's mostly a trivia statistic about _you_ and your model use.\n\nOf all the statistics, the Cache hit and Input savings are the most important. The former tells you how well (or poorly) you use Anthropic's caching, something you can affect with your model use behavior. The latter how valuable caching is to your specific use of the caching feature. The other stats are mostly derived from those two.\n\nFinally, in the per-model break-down, there are two percentage values in braces following the model name such as `(use-cost: 22% 63%)`. The two fractions respectively communicate that specific model's contribution to the total token use (input+output) and total cost. I.e. 22% in the example means the input token use for `claude-opus-4-1-20250805` contributed 22% of the total tokens processed by anthropic for the period. The second fraction; 63% means the cost contribution for those tokens in relation to all models used for the period.\n\nThese numbers are useful to get a sense of \"bang for buck\". For instance, is the output of Opus really worth the extremely high cost, or are you perhaps using Opus for things where Sonnet might have sufficed? \n\nLike for Input/output token ratio, these numbers are mostly about you and _your_ behavioral use of the models, trying to surface dimensions that _might_ save you even more money if you changed your model use slightly. For a very model-concious user it would of course mean something slightly different. In the example above; that there is a significant number of tasks / questions for where Sonnet simply isn't good enough, and where one has to _pay through the nose_ by using Opus instead to get the output quality required. In other words, these numbers just help surface usage information and how that usage contributes to what you pay. What the numbers mean is entirely dependent on what you prioritize and need. They are just statistics about your use and the relative cost of that specific use.\n\n## Updating pricing / Staying up-to-date\n\nThis script uses a pricing file derived from Anthropic's published model pricing pages since Anthropic does not provide an API for fetching this information automtically. As such this snapshot-information is likely to become stale if anthropic either introduces new models or changes the pricing for any of them, including the caching discounts.\n\nFor that reason the pricing information embedded is saved as a JSON file upon first use to your home directory, in the folder `$HOME/.config/claude-cache-savings/config.json` or on windows ``%HOMEPATH%\\.config\\claude-cache-savings\\config.json`.\n\nIf pricing changes, you can just revise the numbers in that file.\n\n### Config file (& missing model-id warning)\n\nThe config file has two elements to it:\n1. The pricing dimensions for each model **family**, in $ (USD) / 1M tokens as shown on the provider's pricing related web pages.\n2. A mapping between a specific model ID and a model family.\n\nThe reason for this separation is to make it as simple as possible to deal with new model variants in a given family, and the pricing, which is set for an entire model family and not for a specific model in that family. I.e. there is a many-to-one relationship between models and a model family.\n\nIf the pricing changes, then it changes for an entire family, so you just have to revise the changed pricing value for _that_ specific family and it will take effect for all models of that family. If a new model is introduced, such as an update to Sonnet, then that model will have a _new_ model ID, so you just add that new model ID to the second section, and the Sonnet family pricing will automatically be applied to this model as well.\n\nIf you've used a model (id) not covered by this script, the script will tell you about it when run. For instance `[!] Warning: the following model-ids do not have a pricing link established and will be excluded from the statistics calculation: {'claude-opus-4-1-20250805'}`. This means you have to add `claude-opus-4-1-20250805` to the Opus family in the config file so the script knows what price to assign for this model's API usage. Once done, the warning goes away and the usage + cost incurreed by this model will be included in the statistics as well. I.e. link this model ID to the family by adding the following to the `pricing` section of the config file: `\"claude-opus-4-1-20250805\": \"Opus\"`\n\nThe pricing information specified and required for a model family is as follows:\n\n* `input`: Normal input token pricing without caching.\n* `output`: Output token generation cost.\n* `cache_write`: Cost of input tokens that aren't already cached when you leverage the Claude caching feature in your programs / APIs.\n* `cache_read`: Cost for input tokens that were retrieved from the claude cache (cache hits).\n* `cache_write_1h`: A recent caching variant with a separate pricing, where anthropic will cache the input tokens for 1 hour instead of the normal 10 minutes only.\n\nAll prices are in $/1M tokens.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Calculates Anthropic Claude cost savings from use of prompt caching",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/jojje/claude-cache-savings",
        "Issues": "https://github.com/jojje/claude-cache-savings/issues",
        "Repository": "https://github.com/jojje/claude-cache-savings"
    },
    "split_keywords": [
        "athropic",
        " claude",
        " cost",
        " cache",
        " caching",
        " price",
        " usage",
        " statistics",
        " stats"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84cb396159b18a3b75060137dbb6ead4a04f3aba0414ef638b0e572b64e33fbd",
                "md5": "a119c81c8dae60c076622d8d9b86b7e7",
                "sha256": "89cbead72620d6bbee4760ba8e8e1f89399c86de314d536b8235bb5e44b3bbb3"
            },
            "downloads": -1,
            "filename": "claude_cache_savings-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a119c81c8dae60c076622d8d9b86b7e7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 8936,
            "upload_time": "2025-10-17T20:22:38",
            "upload_time_iso_8601": "2025-10-17T20:22:38.906831Z",
            "url": "https://files.pythonhosted.org/packages/84/cb/396159b18a3b75060137dbb6ead4a04f3aba0414ef638b0e572b64e33fbd/claude_cache_savings-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-17 20:22:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jojje",
    "github_project": "claude-cache-savings",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "claude-cache-savings"
}

jojje