Model | Reserved for Input | Reserved for Output | Cost / Input Token | Cost / Output Token | Total Cost per Query |
---|---|---|---|---|---|
GPT-3.5 | 2800 | 1200 | 0.0000005 | 0.0000015 | 0.0032 |
GPT-3.5-16k | 13600 | 2400 | 0.0000005 | 0.0000015 | 0.0104 |
GPT-4o-mini-1k | 800 | 200 | 0.00000015 | 0.0000006 | 0.00024 |
GPT-4o-mini-2k | 1600 | 400 | 0.00000015 | 0.0000006 | 0.00048 |
GPT-4o-mini-4k | 2800 | 1200 | 0.00000015 | 0.0000006 | 0.00114 |
GPT-4o-mini-8k | 5600 | 2400 | 0.00000015 | 0.0000006 | 0.00228 |
GPT-4o-mini-16k | 12800 | 3200 | 0.00000015 | 0.0000006 | 0.00384 |
GPT-4o-mini-32k | 28000 | 4000 | 0.00000015 | 0.0000006 | 0.0066 |
GPT-4o-mini-64k | 60000 | 4000 | 0.00000015 | 0.0000006 | 0.0114 |
GPT-4o-1k | 800 | 200 | 0.0000025 | 0.00001 | 0.004 |
GPT-4o-2k | 1600 | 400 | 0.0000025 | 0.00001 | 0.008 |
GPT-4o-4k | 2800 | 1200 | 0.0000025 | 0.00001 | 0.019 |
GPT-4o-8k | 5600 | 2400 | 0.0000025 | 0.00001 | 0.038 |
GPT-4o-16k | 12800 | 3200 | 0.0000025 | 0.00001 | 0.064 |
GPT-4o-32k | 28000 | 4000 | 0.0000025 | 0.00001 | 0.11 |
GPT-4o-64k | 60000 | 4000 | 0.0000025 | 0.00001 | 0.19 |
GPT-4-1106-1k | 800 | 200 | 0.00001 | 0.00003 | 0.014 |
GPT-4-1106-2k | 1600 | 400 | 0.00001 | 0.00003 | 0.028 |
GPT-4-1106-4k | 2800 | 1200 | 0.00001 | 0.00003 | 0.064 |
GPT-4-0125-8k | 5600 | 2400 | 0.00001 | 0.00003 | 0.128 |
GPT-4-1106-16k | 12800 | 3200 | 0.00001 | 0.00003 | 0.224 |
GPT-4-1106-32k | 28000 | 4000 | 0.00001 | 0.00003 | 0.4 |
GPT-4-1106-64k | 60000 | 4000 | 0.00001 | 0.00003 | 0.72 |
Workflow | Average Estimated Input | Average Estimated Output | Cost / Input Token | Cost / Output Token | Estimated Cost per Run |
---|---|---|---|---|---|
AI Agent intent generation (gpt-4-1106-preview) | 600 | 450 | 0.00001 | 0.00003 | 0.0011 |
Query intent classification (gpt-3.5-turbo-1106) | 1000 | 50 | 0.000001 | 0.000002 | 0.0195 |
Variables extraction (gpt-3.5-turbo-1106) | 1000 | 100 | 0.000001 | 0.000002 | 0.0012 |