OpenAI's credits model in <100 lines of code

Kshitij Grover

AI tools like OpenAI, Anthropic, or Perplexity, deal with a ton of abuse and fraud on their platform. In a competitive market with costs that scale linearly with usage, even a small percentage of users using an uncollectible payment method to get access to "free compute" can quickly become a risk to business viability.

Although the traditional usage-based motion for infrastructure providers involves paying at the end of the month for incurred charges (e.g. AWS), OpenAI requires you to load in dollars upfront which are then drawn from as you call the API. The system optionally automatically tops up the balance to a target amount as soon as it dips below a configured threshold. You might recognize this prepaid behavior as Twilio’s from over a decade ago – unsurprising given the amount of fraud there is in telecom.

Implementing this billing model with a system like Stripe Billing can be incredibly painful. Billing systems designed for seats and licenses don’t have the abstractions to deal with a credit balance, aren’t built to be reactive enough to issue alerting or webhooks, and certainly don’t correctly track this revenue in their accounting reports.

Implementing OpenAI’s credits and top-ups with Orb

Let’s walk through how Orb allows you to implement this model with less than 100 lines of code. Orb features prepaid credits as a native abstraction.

There are a few items we’ll want to handle using the Orb SDK:

  1. Incrementing credits and issuing an invoice when a user loads credits adhoc
  2. Configuring usage to automatically draw-down from the user’s credit balance
  3. Top-up behavior to ensure that credits are reloaded appropriately
  4. Displaying the user’s remaining credits

In order to handle your user buying credits, simply make an increment call with the desired amount. Note that Orb allows you to pass a cost basis, which determines the accounting treatment for this revenue.


@app.route("/increment-credits", methods=["POST"])
def increment_credits(request_body):
  customer_id = request_body["customer_id"]
  credit_amount = request_body["credit_amount"]

  orb.credit.create(
            customer_id,
            {
                "entry_type": "increment",
                "amount": credit_amount,
                 "per_unit_cost_basis": "1",
                "description": "Credits purchase",
                "invoice_settings": {
                    "auto_collection": True,
                    "net_terms": 30,
                },
            },
        )

Orb will automatically deduct credits from your customer’s current balance as soon as usage is sent to the system. Importantly, you don’t tell Orb how many credits to deduct each time. Instead, properties of the event can be used to compute a “usage quantity”, which can then be translated into the final cost factoring in tiers or a free allocation. For example, you might build the following event structure:


{
   "event_name": "inference_call",
   "timestamp": "2024-01-01T00:01:00Z",
   "properties": {
	"model_type": "gpt-4-1106-preview",
	"num_input_tokens": 8201,
	"num_output_tokens": 29102,
    },
   "external_customer_id": "customer-123",
   "idempotency_key": "..."
}

Orb is the only billing system that gives you the full power of SQL when aggregating these events (e.g. into a sum of “total input tokens”).

It’s also very easy to configure the top-up behavior. Simply listen to Orb’s webhook customer.credit_balance_dropped webhook – configured for the customer at the desired threshold – and increment credits in response.


@app.route("/webhook-handler", methods=["POST"])
def handle_webhook(webhook_event):   
  # ...Confirm Signature...
  if webhook_event["type"] == "customer.credit_balance_dropped":
    customer_id = webhook_event["customer"]["id"]
    orb.credit.create(
        customer_id,
        {
          "entry_type": "increment",
          "amount": 50,
          "per_unit_cost_basis": "1",
          "description": "Credit balance top-up",
          "invoice_settings": {
            "auto_collection": True,
            "net_terms": 30,
          },
        },
    )

Note that Orb can also send you a webhook when the credit balance is entirely depleted, so you can simply disable programmatic access to your product and send an email indicating that credits have run out.

Finally, in your portal, you’ll display the remaining usage credits by fetching this from the Orb API. Orb allows you to split credits into blocks, which could each be effective or expire at a different time. It only draws down usage from blocks eligible at the logical timestamp of usage, which is important when handling late-arriving events.


@app.route("/credit-balance", methods=["GET"])
def fetch_credit_balance(customer_id):
	res = orb.credit.fetch(customer_id)

	# res.credits is a list of the unexpired, non-zero credit blocks for the customer
	total_balance = sum([block.balance for block in res.credits])
	return total_balance

Beyond the OpenAI model

The OpenAI model of credit purchases and reloads is a simple starting point, but it’s important that your billing system is flexible enough to handle a broad range of scenarios.

For example, Perplexity Pro costs $20/month and comes with $5 of credits each month – these are automatically added to your credit balance and continually accrue.

Orb features the concept of an allocation that is tied to your user’s subscription. Allocations are built on top of the credits ledger, and make it easy to set up recurring credit purchases. Allocations ensure that credits become effective and expire at the correct time and that you get the proper cancellation or change behavior when the subscription state changes. You don’t want to add complexity to your integration to continually have to increment credits at the end of each billing cycle, checking if the subscription is active – with allocations, Orb handles that for you.

When using Orb, you get a billing system that’s built for these prepaid models, so you’ll also have access to multiple credit units for isolated balances, support for prioritizing specific credit blocks, and more. Most importantly, these concepts are built with revenue reporting in mind, so the finance team has all the data they need to close the books each month.

Fraud: other strategies

Prepaid credits are just one strategy to combat fraud. It’s also common to use different triggers that effectively cap your credit risk. For example, Orb is the only billing system that supports threshold billing on real time usage, triggering an invoice as soon as a dollar cost is hit on a subscription.

Suppose you set a $50 cost threshold – this makes it simple for self-serve customers to start without any purchase, and to allow your platform to shut down access if the incremental $50 invoice isn’t paid after multiple attempts. This is built on top of Orb’s alerts, and tied in seamlessly with the billing engine. As you build a relationship with customers, you can extend them larger thresholds.

Billing built for evolution, not stasis

Billing is ultimately a reflection of both the value your product provides and the market dynamics of your industry. An industry like AI – growing at breakneck pace – demands business systems that can keep up from the perspective of scale, reliability, and flexibility in the face of new market pressures. Orb can help you get back to building, not billing.

posted:
January 16, 2024
Category:
Guide

Let's talk.

Thank you! We'll be in touch shortly.
Oops! Something went wrong while submitting the form.