Building A Paid API Offering



In this post I am going to talk about how we built the backend infrastructure to support our Products API, Semantics3’s core product. This post can also be seen as a follow up or maybe even a reply to a post some time back by Zemanta, which dealt with building public APIs.

In that post, the author suggests using 3rd party API support vendors like 3Scale and Mashery to handle all API backend requirements. We seriously considered those two services and concluded that they weren’t appropriate for us because:

  • Our API is our core product offering and we didn’t feel comfortable outsourcing such a critical part to a 3rd-party service.
  • Those services aren’t cheap.
  • We are engineers. We love to build things (while keeping a wary eye on reinventing the wheel).

(c) Techcrunch – http://tcrn.ch/ModUnx

Our core product is our Products API, which allows developers to get immediate access to millions of products data which is constantly updated. A sample query to the API may read: “return LCD televisions with price >= USD400, with length >= 65cms of brand Samsung”.

One can think of an API as a utility service. A paid API is essentially the sale of some sort of utility, either on a usage-based model or a monthly subscription model. In our case, the utility which we are selling is realtime access to products data.

In this post, I am going to focus mainly on the technical aspects of the administrative parts of running such a service. “Administrative parts” includes the delivery, the metering, the billing of customers, authentication of users, etc. How we built the actual utility service is a discussion for another day.

It took three of us (Govind, Vinoth and myself), three weeks to build the API support infrastructure. In terms of work allocation, Govind worked on the actual piping of the data. Vinoth worked on the billing and frontend analytics dashboard. I worked on the architecture, authentication, throttling and metering of customers.

So, what the are some of the things you need to consider when building your own paid API offering?

0. Design Of The API

REST-based APIs have become the defacto standard, so you probably want to build one on these principles. Beware, it’s very easy to design an unREST-ful API so do spend lots of time planning your endpoints. These resources from 3Scale helped us better understand the technicalities of desining a RESTful service. JSON has also become the most popular delivery format, so ensure that you supports that.

1. The Language/Platform To Build The API

The language/platform you pick is of critical importance. You’d ideally want to choose something that can handle a large number of concurrent requests and scale well. Our original plan was to go with Perl, which is our predominant language of choice. However, after some investigation, it didn’t seem to be a great option  (Plack unfortunately is lacking in good documentation). We evaluated Python (Tornado) and Node.js and eventually chose the latter. This is because Node.js has baked in async functionality, native JSON support (duh) and a great community behind it. Our API is built using the really good Restify framework.

Unfortunately, code in Node.js doesn’t run in an async manner automatically. You often need to write it with (ugly spaghetti-like) callbacks for it to happen. Finally, debugging javascript is not a very pleasant experience.

Two other language which you may want to consider, are Golang and Erlang. On hindsight, I probably would have picked Golang.

2. API Key Generation and User Management

You need to have a robust system of generating keys and secrets, since these credentials are used to authenticate customers who access your service. It has to be secure, unique and one-way (not decipherable). Our algorithm for generating keys uses the base64 encoding of the output of some well known hashing algorithms of user details and a random crypto number.

3. Authentication

We decided to go with OAuth v1.0 2-legged as our primary authentication scheme. The other authentication scheme which we support, is basic authentication (just send in a http request header with your key present), but its restricted to only our test endpoints.

Since Node.js didn’t have a suitable OAuth server library, we ended up writing our own (we will open source it some point in the future). We had to then test it with all the popular oauth client libraries for all the major scripting languages, Perl,Python, Ruby, PHP and Node.js.

4. Metering

This is the most critical part of a paid API offering. Metering is used to track exactly how much resources each customer has used. It’s also the starting point for debugging your system.

When designing your system, try to capture as much information as possible from each API query. For each request made to our API, we log 25 different parameters related to the call, giving us more than enough data to hunt down even the hairiest of bugs. This information can later come in handy for analyzing your customers’ usage patterns – e.g.: How many requests are being made, how frequently? Which resources are requested for most often?

All API calls are logged on a Mongodb server. We then run map-reduce jobs to aggregate the number of calls made by each api key, to determine daily usage of each of our customers.

5. Throttling API usage

API throttling is very critical, because you don’t want your service (especially your free plan) to, bluntly put, become a free-for-all unlimited buffet service. We use the leaky bucket algorithm to throttle API requests based on the tier of the plan. [E.g.: our free plan is capped at 1000 calls for any given 24-hour period, from the time the first call was made.] We use a patched version of an implementation of this algorithm, which is available in the Restify API framework which we use.

Here is a great Stackoverflow thread that discusses about request throttling.

6. Billing of Customers

We are based in Singapore, and hence have no access to Stripe :( . As a result we ended up choosing the not-so-easy-to-integrate Paypal API, which took a good one week for Vinoth to integrate, what with the Paypal API’s creaky url-callback system, poor documentation and buggy sandbox environment. It’s quite amusing that Paypal doesn’t even provide a REST API!

We support two types of payments. One is a monthly subscription (get a fixed number of calls per month) and the other is a bulk call purchase (purchase X number of calls at Y dollars).

7. Dashboard/Analytics Platform for Customers

Finally, you want to build a visual front-end so that your customers can track and monitor their usage.

Our analytics dashboard allows users to view the the total number of calls that they made for each day for their chosen date range. We also display the last 100 calls made on the last day of the chosen data range.

We used client side rendering to build the dashboard, using the excellent ICanHaz.js library (which comes with built-in mustache templates support). Client-side rendering is a great strategy when building dashboards, because it makes work division between frontend and backend devs really convenient. More importantly, it allows for changes and new features to be introduced more easily as the various aspects of code (front-end display and back-end data generation) are clearly demarcated.

The javascript library we used for rendering the graphs (bar graph and pie chart) was JQuery Flot. JQuery DataTables was used to display the table of calls. Finally we used the Bootstrap datepicker library to allow for users to select their date range.

Conclusion

Building a paid API offering involves several considerations that need to be planned thoroughly. I hope this blog post serves as a simple guide for those looking to build something similar for their own startups. That said, if your API is just a non-critical add-on service and not your core offering, using a third party service like 3Scale or Mashery may be a much better choice.

On a side note, if our current idea doesn’t take off, we may setup a Mashery competitor (just kidding ;) ).

PS: We just launched a closed beta of our API. We would be most glad if you could give it a try. Here is the signup link (it comes preloaded with the invitation code). Don’t hesitate to share it with your developer friends.

Sivamani Varun

Recent San Francisco transplant from Singapore. Less recent graduate from the National University of Singapore. One-time hardware engineer. Now, a recovering perl hacker. Part-time business guy at Semantics3.

More Posts - Website

  • http://www.3scale.net/ Guillaume

    Hello Sivamani,

    Interesting post :-) And thank you for mentioning 3scale.

    However I strongly disagree with some of your statements regarding 3scale and the use of an external API Management platform like 3scale or Mashery.

    Let me go through them:

    a. Our API is our core product offering and we didn’t feel comfortable outsourcing such a critical part to a 3rd-party service
    3scale has tons of customers for whom their API is core and who use 3scale to benefit of all these elements (and more actually!) you mention:API Key generation and user management
    Authentication
    Metering customer usage
    Throttling API usage
    Billing of customers
    Dashboard/Analytics platform for customersThe reason why 3scale is absolutely compatible with an API being core to a company is that 3scale unique architecture doesn’t take over the control of your API: 3scale doesn’t host the API, doesn’t filter the traffic to your API and offer you tons of APIs to integrate 3scale the way you want and keep absolute control on your infrastructure stack.
    b. Those services aren’t cheap

    With a service starting at a few thousands dollars per month I agree Mashery is clearly an expensive solution for startups and medium sized businesses. But 3scale expensive?? Would you have missed the Free (for ever) version of our platform? Would you have missed the evolutive pricing model that is just around $200 per month (cf. http://www.3scale.net/pricing)?

    c. We are engineers. We love to build things

    Sure! And that’s awesome! Actually I’d love to be like you guys and to be able to build things. Unfortunately I am just an industrial engineer and my programming skills are rather limited. 
    However, what I am sure of is that you have very likely spent tons of time (how many weeks and how many persons?) building something that is actually not core to your business: generating access token, analytics, etc…
    Plus in the scenario you would have been using 3scale, let’s say the $125/month plan that includes monetization I am pretty sure you would have saved a lot of money and that you have had a tremendous opportunity cost (http://en.wikipedia.org/wiki/Opportunity_cost).

    So my conclusion is: don’t spend time re-inventing the wheel specially when there are Free or ultra economic solutions like 3scale API Management platform that can save you time, money, energy and help you focus on your core business or techno ;-)

    • http://twitter.com/govind201 Govind Chandrasekhar

      Hey Guillaume,

      Thanks for your post.

      Yes, I agree with your point that 3Scale is a more viable option than Mashery and other players in the market. And we do appreciate that you have a transparent pricing model out – we’d found it quite frustrating that a lot of the other companies out there had a rather difficult to locate pricing page. However, both in terms of cost and control, an in-house solution seemed to make more sense. 

      Focusing on cost in particular, if you take a look at our pricing page, you’ll see that our beta customers can get 100K API calls/day for $299/month. Were we to use 3scale’s pro plan ($750/month) to support 30 customers, then our expenses per month will be ~8.3% of our revenue (please correct me if I’m interpreting 3scale’s pricing page wrong). 8.3% is a significant percentage, especially since we’re looking at the long haul, i.e., several years hopefully. 

      Building all of this took us 3 weeks * 3 people = 9 weeks of work. Seeing as we’re aiming for high volume traffic, the numbers just didn’t add up while considering third-party services.

      • http://www.3scale.net/ Guillaume

        Thank you for considering 3scale as a more viable option than Mashery ;-)

        Thank you also for sharing the specifics of the implementation (resources and cost).

        Doing some back of the envelope calculation and assuming that a good engineer costs $100K in the valley, the implementation of your in-house solution costed you $16,000. That is 22 months of 3scale PRO solution at $750!

        And one thing that would need to be considered in your cost structure is the maintenance of your in house solution. My assumption is that a minimum of 3 days a months for 1 person should be ok: that is ($100K / 12 months / 21 days) * 3 days = $1,190 / Month for maintenance.

        So cost of implementation and cost of maintenance together, you could have been using 3scale PRO up to 10 Million calls at $1,750/month (without discount ;-) during 22 months and still save $200 per month ;-)

        Oh, and I think I forgot the resources you will need for upgrades, improvements, changes, new features, etc.

        • http://twitter.com/govind201 Govind Chandrasekhar

          That salary estimate doesn’t quite apply to a bootstrapped start-up, living off its own revenue. That’s setting aside the fact that we’ve had great interns (interns have a different pay structure). Moreover, it’s difficult to put a number on the learning that we concurrently gained (technical + market insights), team dynamic developed (which might otherwise take time to achieve) and so on.

          With regards to maintenance and upgrades, we haven’t faced too many problems yet, but I must admit, we haven’t been around for long enough to put a definitive number on that.

          I know our discussions here so far have only been about cost … but I guess Varun’s point in this post is that there isn’t a one-size fits all. For those startups looking to build their own solutions, some of the guidelines mentioned here could prove to be useful.

        • Anon

          Using 3scale isn’t free. A developer also has to consider the costs of integrating 3scale and managing that relationship, and the risks associated with depending on 3scale.

  • GUSTAVO

    3scale provides a hybrid , API traffic management on-premise, API administration, traffic reports and developer portal cloud based and full featured API Management solution.

    http://www.3scale.net/api-management/