@jslack said:
@comictagger: It was necessary for e3, and we were seeing way too much abuse on comicvine with our API. CV Scraper, and other apps that share an API key are extremely aggressive, and don't do any kind of throttling when requests fail.
Your app should be storing data after retrieving data from the API, and each client should not be doing live requests to the API on every page load.
@jslack: re: extremely aggressive: Now that I have an error code for the "rate limit" failure, the upcoming release of CV Scraper will stop scraping when the first rate limit failure occurs. That should tone things down in the near future. Right now, many users are only discovering the new rate limit after they've come back from a scrape of a zillion comics and noticed that every one of them failed. I'll also take a look and see if I can do a better job of cancelling the entire operation whenever a series of failures happen, that way CV Scraper doesn't keep trying to request data when something is clearly wrong.
CV Scraper definitely does cache each page--there are no "page loads" per se, since it is a batch updater, but it doesn't load any data a second time, unless the user explicitly "re-scrapes" a file at some point in the future.
Do you think it would be helpful if I throttled CV Scraper so that individual batches of scraped are processed more slowly? I could easily put in some kind of a delay between every 5th or 6th request.
As I mentioned in my PM, the limit of 200 hits every 15 minutes is falling just under what many of my (not heavy) users need. Would you consider switching it to be 800 hits every 60 minutes, instead? That would allow more breathing room for most users, and I don't think it would lead to significantly more load on your servers.
Log in to comment