If anyone would like to hack his scraper, then commenting out a line of code makes the search work much better. Seems that putting an " AND " between each term is what causes problems.
I commented out line 66 in C:\Users\<your username>\AppData\Roaming\cYo\ComicRack\Scripts\Comic Vine Scraper\cvconnection.py:
Half of the stuff it doesn't seem to find at the moment, for example plenty of Spider-Man books (Amazing Spider-Man Epic Collection, etc.). When I enter the same search query on the webpage I found those books without any problems.
So as a I workaround I have to come here, retry the seach on the webpage and then paste the resulting ID into the CV Scraper search dialog to find the entry.
I am experiencing something much worse. Whenever I use the ComicVine scraper I can't use the site anymore. Seems like my IP address gets immediately blacklisted on the whole CV site because nothing else works here anymore, I keep getting timeouts whenever I try to load the page or the forum. If I change my IP the site is working again until I try to scrape a comic book, then it stops working again. Really weird.
Limiting on the server side is doable, constantly asking the API users to change the software just makes no sense. Next week you are going to tell them that now it should be 2 seconds or that something else needs to be done. Taking the matters in your own hands makes it tweakable. You wouldn't tell the website visitors to only click on 3 links per minute, would you? Same thing. Limit or queue it on your side.
What do you expect to gain from the 1s delay? It is going to take us longer to get our books tagged. Hence more users will be online at the same time which--again--is going to increase your load. This isn't a solution, you are just shifting the problem. You need a good strategy and not trial-and-error development. Load & performance tests don't hurt.
Log in to comment