Development of Brand New Screenshot Capture Generators

2 posts / 0 new
Last post
puravida's picture
puravida
Jedi Warrior
Offline
Joined: 09/01/2007
Visit puravida's Website

Hello,

Over the past few months, we have noticed an increase in failed or incomplete captures for working web pages. These failures occur on a variety of web pages but seems to occur very frequently on pages secured by Comodo CA SSL certificates (an old problem that was resolved but has returned) and pages using HTML5 embedded video.

In our extensive testing of the current capture generators, we have been unable to identify the cause of the failures or find any way to overcome them. Therefore, we have resorted to a complete re-design and re-build from scratch. As of yesterday, September 15th, 2015, we have completed a new build that overcomes all known limitations and, so far, has a 100% success rate in capturing web pages. It is also capturing at roughly 100%-150% faster than the current capture generators.

We will be working diligently to bring the new capture generator into our generator pool as soon as we have fully tested and built out a deployment script for them. There will be no downtime as a result of this migration. This notice is just to let you know that we are aware of the increasing number of failures that may or may not have affected you and that a much improved system is coming in the near future.

As always, we value you as a loyal ShrinkTheWeb user!

Best regards,

Brandon

puravida's picture
puravida
Jedi Warrior
Offline
Joined: 09/01/2007
Visit puravida's Website

The new capture process was very successful and resolved a lot of lingering issues and new issues that had shown up for no discernible reason. The issues we noted were beyond our control, so we felt it necessary to start over.

For the past month, we have been running on about 15% of our normal capacity, using only the new capture generator technology. It was enough to keep up with demand, so we used the opportunity to provide more load to the new generators for testing and monitoring.

As of today, all capture generators have updated and launched.

The only error we found with the new capture process that may have been present in the old process (but went undetected) is that a number of URLs were failing with a blank image captured. This was happening on very slow loading sites that, for some unknown reason, take even longer from our primary hosting datacenters. It may be something specific to the network of our primary hosting company but they refuse to admit there is a problem because the delay in loading the sites only occurs in our capture process (ping, curl, traceroute, etc all load the sites very quickly). However, the sites load quickly in our secondary hosting datacenters, so that tells us that it is not our process. Perhaps there is some conflict with the primary datacenter related to our process, but I have no clue where to even start looking for that possible culprit.

In any case, we have no control over this issue and resorted to extending the timeout to allow more time for those types of sites to load. This ensures a proper capture instead of the BLANK_DETECTED error. We identified 250,000 BLANK_DETECTED captures in our cache, so I have set them all to retry --in case a good many are valid but suffered from this issue.

Aside from that, our capture success rate should be at an all-time high and throughput per generator appears to be increased as well. Under heavy load, the capture generators perform "as fast" as the old process. Under light load, they perform 200-300% faster.

ShrinkTheWeb® (About STW) is another innovation by Neosys Consulting
Contact Us | PagePix Benefits | Learn More | STW Forums | Our Partners | Privacy Policy | Terms of Use

Announcing Javvy, the best crypto exchange and wallet solution (coming soon!)

©2018 ShrinkTheWeb. All rights reserved. ShrinkTheWeb is a registered trademark of ShrinkTheWeb.