Here’s an example. If I want to search Getty Images there is a search string that looks like this:
That looks very useable for an application like DevonAgent. The parameters are nicely named and there aren’t that many of them. But if I try to scrape or extract image elements from the resulting page, very little is returned. Go ahead, take a second to look at the source of that page. I’ll wait…
That’s forcing me to deploy heavier artillery like Fake app or Python libraries that actually run a browser instance. Maybe that’s the point. If it’s difficult I will just browse the site and look at ads.
- I don’t want to dive into the application. It’s very complex and could take several hundred words just to describe what’s possible with DevonAgent. ↩
- There’s also the Selenium package for python and other languages. These are much more complex than using something like BeautifulSoup that is fast at extracting from an XML hierarchy. ↩