In July 2016, Cloudflare integrated with Eager - an apps platform. During this integration, several decisions were made to ensure an optimal experience installing apps. We wanted to make sure site owners on Cloudflare could customize and install an app with the minimal number of clicks possible. Customizability often adds complexity and clicks for the user. We’ve been tinkering to find the right balance of user control and simplicity since.
When installing an app, a site owner must select where - what URLs on their site - they want what apps installed. Our original plan for selecting the URLs an app would be installed on took a few twists and turns. Our end decision was to utilize our Always Online crawler to pre-populate a tree of the user’s site. Always Online is a feature that crawls Cloudflare sites and serves pages from our cache if the site goes down.
The benefits to this original setup are:1. Only valid pages appearAn app only allows installations on html pages. For example, since injecting Javascript into a JPEG image isn’t possible, we would prevent the installer from trying it by not showing that path. Preventing the user from that type of phony installation prevents the user from being confused later when it doesn’t work.2. The user was not required to know any URL of their siteThe URLs are available right there in the UI. With the click of a check mark, the user would not have to type a thing.
The disadvantage of this setup is the dependency of the Always Online crawler.
First off, some users do not wish to have Always Online turned on. Without the consent of the site owner to crawl the site via Always Online, the page loader tree will not load and the user had no options of pages to install an app on.
When a user does have Always Online enabled properly, the crawler might not crawl every page the site owner wishes to install an app on.
The duty of Always Online is to make sure in the most catastrophic event for a site owner - their site being down - users can still see a version of the site via cached static HTML. Once upon a time before Always Online v2, we actually used the Google bot and other search engine crawlers’ activity to decide what to cache for the Always Online feature. We found that implementing our own crawler made more sense. Our goal is to make sure the most vital pages of a site are crawled and stored on our cache, contrasting with search engine crawler’s priority of get the most information possible from the site, thus going “deep” into the depths of a site map.
The duty of an app install on Cloudflare’s Apps platform is to seamlessly enable users to select pages in which to inject Javascript, HTML, CSS, and in the near future, Cloudflare Service Workers into. Since the objectives of the Always Online crawler differ from that of the Cloudflare Apps platform, there were inevitable consequences. Here are some examples where a page would not be crawled:
The page’s subdomain was not "orange-clouded".
The page was not be accessible from the site's homepage via links.
The site’s homepage had too many links for us to follow.
The page was password-protected, preventing us from accessing it and adding it to your site map.
The page was added before we had a chance to crawl the site.
Although our custom crawler works well for the Always Online feature, this limited control for our customers who are installing apps. We decided to do something about it. Combining the advantages of the crawler data we already had implemented with the ability to enter any URL in an install, we created the best of both worlds.
Now, site owners can type in whatever URL they wish to install an app. There is also an option for selecting an entire directory or strictly that page. For simplicity, no regex patterns are supported.
As the apps on the Cloudflare Apps platform advance, it is vital that the platform itself advance. In the near future, the App’s platform will have the power of Cloudflare Workers, local testing, and much more to come.