Background of Multi-site Setup
It is pretty easy to set up Sitecore for a single-instance, multi-site environment. Sitecore is robust enough to handle hundreds or even thousands of websites within the same content tree. All you have to do is specify a different node as the homepage of each website. By default, the Home node is the main website and this is defined in the web.config <sites> section.
The format of site config entries is straight-forward. Give it a name, a path, and the start item (the homepage item). To add a new site, all you have to do is add a new <site> entry and save the file. Also, if your website is running in Live Mode (reads from master database instead of web), you would have to add a new entry to the LiveMode.config file as well.
This is the traditional out-of-the-box way to add a new site. It is fine for a few sites but what if we have hundreds? Technically, we could repeat this process but this also means that the app pool also has to be recycled each time a new site is added because config changes require an app pool restart. This also means slow response times during the restart, even if it is only a few seconds.
So how do we handle this in an elegant manner? It's not that the traditional way is difficult but it requires developer intervention. There are downloadable modules from the marketplace but there isn't a lean and simple solution.
SiteResolver Process (HttpRequest Pipeline)
To understand what we have to do, we have to understand what Sitecore does behind the scenes when it encounters a multi-site setup. Inspect Sitecore.Pipelines.HttpRequest.SiteResolver in a decompiler. You should see the following steps in a nutshell:
1) Sitecore grabs and stores all the <site> definitions
2) The URL parameter "sc_site" is checked. If a website is defined with that name, we have found a matching site!
3) If "sc_site" is not available, then we check the hostname (domain name). If a website is defined with that hostname, we have found a matching site!
These steps assume that the sites are all defined in the config files. What if they are not? Website names and hostnames are attempted to be matched from top to bottom in the list of definitions. If a matching website is found, then the searching stops. If the first one doesn't match, then move on until we hit the last one, which is, AND SHOULD BE, the default "website". This ensures that at least the default website will load if all is bad.
Knowing all of this, let's go ahead and build our own custom site resolver process to replace the default one.
Build CustomSiteResolver Process
Create a new class that inherits from the original site resolver.
All we have to do is override the Process method and make sure that it functions almost the same way as the original method:
The original Process performs the following:
1) Find a matching site based of URL parameter
2) If not found, match on hostname
3) Find site and update the start item so we know where the homepage is
The new Process should be:
1) Find matching site based on URL parameter or hostname
2) If nothing matches or if the only website that matches is the default website, then we take another approach
3) Parse the <site> definitions manually and look for the default website definition. We will base all dynamic websites on the default website definition.
4) Check the URL parameter for the site name. If available, iterate through all the child nodes in "/sitcore/content" and look for the item where the URL parameter value matches the value of the field "Site Name". If there is a match, we have found the start item for this website.
5) If the site name does not exist or does not return a matching site, then the hostname is checked. Iterate through all the child nodes in "/sitecore/content" and look for the item where the hostname matches the value of the field "Site Hostname". If there is a match, we have found the start item for this website.
6) If no matches are found, we use the default website.
Step-by-Step Implementation
It is much simpler when explained with pictures, so here goes. First, make sure you modify the template of each website's start item, either by template inheritance or simply adding new fields, to have these two fields:
Modify your custom site resolver to check for sites:
Check the site config file and get the definition for the default website as a single Xml node.
Get a list of all the child node items in the content tree
Perform the match on the item fields and return a site context (GetSite method)
If a match is found, use the default website definition but change the start item to match the item just found via the field search. Also, set the context site to be the site that was just defined. It is very important to set the context site so that Page Editor mode will be preserved and works as expected.
Now, replace the original processor from the pipeline with the definition of the new custom processor
Possible Enhancements
A very obvious enhancement would be to Solr-ize (or Lucene-ize) the child nodes of the content tree. That way, we can perform quick searches on any of those items in the "Site Name" and "Site Hostname" fields and obtain a matching item quickly.
Summary
I am not sure exactly how the other modules work but it is safe to say they all work in a similar way as the method above. Those modules might be more robust but the method I just described is definitely leaner and doesn't require a set of "global" items to be created to store site definitions. We use a workaround where we take the default website definition and just change its start path. This also assumes there are not specific port numbers used as well. Again, the goal is to have site definitions checked dynamically from a list of website home nodes so developers do not need to update the config files constantly.