Posts Tagged ‘apache web servers’

About the .htaccess file

December 20, 2012 2 comments

Many of you have a fair idea of what an .htaccess file is and what can be done with it. In my case that knowledge was not very accurate so I decided to learn more and to write this brief introduction to the subject.

.htaccess files are hidden text files that contain Apache directives. They are hidden for most operating systems because of the dot at the start of its filename. And they are also hidden on the web because most Apache web servers are configured to ignore them.

.htaccess files provide a way to make configuration changes on a per-directory basis as opposite to a per-server basis. It means that, when an .htaccess file is placed in a particular document directory, the configuration directives contained in the file apply to that directory and all its sub-directories, overriding the setup of those directives in the main configuration files. We have to pay attention to this recursive mechanism because a given .htaccess can override some behavior defined in a different .htaccess placed higher in the directory tree hierarchy.

We said that normally .htaccess files are ignored by the web servers. Why does it happen? Well, .htaccess files are not recommended for several reasons, mainly performance and security.

First, everything that can be done with an .htaccess file can also be done with the main configuration files of the server so, in principle, is not a good idea to put .htaccess files here and there because it makes more difficult to know the real configuration of the server.

Second, if they are enabled, every time the server receives a request, it will look for .htaccess files in every requested directory and its ancestors, and will process the found files in order to know which directives it must override. No caching mechanism here, it happens every time a request is received. In contrast, the main configuration has to be loaded just once. Even worse, if you are using RewriteRule directives in your .htaccess file, then the regex are re-compiled with every request to the directory (the main server is much more efficient as it compiles the regex just once and caches them).

So in general .htaccess files should be avoided: anything that goes into an .htaccess file can go into the main configuration files (using the same syntax in most cases) and performs worse… but they exist for a reason.

A typical case in which .htaccess files are used is that of an ISP hosting multiple sites on a single machine. If the server administrator wants to give users (i.e., content providers) the possibility of changing the configuration of their site without having access to the main configuration files, then .htaccess files are the way to go (of course, it implies security risks because people other than the service administrator will be able to change part of the server configuration). Also many popular CMSs like WordPress, Joomla or Drupal use .htaccess files.

Just one more thing. In order to use .htaccess files, the AllowOverride directive must be set to something different than None. This directive determines which directives are allowed in the file so we have to setup it accordingly to our needs. If AllowOverride is set to None then the .htaccess files are not even read by the web server.

That’s all about it. I’m not going to talk here about all the sorcery and tricky things that you can do using .htaccess files (you can do the same than with the main configuration files and using the very same syntax in most cases). In future posts I’ll talk about using the mod_rewrite module in .htaccess files and how it differs from using that module in the main configuration files.