Apache, PHP and MultiViews
A dedicated webmaster
As a dedicated webmaster, and a true geek, I am trying to create clean, straightforward code and organization for bmt-online. Thus, after installing Apache 1.3 on my Windows box, after installing PHP4, in order to have an easy document organisation on the site, after validating all the content with W3C, I started to think about how to organize my documents on the site.
While doing the validation, I came across an interesting article on the W3C site entitled "Cool URIs don't change". A whole program ! Suffice to say that I was convinced, and decided to take advantage of the facilities offered by Apache.
Thus, the current page, as I am writing this article, is a php script. Its canonical URL is :
http://www.bmt.dnsalias.org/geekisms
However, it would be equally valid to access it at:
http://www.bmt.dnsalias.org/geekisms.php
All this is fine, and I am pretty happy about my setup, except for one little detail: the CSS validation of the documents when refered by the "good looking" URIs was broken. Instead of congratulating me with my valid CSS, the validator now spit a hard and short error at my face:
I/O Error: http://www.bmt.dnsalias.org/: Not Acceptable .
Investigation...
After investigation, it turns out that this is the result of the conjunction of several independant factors. So, let's look at the technical details of my setup:
-
I set up PHP for Apache as a module. This apparently requires that a
pseudo MIME type,
application/x-http-phpbe declared so that the module correctly handles PHP files. The directiveAddType application/x-httpd-php .php
in my Apache configuration file in effects makes all files with extension.phpto be considered of typeapplication/x-httpd-php. -
To achieve the "good looking" URIs, I used the
MultiViewsdirective of Apache. The description, taken verbatim from the Apache documentation is:
The last part is important. Remembering that we attributed the media typeThe effect of
MultiViewsis as follows: if the server receives a request for/some/dir/foo, if/some/dirhasMultiViewsenabled, and/some/dir/foodoes not exist, then the server reads the directory looking for files namedfoo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client's requirements.application/x-httpd-phpto PHP files, we see that while processing the content negociation request of the client, the server considers the PHP files as being of their own private type.
This is somewhat off, given that the actual type of the output of the PHP script is a genuine HTML page, with correctly set text/html content type, but, oh well, it doesn't do much harm. Or does it ? -
The CSS validator at W3C is a picky web client. A capture of the
packets exchanged between the validator and my server shows that it
sends a quite peculiar
Accept:field in its HTTP request.GET /geekisms HTTP/1.1 Cache-Control: no-cache Date: Thu, 10 Jul 2003 00:06:02 GMT Pragma: no-cache Accept: text/css,text/html,text/xml, \ application/xhtml+xml,application/xml, \ image/svg+xml,*/*;q=0 Accept-Language: en-us Host: www.bmt.dnsalias.org User-Agent: Jigsaw/2.2.0 W3C_CSS_Validator_JFouffa/2.0So, wait a minute, what ? This means that the validator accepts css, html, xml and a few other mixes of standard web publishing formats, and,*/*;q=0, nothing else ?!?
Yes indeed, that's what it means. This client is picky, but it is its own right. It is not violating any standard, nor any rules of good conduct. It is just enforcing the fact that, being a CSS validator, it has to be served an *ML document.
Now, during content negociation, triggered by the fact that the client
asked for a content-type-generic URI, the server found out that it was
asked for *ML only content, while it was only able to deliver
application/x-httpd-php content. Thus, complying with the
HTTP 1.1 standard, it returned a 406 Not Acceptable
response to the client, which in turn displayed it on its result page
as the dry error above.
Where to go from there ?
When I found out about that problem, I started, of course, researching about it. There are several faqs and bug reports describing the problem around:
-
W3C report the problem in their CSS validator
FAQ. In short, they say that creating a specific mime type for
php files (with the
AddTypedirective) is a misconfiguration of the server, and that the server should be configured to use the PHP module as a handler instead (with theAddHandlerdirective).
However, as far as I can tell, there is no mention anywhere in the PHP documentation that it has a handler ability when used as a module for Apache 1.3. -
There is a bug
report on the PHP web site, describing the problem,
where the submitter of the bug recommends that the mime type for the
PHP files be
text/x-httpd-phpinstead ofapplication/x-httpd-php.
However, noticing that the problem is solved with Apache 2.0, the bug report has been closed without being fixed. I do not think this is the right solution anyways. A PHP script is basically allowed to produce an output of any content type. If the server has to do anything, it's to analyze the output of the script, and classify the document accordingly during the content type negociation. Talk about heavy duty...
I have tried to bring up the matter on comp.lang.php, to
no avail. On the #php channel on IRC, I have been told to read the
fscking manual :-(.
In any case, nobody was able to tell me if there was a known
workaround for the issue, or if there was a way to install PHP for
Apache as a module invoked as a handler, or any way to modify the
dreaded mime type application/x-httpd-php and still have
the scripts correctly processed by PHP.
I have had a look at the code of the PHP module. The mime type is very much hard-coded in there, as evidenced by the following excerpt from the code:
handler_rec php_handlers[] =
{
{"application/x-httpd-php", send_parsed_php},
{"application/x-httpd-php-source", send_parsed_php_source},
{"text/html", php_xbithack_handler},
{NULL}
};
I thought there might be some hope with the line containing
text/html. It handles the so-called x bit hack, which
allows PHP to "parse files with executable bit set as PHP
regardles of their file ending". However, it is no more than a
hack as its name indicates, and it does not work as expected under a
Windows system.
A story in the making
As it is, I still haven't found a satisfying way of solving the issue. If a certified PHP geek is able to devise a viable workaround, he is more than welcome to contact me and let me know about it.
wed 2003-07-09
