Main
Latest
- South African Embassy
- Green Card Lottery 2008
- Zend Framework Google Login Example
- Custom Muti Widgets
- Google Social Graph API and PHP 5.2
- Google AJAX Feed API Muti Example
- Gnip API Changes
- Google Maps and Geolocation
- oEmbed, flickr and starstar
- Petition Against Public Holidays
Archives
- June 2004
- July 2004
- August 2004
- September 2004
- October 2004
- November 2004
- December 2004
- January 2005
- February 2005
- March 2005
- April 2005
- May 2005
- June 2005
- July 2005
- August 2005
- September 2005
- October 2005
- November 2005
- December 2005
- January 2006
- February 2006
- March 2006
- April 2006
- May 2006
- June 2006
- July 2006
- August 2006
- September 2006
- November 2006
- December 2006
- January 2007
- February 2007
- March 2007
- April 2007
- May 2007
- June 2007
- July 2007
- August 2007
- September 2007
- October 2007
- November 2007
- December 2007
- January 2008
- February 2008
- March 2008
- April 2008
- May 2008
- June 2008
- July 2008
- August 2008
- September 2008
- October 2008
GMail
I've got 6 GMail invites; comment if you'd like one.
If you are of the opposite sex and you have long straight blond hair, bright blue eyes, Afrikaans/Dutch speaking, have no boyfriend, and currently residing in the Garden Route area, you will get precedence and a free cup of coffee. ;-)
Contact Details
This post has become redundant. All of the information is now in the navigation.
Open Source
Stuff has been going crazy on this side. I barely have enough time to post. When I'm not covered up in studying, I'm covered up in work or with some other project.
Recently I have done quite a bit of investigating into Open Source, with our new GROSS User Group.
Last week, I have for the first time properly used Linux. Out of chance that would be Fedora. Only one word: Excellent. I like the Gnome user interface (more than Windows 2000). It's easy to install and use.
As a Linux first-timer, I could already install Apache, PHP, MySQL, Mono (with ASP.NET on Apache), Firefox, and Postnuke. Everything worked, except for one small glitch in Postnuke.
I want to completely switch myself over from proprietary-based software to open source in the next year or so. I'm also going to join Jabber and try to use that for my instant messaging instead of MSN Messenger and ICQ like I'm using now. (I'll still be available on those networks for people not using Jabber.)
I know that many people are moaning about Linux, but I hear just as much (if not more) moaning about Windows.
The whole point of the user group is to share information and help each other, so that we can convert the area to open source.
If I can remember correct, it's a few hundred million Rands (local South African currency) that's leaving this country every year because of software licenses. For a third-world country with a not-too-strong currency, that's not very good news. If we can convert people to open source, we could try and minimise that.
Local talent can be sourced to aid in installing and using, and thus the money can be kept inside of the country, thereby strengthening the local economy.
Am I sounding like a journalist now?
Apparently, the South African Government has also committed itself to converting to open source.
Well, I don't know about the government, but I will definitely be trying to do that. :-)
Sorry for this long post; I know this has nothing to do with web standards, but one thing that web standards and open source seems to have in common is this: freedom.
Native UTF-8 in PHP
Update: Robbert sent me some more information, so I updated this post.
Robbert Broersma e-mailed me with the following (very interesting) information:
Apparently you can get the following in PHP 5.0.1:
./configure --help:
=========================================
Zend:
--enable-zend-multibyte Compile with zend multibyte support.
=========================================
You can also overload the standard string functions by enabling the mbstring extension, by putting this line in your php.ini-file:
mbstring.func_overload = 7;
But remember to add the following line too:
mbstring.internal_encoding = UTF-8;
All of this will make it a lot easier to output UTF-8 (less calls to utf8_encode).
'Class' and 'ID' Attributes as Signatures
Having signatures is an excellent idea. It's so that users can style your site individually in their user style sheets.
To do this on your site, you need to have an id attribute on either your root element or your body element with a value that's unique to your site.
For this unique value, the current convention is to take your hostname and replace all of the dots with dashes. For example: charlvn-blogspot-com
If you're using XHTML 1.0, you can put this id attribute on your root (html) element, like for example this:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" id="charlvn-blogspot-com">
If you're using HTML 4 or XHTML 1.1, put it on your body element (since it's illegal on your html element in those DTD's). For example:
<body id="charlvn-blogspot-com">
Do this on all of your pages in your site, and you're through!
Then the user can put the following into his/her user style sheet if he/she wants to:
#charlvn-blogspot-com { display: none }
Notice that the above is actually an absurd example, and you never want that to happen to one of your sites, but that should give you an idea.
What I showed you so far seems to be the new convention. I however, prefer the following:
Instead of using the id attribute for your site identifier, rather use the class attribute. And then use the id attribute to identify individual pages on your site.
For example, on the front page you could have something like this:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" id="frontpage" class="charlvn-blogspot-com">
And then on the permalink (for this particular post), you could have this:
<body id="2004-08-24-class-and-id-attributes-on-root" class="charlvn-blogspot-com">
So now, if the user wants to apply a rule on all of your pages, he/she can do this:
.charlvn-blogspot-com { display: none }
But if he/she wants to style only one specific page, they could do something like this:
.charlvn-blogspot-com#2004-08-24-class-and-id-attributes-on-root { display: none }
Of course this might also aid you yourself in your site style sheets.
What you have in the id attribute doesn't really matter (in this case) anyway, as long as it's unique to that specific page on your site.
If you're not using a content management system that you can build something like this into, it will probably not be worth converting all of your pages to this format. But if you could add an extra few lines of code to do this on every page automatically, well then you could add some interesting functionality into your site, which could be handy not only to your users, but to yourself as well!
I don't really think that this functionality would be practical in most cases, and maybe it's going a little overboard. I don't know. But it's an interesting thought anyway.
You can see a working example of this too if you like.
On this site, I'm only going to add the class attribute for now (without any id attributes). Then I can add id attributes later if I want to, without breaking any user style sheets.
Please note that the class attribute doesn't seem to be legal on the root element in XHTML 1.0. (And of course not in HTML 4 and XHTML 1.1!) Therefore you'll be stuck with only styling the body element. That's not ideal, but it's not the end of the world either since very few people style their head element at the moment. This should be fixed in XHTML 2.0, though!
Some resources:
- Anne's Weblog about Markup & Style: Signatures on root
- Anne's Weblog about Markup & Style: Styling different sections different
- A better solution is coming? Thank goodness!
Kittenfight
Kittenfight via Anne van Kesteren.
Here's my own one:
XHTML 1.1 versus XHTML 1.0 Strict: XHTML 1.1 wins!
HTML versus XHTML: HTML wins!
Now what? :-)
Mischellaneous
Just some random stuff:
I liked the original design of Mezzoblue. That other design Dave came up with sucked in my opinion, but his latest one is on fire! Reputation restored. :-)
I honestly don't know why I'm still using entities on this weblog, since it's already on UTF-8. Viva Iñtërnâtiônàlizætiøn!
They actually publish your e-mail address? It wouldn't bother me though; I get spammed so much anyway it wouldn't make a difference. :-) [Via Anne van Kesteren]
Free Website Statistics
Having proper statistics is a must-have for any owner of a website. Especially if you have invested a lot of money into it, you want to know how many people are actually looking at your site.
It is also very important for your search engine optimization. You want to look at who finds you under which keywords, so that you know which keywords must receive more attention.
The problem is that web hosts typically see web statistics as an add-on, or you only get it when you are signed up for their "Premium" hosting option.
However, here is the answer for you: Free website statistics.
There are actually a suprising amount of companies willing to deliver this service. Just look on Google.
I went for the first result, namely StatCounter.com. Only one word: Excellent.
No ads, banners, buttons, or whatever on your site. Completely invisible. You can have a counter if you like, and you can support them with a button, but neither is required.
You get full statistical analysis comparible with those of Webalizer.
Of course there is a little snippet of code that you need to put on your site. However, they actually give you the option of having standards-compliant XHTML!
So, I have it installed on this weblog now, and am planning on installing it unto all of my sites soon.
Broken Validator?
Recently, while attempting to validate this weblog, I had the following in my code:
Weblog about Web Design & Programming
Notice the unencoded ampersand. Yet, the W3C Validator said that I have Valid XHTML 1.0 Strict
.
Any conforming XML parser would have choked on this, as far as I know.
It did actually detect that mistake, since it listed it as a warning. However, if a XML parser will choke on it, shouldn't it rather be an error?
Weblog Validation
Ok, the front page and the archive pages now validate. However, I still can't get the other pages to validate.
The ampersands is not encoded in the URI's that Blogger inserts. I have no control over this. The only way to get past it is by disabling comments completely, which is not worth it.
Also, Blogger transforms the HTML in comments to uppercase instead of lowercase. Can you believe that?
I can't go back to HTML 4.0 Strict, since in some other code Blogger inserts (which I also have no control over), the end tags are closed. So I will actually be moving further away from validation standards by doing so. I know that closing HTML tags having that trailing slash is also legal in HTML, but it still means something totally different.
In the meanwhile, I choose to specify the non-validating pages as tag soup by removing the doctype. I know that they are parsed as tag soup anyway, but now I specify as tag soup too.
There is also one last problem with the other pages, and that is that when no comments are made yet, you have an empty dl element. Also illegal in both XHTML and HTML.
PS: Notice that I am now using URI instead of URL because in the past I was talking nonsence (as usual).
And one last thing... you will now know which pages on this weblog validate and which not by checking to see if there's a ugly little W3C sticker on the right. As soon as all of my pages validate (yes, I am quite optimistic after yesterday), I will remove that little sticker since it's just ugly (IMHO).
Things are looking good at blogger...
The other pages are still broken though due to invalid code inserted by Blogger itself. :-(
Anonymous Comments
Some people have been making derogatory (anonymous) comments on this weblog recently. Because of this reason I have now set the option that you must be registered with Blogger in order to comment.
I am sorry that I have to do this, because I am as much for freedom of speech and privacy on the Internet as any other person, but if you want to make derogatory comments then at least put a name to it. I do.
So if your (anonymous) comment has been deleted, you are welcome to re-comment using a real name, and unless profanity sets in, I will not (normally) delete your comment.
Thanks!
PS: I hate having to force users to register with Blogger first, but I have to. Otherwise this is going to get out of hand.
Afrikaans.mozdev.org
Although I love to discuss web design, I enjoy it even more actually jumping in and designing.
My latest release: http://afrikaans.mozdev.org
Some specs for your enjoyment:
- Valid XHTML 1.1
- Valid CSS
- UTF-8
linkelements (only until we getnlin XHTML 2.0)accesskey's andtitle's on some links- Semantic markup
- Solid code structure (I think)
If you see anything that I missed when creating that site, comments are much appreciated.
The site is the i18n site for Firefox and Thunderbird in Afrikaans. I am working together with Translate.org.za on this (they are doing the Afrikaans translations for the Mozilla Application Suite).
Since the website is about Afrikaans translations, the whole website is accordingly in Afrikaans. And thus if you don't read Afrikaans you might have some trouble understanding what's going on in that particular site.
The look and feel has mainly been borrowed from Mozilla.org since I saw the Dutch translation's website was also like that (ugly use of frames on the Dutch translation's site though) and I like the design. I will be doing some more designing myself on that, since I want to exercise my graphical design skills.
I will be starting on za.mozdev.org soon and then I can get translating!!!
UTF-8 in PHP
Update: Since I can't read I got htmlspecialchars and htmlentities mixed up. Mistake has been corrected, thanks to Anne.
First of all, don't encode any of your PHP source code in UTF-8. I tried this, and believe me it doesn't work (or at least not on my setup).
There are a number of functions you can use to convert your output to UTF-8. For this post I will assume that you are trying to output a particular string containing no entities and lots of special characters.
First of all, you can use the htmlspecialchars function to encode your smaller-as, greater-as, ampersands, and quotes.
If you want to, you can also use htmlentities to convert all of those special characters like 'é', 'ë', 'ü', etc to their respective entities. However, when using UTF-8, there's really no need to, since it supports all of those characters. Read the next paragraph.
To convert your special characters directly to UTF-8, you can also use the utf8_encode function. Remember that it will probably be best to call this function every time you output special characters, and also when you're not sure.
For example, 'é' has different codes in ISO-8859-1 and in UTF-8. So always call that function unless you're sure.
And lastly, be careful when comparing strings with each other when they're in different character encodings. Even when not using special characters, it didn't want to work at all on my side.
Hope this is useful!
Abbreviations
According to the WCAG 1.0, you must mark up your abbreviations propperly. You must also give them a relevant title the first time they get used on a page.
However, when the user enters the web page through a bookmark, he/she/it might miss the first mention of any particular abbreviation. So shouldn't a propper title be provided every time?
But then, if you have to type the full title out every time, what's the point of using abbreviations in the first place? At the end of the day you will probably have to leave this to your CMS. Just a pity that none(?) of the systems currently out there provides this functionality (yet).
Lastly, some people pronounce SQL differently. I personally prefer to spell it out, and so does many others. However, every so often I hear somebody else pronouncing it as "sequel".
So, how must I mark that up? If I use <abbr class="spell-out"> then some people might get confused, and if I use <abbr class="run-through"> then some other people might get confused. So what do I do? Does the W3C have any answers? Or is this eventually left up to myself to decide?
Although probably not very uniform, eventually it probably doesn't matter that much. When I hear somebody speaking about "sequel" then I know what they're talking about. And then when they hear me speaking about "SQL" then they'll also probably know what I'm talking about.
Just as a last remark, having to type out an abbr element with propper class and title attributes is simply just too much work and not practical. This is a computer's job. Come on, CMS's, get on with technology!
Notice that I used he/she/it
in this post so that I don't discriminate against any particular gender and/or species. I have recently been accused of being sexist, and I don't want to be accused by some cat or dog as being "specie-ist" (spelling?) either. :-)
Times and Dates in Permalinks
This is applicable both to permalinks of posts and of comments.
If you time/date code your posts/comments in the permalinks, which is not always a bad idea, you must take care when it comes to timezones.
If I post something early in the morning in South Africa, let's say 2 o'clock in the morning on 1 January 2005 for instance, it is still somewhere in 31 December 2004 in the USA.
Now normally this wouldn't be a problem, but if you use the year, month, and a SLUG in your permalink and somebody in the USA tries to recall the link out of his head, he will possibly regurgetate the following:
http://charlvn.blogspot.com/2004/31/somenewpost
However, that will return an error screen, because actually the link was this:
http://charlvn.blogspot.com/2005/01/somenewpost
You see, it is easy to remember somenewpost, but not so easy to remember the date, unless you think about on what day you read the post.
2005/01 will be unlikely to you if you live in an area on which it is still new year's eave. Unless, of course, you are trained to think internationally. :-)
But even if the previous example seems a little unlikely to you, keep the following in mind:
On this weblog, all of the times are according to South African time.
However, say for instance I want to move to the USA. Now I would obviously like to adjust my time zone to my new location. So I set this new time zone on my weblog. What will happen now? Will some of my permalinks change? If I don't want that, either some of my dates (and all of my times) will be incorrect or some of the dates in the permalinks will not match up with the dates in the respective posts. If you want to change the permalinks, you might need quite a few redirects. I hope you have a script to do that. :-)
Otherwise you can just leave all of your old posts as South African time and just start with USA time from there on. Just remember to indicate this on all of your posts (on which timezone you're using for that particular post). But you should probably be doing that anyway.
To keep stuff simple: Use standard GMT in posts and comments themselves and their respective permalinks for all dates and times. Why not?
Automatically displaying the time in the user's own timezone (depending where he's viewing from) is also an option, but then you'll need access to lists of IP ranges and where they're owners are physically located (therefore which timezone they'll be using). But that might still cause confusion. And of course this must not influence permalinks.
So for all practical reasons, like I said, use GMT. :-) And clearly indicate that you're doing so as well.
Charl's Perfect Permalink
I'm specifically specifying Charl's
because perfect
is highly subjective. Everybody will have their own version of perfection. So what you read in this post is only mine. :-)
I have been thinking, and I would actually like to go back on something I said yesterday. If you haven't read it, it's probably best to read it first so that what I say here will make sense.
At the moment, I'm sitting with this:
http://charlvn.blogspot.com/2004/08/africanwildlifeartzanet.html
But actually, my ideal would be this:
http://charlvn.blogspot.com/africanwildlifeartzanet
But if I don't have access to redirects, I want this:
http://charlvn.blogspot.com/2004/08/16-27
16 is the hour of the day the post has been made. 27 is the minute in that hour.
So the limitation is that you can only make one post per minute, but who needs more?
So why this? Because it will never change, doesn't matter what I do with my title.
Notice the dash between the hour and the minutes. This is nicer to me than having something like 1627.
But why don't I like a mixture of the above like /2004/08/africanwildlifeartzanet?
Well, africanwildlifeartzanet is easy to remember. However, 2004/08 is not. So why have something that is easy to remember only to bugger it up with something like 2004/08? Precisely, it defeats the object.
So I'll rather either have all numbers or only have a SLUG (or whatever it's called). Both has its advantages and its disadvantages, but mixing the two carries none of these advantages. Or am I missing something?
Weblog URL Design
Update: When I speak about human readable, I actually mean human rememberable more than only readable. Just a quick clarification.
First of all, I don't know what the point is of referring to URI's when you actually mean URL's.
The URL of a typical post on Blogger looks like this:
http://charlvn.blogspot.com/2004/08/africanwildlifeartzanet.html
There are two things I don't like, and one I do. I dislike the fact that there is a file extension (.html) at the end of the URL. Another thing I don't like is the text africanwildlifeartzanet, but more about that later. Lastly, I do like the fact that there is no archives bit in the URL.
Why no file extensions? Well, there is another thing I dislike: http://. Both of these expose the technical detail of the technologies used, which I believe, in my personal philosophy, the user should not be in contact with. However, in the case of http://, that is not something which we are quickly going to change, and it isn't really that harmful since most user agents will fill that in for the user automatically. It's a different story with .html, however at least it's better than .php or something like that.
So, what's wrong with africanwildlifeartzanet (for example)? Well, that is a kind of abbreviation for my post title. Why is this evil? Well, it's subject to change. Not this particular one, but some other ones I have made in the past. Sometimes I make a typing mistake, or the post content changes to such an extent that the post title isn't relevant anymore. This happened more often to me than you think. If I change the post title, I don't want the mistake to be present in the URL. And that breaks permalinks. Of course I could just set up a redirect, but many (if not most) people can't (or doesn't know how to) do this. I don't think Blogger allows me to do this (maybe another suggestion...)
So what would I prefer instead of africanwildlifeartzanet for instance? Rather a never-need-to-change URL comprising of only numbers (the hour, minute, and second that the post was made, having the limitation that only one post can be made per second, but that's probably OK). But wait, isn't africanwildlifeartzanet much more human readable and rememberable? Of course! But what about the 2004/08 in front of that? Exactly, it isn't very human readable anyway!
If you really want to insist on having easy-to-remember URL's for your posts, why not leave off the year and month? Because if you're going to have two SLUG's that are the same but was posted under two different months, well wouldn't that be confusing and defeating the object of making it easier to remember?
Ok, so why no archives in the URL? Well, simply because it is unnecessary. Short URL's rock!
Lastly, I have respect for the fact that you might want to keep the month, year, and archives in your permalinks for added URL structure. That is indeed very important. However, sometimes I prefer size over structure. But then again, I could be wrong.
Oh yes, and coming back to the URI versus URL thing, why are the people in the standards-compliant community obsessed with using URI above URL? URL is a subclass of URI, and I believe that if you want to talk about URI's that are more specifically URL's, call them URL's for all clarity reasons.
AfricanWildlifeArt.za.net
Yes, I just finished up with that site.
Some specs in case you're interested:
- Valid XHTML 1.1 with XML Declaration and
xml-stylesheetprolog asapplication/xhtml+xmlto user agents with that in their request header. - Valid HTML 4.01 Strict with
<link rel="stylesheet">to other user agents. - Valid CSS
- UTF-8
- Clean URI's without file extensions (even the images and stylesheet). Like it or not.
- Full PHP database-driven background. Yes, I had to do some magic to get the clean URI's. Just proves that it can be done (thanks to
htaccessfiles.) - Solid site and code structure.
accesskey's andlink's.- Languages specified.
- Lots (and lots!) of metadata and keyword optimization for search engines.
- Semantic markup (I think)
- I can't remember using any
span's ordiv's. I can be wrong though. - Oh yes, and NO WWW!
Note that it doesn't matter if you use span's or div's, but it's just fun not to have them. ;-)
Of course I just had to go and use XHTML 1.1 to irritate some people, especially Anne ;-) I do send HTML 4.01 to other user agents like Internet Explorer and search engines, though!
Well, comments are open for all your compliments, insults, and flames. While you're at it, why not flame me for daring to validate and list that it does? :-)
Copyright © 2004-2008 Charl van Niekerk. All articles are released under the Creative Commons Attribution 2.5 South Africa licence, unless where otherwise stated.

