Charl van Niekerk » Blog

Main

Latest

Archives

Powered by Blogger

Petition Against Public Holidays

Photo of whiteboard

A friend and former colleague of mine uploaded a photo of a whiteboard to Flickr today he took a couple of years back in the office we used to work from. We were both protesting all the public holidays we experience in South Africa, each costing the economy lots of money. I mean, as if having two sevenths of your days off (the weekends) is not enough. Yes, we were geeks without broadband at home.

The whiteboard reads something like this. I don't know, maybe this is abuse of the blockquote element.

Petition against silly public holidays!

  • Charl
  • Aryan

We shall refuse to stay home!

We shall come to work by force!

Yes, we are geeks and we are militant. :P

Good Morning

XKCD: Good Morning

People that know me well will so appreciate this. :)

Django Apache Configuration

I heard about Django back in 2005 but because I had trouble finding decent local hosting for Python based applications I ended up leaving it and almost forgot about it. However this situation has changed and there are now a number of companies that can accommodate this.

Firstly, I do want to say that I would definitely not use Django for everything. This is not because I think it's not a very capable platform, but I really hate using a hammer to kill an ant (to use the Afrikaans expression).

For a typical "database application" where 90% of your work is the usual CRUD this can certainly be quite a useful framework though.

One of the questions I get most often from other Django beginners is about the Apache configuration. I'll show you an example of my configuration to help you get going.

<VirtualHost *>
  ServerName myapp.example.com
  SetHandler python-program
  PythonHandler django.core.handlers.modpython
  SetEnv DJANGO_SETTINGS_MODULE myapp.settings
  PythonDebug On
  PythonPath "['/var/www/django'] + sys.path"
  Alias /media /var/www/django/django/contrib/admin/media
  <Location "/media">
    SetHandler None
  </Location>
</VirtualHost>

As I'm sure you'll agree, /var/www/django is most certainly not the proper place to put my Django code, but due to the way I do things with backups etc this is the easiest for me. I don't have anything pointing to /var/www/ directly, I made very sure of that.

Basically, /var/www/django/django is my central subversion checkout, and I would have the code for "myapp" sitting in /var/www/django/myapp. I'm sure this directory structure could have been better planned, but as I said I'm still starting with the platform, so will probably organise this better soon.

Raising funds for Autism Western Cape

Neville Newey of muti.co.za came up with a really great idea. For each vote on this post, muti will donate R1 to Autism Western Cape.

Just after I heard about the idea, I went into town and spoke to Jayx. We both decided to back up Neville and each match his donation, and now we are encouraging others to do the same. Some more people also joined us already; if you want to as well, please comment on Jayx's post and help us to contribute to this very worthy cause. :)

When to release the code?

I just read a very interesting post on Stii's new Django-based blog. Towards the end he writes:

This here Django platform, it will be a work in progress and I'll release the source some day! At the moment, there is just too much stuff still hard coded and too many features lacking for me to bother even! To those who have asked, please just bear with me another week or so!

This is something that has been plaguing me for years.

My typical cycle goes like this:

  1. Write some cool code.
  2. Feel that I want to release it.
  3. Convince myself the code isn't good enough (yet, of course).
  4. Write some more code, mainly adding features or improving existing ones, things the user will notice.
  5. Realise the code now looks even worse than before.
  6. Procrastinate on cleaning up the code.
  7. Forget about the project and never release anything.

At least recently I started changing this. Now, the last two (steps #6 and #7) have been removed and the following cycle starts:

  1. Start aggressively refactoring the code, at least until it's reasonably secure and structured.
  2. Release the code.
  3. Go back, clean the code and document.

Only when the first code has been released do I go back and add more features and do real cleaning.

Now this last part feels much more satisfying. :)

Now of course, I am not saying Stii is doing the same, his post just made me think of my own bad habits and therefore prompted this blog post.

I think most good developers take their code quite seriously. I also think that it really boils down to pride. You don't want to show others the mess you've made. :) Realistically, most of our code ends up being a mess at some point or other. The people that tend to wait until their code is "perfect" tend not to release much, ever. And it really is not shameful to release something that isn't completely there yet, rather that than not getting anything out at all. That's why sniggering at others' messy code is not good, because it makes you scared to release anything yourself.

I do fundamentally suffer from perfectionism. Therefore, I constantly need to keep reminding myself that "perfection is the enemy of the good". :)

I think this counts for most things in life, not just code. People wanting to start their own projects all the time instead of contributing to existing ones because the other projects are never good enough (although sometimes they really aren't!) is another manifestation of the same "root" issue. Same goes for blogging and everything else you do in life. Definitely one must be proud of one's work and at least try your best to do a good job, care about what you do and go at it with a passion, but somewhere along the lines you must come to some type of compromise. :)

Parsing Hashtags

I really don't like this. Check out the page on Hashtags on the Twitter Fan Wiki and you'll see what I mean.

Basically, consider the following three cases.

  1. #downtown's becomes #downtown
  2. #www.downtown.com. becomes #www.downtown.com
  3. #downtown_ stays #downtown_
  4. #taxonomy:binomial=Alcedo_atthis stays #taxonomy:binomial=Alcedo_atthis

This makes parsing pesky. Basically you can't just use a simple regular expression. You have to include various non-alphanumeric characters but trim only some of them off the end later. (As far as I understand, "the end" would refer to the end of the string or if it's followed by whitespace). Why can't we just allow alphanumeric characters, if necessary with underscores included?

Let's face it. We see this every time with the web. Make things complicated, and people either implement it wrong or don't implement it at all.

That document is hardly a spec. When will somebody come up and write us a proper spec with an example parsing algorithm and regex?

ISDSL International Weirdness

The last few days, the international access on ISDSL has been really unusual. Often, I can't reach any of Google's services (and some other sites), while other internationally hosted sites like muti are much faster than previously.

Check out the following traceroutes to see what I mean...

When I do a tcp traceroute to port 21 or 22 for mail.charlvn.za.net (on Google), I get something like the following:

charlvn@charlvn-desktop:~$ tcptraceroute -f 3 mail.charlvn.za.net ftp
Selected device eth0, address 10.0.0.2, port 59392 for outgoing packets
Tracing the path to mail.charlvn.za.net (72.14.207.121) on TCP port 21 (ftp), 30 hops max
 3  cdsl1-rba-vl2257.ip.isnet.net (196.38.73.237)  63.065 ms  63.408 ms  62.633 ms
 4  curp4-rba-gi2.isdsl.net (196.38.73.114)  63.974 ms  63.263 ms  62.496 ms
 5  cdsl1-rba-vl50.ip.isnet.net (196.38.73.109)  61.975 ms  62.433 ms  62.679 ms
 6  core2b-rba-7600-gi1-0-10.ip.isnet.net (168.209.1.182)  62.701 ms  63.186 ms  62.852 ms
 7  168.209.161.234  319.134 ms  321.461 ms  321.422 ms
 8  168.209.163.61  334.373 ms  334.761 ms  334.459 ms
 9  168.209.225.178  334.441 ms  332.068 ms  334.446 ms
10  core2a-ny-gi1-0-19-22.ip.isnet.net (168.209.163.43)  329.959 ms  329.498 ms  330.032 ms
11  cp1-ny-gi0-1.ip.isnet.net (168.209.244.7)  316.938 ms  314.323 ms  317.648 ms
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *

If I do a tcp traceroute to mail.charlvn.za.net but on port 80, I get this:

charlvn@charlvn-desktop:~$ tcptraceroute -f 3 mail.charlvn.za.net
Selected device eth0, address 10.0.0.2, port 50716 for outgoing packets
Tracing the path to mail.charlvn.za.net (72.14.207.121) on TCP port 80 (www), 30 hops max
 3  cdsl1-rba-vl2257.ip.isnet.net (196.38.73.237)  62.350 ms  62.440 ms  62.353 ms
 4  curp4-rba-gi2.isdsl.net (196.38.73.114)  63.853 ms  65.709 ms  64.544 ms
 5  cdsl1-rba-vl50.ip.isnet.net (196.38.73.109)  62.872 ms  63.260 ms  61.563 ms
 6  core2b-rba-7600-gi1-0-10.ip.isnet.net (168.209.1.182)  62.889 ms  65.847 ms  63.798 ms
 7  168.209.161.234  318.726 ms  319.048 ms  321.106 ms
 8  168.209.163.61  334.228 ms  333.603 ms  334.144 ms
 9  168.209.225.178  334.544 ms  332.639 ms  334.694 ms
10  core2a-ny-gi1-0-19-22.ip.isnet.net (168.209.163.43)  331.339 ms  330.550 ms  331.198 ms
11  cp1-ny-gi0-1.ip.isnet.net (168.209.244.7)  314.505 ms  317.200 ms  314.487 ms
12  core1-0-2-0.lga.net.google.com (198.32.160.130)  319.171 ms  315.431 ms  317.610 ms
13  72.14.236.219  335.574 ms  334.892 ms  333.395 ms
14  72.14.232.111  347.598 ms  349.083 ms  347.671 ms
15  66.249.94.50  335.340 ms  348.494 ms  342.940 ms
16  eh-in-f121.google.com (72.14.207.121) [open]  332.557 ms * 340.430 ms

However, if I do a tcp traceroute to muti.co.za on port 80, I hit a transparent proxy:

charlvn@charlvn-desktop:~$ tcptraceroute -f 3 muti.co.za
Selected device eth0, address 10.0.0.2, port 58344 for outgoing packets
Tracing the path to muti.co.za (66.98.208.42) on TCP port 80 (www), 30 hops max
 3  cdsl1-rba-vl2257.ip.isnet.net (196.38.73.237)  63.093 ms  62.394 ms  62.858 ms
 4  curp1-rba-gi6.isdsl.net (196.38.72.238)  63.891 ms  64.535 ms  62.691 ms
 5  cdsl1-rba-vl43.ip.isnet.net (196.38.72.229)  61.971 ms  62.109 ms  62.877 ms
 6  core2b-rba-7600-gi1-0-10.ip.isnet.net (168.209.1.182)  62.923 ms  64.416 ms  62.611 ms
 7  web7.webfaction.com (66.98.208.42) [open]  62.915 ms * 64.618 ms

But if I do a traceroute to muti.co.za on port 21 or 22, I still get something quite reasonable:

charlvn@charlvn-desktop:~$ tcptraceroute -f 3 muti.co.za ftp
Selected device eth0, address 10.0.0.2, port 34096 for outgoing packets
Tracing the path to muti.co.za (66.98.208.42) on TCP port 21 (ftp), 30 hops max
 3  cdsl1-rba-vl2257.ip.isnet.net (196.38.73.237)  62.441 ms  63.537 ms  63.576 ms
 4  curp1-rba-gi6.isdsl.net (196.38.72.238)  63.970 ms  63.199 ms  62.891 ms
 5  cdsl1-rba-vl43.ip.isnet.net (196.38.72.229)  61.980 ms  62.318 ms  63.922 ms
 6  core2b-rba-7600-gi1-0-10.ip.isnet.net (168.209.1.182)  62.629 ms  63.803 ms  62.443 ms
 7  168.209.161.234  320.373 ms  318.964 ms  321.274 ms
 8  168.209.163.61  334.419 ms  333.667 ms  334.552 ms
 9  168.209.225.178  333.134 ms  332.395 ms  331.357 ms
10  core2a-ny-gi1-0-19-22.ip.isnet.net (168.209.163.43)  327.729 ms  328.672 ms  328.937 ms
11  core1b-ny-gi0-0-2.ip.isnet.net (168.209.244.3)  313.526 ms  313.098 ms  313.177 ms
12  ge-3-11.r02.nycmny03.us.bb.gin.ntt.net (129.250.192.213)  427.159 ms  312.978 ms  311.066 ms
13  ae-1.r01.nycmny03.us.bb.gin.ntt.net (129.250.5.101)  311.533 ms  313.957 ms  314.680 ms
14  as-1.r20.dllstx09.us.bb.gin.ntt.net (129.250.3.42)  363.205 ms  379.074 ms  360.845 ms
15  po-1.r02.dllstx09.us.bb.gin.ntt.net (129.250.2.154)  360.998 ms  361.583 ms  360.685 ms
16  xe-4-4.r03.dllstx09.us.ce.gin.ntt.net (157.238.225.6)  354.532 ms  380.936 ms  354.398 ms
17  et1-1.ibr01.hstntx2.theplanet.com (70.87.253.50)  366.245 ms  369.172 ms  367.611 ms
18  et3-3.ibr01.hstntx1.theplanet.com (70.87.253.153)  352.322 ms  351.790 ms  350.627 ms
19  ivhou-207-218-223-22.theplanet.com (207.218.223.22)  354.475 ms  355.898 ms  356.514 ms
20  web7.webfaction.com (66.98.208.42) [open]  350.820 ms * 350.236 ms

Normally, the international latency would be much higher, more like > 600ms. However, look at it now. Just a little over 300ms? It seems like we moved from satellite to landline. Also, what's up with that isnet.net router in New York and the transparent proxy server? Weird stuff.

Also, even though the traceroutes to port 80 look reasonable, sometimes Google services are totally inaccessible while if I try HTTPS it works fine. Definitely smells like shaping/proxy issues. When I try SSH tunnels to overseas, also no issues.

Evolution a Theory?

I was just researching and found something interesting. There are some parts of evolution that are theory and some parts that are fact. Please see the Wikipedia page Evolution as theory and fact for more information.

However, if you take at the Wikipedia page of Theory, you get the following:

In science a theory is a testable model of the manner of interaction of a set of natural phenomena, capable of predicting future occurrences or observations of the same kind, and capable of being tested through experiment or otherwise verified through empirical observation..

By this definition, how can evolution be a scientific theory? Can it be tested or verified through empirical observation?

Maybe, but it would make a flipping interesting discussion. Also see Predictive power. As always, comments are open. :)

IPv6 in Africa

For the lazy or time-deprived, this post can be summarised with the following line: Get a flipping move-on with IPv6!

This was actually a discussion that started the other day on IRC and then Andrew Alston (Symmetria) pointed out that he already wrote an article about this on IS Labs. He makes some excellent points there and reminds us of the urgency of the matter.

I have been researching IPv6 for more than 6 years now and am frankly getting quite fed up with the lack of action, not only locally in Africa but also internationally. It seems like companies are taking a very crappy "we don't care - we NAT" attitude which is frustrating beyond belief.

When will people start to realise that NATting is not at all an elegant solution? Yes, it does plug the hole in the short run, and even gives some people a false sense of security. It is true that NAT doesn't generally create lots of issues. However, personally I am a major supporter of true peer-to-peer computing, and NAT together with ridiculous firewalls are major hurdles to overcome.

With rapid miniturisation, one day not too far into the future, your wrist watch will integrate a full IP stack together with support for VoIP and NTP to synchronise the time. Yes, to use the now-lame example of "your fridge and your toaster will be internet connected", we will need more and more IP addresses.

Skype set out some interesting examples for us. It allows two users both behind a NAT router/firewall to have a conversation by channeling their traffic through other people's computers that happen to be directly internet-connected and with a public IP address and not firewalled in too strictly.

This is obviously really bad for Africans that have to pay a fortune to get access to minimal bandwidth or that have to contend with strict bitcaps.

However, the situation is still currently working because there are enough "suckers" out there whose bandwidth can be (ab)used. However, if 99% of all the normal desktops get NATted, then we are screwed because the remaining 1% won't be able to keep up.

Even if we take peer-to-peer out of the equation, still, with the amount of servers that are being put up all over the world, we are going to get into serious trouble eventually.

It seems like most people are still in the mode of thinking "oh well, it isn't a problem right now, so we don't worry about it". This very type of short-term thinking is what caused serious problems for the world in general in the past.

So, please ISPs, wake up and do something now before it's too late!

Thoughts on URL Shorteners

For some reason, I sometimes seem to be fascinated by simple things. No, not the simple things in life like the air we breathe blah blah, but simple modern web services, particularly microblogging, URL shorteners, etc.

I've always hated TinyURL just because the site looks so ugly. The first nice one I discovered was urlTea. Not too long after that, we got some South African equivalents, particularly TinyLink and TimesURL.at, the latter even being hosted locally.

It seems like urlTea is currently down, but its source code is available (see lower down).

Let's just get one thing straight though, although I might think URL shorteners are cool, I am not a mad fan of their use. As a matter of fact, more often than not, I feel that they are being overused and often abused.

To me, the only times a URL shortener is really handy, is when you need to send a long URL over SMS or sometimes in physical print media, for example. To do so in normal web media seems generally pretty ridiculous.

There are also times when they can be a real pain. Particularly, let's say a social bookmarking service bans a particular domain from being submitted because of repeated spam. Then the spammers can just use a URL shortener to circumvent the situation. Obviously, URL shorteners should themselves be banned from such services, but there are so many at the moment that it becomes a pain to keep track. Maybe somebody should create a central list of them in easy computer-parsable format.

Also, my biggest pain is still my original pain - the loss of semantics. Very often, I hover over a link to see which address it points to. Often it gives me a good indication of what the content is about, especially if the URLs have been well designed.

But sometimes, you really do need a URL shortener for your own purposes. For example, muti incorporates its own built-in URL shortener for links posted to twitter for example. This is so that if the link is deleted off muti because it's spam, the shortened URL will automatically stop working as well and the spammer won't get any benefit.

Anyway, earlier tonight I went and did some research on existing open source URL shorteners. I was quite surprised to find four projects on Google Code that look quite interesting. The first three are shortur, urltea and shorty-links. These are all PHP-based systems. I then found a Django-based one called django-shorturl and even urly which is based on Google App Engine.

I'm sure if I go and do some more digging I'll even find more. Of course, I also have a couple of aces up my own sleeve, but more about that later. :)

Copyright © 2004-2009 Charl van Niekerk. All articles are released under the Creative Commons Attribution 2.5 South Africa licence, unless where otherwise stated.