<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Joy of Hack &#187; Programming</title>
	<atom:link href="http://www.aijazansari.com/tag/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.aijazansari.com</link>
	<description>For people who like to make things</description>
	<lastBuildDate>Tue, 20 Jul 2010 13:20:08 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>The Performance Cost of Using WordPress</title>
		<link>http://www.aijazansari.com/2010/03/31/performance-cost-of-using-wordpress/</link>
		<comments>http://www.aijazansari.com/2010/03/31/performance-cost-of-using-wordpress/#comments</comments>
		<pubDate>Wed, 31 Mar 2010 17:57:48 +0000</pubDate>
		<dc:creator>Aijaz Ansari</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Apache]]></category>
		<category><![CDATA[Caching]]></category>
		<category><![CDATA[DNS]]></category>
		<category><![CDATA[Efficiency]]></category>
		<category><![CDATA[Etag]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[Metrics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Safari Web Browser]]></category>
		<category><![CDATA[Spiders]]></category>
		<category><![CDATA[TaskForest]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[Websites]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.aijazansari.com/?p=531</guid>
		<description><![CDATA[I recently tried to switch a website over from a homegrown content management system to WordPress.  The results were thoroughly disheartening.  This post illustrates the steps I took, and how I managed to move in the opposite direction and optimize my site even more.  ]]></description>
			<content:encoded><![CDATA[<p>Happy with my experience with a custom WordPress installation for this blog, I decided to try using the blogging platform for the <a title="TaskForest Job Scheduler" href="http://www.taskforest.com/">TaskForest</a> website.  The two main reasons were the ease of creating RSS feeds and the ability for users to comment on posts or articles.  After a few days of tinkering around, I&#8217;ve come to the conclusion that, at least for TaskForest, WordPress would cause more problems than it would solve. Here&#8217;s how I came to that conclusion:</p>
<h2>Setting up a Sandbox Domain</h2>
<p>The first step in trying out WordPress was to set up a new domain just for testing out the WordPress installation.  This way, I wouldn&#8217;t affect the taskforest.com domain during my experiments.  I happen to run my own name servers using Daniel Bernstein&#8217;s <a href="http://cr.yp.to/djbdns.html"><em>tinydns</em></a>, so I decided to create a new domain called <em>tf.enoor.com</em>, a subdomain of my defunct company&#8217;s domain.  Since I use bluehost.com&#8217;s WordPress hosting service, I had to make their name servers responsible for the domain.  All that&#8217;s needed is adding the following two lines to <em>/etc/tinydns/root/data</em>:</p>
<pre class="brush: plain;">
&amp;tf.enoor.com:74.220.195.31:ns1.bluehost.com:300
&amp;tf.enoor.com:69.89.16.4:ns2.bluehost.com:300
</pre>
<h2>Selecting a Theme</h2>
<p>After setting up WordPress, the next step was to select a theme.  I wanted something similar to TaskForest&#8217;s current design, so I chose the remarkably customizable <a title="Suffusion" href="http://www.aquoid.com/news/themes/suffusion/">Suffusion</a> theme.  After some tinkering I was able to get a site that was quite similar to the original, with one compromise &#8211; I could get either the logo or the site&#8217;s name in the header, but not both.  Not the way they are on the current site.  It&#8217;s quite important that the site&#8217;s name appear in the header, because that helps with Search Engine Ranking.  So with a heavy heart I decided to omit the logo.</p>
<h2>The Problem with Content</h2>
<p>TaskForest ships with its own web server to support <a href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a> interface.  As part of the included website is all the documentation for the system. Just like the code, this documentation is under source control, and it&#8217;s also used to populate the taskforest.com website.  This way, I can ensure that both the taskforest.com website and a user&#8217;s local install have the most up-to-date docs, as long as the user is running the latest version of the software.  What this also means is that I have a few dozen webpages that need to be transfered to WordPress before the new site can go live.  I was already resigned to the fact that the URLs of these pages would be different &#8211; the current site has URLs that look like <em>http://www.taskforest.com/about.html</em>, but the default WordPress installation would use URLs that look like <em>http://www.taskforest.com/about/</em>.  It&#8217;s not a huge deal, but I prefer my way.</p>
<p>The bigger issue is that when a new version of the software is released, the pages change.  The current build process ensures that the client website and the taskforest.com website stay in sync.  Now if I use WordPress, I don&#8217;t want to manually edit the pages using the WP admin site.  I need to install a new plugin that handles inclusion of files.  So, I installed the WP Include plugin.  I&#8217;d have to change my build process, but I could get it to work.</p>
<h2>Putting It All Together</h2>
<p>Okay, so I got the themes and plugins installed, and I&#8217;ve got the process worked out.  It was then time to try a proof of concept with a single page.  It worked just as expected, but the site seemed very sluggish.  I made sure I didn&#8217;t have anything enabled that I didn&#8217;t need.  Still, the site was noticeably slower than the existing site, and that&#8217;s with only one non-blank page and zero blog posts.  I thought that maybe the problem was that the bluehost.com shared server was too slow.  I just happened to have an unused server in the same data center that hosts taskforest.com.  The same kind of server as well. It took the better part of the morning, but I installed PHP, mysql and WordPress on that server.  In installed PHP as a static module within Apache, for optimal performance.  Even then, on a pristine machine running nothing else, it was <strong>slow</strong>.</p>
<h2>How Slow is Slow?</h2>
<p>Anyone who knows me knows that I&#8217;m an Engineer, and as an Engineer I like metrics.  I wanted to know that it wasn&#8217;t just my own bias that was penalizing WordPress over my established way of doing things.  So, I looked at the Safari web browser on the Mac.  Safari has a really useful feature called the Web Inspector that, among other things, displays the amount of time it takes for different parts of a page to load.  The numbers were very surprising.  With WordPress, a new page would take <strong>1.75</strong> seconds to load &#8211; an eternity on a high-speed broadband connection.  A subsequent request of the same page would take about <strong>700ms</strong>. Switching to the TaskForest website&#8217;s strategy would take about <strong>800ms</strong> for a new page, and <strong>275ms</strong> for a cached page.  That&#8217;s more than a <strong>20</strong><strong>0%</strong> increase in speed!  The TaskForest website used a preprocessor that allowed for including header files.  It ran under mod_perl.  I decided to write a spider that crawled the website and convert everything to static pages, and serve those pages without mod_perl.  With this enhancement, the results were even better.  I saw a <strong>400% to 800% speed increase</strong>, with cached pages taking just <strong>140ms</strong> to load.</p>
<p>One of the reasons WordPress&#8217;s numbers were so poor is that if articles like <a href="http://www.codinghorror.com/blog/2008/04/behold-wordpress-destroyer-of-cpus.html">this one</a> are still correct, WordPerfect does not cache the content of pages by default.  Resources within the HTML page, like images or javascript files seem to be cached, but every page still does upto 120 database accesses depending on the setup.  Apache is a lot better at caching requests for static HTML pages.  Even the TaskForest webserver used for the REST API, caches data very intelligently using the HTTP headers.</p>
<h2>What Does This All Mean?</h2>
<p>Given the amount of compromises I would have to make in the design, layout and build process, switching to WordPress would have been very painful.  However, with the performance hit the site would be taking, it seems unlikely that I&#8217;ll switch to WP any for the TaskForest website any time soon.  I think I&#8217;d rather write a script to generate RSS feeds on demand and automatically submit the feed to feed notification sites like pingomatic. I think this was a great learning experience that will help me with similar decisions in the future.  And I also got to write a cool spider and increase the speed of the TaskForest web site.</p>
<div id="attachment_559" class="wp-caption alignnone" style="width: 595px"><a rel="attachment wp-att-559" href="http://www.aijazansari.com/2010/03/31/performance-cost-of-using-wordpress/webinspector/"><img class="size-large wp-image-559" title="Safari's Web Inspector" src="http://www.aijazansari.com/wp-content/uploads/2010/03/WebInspector-585x351.png" alt="" width="585" height="351" /></a><p class="wp-caption-text">Safari&#39;s Web Inspector</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.aijazansari.com/2010/03/31/performance-cost-of-using-wordpress/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Sometimes Text Files Are Better Than Databases</title>
		<link>http://www.aijazansari.com/2010/03/08/sometimes-text-files-are-better-than-databases/</link>
		<comments>http://www.aijazansari.com/2010/03/08/sometimes-text-files-are-better-than-databases/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 03:08:33 +0000</pubDate>
		<dc:creator>Aijaz Ansari</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Diff]]></category>
		<category><![CDATA[Editors]]></category>
		<category><![CDATA[Grep]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[Persistence]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[RDBMS]]></category>
		<category><![CDATA[Representation]]></category>
		<category><![CDATA[SSH]]></category>
		<category><![CDATA[TaskForest]]></category>
		<category><![CDATA[Text]]></category>

		<guid isPermaLink="false">http://www.aijazansari.com/?p=499</guid>
		<description><![CDATA[There are many classes of applications for which text files are the preferred means of storing data.  One of the main reasons is that when data is stored in a relational database, editing it is not a trivial task.  A well-normalized database is not easily updated via an SQL command line.  More often than not, a dedicated, graphical editor is needed to model the complex relationships.    ]]></description>
			<content:encoded><![CDATA[<p><a rel="attachment wp-att-504" href="http://www.aijazansari.com/2010/03/08/sometimes-text-files-are-better-than-databases/file/"><img class="alignleft size-full wp-image-504" title="A File" src="http://www.aijazansari.com/wp-content/uploads/2010/03/File.png" alt="A File" width="108" height="136" /></a>I remember in my first Computer Programming class in college, the instructors wanted to make sure we understood the concept of persistence by saving application data to disk.  To keep things simple we would serialize data and save it to text files.  Once we learned advanced concepts we migrated to using relational databases.  As a professional, most of the apps I see use an RDBMS like DB2, PostgreSQL, Sybase or Oracle.  Text files have been relegated to the simple homework assignments of Programming 101.</p>
<p>There are, however, many classes of applications for which text files are the preferred means of storing data.  One of the main reasons is that <strong>when data is stored in a relational database, editing it is not a trivial task</strong>.  A well-normalized database is not easily updated via an SQL command line.  More often than not, a dedicated, graphical editor is needed to model the complex relationships.</p>
<p>Several years ago, when I wrote <a href="http://www.taskforest.com/">TaskForest</a>, one of the initial design requirements was that it be easily configurable with just a shell prompt and one&#8217;s favorite text editor. Many of the servers I cared for for schools and non-profits were old boxes which I administered by logging into them via ssh. So when it came to designing job definitions and dependencies, I chose a text file representation. The benefits of text files over a graphical user interface for this include:</p>
<dl>
<dt>Easy Remote Access</dt>
<dd>All you need is the ability to get to a command line and a text editor on the machine that holds the data files. With the such low client access requirements, virtually any old machine that has internet access and an ssh client can be used to administer the system. I have often worked on my own taskforestd server from a local internet cafe using a Putty.exe downloaded minutes earlier.</dd>
<dt>Mobile Access</dt>
<dd>Text files also make work relatively easy using a mobile ssh client like Idokorro Mobile SSH. A dedicated mobile client would be ideal, but short of that, the text file approach assures low bandwidth usage and easy-to-make changes.</dd>
<dt>Flexibility</dt>
<dd>The simple, easily parseable format of text files allows us to build richer graphical clients later that would use a graphical interface to specify relationships between jobs.</dd>
<dt>Source Control</dt>
<dd> The text based format makes it easy to place the data files under source control. You can also easily <em>diff</em> different versions of the same data file.</dd>
<dt><em>Grep</em></dt>
<dd> When you have dozens of job group files and hundreds of jobs, you may need to answer questions like: &#8220;Are we still running Job J?&#8221; This can easily be answered by <em>grep</em>ping the files for job J.</dd>
<dt>Low footprint</dt>
<dd>When you&#8217;re designing an open-source application you may want to minimize the complexity of the system by not forcing dependencies on major subsystems like GUI libraries and Relational Databases.  Of course there is a point at which a such dependencies are inevitable &#8211; you have to periodically re-evaluate your decisions and determine whether decisions that were correct, say, a year ago are still correct today.</dd>
</dl>
<h2>Choosing A Text Format</h2>
<p>In the case of the TaskForest project, the most difficult task by far was choosing which text format to use.  I went through several iterations trying to find one that was simple to read and write, and yet rich enough to model the domain space completely.  What worked for me (and might work for you) was to ask myself how I would represent the data I&#8217;m trying to save given just a pencil and paper.  Drawing in a notebook gave me the flexibility to sketch and edit easily, and once I had a good representation, converting that to a text file was a simple task.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.aijazansari.com/2010/03/08/sometimes-text-files-are-better-than-databases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Most Dangerous Programming Errors</title>
		<link>http://www.aijazansari.com/2010/02/26/the-most-dangerous-programming-errors/</link>
		<comments>http://www.aijazansari.com/2010/02/26/the-most-dangerous-programming-errors/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 15:00:10 +0000</pubDate>
		<dc:creator>Aijaz Ansari</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Bugs]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.aijazansari.com/?p=478</guid>
		<description><![CDATA[The Common Weakness Enumeration (CWE) has released their list of Top 25 Most Dangerous Programming Errors. This list and the explanations of the errors are very instructive and should help both novice and expert programmers.  If you&#8217;re a developer, I strongly urge you to read this document and make sure you understand the concepts it [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_481" class="wp-caption alignleft" style="width: 295px"><a rel="attachment wp-att-481" href="http://www.aijazansari.com/2010/02/26/the-most-dangerous-programming-errors/mac-00117-img_6318_2/"><img class="size-medium wp-image-481" title="Streams at the Great Smoky Mountains" src="http://www.aijazansari.com/wp-content/uploads/2010/02/Mac-00117-IMG_6318_2-285x190.jpg" alt="Streams at the Great Smoky Mountains" width="285" height="190" /></a><p class="wp-caption-text">Streams at the Great Smoky Mountains</p></div>
<p>The Common Weakness Enumeration (CWE) has released their list of <a href="http://cwe.mitre.org/top25/">Top 25 Most Dangerous Programming Errors</a>. This list and the explanations of the errors are very instructive and should help both novice and expert programmers.  If you&#8217;re a developer, I strongly urge you to read this document and make sure you understand the concepts it covers.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.aijazansari.com/2010/02/26/the-most-dangerous-programming-errors/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
