Finally! I have blogged 100 thousand words.

Posted by Tom Moertel Mon, 30 Jan 2012 04:48:00 GMT

I have finally done it! With my recent post on tree traversals, I have managed to write 100 thousand words for my blog:

>> Article.find(:all).inject(0) { |sum,a| sum +=
?>        (a.body + a.extended.to_s).split(/\s+/).length }
=> 100334

That sounds impressive until you realize that my first blog post, Fun with Asterisk, was about nine years ago. So we’re only talking, on average, about 11 thousand words per year. And that’s not hard, if you stick with it.

For me, the trick has been sticking with it. I joined a startup at the end of 2007, and my blogging abruptly lost about four fifths of its pace:

Tom's spotty writing record for blog.moertel.com

So I need to discipline myself to blog more frequently. I hope the next 100 thousand words won’t take so long to write.

Finally, I’d like to take this opportunity to thank you for reading and commenting. You’re the reason I wrote those words in the first place. You made the first 100 thousand words fun.

Thank you!

Your pal,
Tom Moertel

Posted in
Tags , , , ,
1 comment
no trackbacks
Reddit Delicious

Almost there: 100K words

Posted by Tom Moertel Thu, 21 Jul 2011 00:16:00 GMT

In 2007, I counted how many words I had written for my blog and was surprised to discover it was 60 thousand, about a short novel’s worth. Just now, I stumbled upon that earlier count and wondered what the count was today.

>> Article.find(:all).inject(0) { |sum,a| sum +=
?>        (a.body + a.extended.to_s).split(/\s+/).length }
=> 98702

I’m about 1300 words shy of 100 thousand.

Now I’ve got to write a few more posts for the blog.

:-)

Posted in
Tags ,
no comments
no trackbacks
Reddit Delicious

Most popular articles on my blog for 2010: the old stuff rules

Posted by Tom Moertel Tue, 28 Dec 2010 19:02:00 GMT

What did people read on my blog in 2010? Mostly, it was older content. Here are the ten most-popular pages, ordered by unique page views relative to that of the home page (1.0):

1. A Coder’s Guide to Coffee (2002, popularity = 5.30). This oldie continues to be popular mainly because coders still drink coffee – and because the Guide gets rediscovered every few months and posted to Reddit or Hacker News. This year it got an additional boost from being the cover story of Hacker Monthly #4.

2. Never store passwords in a database! (2006, popularity = 3.18). Despite being 4 years old, this article gets a steady flow of readers because lots of programmers are still storing passwords in databases. And getting owned.

3. Ruby 1.9 gets handy new method Object#tap (2007, popularity = 1.37). I’m not sure why this article keeps getting the hits, but it does. People just love Object#tap, I guess.

4. Wondrous oddities: R’s function-call semantics (2006, popularity = 1.22). This article’s popularity is easy to explain: R continues to steamroll just about everything else in statistical computing and has a continuous influx of new, curious users who want to know more about R’s inner workings.

5. Verizon FiOS fiber-optic Internet service: a first look (2005, popularity = 1.05). I think this article is popular because I was an early adopter of FiOS had one of the first hands-on reviews. It gets lots of search hits.

6. A couple of tips for writing Puppet manifests (2007, popularity = 1.02). I’m not sure either of these tips is still relevant. Still, this article brings in readers.

7. How I stopped missing Darcs and started loving Git (2007, popularity = 1.01). Programmers love to talk about DVCSs, Git and Darcs especially. Plus, if you search on “darcs git”, this article is one of the first results.

8. A type-based solution to the ‘strings problem’: a fitting end to XSS and SQL-injection holes? (2006, popularity = 1.00). This article remains popular because it gets readers from two sources: from religious wars over typing systems and from discussions of what to do about XSS vulnerabilities.

9. Don’t let password recovery keep you from protecting your users (2007, popularity = 0.93). This article is a follow-up to Never store passwords…! and tends to pick up a share of its sibling’s traffic.

10. On the evidence of a single coin toss (2010, popularity = 0.78). This short article raises a simple question: If I hand you a coin and claim that it always comes up heads, and you toss the coin and it does come up heads, how much more should you believe my claim compared to before the coin toss? This kind of question is irresistible to anyone even remotely Bayesian, so it ended up on Hacker News and got a lot of traffic in a few days. (The follow-up article is also popular, but didn’t make the top-ten list.)

So, once again, it looks like the old content dominates. Only one article from 2010 made the top ten, and just barely at that.

Posted in
Tags , , ,
no comments
no trackbacks
Reddit Delicious

Blog updates: faster, mathier, and more cacheable

Posted by Tom Moertel Mon, 27 Dec 2010 00:14:00 GMT

After much neglect, my poor blog is finally getting some much-needed care.

The first improvement I wanted to make was to add support for TeX mathematics. Last week’s article on how to update your beliefs after observing a coin toss contained enough painstakingly entered mathematical notation to provide the necessary motivation. The solution I preferred was MathJax, a JavaScript library that runs in the browser to render TeX markup into mathematical notation after a page is loaded. But that solution created a new problem.

MathJax, you see, has a rather large footprint. And my blog runs on a decrepit server that is already overtaxed. So, to use MathJax, I first had to put a reasonably tuned cache in front of my blog to offload the byte-slinging duties soon to be imposed.

Varnish

Enter Varnish, an efficient, highly tunable, caching HTTP proxy. I set it up on a front-end server and told it to cache anything mostly static on the blog’s server:

# allow caching of mostly static resources
sub vcl_fetch {
 if (req.url ~ "\.(ico|png|gif|jpg|swf|css|js)$" ||
     req.url ~ "^/xml/.*\.xml$" ||
     req.url ~ "^/$") {
   set obj.ttl = 600s;
 }
 if (req.url ~ "/javascripts/MathJax/") {
   set obj.ttl = 3600s;
 }
 if (req.url ~ "\.(ico|png|gif|jpg|swf|css|js)\?[0-9]+$") {
   set obj.ttl = 1d;
 }
}

Basically, that bit of Varnish Configuration Language says that after the proxy fetches a resource from the back-end blog server, if it’s an image, script, feed, or MathJax resource, it should be given some reasonable amount of time to live in the cache. Once in the cache, Varnish will serve it up until its time to live expires, when Varnish will finally ask the old blog server to fetch another copy.

This little change made a big difference in my blog’s responsiveness. It feels much snappier now. (Let me know if you agree.)

MathJax

The front-end cache done, I moved on to installing MathJax. Basically, I downloaded a couple of Zip archives, decompressed them, and dropped the resulting files onto my blog’s server. Then I tweaked the blog’s default page template to load the root MathJax JavaScript file. That’s it.

Now I can have fun with TeX-markup mathematical formulas on the blog: $$1 + x_1 + x_2 + \cdots$$

The only downside to using a client-side library like MathJax is that it will probably not go so well for readers using Instapaper and e-readers. (If you’re one of them, let me know how it goes for you.)

Posted in
Tags , , , , ,
3 comments
no trackbacks
Reddit Delicious

Fun with statistics: estimating blog readership (a do-it-yourself recipe)

Posted by Tom Moertel Thu, 23 Aug 2007 01:34:00 GMT

As everybody knows, statistics is fun. Is there anything cooler than crushing a heap of seemingly uninteresting numbers into gleaming jewels of meaning? Of course not! Models, data-visualization plots, and fat data sets are way cool. So, let’s find an excuse to play with them.

Here’s an excuse – I mean, an important and highly relevant question that many of us share: How many people actually read our blogs? To answer the question, we will need to use statistics, data, and cool plots. Further, if you’ve got the raw data for your blog, you can follow along with your own analysis. Even more fun!

We’ll start with a simple inspection of common web-log data, using command-line tools. After developing a rough understanding of what useful information we can extract, we’ll analyze the raw data using a series of successively more sophisticated techniques. In the end, we will derive a simple formula for estimating readership from easily obtainable data.

Sound good? Then let’s get rocking.

But first, a preemptive strike on would-be poo-pooers: I know all about FeedBurner. I know they will track my blog’s subscribers and use their mystical powers to infer the number of “real” subscribers I have. I know it’s all so easy. But easy isn’t the point. I want to understand what’s going on. Just taking somebody’s word for it isn’t nearly as satisfying as figuring it out yourself – nor as fun.

OK. For real this time, let’s get rocking.

Read more...

Posted in
Tags , , , ,
5 comments
no trackbacks
Reddit Delicious

I have written a short novel's worth of content for my blog

Posted by Tom Moertel Fri, 30 Mar 2007 04:34:00 GMT

How much content have I written for my blog? Let’s find out.

My blog runs on Typo, which is built upon Ruby on Rails. Let’s fire up the Rails console and gather a quick word count:

$ cd ~/blog
$ ruby script/console 
Loading development environment.
>> require 'article'
=> true
>> Article.find(:all).inject(0) { |sum,a| sum +=
       (a.body + a.extended.to_s).split(/\s+/).length }
=> 66665

So I have written about 66 kilo-words, which is entering novel territory. Paperback-wise, it’s about 190 pages.

All I need now is a villain and some cool cover art.

;-)

Posted in , ,
Tags , , , ,
2 comments
no trackbacks
Reddit Delicious