Posted by Tom Moertel
Mon, 30 Jan 2012 04:48:00 GMT
I have finally done it! With my recent post on tree traversals, I have managed to write 100 thousand words for my blog:
>> Article.find(:all).inject(0) { |sum,a| sum +=
?> (a.body + a.extended.to_s).split(/\s+/).length }
=> 100334
That sounds impressive until you realize that my first blog post, Fun with Asterisk, was about nine years ago. So we’re only talking, on average, about 11 thousand words per year. And that’s not hard, if you stick with it.
For me, the trick has been sticking with it. I joined a startup at the end of 2007, and my blogging abruptly lost about four fifths of its pace:

So I need to discipline myself to blog more frequently. I hope the next 100 thousand words won’t take so long to write.
Finally, I’d like to take this opportunity to thank you for reading and commenting. You’re the reason I wrote those words in the first place. You made the first 100 thousand words fun.
Thank you!
Your pal,
Tom Moertel
Posted in site news
Tags 100k, blog, blogging, statistics, writing
1 comment
no trackbacks

Posted by Tom Moertel
Thu, 21 Jul 2011 00:16:00 GMT
In 2007, I counted how many words I had written for my blog and was surprised to discover it was 60 thousand, about a short novel’s worth. Just now, I stumbled upon that earlier count and wondered what the count was today.
>> Article.find(:all).inject(0) { |sum,a| sum +=
?> (a.body + a.extended.to_s).split(/\s+/).length }
=> 98702
I’m about 1300 words shy of 100 thousand.
Now I’ve got to write a few more posts for the blog.
:-)
Posted in writing
Tags blog, writing
no comments
no trackbacks

Posted by Tom Moertel
Tue, 28 Dec 2010 19:02:00 GMT
What did people read on my blog in 2010? Mostly, it was older content. Here are the ten most-popular pages, ordered by unique page views relative to that of the home page (1.0):
1. A Coder’s Guide to Coffee (2002, popularity = 5.30). This oldie continues to be popular mainly because coders still drink coffee – and because the Guide gets rediscovered every few months and posted to Reddit or Hacker News. This year it got an additional boost from being the cover story of Hacker Monthly #4.
2. Never store passwords in a database! (2006, popularity = 3.18). Despite being 4 years old, this article gets a steady flow of readers because lots of programmers are still storing passwords in databases. And getting owned.
3. Ruby 1.9 gets handy new method Object#tap (2007, popularity = 1.37). I’m not sure why this article keeps getting the hits, but it does. People just love Object#tap, I guess.
4. Wondrous oddities: R’s function-call semantics (2006, popularity = 1.22). This article’s popularity is easy to explain: R continues to steamroll just about everything else in statistical computing and has a continuous influx of new, curious users who want to know more about R’s inner workings.
5. Verizon FiOS fiber-optic Internet service: a first look (2005, popularity = 1.05). I think this article is popular because I was an early adopter of FiOS had one of the first hands-on reviews. It gets lots of search hits.
6. A couple of tips for writing Puppet manifests (2007, popularity = 1.02). I’m not sure either of these tips is still relevant. Still, this article brings in readers.
7. How I stopped missing Darcs and started loving Git (2007, popularity = 1.01). Programmers love to talk about DVCSs, Git and Darcs especially. Plus, if you search on “darcs git”, this article is one of the first results.
8. A type-based solution to the ‘strings problem’: a fitting end to XSS and SQL-injection holes? (2006, popularity = 1.00). This article remains popular because it gets readers from two sources: from religious wars over typing systems and from discussions of what to do about XSS vulnerabilities.
9. Don’t let password recovery keep you from protecting your users (2007, popularity = 0.93). This article is a follow-up to Never store passwords…! and tends to pick up a share of its sibling’s traffic.
10. On the evidence of a single coin toss (2010, popularity = 0.78). This short article raises a simple question: If I hand you a coin and claim that it always comes up heads, and you toss the coin and it does come up heads, how much more should you believe my claim compared to before the coin toss? This kind of question is irresistible to anyone even remotely Bayesian, so it ended up on Hacker News and got a lot of traffic in a few days. (The follow-up article is also popular, but didn’t make the top-ten list.)
So, once again, it looks like the old content dominates. Only one article from 2010 made the top ten, and just barely at that.
Posted in site news
Tags blog, content, popularity, statistics
no comments
no trackbacks

Posted by Tom Moertel
Mon, 27 Dec 2010 00:14:00 GMT
After much neglect, my poor blog is finally getting some much-needed care.
The first improvement I wanted to make was to add support for TeX mathematics. Last week’s article on how to update your beliefs after observing a coin toss contained enough painstakingly entered mathematical notation to provide the necessary motivation. The solution I preferred was MathJax, a JavaScript library that runs in the browser to render TeX markup into mathematical notation after a page is loaded. But that solution created a new problem.
MathJax, you see, has a rather large footprint. And my blog runs on a decrepit server that is already overtaxed. So, to use MathJax, I first had to put a reasonably tuned cache in front of my blog to offload the byte-slinging duties soon to be imposed.
Varnish
Enter Varnish, an efficient, highly tunable, caching HTTP proxy. I set it up on a front-end server and told it to cache anything mostly static on the blog’s server:
# allow caching of mostly static resources
sub vcl_fetch {
if (req.url ~ "\.(ico|png|gif|jpg|swf|css|js)$" ||
req.url ~ "^/xml/.*\.xml$" ||
req.url ~ "^/$") {
set obj.ttl = 600s;
}
if (req.url ~ "/javascripts/MathJax/") {
set obj.ttl = 3600s;
}
if (req.url ~ "\.(ico|png|gif|jpg|swf|css|js)\?[0-9]+$") {
set obj.ttl = 1d;
}
}
Basically, that bit of Varnish Configuration Language says that after the proxy fetches a resource from the back-end blog server, if it’s an image, script, feed, or MathJax resource, it should be given some reasonable amount of time to live in the cache. Once in the cache, Varnish will serve it up until its time to live expires, when Varnish will finally ask the old blog server to fetch another copy.
This little change made a big difference in my blog’s responsiveness. It feels much snappier now. (Let me know if you agree.)
MathJax
The front-end cache done, I moved on to installing MathJax. Basically, I downloaded a couple of Zip archives, decompressed them, and dropped the resulting files onto my blog’s server. Then I tweaked the blog’s default page template to load the root MathJax JavaScript file. That’s it.
Now I can have fun with TeX-markup mathematical formulas on the blog: $$1 + x_1 + x_2 + \cdots$$
The only downside to using a client-side library like MathJax is that it will probably not go so well for readers using Instapaper and e-readers. (If you’re one of them, let me know how it goes for you.)
Posted in site news
Tags blog, caching, mathjax, performance, tex, varnish
3 comments
no trackbacks

Posted by Tom Moertel
Thu, 23 Aug 2007 01:34:00 GMT
As everybody knows, statistics is fun. Is there
anything cooler than crushing a heap of seemingly uninteresting
numbers into gleaming jewels of meaning? Of course not! Models,
data-visualization plots, and fat data sets are way cool.
So, let’s find an excuse to play with them.
Here’s an excuse –
I mean, an important and highly relevant question that many of us share:
How many people actually read our blogs? To answer the
question, we will need to use statistics, data, and cool plots.
Further, if you’ve got the raw data for your blog, you can follow
along with your own analysis. Even more fun!
We’ll start with a simple inspection of common web-log data, using
command-line tools. After developing a rough understanding of what
useful information we can extract, we’ll analyze the raw data using a
series of successively more sophisticated techniques. In the end, we
will derive a simple formula for estimating readership from easily
obtainable data.
Sound good? Then let’s get rocking.
But first, a preemptive strike on would-be poo-pooers: I know all about
FeedBurner. I know they will track my blog’s subscribers and use
their mystical powers to infer the number of “real” subscribers I
have. I know it’s all so easy. But easy isn’t the point. I want to
understand what’s going on. Just taking somebody’s word for it isn’t
nearly as satisfying as figuring it out yourself – nor as fun.
OK. For real this time, let’s get rocking.
Read more...
Posted in statistics
Tags blog, fun, modeling, R, statistics
5 comments
no trackbacks

Posted by Tom Moertel
Fri, 30 Mar 2007 04:34:00 GMT
How much content have I written for my blog? Let’s find out.
My blog runs on Typo, which is built upon
Ruby on Rails. Let’s fire up the Rails
console and gather a quick word count:
$ cd ~/blog
$ ruby script/console
Loading development environment.
>> require 'article'
=> true
>> Article.find(:all).inject(0) { |sum,a| sum +=
(a.body + a.extended.to_s).split(/\s+/).length }
=> 66665
So I have written about 66 kilo-words, which is entering novel
territory. Paperback-wise,
it’s about 190 pages.
All I need now is a villain and some cool cover art.
;-)
Posted in site news, rails, writing
Tags blog, rails, word_count, words, writing
2 comments
no trackbacks
