Downloads of The Machinery of Freedom
I was still curious, so I went to the logs folder on my site and downloaded access_log, which was current. It's a text file, so I loaded it into Word, found the text representing a download of the Machinery pdf, and used Word's search and replace feature to count how many times the text appeared. To my astonishment, the result was over fourteen thousand.
I wasn't sure I believed it, so I located a Mac program for analyzing weblogs ("Traffic Report"), and downloaded it—the program permits a free trial for a month. It agreed with my previous result, showing 14,526 hits for the file. Since it's a single pdf, hits ought to be downloads. I then sent enthusiastic emails to Patri, my agent, and my contact at Open Court, which published Machinery—and until recently wouldn't let me web it.
This morning, I took a look at the summary provided by my ISP. It had been updated. But its figure for downloads, although still more than I would have expected a few days back, was much lower—2867. On the other hand, its figures for daily traffic showed it roughly doubling the day after I webbed the Machinery pdf, an increase of almost 2000 hits a day, and staying up thereafter. Assuming those are all Machinery downloads, that would be a total of about ten thousand—much larger than their figure, lower than the Traffic Report figure.
My current (optimistic) guess is that the Traffic Report figure is correct and there is still something wrong with the Webalizer figure. An alternative possibility is that the two sources are reporting on different information. Perhaps, for example, Webalizer has some way of filtering out hits from web spiders—although it's hard to believe that that could represent so large a difference, and I am not seeing a similar difference on the figures for other files.
It occurred to me that some readers of this blog probably know a lot more than I do about analyzing web traffic, hence this post. I'm hoping that one of you can suggest a plausible explanation for the discrepency between the number produced by Welalizer and the number produced by Traffic Report, and some way of testing whether the explanation is correct.
P.S. (6/27/10) Update on downloads:
According to my ISP's software, Machinery has had 4357 hits and a total download of 142264K, for an average of 330K per hit. The file shows on the web site as 517K, which suggests that on average I'm getting about two downloads for every three hits. This may, as some have suggested, reflect the use of a download accelerator or something similar, with some people downloading the file in several pieces. The corresponding figures for Salamander are 255 hits, 142264K downloaded, for an average of 558K; the file is 776 K, so a similar ratio.
For the moment I'm assuming those numbers are correct; I don't yet have an explanation of why my earlier analysis of the access log produced figures so much larger. Both the software I used and the software my ISP provides give statistics in terms of hits, not as an estimate of number of downloads, so I would think both would have been affected in the same way by anything that made the number of hits substantially larger than the number of downloads.
It looks as though downloads are continuing at a rate of hundreds, but not thousands, a day.
As before, suggested explanations of my data from those more familiar with the subject are invited.