Today was the day. Having concluded that a more comprehensive approach would be simpler, I generated a list of URLs with post IDs, saved it to a file, and used wget to retrieve the redirection information from the blog site:
wget -i numerical-urls.txt --server-response --spider --max-redirect=0
wget seems to have trouble sending the server responses to the output file, even with the –save-headers option. My guess is that if wget doesn’t see a file as part of the response, it doesn’t see fit to write only the server headers to a file. As a result, I had to rely on the scrollback buffer of the terminal window I was running this in – which thankfully on OS X can be set to infinite, because I had almost 200,000 lines of output to then copy into a text editor and winnow down to what I wanted.
I wasn’t in the mood to futz with Python today, even though it might have led me more directly to a usable file, but TextWrangler managed to do just fine and by the end of it I had a perfectly submittable CSV file that Disqus handled without a problem.
Comments are closed.