Printing a Specific Line From a Large File in Linux

Posted in Programming on December 4th, 2009 by Carl Zulauf

I recently had to find a specific line in a large (28GB) file equipped with nothing more than the line number. I thought it would take me just a few seconds to find a cool *nix utility to accomplish this task. Instead, it took me a bit of scouring to find something that works, and works well on large files. That’s OK though since I had to wait for the 28GB file to uncompress from a tarball… which, obviously, takes a while.

What I learned about while I waited was the *nix command ’sed’. This is a tool built for command line processing of data files. Apparently it was birthed as an evolution of our trusty friend ‘grep’. The forum post I found which hinted that ’sed’ was my solution didn’t provide much real information and the Wikipedia article was mostly background and provided examples that won’t help me.

Where I found the most useful info was the sed page on sourceforge… go figure. The docs page pointed me to ‘The sed one-liners‘ by Eric Pement. Here I found, through example, the power of ’sed’ and an example that is more efficient on large files than the ones I found elsewhere.

So here is how you do it:

sed '34005050q;d' filename

‘34005050′ is the line number. ‘q’ tells sed you are looking for that line number, and ‘;d’ tells it to stop after that line. ‘filename’ is of course the file you are trying to coax a specific line out of. To do an inclusive range of lines all at once (lines 8 through 12, for example), do this:

sed '8,12!d' filename

I’m still learning about ’sed’ but its already saving my ass. Have fun.

ExaNotes – An Overdue Introduction

Posted in Programming, Technology on June 17th, 2009 by Carl Zulauf

Several months ago I started working on ExaNotes as a lightweight personal tool to write, access, and search notes from one of the many computers I may use throughout the day. I told a good friend of mine about the simple web app I was building and she said it sounded like a great tool for keeping a journal. I asked her if she could help test the app by keeping a journal and she hesitated. As good of friends as we were the idea of me having access to her journal was not a comfortable one. The convenience of being able to access her journal from any computer was appealing but she didn’t want to trade control of her privacy for convenience. Her desire for privacy gave me the idea of developing a system that was so secure that even the developer or administrator of the tool could not access the content users of the application have stored within it.

I decided to use this as a chance to learn much more about building secure web applications and I spent a few hours diagramming the concept. Then, I spoke with my friend again. I explained the design to her and she agreed the design would keep her journal secure enough that she would feel comfortable testing it and using it. Read more »

Tags: , , ,

Greasemonkey Script: ArsTechnica Cleaner

Posted in Programming on December 4th, 2008 by Carl Zulauf

This is another fairly simple Greasemonkey script I’ve made to make viewing ArsTechnica much easier. This script hides, removes, and reorganizes superfluous content, widens the main content pain a little, and removes all flash. This is extremely helpful when you are reading many articles at a time and don’t want the flash animations on every article slowing down your computer. Scrolling smoothness, tab switching, and general responsiveness of your browser will be improved, especially on older/slower computers. Also, I find the slightly wider content pain to be easier to read.

Like last time, this script is also available through userscripts.org.

Tags: , ,

Twitter Refresher Script for Greasemonkey/Others

Posted in Programming on September 11th, 2008 by Carl Zulauf

I wrote my first Greasemonkey script today, and it happens to be an automatic Twitter Refresher. The script will refresh Twitter every n seconds (10 seconds by default), unless you have placed any input in the input field, in which case it will not refresh.

I haven’t tested this in Greasemetal (Google Chrome) or Greasekit (Safari) yet, but it uses very simple and standard javascript DOM to work, so I’m pretty sure it will work fine in other userscript environments.

Pretty simple, but its something I really wanted so I figured I would share in case anyone else needs a tool like this. Also available through userscripts.org.

Let me know if you can think of any other sites this would be useful for.

Tags: , , , , , ,

PHP: Is Output Buffering Significantly Slower Than Strings?

Posted in Programming on August 8th, 2008 by Carl Zulauf

I love PHP. I use it constantly for rapid prototyping and small scripts that help “glue” things together for me. However, one annoyance I have with PHP is related to building large strings. This isn’t particularly difficult in PHP… in fact its easier in PHP than most languages. The problem is that sometimes my entire page output will need to be contained in a string for various reasons (headers still need to be sent, the output needs to be passed to a filter or other function, etc…). Creating multi-line strings in PHP using single or double quotes looks awful, and the heredoc syntax prevents proper indentation. All of these methods also require tons of concatenation if any logic is necessary. Sometimes I have forgone these methods and instead used Output Buffering to prevent output from being sent, then I simply grab the content of the output buffer and store it in a string. This has several advantages, including ease of coding and the freedom of using the same code whether I need the output stored in a string or sent to the client immediately. However, its main disadvantage is that it is much slower… right? Well, I decided to find out exactly how much slower Output Buffering really is. Read on for my results.
Read more »

Tags: ,