Not too long ago, I needed to figure out how many total lines I had in a series of files.  Normally, this would not be difficult:  I can use the Measure-Object Powershell cmdlet with the -Line parameter and get a line count.  The tricky part here, though, was two-fold:  first, I had approximately 9400 files to parse; and second, all of the files were gzipped.  This meant that I would need to unzip each file first before counting its lines.

Fortunately, I found a great function on the TechNet gallery:  ConvertFrom-Gzip.  This function is well-written enough that it can read items from memory, do the unzipping in-memory, and pass contents along the pipeline in-memory, meaning that I never needed to write an unzipped file out to disk, saving me a lot of time.  Here’s my Powershell one-liner to solve the problem:

Get-ChildItem "C:\temp\CallLog\" -Recurse -Filter "*.gz" | Get-Item | ConvertFrom-Gzip | Measure-Object -Line

The thing I like most about this function is that it really fits in the spirit of Powershell—I can use it in a pipeline without needing to add extra cruft.  Note that if you want to use the function in your own code, it’s not signed.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s