Reading And Writing CSVs

Our fearless Penguatroll, in his spare time, runs an Out of the Park fantasy baseball league, in which I occasionally remember to participate.  As part of this league, we’ve put together a draft site using OOTP Online Utilities.  Said Penguatroll realized that not all players were showing up as expected, and so it was my task to figure out what was going on.

After fixing a few separate issues, I discovered that the CSV has a flag called draft_eligible, which was set to 0 (false) for certain players even though it should have been set to 1 (true).

Throw Excel At The Problem 

My first stab at the problem involved Excel, which failed miserably.  The reason it failed is that OOTP uses somewhat-strange rules for text qualifiers:  it uses quotation marks even if there are no commas, but only does this for a few fields like first/last/nickname, date of birth, etc.  Excel “helpfully” decides that these quotation marks are text qualifiers and strips them out, never to be seen again.  The problem, however, is that I need these quotation marks, so that leaves scripting something.

Research Time

My first stab at this was F#.  I took advantage of the F# CSV type provider and whipped up the following:

open System
open FSharp.Data
type Players = CsvProvider<"C:\\Temp\\players.csv", Quote = '~', HasHeaders = true>

let main argv = 
    let players = Players.Load("C:\\Temp\\players.csv")
    let playerSample = players.Rows
                        |> Seq.take 100

    for player in playerSample do
        printfn "Name:  %s %s" (player.First_name) (player.Last_name)
    let s = Console.ReadLine()
    0 // return an integer exit code

What this let me do was see the problem, but from there, I was limited in options—mostly due to my lack of F# knowledge. I needed to modify a specific attribute based on a few others (if the player’s age is at least 16, if the league is not one of the professional leagues, etc.), and the type provider’s output is read-only. If this were a 10-attribute type, I would have gone forward, but we’re talking about dozens and dozens of attributes.

With the type provider, I needed to use tilde (~) as the qualifier (Quote) to preserve my quotation marks.  I did not have any tildes in my file, so this was a safe qualifier.

Powershell To The Rescue

Hitting a wall in F# (and giving myself a new problem to work through once I have time to learn more about this), I decided to drop back to Powershell.  I decided to use the Import-Csv cmdlet:

$csv = Import-Csv "C:\Temp\players.csv"
$csvtest = $csv[0]

Once I imported the data set, I grabbed the first line and checked out its attributes. And lo, the quotation marks were gone. This frustrated me greatly, as when I started doing Google searches, they all ended up being cases where people wanted to remove quotation marks rather than preserving them.

What I finally had to do was a bit cheap: I replaced all of the quotation marks with tildes (noting that there were none in my data set) and then imported the CSV, ran my script, and exported the results to CSV:

$csv | %{ if($_.retired -eq 0 -and $_.league_id -NotIn 0,105,108,109,110,135 -and $_.age -gt 15) { $_.draft_eligible = 1 } }
$csv | Export-Csv "C:\Temp\players_new.csv"

From there, I swapped those tildes back out to quotation marks and all was well.  You can automate the character swaps, but because this was a one-time thing, I decided just to modify the file externally.


One thought on “Reading And Writing CSVs

  1. Also — if you are a fan of baseball and would like to join, feel free to contact me! 🙂 (Seriously, though, +1 million points to Kevin for sorting out this boondoggle.)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s