I have fair reason to not like Access very much right now. I’ve been piddling away at an Access database, learning everything on the fly. Somehow, I’ve been able to mimic a major process in roughly two weeks, and I’m going to try to spend some time next week cleaning up a few of the learning curve decisions. The big fix I have to make involves lookups. Our process, as I have stated before, takes a bunch of fields in and then does lookups and translations on many of them. The lookups and translations are done off of Oracle tables, which allows them to do thousands of translations in a few seconds. I, however, had no clue how to access an Oracle table from Access, so what I did was export all of this data into an Excel spreadsheet, import the sheet into the Access database, and try to do a select query.
Why “try”? Because Access doesn’t want you doing select queries in VBA code! Here, for anybody who might be looking for some code to replicate a search process (very slowly) in Access, is a mock-up of what I am doing:
Function ProcessToRun()
Dim flatfile As New ADODB.Recordset
Dim account As New ADODB.Recordset
Dim transfile As New ADODB.Recordset
Dim foundFlag as Boolean
foundFlag = False
flatfile.ActiveConnection = CurrentProject.Connection
flatfile.Open “dcas”, , adOpenKeyset, adLockOptimistic, adCmdTable
accountTrans.ActiveConnection = CurrentProject.Connection
accountTrans.Open “AccountTranslation”, , adOpenKeyset, adLockOptimistic, adCmdTable
transfile.ActiveConnection = CurrentProject.Connection
transfile.Open “Transformed23″, , adOpenKeyset, adLockOptimistic, adCmdTable
While Not flatfile.EOF
While (Not accountTrans.EOF) And foundFlag = False
If ((accountTrans.Fields.Item(”SRC_SYSTEM_CD”) = Trim(flatfile.Fields.Item(”intname”))) And (accountTrans.Fields.Item(”TRNSLTN_ACCT_INACTVE”) = “A”) And (accountTrans.Fields.Item (”SRC_ACCT_CD”) = Trim(flatfile.Fields.Item(”acctno”)))) Then
GNRL_LDGR_ACCT_NUM = accountTrans.Fields.Item(”ACCT_CD”)
OLD_GNRL_LDGR_ACCT_NUM = flatfile.Fields.Item (”acctno”)
foundFlag = True
End If
‘Move along to the next row in the reference data
accountTrans.MoveNext
Wend
If (foundFlag = False) Then
accountTrans.MoveFirst
While (Not accountTrans.EOF) And foundFlag = False
If ((accountTrans.Fields.Item(”SRC_SYSTEM_CD”) = SRC_SYSTEM_CD) And (accountTrans.Fields.Item(”TRNSLTN_ACCT_INACTVE”) = “A”) And ( accountTrans.Fields.Item(”SRC_ACCT_CD”) = Trim(flatfile.Fields.Item(”acctno”)))) Then
GNRL_LDGR_ACCT_NUM = accountTrans.Fields.Item(”ACCT_CD”)
OLD_GNRL_LDGR_ACCT_NUM = flatfile.Fields.Item(”acctno”)
foundFlag = True
End If
‘Move along to the next row in the reference data
accountTrans.MoveNext
Wend
End If
If (foundFlag = False) Then
GNRL_LDGR_ACCT_NUM = “X” & flatfile.Fields.Item(”acctno”)
OLD_GNRL_LDGR_ACCT_NUM = flatfile.Fields.Item(”acctno”)
End If
‘Reset foundFlag for the next loop
foundFlag = False
transFile.AddNew
transfile.Fields.Item(”GNRL_LDGR_ACCT_NUM”) = GNRL_LDGR_ACCT_NUM
transcreated.Fields.Item(”OLD_GNRL_LDGR_ACCT_NUM”) = OLD_GNRL_LDGR_ACCT_NUM
accountTrans.MoveFirst
transFile.MoveNext
flatfile.MoveNext
Wend
flatfile.Close
transFile.Close
accountTrans.Close
Let me explain this a little bit. “dcas” is the name of a table of data I imported from a flat file by using the TransferText function in a macro, “Transformed23″ is the table I am writing to, and “AccountTranslation” is the name of the Excel link I am using. AccountTranslation has 4 major parts to it:
- A Source System Code (SRC_SYSTEM_CD) which comes in.
- Whether or not an account is active (”A”)
- The source account number, which comes in as acctno on the flat file
- The translated account number, ACCT_CD, which exists only on the AccountTranslation table
So what my loop does is it goes through the table and tries to find any combinations of active records with matching source system and account number codes, and then returns the “real” account. Once it finds this account, foundFlag is set to true and the process ends. If we get all the way through the table without finding a match–certainly possible–then we walk into that second if statement. That second if statement does the same thing, except the source system code is different. So if that one fails, too, then we walk into the third if statement, which just sets our account number to X followed by the old number. In any event, foundFlag is set back to False so that it may be used again, and then we add a new record and insert the account numbers (old and new), move the pointer for row numbers (accountTrans goes back to the beginning and transFile and flatfile each move one down), and start over. At the end, we are very polite and close the files. Only Dan would be so much of a jerk as to leave these files open. Tony would probably just be lazy and leave files around like his dirty laundry: all over the floor.
To explain what the problem with this idea is, I would have to explain exactly why relational databases exist. And why do relational databases exist? Because for relational data, they’re a _lot_ faster than going sequentially! So here I am, using a relational database to putter along, doing sequential lookups. Let us just say that this has major speed-related issues. It works, so the rest of the test team is using it, but my #2 priority right now is to replace all of these sequential lookups with Oracle-based lookups.
Because I had some difficulty getting Oracle to work originally, any n00bs in the audience might want a piece of sample code which successfully connects to an Oracle database and pulls data from a table. The following code will do just that:
‘Set up the Oracle connection
Dim jgenOracleConn As ADODB.Connection
Dim jgenOracleRec As ADODB.Recordset
Dim num_recs_oracle
Dim sqlString_oracle As String
sqlString_oracle = “select * from myTable order by field1 asc, field2 desc”
Set jgenOracleConn = New ADODB.Connection
jgenOracleConn.Open “DSN=dsn_name;UID=username;PWD=password;database=db_name”
Set jgenOracleRec = jgenOracleConn.Execute(sqlString_oracle, num_recs_oracle)
While Not jgenOracleRec.EOF
outputfile.AddNew
‘At this point, just pluck out the fields you need
outputfile.Fields.Item(”field1″) = jgenOracleRec.Fields.Item(”field1″)
jgenOracleRec.MoveNext
outputfile.MoveNext
Wend
jgenOracleConn.Close
Now, you can modify this little snippet of code without too much difficulty and do an Oracle lookup on the account, like before. There would be two statements necessary: selecting the account code with the original source system code, and selecting the account code with the secondary source system code. The underlying code itself doesn’t change much and really doesn’t get simpler, but there are two advantages: first, changes in reference data are guaranteed to be fresh (whereas with Excel, you can make it so that the spreadsheets refresh themselves every time you open the file, but this doesn’t guarantee the freshest reference data); and second, it is tremendously faster in comparison. I know that Oracle and Teradata work very similarly from Access VBA’s perspective, so if you use either of those database systems, do it the second way instead of the first. As soon as I get priority #1 finished, I’m going to change all of the old lookups to the significantly faster ones. That is because I am super awesome.
[...] Yesterday, I talked about my plans to enspeedinate the flat file transformation process, in The Need For Speed. Today, I want to give a quick example of exactly how much faster SQL lookups are than sequential searches, and I shall do so with the help of one of my flat file interfaces. I have a file which has 397 rows and is 148K. That’s not a big file by any means, so it shouldn’t really tax any legitimate system. The process itself does some changes to the data, but the killer is that I have to look up three values and translate them. Well, this interface was by far the worst, and the reason for that is that I dumped all of the reference data I would need into one Excel spreadsheet. This spreadsheet is over 25,000 rows and roughly 80 columns, and that is a _lot_ of data to process. So what I do is, for each row in the file, I have to go through all 25,000 rows to pick up each of the translations. Thus, for my 397 rows, I am processing just shy of 10,000,000 rows of reference data, of which I would only need upwards of 1191 (assuming that each translation value occurs on a different line). That is, shall we say, quite inefficient… “,0] ); D([”ce”]); //–> [...]
Pingback by The Need For Speed: Empirics « 36 Chambers - The Legendary Journeys — September 15, 2006 @ 7:05 pm
[...] I got busted on a silly mistake today. I had spent Friday and Monday enspeedinating my processes, as I had described here and here. So today, when the folks go and try to test it out, they get invalid login/password errors because I didn’t actually give them the right login and password in the code. Oops. One mass replace statement later, all is good, but there was one more issue: I double-declared one variable in each of the mappings. Oops-oops. This was slightly more difficult than a mass replace, but only slightly (as I said). Those combined took about 1 minute and 45 seconds to fix, but really, it’s kind of embarrassing to get caught by things like that. I want my bugs big and ludicrous. I want the ones which melt monitors and cause sparks to fly out of the fluorescent lights, not these piddling “if you even dry-ran the thing you’d catch it” bugs. [...]
Pingback by Preview Before You Do! « 36 Chambers - The Legendary Journeys — September 19, 2006 @ 8:22 pm