Friday, May 5, 2006

MS Access: VBA Regular Expressions (RegEx)

As a Perl programmer (hey, I founded Barcelona Perl Mongers), I always am looking for Perlish features in other languages. I know VBA (Visual Basic for Applications) is not a beautiful language, but I run into so very often that I actually use it more than Perl now. And I've found that it has plenty of Perlish features. Two of the most useful for me have been Associative Arrays (i.e. hashes) in the form of the Dictionary object, and Regular Expressions. Recently a friend asked me how to extract ZIP codes from a field in a Microsoft Access table in a SELECT statement. For example, she wanted to say:


select some_function(ZIP_expression)
  from some_table;
on a table with values like "Philadelphia, PA 19107-1234" or "Lumberton, TX 77657" and have it return just the ZIP. I couldn't make an unwieldy version of this using just INSTR, MID, etc. Maybe there is a way. However, there is an elegant solution using regular expressions. You have to include a new Access module, and put this code in it. Prior to putting the code in, you have to include a reference to "Microsoft VBScript Regular Expressions 5.5" (Tools->References):

Function zipfinder(t As String)
    Dim re As New RegExp
    ' Look for five digits.  Optionally, look for a dash and four digits
    re.Pattern = "\b(\d{5}(-\d{4})?)\b"
    ' This version finds the first match.
    ' If you want the last match, set Global to false.
    re.Global = False
    Dim m
    For Each m In re.Execute(t)
        zipfinder = m.Value
    Next
End Function
Then you can use this as a function in a SELECT statement:

select zipfinder( ZIP_expression)
  from some_table;

12 comments:

  1. Third or fourth google link. Thanks for the info :)

    ReplyDelete
  2. This is exciting...
    I tried this several times but I got error. At last I noticed I have to add "Microsoft VBScript Regular Expressions 5.5" reference.

    Thanks...

    ReplyDelete
  3. Hi,
    if you want to get all matches you have to change the code as follows:

    Dim matches As MatchCollection
    'do the pattern stuff

    'this initializes a collection with all matches
    Set matches = re.Execute(t)
    For Each Match In matches
    'do whatever
    Next

    ReplyDelete
  4. It does not work for me . I am using Microsoft Jet OLE DB 4.0.

    SQL = "select zipfinder(poolid) from " & txtFile & " group by poolid order by poolid asc"

    ConnectionString1 = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & textFileDirectoryPath & ";" & _
    "Extended Properties=Text;"
    Set recordset = New ADODB.recordset
    Call recordset.Open(SQL, ConnectionString1, CursorTypeEnum.adOpenForwardOnly, LockTypeEnum.adLockReadOnly, CommandTypeEnum.adCmdText)

    Do Until recordset.EOF
    recordset.MoveNext
    //some processing
    Loop



    recordset.Close





    Function zipfinder(t As String)





    Dim re As New RegExp

    re.Pattern = "b(d{5}(-d{4})?)b"

    re.Global = False

    Dim m

    For Each m In re.Execute(t)

    zipfinder = m.Value

    NextEnd



    End Function

    ReplyDelete
    Replies
    1. Akshay, what error do you get? Did you add the reference to "Microsoft VBScript Regular Expressions 5.5" or its equivalent?

      Delete
    2. I have some experience creating some reasonably complex VBS scripts, as well as a few VBS scripts in excel etc, I also have some experience with Access and also some with regular expressions, although I am not expert.
      But when I try to reference a function name in Access, it give me an error message that says "undefined function "name" in expression"
      I have created the code, declared it as a public function, put it in a module and ticked the box against "Microsoft VBScript Regular Expressions 5.5".....what am I missing?

      Delete
    3. Scooble, can you send me your code and the text of the error message (a screen print of the error message would be great). I'm timothychenallen@gmail.com.

      Delete
    4. I found this page;
      http://www.access-programmers.co.uk/forums/showthread.php?t=214679

      which contained a download. (I changed the name of the function from RegExp to MyRegExp)

      I looked at the SQL and the code in the module and it all kind of makes sense.
      The SQL statement makes a call to a function by submitting to it as a string the Regular expression whilst also specifying which table to use and fields to show.

      This all makes sense, however, when I copy the table, the module, and the query direct from the downloaded file, it fails to work in a new blank database.

      When I try to run or submit the SQL;
      SELECT tblPackageReferenceValuesConcat.ConcatRefNum, MyRegExp([ConcatRefNum],"^(PN|P)[0-9][0-9][A-Z][0-9]{3,4}[A-Z]{5,6}$") AS Exp2
      FROM tblPackageReferenceValuesConcat;

      I get the error message;
      'Ambigigous name in query MyRegExp([ConcatRefNum],"^(PN|P)[0-9][0-9][A-Z][0-9]{3,4}[A-Z]{5,6}$")

      Its odd, it works in the downloaded file, but not in the blank one I created with everything copied across

      Delete
    5. The error message is telling you that Access can't tell what table or query ​[ConcatRefNum] comes from. You can fix this by qualifying the column name (i.e. pre-pending the table name on the column name). Also, you should *not* qualify the function name; the function does not belong to the table :

      SELECT tblPackageReferenceValuesConcat.ConcatRefNum, MyRegExp(tblPackageReferenceValuesConcat.[ConcatRefNum],"^(PN|P)[0-9][0-9][A-Z][0-9]{3,4}[A-Z]{5,6}$") AS Exp2
      FROM tblPackageReferenceValuesConcat;

      - or even neater:

      SELECT tp.ConcatRefNum, MyRegExp(tp.[ConcatRefNum],"^(PN|P)[0-9][0-9][A-Z][0-9]{3,4}[A-Z]{5,6}$") AS Exp2
      FROM tblPackageReferenceValuesConcat as tp

      ​Try that and let me know how it turned out. Thanks!​

      Delete
  5. Yes I added a reference. I tried it on Oracle DB too and it fails. It throws an error saying " ZIPFINDER invalid identifier".


    Please let me know where I am going wrong

    ReplyDelete
  6. To add more to it , I am calling the code from a macro in Excel .

    ReplyDelete

I moderate comments blog posts over 14 days old. This keeps a lot of spam away. I generally am all right about moderating. Thanks for understanding.