Sunday, July 13, 2008

TStringList's has poor parsing capabilities

TStringList cannot handle spaces in its DelimitedText property. So the workaround is to ensure that your strings with spaces are enclosed in chars that are defined in the TStringList QuoteChar property. So what happens when you use the nifty LoadFromFile procedure on a large delimited text file with many spaces but no QuoteChars? 

The answer is you get a messed up TStringList, because if you have no QuoteChars, TStringList assumes that spaces are delimiters too. Isn't that fun!

The parsing capabilities of TStringList are half-assed at best. In addition to the spaces/QuoteChars fiasco, your delimiter can only be a string with a length of one.

And don't worry, I've already programmed around TStringList's shortcomings. I had to load the file and parse it myself, adding quotes where necessary, then passing it to TStringList.DelimitedText. 

5 comments:

  1. Maybe ExtractStrings is of help here?

    ReplyDelete
  2. Ahh, that looks like it =). For my app, I ended up parsing it manually by using strings and treating them as arrays so I could enumerate by index. I wonder if arrays are as fast as pointers in Delphi? I know conceptually they're the same thing, but you never know what a language/platform is doing behind the scenes.

    ReplyDelete
  3. You can easily see what's happening "behind the scene" using the CPU View (View -> Debug Windows). You can then see each line of code and the corresponding assembly code.
    Accessing an array is very fast, once the the address of the array is loaded it's jsut an indexed memory access, i.e.

    MOV EAX, [EBX][ECX]

    or something alike.

    ReplyDelete
  4. Set StrictDelimiter property to TRUE;

    ReplyDelete