Search this site


Metadata

Articles

Projects

Presentations

Flex start conditions

I finally (after a bit of searching and thinking) figured how to properly match comments with flex/bison. Horray :)

Some tutorials use this in flex:

#.*$
But since flex chooses longest-match-first for the next token, any line with a # in it might have the remainder of the line accidentally cut as a comment. The right way to do this appears to be:
/* in your .lex file */

%x LEX_COMMENT

%%

"#" { BEGIN(LEX_COMMENT); }
<LEX_COMMENT>[^\n]*  /* ignore comments. */
<LEX_COMMENT>\n   { yylineno++; BEGIN(INITIAL); } /* end comment */
There's a mini state-machine going on here. When it matches a "#" it moves into the 'LEX_COMMENT' (name chosen by me, could be anything) state where only tokens in this state are accepted. Now my config files can ignore comments properly: only when outside the presence of any other token (like a quoted string).

Details here

Query parsing in JavaScript

For pimp, I want to be able to search a specific column, say, artist, without needing multiple fields for searching. The ability to specify more advanced searches than simple keywords is quite useful. How do we leverage this on the client and turn a search query into a set of key-value pairs?

I must confess I was hesitant to put this kind of logic into Javascript instead of Python. Furthermore, it makes me feel a little uneasy using /foo/ in anything other than Perl. Nonetheless, doing this in Javascript was simple and it's still fast (as it should be).

The particular type of query I want to parse come in the following (hopefully intuitive) formats:

  • foo
  • artist:Eminem
  • album:"Across a Wire"
  • artist:"Counting Crows" album:august
The following code does this for me. The parse_query function will return a dictionary of query terms. Values are lists.

Here's an example:

Query
rain baltimore artist:"Counting Crows" album:august
Results of parse_query
    { "artist": ["Counting Crows"],
      "album": ["august"],
      "any": ["rain", "baltimore"],
    }
  
I take the dictionary returned and pass it to jQuery's $.post function to execute an AJAX (I hate that term, it's such a misnomer these days) request. Here's the code:
query_re = /(?:"([^"]+)")|(?:\b((?:[a-z:A-Z_-]|(?:"([^"]+)"))+))/gi,

function parse_query(string) {
  dict = {}
  while (m = query_re.exec(string)) {
    val = (m[1] || m[2]).split(":",2)
    if (val[1]) { key = val[0]; val = val[1]; }
    else { key = "any"; val = val[0]; }

    val = val.replace(/"/g,"");

    dict[key] = dict[key] || [];
    // the following should be .append(val) but 
    // I don't think javascript lists have them
    dict[key][dict[key].length] = val;
  }

  return dict;
}