Search this site


Metadata

Articles

Projects

Presentations

PCRE, and how to not write an API.

From the pcreapi(3) manpage:
The first two-thirds of the vector is used  to  pass  back  captured  sub-
strings,  each  substring using a pair of integers. The remaining third of
the vector is used as workspace by pcre_exec()  while  matching  capturing
subpatterns, and is not available for passing back information. The length
passed in ovecsize should always be a multiple of three. If it is not,  it
is rounded down.
The 'vector' in question is used by pcre to store offset information for captured groups. It's a good and simple way to figure out where each capture starts and ends.

What doesn't make sense is the portion I put in bold. Why wouldn't pcre_exec simply allocate that scratch space itself? This does not make sense to me. In the mean time, I'm left wondering why I am allocating parts of an array I am told are unusable. I hope there's a good reason. Perhaps some unknown efficiency is gained from doing it this way.


2 responses to 'PCRE, and how to not write an API.'

Showing last 2 comments... (Click here to view all comments)

Justin Mason wrote at Wed Jun 4 03:00:19 2008...
Possibly, it's to allow you to perform the memory allocation upfront, so as to avoid the overhead of the internal implementation calling malloc().  I've seen that before.

In this case though, I doubt it -- I would guess they're just recursing and reusing that vector as-is, to store the opaque "capturing subpatterns" data.  Definite bad code smell off that, if so.

Jordan Sissel wrote at Wed Jun 4 10:03:12 2008...
Yeah, I thought it must be to avoid additional malloc() calls, but even then it doesn't make total sense.

Is it really much more efficient to do it this way, than, say, keeping your 'magical' vector inside the pcre* struct and realloc()'ing it if num_captures > length_of_magical_vector any time it happens?

It's probably not worth speculating more, since it just makes my brain hurt ;)


Leave a reply

You need javascript enabled to use this form. Anti-spam efforts ongoing. Also, if the comment doesn't show up, it's because the form expired. Go back and copy your comment, reload the form, and resubmit. Apologies if this is a hassle, I'm just playing with antispam methods right now. If this insists on not working, please email me about it.

Name (required)
E-mail (optional, if you want me to be able to email you back)
URL (also optional)
Comment: