[Opa] working around perf issue with appending data

Alok Menghrajani alok at fb.com
Fri Nov 18 13:30:27 EST 2011


Hi,

While some of this is obvious for most people, I thought it's worth sharing.

I need to store data iteratively (by appending chunks). My first approach looked something like this (where [...] contained some other uninteresting data):

type simple = {
  [...]
  data : string;
}

db /simple : simple

append_first_chunk(first_chunk:string) = (
  id:int = Db.fresh_key(@/simple)
  /simple[id] <- {[...] data=first_chunk}
)

append_chunk(id:int, chunk:string) = (
  /simple[id] <- {[...] data=String.concat("", [/simple[id]/data, chunk])}
)

This worked, but got slower and slower as more data came in. I was playing around with 4k chunks * 100 chunks, and things would get noticeably slower after ~30 chunks.

I then tried the following:

Type simple = {
  [...]
  data : list(string);
}

db /simple : simple

append_first_chunk(first_chunk:string) = (
  id:int = Db.fresh_key(@/simple)
  /simple[id] <- {[...] data=[first_chunk]}
)

append_chunk(id:int, chunk:string) = (
  /simple[id] <- {[...] data=[chunk | /simple[id]/data]}
)

Again, this has the same poor performance issues as the first case. Things get slower as more data comes in.

Finally, here is how to do it right. I ended up slighly changing append_chunk's signature, but that's not strictly required since intmap's are ordered.

type simple = {
  [...]
  data : intmap(string);
}

db /simple : simple

append_first_chunk(first_chunk:string) = (
  id:int = Db.fresh_key(@/simple)
  data:intmap(string) = Map.empty
  /simple[id] <- {[...] data=Map.add(0, first_chunk, data)}
)

append_chunk(id:int, chunk:string, n:int) = (
  /simple[id]/data[n] <- chunk
)

Alok
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/opa/attachments/20111118/462b4452/attachment.html 


More information about the Opa mailing list