[Opa] working around perf issue with appending data
Alok Menghrajani
alok at fb.com
Fri Nov 18 13:30:27 EST 2011
Hi,
While some of this is obvious for most people, I thought it's worth sharing.
I need to store data iteratively (by appending chunks). My first approach looked something like this (where [...] contained some other uninteresting data):
type simple = {
[...]
data : string;
}
db /simple : simple
append_first_chunk(first_chunk:string) = (
id:int = Db.fresh_key(@/simple)
/simple[id] <- {[...] data=first_chunk}
)
append_chunk(id:int, chunk:string) = (
/simple[id] <- {[...] data=String.concat("", [/simple[id]/data, chunk])}
)
This worked, but got slower and slower as more data came in. I was playing around with 4k chunks * 100 chunks, and things would get noticeably slower after ~30 chunks.
I then tried the following:
Type simple = {
[...]
data : list(string);
}
db /simple : simple
append_first_chunk(first_chunk:string) = (
id:int = Db.fresh_key(@/simple)
/simple[id] <- {[...] data=[first_chunk]}
)
append_chunk(id:int, chunk:string) = (
/simple[id] <- {[...] data=[chunk | /simple[id]/data]}
)
Again, this has the same poor performance issues as the first case. Things get slower as more data comes in.
Finally, here is how to do it right. I ended up slighly changing append_chunk's signature, but that's not strictly required since intmap's are ordered.
type simple = {
[...]
data : intmap(string);
}
db /simple : simple
append_first_chunk(first_chunk:string) = (
id:int = Db.fresh_key(@/simple)
data:intmap(string) = Map.empty
/simple[id] <- {[...] data=Map.add(0, first_chunk, data)}
)
append_chunk(id:int, chunk:string, n:int) = (
/simple[id]/data[n] <- chunk
)
Alok
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.owasp.org/pipermail/opa/attachments/20111118/462b4452/attachment.html
More information about the Opa
mailing list