FWIW, awk certainly defines 'record' as a single row. From gawk(1): "Normally, records are separated by newline characters. [but you can change the record separator]".
mafintosh
hi everyone!
wking
but the "records mode can be useful" points use words I don't understand ("facet", "fill down", …) :p
sorribas joined the channel
ogd
wking: facet is like a working set based on a filter, fill down is (i think) a column-wise operation that assigns some value/expression to every cell in a column
It looks like the CSV is just the current tip, and not historical revisions. Although maybe there are no revisions in the bus data ;).
ogd
wking: correct on both counts
mafintosh: one thing we cant do right now is say 'go to row 5382842'
mafintosh: cause we dont actually know what key that is
mafintosh: but i dont really care about that query anyway
mafintosh: it just means the ui will have to be based on startkey/endkey/limit instead of row offset
which means i'll probably get ride of the "1 - 10" and make it instead show the current range query options or something
rid*
wking
there's a hashy (hash-ish? :p) bit in the _id and another in _rev, but they aren't the smae. It looks like the _rev hash is an MD5, but what's the _id hash?
mafintosh
we could add forever scrolling as well :)
ogd
wking: we arent using those anymore (that dat is old) but _id was a uuid and rev was a md5 of the contents of the row
wking
ah
ogd
wking: now _id is key and _rev is version and id is a http://npmjs.org/cuid and version is an integer
dybskiy_ joined the channel
wking
so no hashing in _rev anymore? How do you detect and reject parallel edits to the same row?
ogd
wking: we decided to punt on the parallel edits use case for the alpha and add it in later, as part of an overall reduction of complexity
wking
ah, ok
ogd
so dat alpha is more like SVN than Git re: merging
wking
except SVN detects (I think) when you try to push a parallel edit ;).
dybskiy_ has quit
dybskiy_ joined the channel
but punting while the rest of the structure is hashed out seems reasonable. Adding metadata back in to handle/notice parallel edits later should be easy.
ogd
yea im trying to make a small core that is easy to extend w/ experimental modules
mafintosh: i was gonna geek out and go full virtual-dom w/ the ui but i stuck with mustache templates intead :)
we don't need to bleeding edge in everything we do :)
ogd
yea hehe
wking
I'm trying to work through the differences you'll get from avoiding Git's “all pointers are hashes” approach. Per-row revisions let you roll back individual rows, but I'm not sure how you'd snapshot/roll-back/etc. the table as a whole. Is that possible somehow?
Maybe that's getting off-topic here and I should put it in #121 instead ;).
ogd
wking: i was thinking our 'commits' would just be a list "key x went from version y to z"
that way if you wanna get data by commit hash you could just stream all the rows as they were at that point in time
wking
ah, and you have timestamps in the keys, so that should work.
ogd
one theoretical use case for a commit is if you wanted to run some sort of functional transform on the db
it'd almost feel like a db transaction
but the end result being a timeline of operations that happened w/ commit messages so you can roll back to a certain one
wking
Even with a daemon, commits are still going to require traversing the whole database checking for key timestamps though. Perhaps having a btrfs-style table that tracks changes at update-time, instead of going back through and looking at commit time.
ogd
that might not be a problem w/ leveldb, especially on SSDs, one of the nice properties is that sequential reads and random reads are nearly the same speed
so as long as we cached what keys we need to get
it should be pretty fast
wking
And not worry about databases that exceed your local memory/SSD capacity?
ogd
disk is cheap right :P
wking
Spinning disks are cheap ;).
SSDs are probably still cheaper than programmers ;). But some sort of shardable storage would be nice ;)
ogd
yes totally, im excited for the point when we can tackle the distributed use cases
todrobbins
what are some other dat repo URIs?
wking
I'll let you get back to work then, so that point is closer ;). Thanks for chatting :)