#dat

/

      • shama has quit
      • dybskiy_ is now known as dybskiy
      • dybskiy has quit
      • dybskiy joined the channel
      • dybskiy has quit
      • dybskiy joined the channel
      • shama joined the channel
      • taterbase has quit
      • dybskiy_ joined the channel
      • dybskiy has quit
      • shama has quit
      • dybskiy_ has quit
      • shama joined the channel
      • shama has quit
      • dybskiy joined the channel
      • dybskiy has quit
      • shama joined the channel
      • shama has quit
      • dybskiy joined the channel
      • dybskiy has quit
      • shama joined the channel
      • shama has quit
      • dybskiy joined the channel
      • dybskiy has quit
      • shama joined the channel
      • shama has quit
      • dybskiy joined the channel
      • dybskiy_ joined the channel
      • dybskiy has quit
      • dybskiy_ has quit
      • sorribas joined the channel
      • paulfitz joined the channel
      • shama joined the channel
      • shama has quit
      • dybskiy joined the channel
      • dybskiy has quit
      • shama joined the channel
      • zjj joined the channel
      • zjj has quit
      • zjj joined the channel
      • zjj has quit
      • dybskiy joined the channel
      • nseger joined the channel
      • dybskiy has quit
      • sorribas has quit
      • wizibe joined the channel
      • wizibe has left the channel
      • dybskiy joined the channel
      • dybskiy has quit
      • todrobbins joined the channel
      • knowtheory joined the channel
      • dat-git-bot
        [dat] wking opened pull request #124: cli.js: Document clone arguments (optional directory) (master...clone-docs) http://git.io/OS-jZQ
      • todrobbins
        this might not be the best place to ask this, but is there a decent triplestore (RDF) in node?
      • and in relation to dat, I suppose you could just store ttl or json-ld in dat
      • taterbase joined the channel
      • fjhqjv joined the channel
      • dybskiy joined the channel
      • dybskiy has quit
      • dybskiy joined the channel
      • dat-git-bot
        [dat] maxogden closed pull request #124: cli.js: Document clone arguments (optional directory) (master...clone-docs) http://git.io/OS-jZQ
      • donpdonp has left the channel
      • ogd
        todrobbins: yea levelgraph is pretty good
      • todrobbins: uses the same db as dat, so it should be easy to write a module using levelgraph on top of dat
      • (might require writing a custom replicator or something like that though)
      • todrobbins
        ogd: cool. I’ll check it out
      • tphummel joined the channel
      • ogd
        revived a project I havent touched in 2 years! https://github.com/maxogden/dat-editor (new name)
      • todrobbins
        ogd++
      • ogd
        ooh interesting difference between a 'row' and a'record' http://googlerefine.blogspot.com/2012/03/differ...
      • wking
        FWIW, awk certainly defines 'record' as a single row. From gawk(1): "Normally, records are separated by newline characters. [but you can change the record separator]".
      • mafintosh
        hi everyone!
      • wking
        but the "records mode can be useful" points use words I don't understand ("facet", "fill down", …) :p
      • sorribas joined the channel
      • ogd
        wking: facet is like a working set based on a filter, fill down is (i think) a column-wise operation that assigns some value/expression to every cell in a column
      • wking
        ah. So SQL's SELECT and UPDATE?
      • ogd
        yea kinda
      • taterbase has quit
      • dybskiy has quit
      • dybskiy_ joined the channel
      • dybskiy_ has quit
      • dybskiy_ joined the channel
      • mafintosh: hah check out dat-editor now
      • my crappy design skills at work
      • mafintosh
        ogd: what was the url to the bus dat repo?
      • i need some to data :)
      • ogd
      • mafintosh
        oh - we need to update that so i can clone it :)
      • ogd
        yea
      • it has like 4 million rows atm
      • you could export a csv and import it
      • mafintosh
        i need to downgrade dat to even clone since it doesnt support binary replication :)
      • ogd
      • might take a while
      • mafintosh
        jan wants me to do a talk at jsconf
      • dybskiy_ has quit
      • ogd: w000t super nice!
      • ogd: how does the ui scale on big data sets?
      • arh pagenation
      • wking
        It looks like the CSV is just the current tip, and not historical revisions. Although maybe there are no revisions in the bus data ;).
      • ogd
        wking: correct on both counts
      • mafintosh: one thing we cant do right now is say 'go to row 5382842'
      • mafintosh: cause we dont actually know what key that is
      • mafintosh: but i dont really care about that query anyway
      • mafintosh: it just means the ui will have to be based on startkey/endkey/limit instead of row offset
      • which means i'll probably get ride of the "1 - 10" and make it instead show the current range query options or something
      • rid*
      • wking
        there's a hashy (hash-ish? :p) bit in the _id and another in _rev, but they aren't the smae. It looks like the _rev hash is an MD5, but what's the _id hash?
      • mafintosh
        we could add forever scrolling as well :)
      • ogd
        wking: we arent using those anymore (that dat is old) but _id was a uuid and rev was a md5 of the contents of the row
      • wking
        ah
      • ogd
        wking: now _id is key and _rev is version and id is a http://npmjs.org/cuid and version is an integer
      • dybskiy_ joined the channel
      • wking
        so no hashing in _rev anymore? How do you detect and reject parallel edits to the same row?
      • ogd
        wking: we decided to punt on the parallel edits use case for the alpha and add it in later, as part of an overall reduction of complexity
      • wking
        ah, ok
      • ogd
        so dat alpha is more like SVN than Git re: merging
      • wking
        except SVN detects (I think) when you try to push a parallel edit ;).
      • dybskiy_ has quit
      • dybskiy_ joined the channel
      • but punting while the rest of the structure is hashed out seems reasonable. Adding metadata back in to handle/notice parallel edits later should be easy.
      • ogd
        yea im trying to make a small core that is easy to extend w/ experimental modules
      • mafintosh: i was gonna geek out and go full virtual-dom w/ the ui but i stuck with mustache templates intead :)
      • mafintosh
        haha - sell out
      • ogd
        i started here https://github.com/maxogden/dat-editor/blob/mas... but that file isnt used atm
      • dybskiy_ has quit
      • mafintosh
        yeah you showed me that code
      • we don't need to bleeding edge in everything we do :)
      • ogd
        yea hehe
      • wking
        I'm trying to work through the differences you'll get from avoiding Git's “all pointers are hashes” approach. Per-row revisions let you roll back individual rows, but I'm not sure how you'd snapshot/roll-back/etc. the table as a whole. Is that possible somehow?
      • Maybe that's getting off-topic here and I should put it in #121 instead ;).
      • ogd
        wking: i was thinking our 'commits' would just be a list "key x went from version y to z"
      • that way if you wanna get data by commit hash you could just stream all the rows as they were at that point in time
      • wking
        ah, and you have timestamps in the keys, so that should work.
      • ogd
        one theoretical use case for a commit is if you wanted to run some sort of functional transform on the db
      • it'd almost feel like a db transaction
      • but the end result being a timeline of operations that happened w/ commit messages so you can roll back to a certain one
      • wking
        Even with a daemon, commits are still going to require traversing the whole database checking for key timestamps though. Perhaps having a btrfs-style table that tracks changes at update-time, instead of going back through and looking at commit time.
      • ogd
        that might not be a problem w/ leveldb, especially on SSDs, one of the nice properties is that sequential reads and random reads are nearly the same speed
      • so as long as we cached what keys we need to get
      • it should be pretty fast
      • wking
        And not worry about databases that exceed your local memory/SSD capacity?
      • ogd
        disk is cheap right :P
      • wking
        Spinning disks are cheap ;).
      • SSDs are probably still cheaper than programmers ;). But some sort of shardable storage would be nice ;)
      • ogd
        yes totally, im excited for the point when we can tackle the distributed use cases
      • todrobbins
        what are some other dat repo URIs?
      • wking
        I'll let you get back to work then, so that point is closer ;). Thanks for chatting :)
      • todrobbins
        http://nextbus.dathub.org/_csv is giving me 404 on the command line
      • ogd
        todrobbins: we made lotsa breaking changes this week so most of them wont work
      • todrobbins
        ah