#dat

/

      • ogd
        mafintosh: that way when people google for individual file hashes dat web servers or trackers or something will come up
      • mafintosh: basically we should just make sure to try and expose that metadata, rather than just exposing the merkle dag hash
      • mafintosh
        ogd, karissa public bits could generate html pages for o/ that could be picked up by search engines
      • ogd
        mafintosh: yea good idea, i was also thinkign though maybe dat needs a default web index thing that lets you browse + download files over http just so that dats become googleable
      • karissa
        ogd: yeah, this is part of 'what happens when I go to the http endpoint for a dat'
      • ogd: perhaps there are different frontends.. it'd be nice if there was just an http endpoint, like when you go to a filesystem
      • ogd
        i just think it would be cool if you could take any hash of any file, put it into google, and get a bunch of results. not many things expose that info right now, even if they do hash them (like git for example)
      • karissa: yea
      • karissa
        i don't know if i'd ever search for a file hash, but i would totally search for a filename. maybe by default there's an index and then there could be a /metadata.json endpoint
      • or idk, file hashes seem pretty unintuitive to me. i'd love to get away from them as soon as we can
      • kind of advanced usage
      • it'd be cool if we could get semver on publicbits
      • ogd
        karissa: oh yea im not saying this is important as a user facing concept, just that hashes are a Universal Truth Of Computers that we should expose to The Internet :D
      • karissa
        I feel like this should be a T-Shirt
      • ogd
        lol
      • karissa
        XD
      • mafintosh
        i just wanna point out that i learned yesterday that hashes are also post-quantom secure
      • karissa
      • ogd
        mafintosh: lol
      • mafintosh
        ogd: karissa after i'm done with this dht stuff i'm gonna impl storing of files as actual files
      • karissa
        mafintosh: dance dance dance
      • mafintosh
        then i'm fine doing a release
      • stwe has quit
      • stwe joined the channel
      • ogd
        mafintosh: cool
      • stwe_ joined the channel
      • stwe has quit
      • substack
        I've got so many spatial indexes that I want to implement
      • getting all kinds of crazy ideas for p2p stuff to build on top of different indexes now that I built a kdb index over a hyperlog
      • a 1d interval tree would make a good p2p calendar backend, an nd interval tree would make a good bounding box query engine
      • dwins joined the channel
      • mafintosh
        substack: whats an interval tree?
      • substack
        it lets you find intervals that overlap with other intervals
      • so you can compute the intersection of two polygons
      • or in 1d, you can find ranges that overlap
      • mafintosh
        substack: ah i see. nice
      • substack
        and I think the trick from bkdtrees would work for interval trees too, but I haven't seen anything in the literature about that
      • the bkd trick is that you buffer writes until you have N records, you sort in memory to build a balanced tree, then you write out to disk
      • but there are lots of size N, 2N, 4N, 8N, 16N etc trees
      • and if slot N is full, you merge the memory buffer with slot N to fill slot 2N
      • but if 2N is full, you find some combination to merge into an empty slot
      • and there always exists such a combination because the pattern of power 2 combination overflows are a binary counter
      • stwe_ has quit
      • dwins has quit
      • floppy joined the channel
      • floppy has left the channel
      • TheLink
        has anyone ever benchmarked http://papaparse.com/ against mafintosh's csv-parser?
      • mafintosh
        TheLink: wouldn't be surprised if that is faster than mine since it coulples the file system
      • TheLink
        ah, ok
      • mafintosh
        TheLink: that means you can safe a bunch of memory allocations if you're smart about it.
      • dwins joined the channel
      • dwins joined the channel
      • stwe joined the channel
      • finnp
        feross: mafintosh Are you in hall g?
      • tbeseda joined the channel
      • ralphtheninja joined the channel
      • stwe has quit
      • floppy joined the channel
      • ralphtheninja has quit
      • floppy has left the channel
      • mafintosh
        finnp: we ended up in a crypto analysis on diffie-helmann
      • we are both pretty depressed now as there seams to be a lot of attack vectors with the current set of common parameters with DH
      • Ogd karissa watch the DH and discrete log talk when it comes out
      • tbeseda has quit
      • dwins joined the channel
      • dwins joined the channel
      • finnp
        mafintosh: ugh that sucks
      • dat-gitter-bot has quit
      • dat-gitter-bot joined the channel
      • ogd
        karissa: been thinking about your 'move away from hashes' comment earlier, it got me thinking
      • karissa: i think maybe we could remove hashes altogether from the discovery step
      • karissa: and instead pick something user friendly like username/dataset-name or just dataset-name
      • karissa
        ogd: do we need to depend on publicbits for that?
      • ogd
        karissa: yea i think so....
      • karissa: *something* has to authenticate it
      • karissa
        yeah
      • ogd
        karissa: with hashes the hash itself authenticates it
      • karissa
        so if we had a sort of reverse-hash lookup on publicbits would that suffice?
      • ogd
        karissa: if we allow anyone to say they have 'maxogden/genome' then you have to trust on first use (TOFU)
      • but if you can just trust publicbits.org then its easier
      • karissa: ya that would work
      • only mystery to me is what do we do if you arent on the internet
      • e.g. you wanna do a wifi sync but neither of you can talk to publicbits.org
      • someone on your LAN might say "i have maxogden/genome", normally you could verify that claim with publicbits.org but in this scenario maybe it prompts the user "cannot access publicbits.org, do you want to trust anyway?"
      • cc mafintosh
      • ralphtheninja joined the channel
      • mafintosh
        i wouldnt want to discovery away from hashes just yet. too many unknown variables for me
      • you can always have a central trusted point that resolves username/repo to a hash
      • ogd
        mafintosh: ya agreed, this convo is probably too premature
      • mafintosh: what about reverse lookups, e.g. someone on my LAN claims to have 'maxogden/genome', they send me a hash, how can i verify that hash exists in the history of the trusted versions? i guess i'd need to ask a server to 'verify' a hash is in the history of maxogden/genome
      • mafintosh: or alternatively a crypto sig could be used
      • mafintosh
        ogd: yep some signature. this is a classic hard problem though
      • karissa
        i wouldn't worry about offline + username/repo. for now we could say offline would only work with hashes
      • vespakoen joined the channel
      • the username/repo seems useful for an open publishing flow
      • an org could host a publicbits instance behind a firewall and still have the same private publishing still with this method
      • mafintosh
        ogd, karissa: for versioned feeds i would do this. have the swarm / data be identified by a ecc public key. use this key for peer discovery. if a user does dat maxogden/genome first contact public bits to resolve maxogden/genome to a hash/public key. cache this result of this resolution so it works on subsequent offline syncs
      • ogd
        mafintosh: i think for publicbits/dat our approach is to have a centralized trust server with decentralized file transfer and discovery
      • mafintosh
        ogd: as long as the central point of trust is a very thin layer on top i'm +1 on that
      • ogd
        mafintosh: yea it would just resolve "user/repo" and potentially "user/repo@tag" to hashes
      • mafintosh: so re: your proposal, if i want a specific version i would do 'dat maxogden/genome#somehash' and it could also ask publicbits if 'somehash' is valid?
      • mafintosh: but if you ever get the hash you can forever get the data from peers who have it since at that point its totally p2p
      • mafintosh
        ogd: yea.
      • ogd: and public bits can also resolve to a public key instead of a hash so i can future updates offline
      • ogd
        mafintosh: yea thats awesome
      • mafintosh
        ogd: for that one feed (use a different public/private key for every feed)