(pingali) It is post-midnight EST but if anyone is around Hi! from Bangalore!
(pingali) Quick summary of what I do: I work on an opensource called dgit (will change the name soon!) that is 'git for data'. It extends git interface to handle dtaa management tasks such as validation, materialization (of queries etc), schema change detection etc.
(pingali) Not being a very javascript person (live in python world), dont use dat much myself. But I like the concept and tool experience.
(pingali) Will try to be around today PM mytime so that I can catch any of you.
pfraze joined the channel
bthesorceror joined the channel
pfraze joined the channel
bthesorceror has quit
jorin joined the channel
m-i_ has quit
ddem-bot has quit
ddem-bot joined the channel
m-i joined the channel
mafintosh
@pingali hi!
dat-gitter-bot
(pingali) @mafintosh Hi!
(pingali) A bit early for EST
(pingali) should I be using the irc channel instead?
mafintosh
@pingali i'm on CEST. i live in copenhagen
dat-gitter-bot
(pingali) ah!
mafintosh
@pingali up to you
dat-gitter-bot
(pingali) ok. I am just an eaves dropper with shared interest.
mafintosh
ironically its not that early for cest by i'm a late riser
(pingali) I intend to take a closer look at dat as well. If there is interest, we can discuss how the tools can interoperate
mafintosh
cool thanks. does your project use git underneath?
we've actually pivoted a bit from the use-case of sharing structured data. dat is primarily a file sharing tool now. we simply store data sets as files
dat-gitter-bot
(pingali) yes. It is a wrapper around git. focuses on the content of the repo.
(pingali) Understand. Your motivation, as I understand, is more from data journalism/citizen data analyst.
(pingali) I am coming at this from corporate data science.
mafintosh
it is still very science focused as well
but yea the tool is generic
dat-gitter-bot
(pingali) dgit does very little for sharing. So one thought I had was to build/extend/use dat's framework for sharing.
mafintosh
interesting
dat-gitter-bot
(pingali) yes, I am partly motivated by reproducible research
(pingali) (am a recovering academic)
(pingali) Who are primary users of dat now?
(pingali) Are academics/data scientists open to unconstrained sharing that dat provides?
mafintosh
we are actually adding some very simple access-control features now
where we can guarantee that only the person you share your dat link with will be able do download the content (no man in the middle). this should enable decent private sharing
dat-gitter-bot has quit
dat-gitter-bot joined the channel
dat-gitter-bot
(pingali) two questions: (1) persistence of the link - is it a function of the underlying machine/storage ? One requirement that I have is that a year from now the link should be still valid (which I am now ensuring using s3/github etc urls)
(pingali) (2) any notion of groups?
(pingali) one reason people seem to like dgit is that it does not require you to give credentials to any thirdparty (e.g., github) and data does not leave premises
(pingali) they like the control
(pingali) (3) One more. Can I withdraw a link? I think mistakes will happen while sharing. people may want to undo
m-i_ joined the channel
m-i has quit
mafintosh
@pingali the link is a hash of the content (similar to git) and is resolved p2p like bittorrent
you withdraw a link by simply stopping to share it
dat-gitter-bot
(pingali) Ok. Will dig in some more.
(pingali) any thoughts on groups?
mafintosh
hmm i'm not sure how that applies
because we dont have any centralized storage anywhere
mapop joined the channel
mapop has quit
dat-gitter-bot
(Blahah) there are perhaps a few group-like concepts built into hyperdrive/dat
(Blahah) anyone the link is shared with enters a swarm, which could be considered a gorup
(Blahah) also with live-feeds you could have multiple people able to sign the same feed right?
(Blahah) so a group responsible for pushing to a feed
jorin has quit
erikg has quit
erikg joined the channel
pdurbin joined the channel
bthesorceror joined the channel
zanadar joined the channel
bthesorceror has quit
bthesorceror joined the channel
mapop joined the channel
zanadar has quit
bthesorceror has quit
dwins joined the channel
mchelen has quit
pfraze joined the channel
mchelen joined the channel
iml_ joined the channel
zanadar joined the channel
mapop has quit
zanadar has quit
mapop joined the channel
mapop has quit
(pingali) @Blahah Got it. I didnt fully appreciate the p2p-ness of the system. I will study it some more.
(pingali) One question on the usecase. @mafintosh mentioned about use in academics.
(pingali) How is dat being used there?
mapop joined the channel
pfraze joined the channel
mapop has quit
mapop joined the channel
bthesorceror joined the channel
zanadar joined the channel
pfraze joined the channel
mapop has quit
mapop joined the channel
zanadar has quit
zanadar joined the channel
mapop has quit
bthesorceror has quit
bthesorceror joined the channel
zanadar has quit
serapath joined the channel
zanadar joined the channel
dwins joined the channel
zanadar joined the channel
karissa
@pingali to reference data in a future-proof way in a paper
@pingali and secondly, to have better big data support as a replacement for git/dropbox in a researcher's workflow
dat-gitter-bot
(Blahah) @pingali @karissa personally (as a scientist) I'm using it for everything that involves moving or distributing data
(Blahah) for example, I use it to distribute teaching materials to everyone in a classroom
(Blahah) and to ship data from my storage machines to a cloud instance for analysis
(Blahah) I'm sure we haven't scratched the surface of what's possible though :)
zanadar has quit
ogd
karissa: mafintosh and i synced up earlier on hyperdrive/hypercore/meta.dat stuff, i have a good idea of the roadmap there now. but i think youre right that we should do a CLI release asap