I'll kick off in a few minutes, but if no-one is here I'll stick around
I have to be out in ~90 minutes though
pudo
I can't imagine the short trousers working out very well
anyway. would you still be around in a little bit?
pwalsh` has quit
trickvi
pudo: yes, I'll be around
pwalsh joined the channel
pwalsh
hi all
trickvi
and on that note, let's kick this thing off
hi pudo
no wait, that was supposed to be hi pwalsh
pwalsh
:)
trickvi
autocomplete win!
pudo
ok, sorry to drop out :)
trickvi
no problem
pwalsh: maybe you want to introduce yourself
the topic for today is to go through where we're at with the micro-service/monolithic code base
and have a discussion about it
so I can start and go through my thoughts on the micro-services
pwalsh
Hi. I'm Paul. I'm working with Open Knowledge at the moment on some projects (related in a broader sense to OS). I've also previously worked in a similar open budget project with the Public Knowledge Workshop in Israel.
astafish1 has quit
astafish joined the channel
trickvi
the biggest issue of micro-services is the integration of services imo and how to handle that
I stumbled upon a blog post by Eran Hammer recently
where he downtalks micro-services, because at some point somebody will have to deal with the complexity
in our case, the person who deploys the stuff and manages everything
pwalsh
in what sense is it the biggest issue? Cognitively for the developer? Operations-wise for deployment?
trickvi
I'm quite willing to take that on
pwalsh
reading link...
trickvi
It's the biggest issue operations-wise
for the developer we're imo making things easier cognitively
we have to be sure things work together
so there are three things I've focussed on when thinking about how to do this
One benefit of the micro-services is that we can expose different components to different developers so that they can hook into the pipeline
terra joined the channel
This creates one problem, e.g. if we do validation early in the process we have to repeat it throughout the process
hi terra
so we need to always validate input (or be ready to just fail)
I've also focussed on simplicity of implementation over efficiency
so something we can get working quickly and worry about things like speeding things up internally later on
an example of that is to just go for http endpoints instead of using an internal message queue which is faster and has less overhead
The last thing I've had in mind is to use existing standards/protocols as much as possible, instead of inventing our own
any questions about these?
most of this could of course also be covered by monolithic code
I'm just rambling on about my thoughts, feel free to discuss and raise your points
I'll just continue :)
lay all my cards on the table
pwalsh
I guess one question is, how would the developer-user, who wants to setup OS locally, experience this? Presumably some repo that pulls the parts together and bootstraps it all? I realize that is not directly related to *why* you want to go for micro-services, but it is a tangible result from the change
trickvi
pwalsh: ah yes let me get to that next
pwalsh
ok
trickvi
I've thought about general things
like we should decide on preferred programming language/framework and we've mostly done python and pudo has migrated things to flask so I say we stick to that (I've also had a look at Eve the REST API framework built on flask but not very closely)
note I say preferred because we shouldn't reject things that are not implemented like "we're used to"
then we should have central documentation of all services to show the big picture
and then we come to pwalsh's question
I have been thinking about how to do local development of a single service without having to boot a lot of stuff up
the conclusion I came down to was basically include mocks of all services a particular service integrates to
we will do that no matter what, because of tests
we would also have a staging machine running for integration tests
but allow devs to hook into that
so if you don't want to use the mocks, you can go for the staging setup
so I'm thinking something like a dev mode or something which would switch over to mocks (probably implemented as a "production mode switch" that defaults to false so dev mode with mocks is the default)
pwalsh: does that answer your question?
pwalsh
yes
trickvi
and then the other question... does that make sense is or is that just stupid?
pwalsh
specifically the local dev thing? the micro-services sounds good to me, the local dev setup might need some refining
obviously work on different services will be easier
trickvi
how would you go about it?
pwalsh
just the experience of working on OS as a whole locally probably needs more thought I guess.
trickvi
(the local dev setup)
we could of course just use vagrant (which we already use to set up dev stuff)
which would set all of the different services up for development
pwalsh
not sure... yes, I had more in mind something like a vagrant build that just runs everything you need
terra has quit
but you are right that there will be a mocking interface anyway because of tests etc
trickvi
alright, we should then look into vagrant install for all services, we kind of also need that for the staging machine (simple / automated deploy)
vagrant or something else
continue?
so what I've been thinking about in terms of integration is HTTP endpoints for all services
we might support something like ZeroMQ in the future (to speed up internal processing)
but I think we should at least to begin with stick to http
it means overhead, but it is pretty fast for our use case
I'm very fond of REST APIs so I think we should design it as such, but that's just a matter of preference
and then have it properly documented for humans (each service and interface)
REST should be able to take care of that but nobody can do perfect REST (we can aspire to)
pwalsh
and so there is an event dispatcher? how would the actuall communication between services work? what would the flow be
trickvi
pwalsh: yes, that's a good point... so there is always going to be some trigger
and that trigger can either be a human who does something like upload some stuff
or a process that finishes with some output that another service could pick up
in the former (human trigger) we only have to think about the websitrontend (openspending.org)
then we can just call the right microservice directly
in the latter we would need something like a queue, but I'd still want anybody to be able to hook in, so a queue that does not acknowledge messages (leave them in the queue)
so I'm thinking something like pubsubhubbub
this allows us to fetch a backlog of outputs and use them when we boot up
and allows a service to be notified when something new is published (without having to care about how it got there)
I haven't fully fleshed this out, but I think this could be useful for other things as well
for example the inflation data in OpenSpending at the moment is updated every year or so, but OpenSpending loads it at boot
and does not poll for new data
we could have inflation data be published to a pubsubhubbub (PuSH) hub which would notify OpenSpending
and OpenSpending could hot load the new "master data"
now I'm talking about OpenSpending in its current form
in the micro-service case this would be the analytical service that uses inflation data
in that sense the pubsubhubbub would be the event dispatcher for the process
so if there is somebody interested in validated source files or something, they could just listen to that "topic" on the PuSH hub
OpenSpending as a "project" would still continue to process the validated source files, but other projects could also do it
pwalsh: does that answer your question/make sense?
pwalsh
so, something just became clearer to me know. most of the micro-service interaction you are proposing (here an in the OSEP etc.), is for input of data, and processing of data (aggregation, etc.). The front end API, consumed by public-facing apps, would likely be its own micro-service, and it only needs to know about some efficient form of the data that it returns to users (whether it is output from cubes, or rendered files on S3 as
rufus was suggesting, so on)
(that wasn't an answer to your question)
trickvi
yes
pwalsh
re. your question: yeah sure, that is what is most interesting to me - the design for different entry points