any here using influxdb for graphite backend ? (eg sending metrics via collectd -> relay (carbon-relay-ng) -> influxdb
just if there are major benefits right off the bat, or if it is only apparant as you scale up the data points in the dataset
just wondering if there are **
(sorry, its monday :) )
Civil
bulldozer1: I think Dieterbe is :)
bulldozer1: it have it's pros and cons
InfluxDB is easier to setup (as for me) than all the carbon stuff there
you can use grafana directly on influxdb
bulldozer1
i have carbon setup and it seems to be working fine, i just wonder if i shouldn'
shouldn't add influxdb as another output as well
and then switch after testing etc.
Civil
bulldozer1: you can always try, if you have spare hw
karlnp_
i'm using it based on dieterbe's stuff. it's working well for me
bulldozer1
well, its in jails, so ill just do that.
Civil
though I think it's better to wait for 0.9
bulldozer1: also it depends on your load.
bulldozer1
when is that slated ?
Civil
bulldozer1: they are saying it should be released this month
karlnp_
i started with carbon & added influx, and influx responded better & had better disk usage patterns (iirc) so now i'm just using that. we're also basically running it on a digital wristwatch so ymmv
bulldozer1
Civil: this is a low-scale testbed right now.. eventually it will be moved to real prod HW when it starts intaking a boatload of metrics
karlnp_: interesting
i am gonna setup influxdb as well, why not.
Thank you for your sharings folks
:)
Civil
bulldozer1: in terms of writing performance, it's better.
karlnp_
dieter has some good posts about the use cases of carbon vs influx
Civil
graphite's default scaling mechanism is broken by design
though influxdb have some problems with sharding right now
that they promise to fix in 0.9
UICTamale
Any of you using influxdb with a LOT of "tags" ?
Civil
but even now it's better
karlnp_
yeah... i really am curious how many of the "will be fixed in 0.9" issue closures will be fixed For Real or if they'll just end up being pushed down the pipe
Civil
bulldozer1: by scaling I mean when you've got more load than one server can handle, influxdb will be better
bulldozer1
Civil: gotcha
Civil
bulldozer1: but influxdb 0.8 is a lot more write-optimized
it's slow when you have a lot of reading requests
bulldozer1
however in my thinking, i would probably roll influxdb from day 1 and then scale out hw/compute wise as needed, rather than having to re-engineer that storage later and export/import, etc.
Civil
and quite CPU and RAM hungry for that
bulldozer1
i assume influxdb stores the MRU workingset/etc in memory, and does journaling + flushes to disk in a sane methodology
which is what im looking for
i dont want to go to disk for metrics, even though I am using ZFS so its not that bad.
also whisper/ceres is easier to manage (if you have unstable environment, when metric name scheme can change or software will move from one host to another - it's easier to rename file, rather than do select-delete in influx)
karlnp_
those are a few months old tho
Dieterbe
hi
karlnp_ is now known as karlnp
karlnp
-
hmm. weird ghosting thing. hi dieter
bulldozer1
lol
Civil
bulldozer1: for my workload (when data should be stored for several years, with different retention periods, etc) I've choosed Ceres for now, but I'll try again with 0.9
bulldozer1
yeah, im going to keep everything on whisper for now and stream a parallel copy to influxdb and play with that
basically what Dieter did for his setup.
JahBurn1 joined the channel
Dieterbe
yeah
i also don't fully trust influxdb yet, it's nice to be able to wipe data, refill it, etc
We've got a similar test going to what you posted about in one of our production regions
where we split from our application poller and send to traditional graphite and kafka
then a kafka consumer writes to cynaite
er cyanite
So far, cyanite is showing a lot of promise for us (we're on AWS)
Dieterbe
good to hear
UICTamale
we were getting to the point where we needed 4 m1.xlarge carbon servers
Dieterbe
i choose to try influx before cyanite because i think longer term it should become the better solution, but right now cyanite might be much better, i don't know.
piavlo has quit
karlnp
i would have tried it but there are a few ppl here who talk endless shit about cassandra. that was before i realized that people who talk shit about technologies mostly just post on HN and don't do much
here = dayjob
Civil
I've tried cyanite but quite long time ago (this summer) and don't have really much data about it. It seems that it's currently more mature than influxdb, but I agree to Dieterbe - it seems that it's not a good choice for long-term perspective.
Maybe I just don't know how to "bake" cassandra
UICTamale
That's just it, we're already using cassandra for a lot of other stuff
so it was easy for us to to try
In fact, that's my main worry about influx - it's hard to write good distributed fault-tolerant storage
cassandra has that problem licked
Civil
UICTamale: if it's easy - then try :) It's always worth trying, because all storages have their pros and cons
UICTamale
yeah definitely
Civil
and it's always about finding best combination of + and -
UICTamale
I tried influx a while ago and didn't get far
failed to install
I'll give it another shot soon
Civil
UICTamale: for me data I'm sending to graphite is not that important - It'll be bad to loose them, but not critical.
that's relaxes requiements for fault-tolerance a lot :)
UICTamale
Indeed
it used to be that way for us, too
Civil
and current company is too small to have more than 2 servers for graphite :)
bulldozer1
Civil: what is your long term storage for metrics you care about ?
UICTamale
but now we're looking at billing off some of this data
Civil
that relaxes "distribution" a lot :)
bulldozer1: just to use metrics to predict something and provide reports like "we've got 2x increase in number of clients this year, we need to have additional servers" :)
just for that purpose
*mainly for that purpose
karlnp
you get more than one server for graphite? i'm jealous
UICTamale
yup, that's similar for what we need
but we need to compare things year over year
which means we can't lose data
bulldozer1
right, im asking what storage are you using for your stats ?
Civil
If I loose hour or two of stats - it's managable :)
bulldozer1: Ceres
karlnp
i'm forced to put like 8 services on one box that would be underprovisioned for any of them individually. be thankful imo :P
bulldozer1
interesting
Civil
bulldozer1: I've came up with: carbon-c-relay (used carbon-relay-ng, but required aggregation, in fact carbon-relay-ng is more predictable from my point of view, still have some small problems after switching to c-relay)
7 instances of carbon-writer (patched version of Yandex's megacarbon)
running all of this with pypy2.4
and two servers, one for storing data with 30s precision (to be able to see what happend in a week or two), another - main one with 3days of 30s and a lot of with another retention schemas
and graphite-api for accessing the data
bulldozer1
interesting way to chop it up
Civil
bulldozer1: ceres is a bit abandoned, so I'm using patched version (patches applied to Yandex's version of it)
with compleately rewritten rollup mechanism
bulldozer1
yeah, im trying to stay away from having to roll my own stuff :>
bulldozer1: it's the only dashboard we have in Graphana :)
bulldozer1: the problem is that currently we have quite mixed workload here
you can see that it's 20rps peak on graphite-api
and avg is around 10
and some queries are quite complex
like request 1k lines for 4 days, compute something strange (sumSeries, asPercent, etc, some times more than one function for that)
and also graphite receives 500k datapoints each minute
and all the things on one server, cheap one
ex-40 - i7-2600, 32GB Ram, 2x OCZ Vertex 3 120GB
Dieterbe
Civil: the booking.com people (dgryski et al) have written some interesting stuff. like their recent thing which keeps data in memory and computes the requested function in the storage system itself, offloading graphite
Civil: you should ask him about it, we usually talk in #graphite-ng
Civil
Dieterbe: yeah, I should :)
Dieterbe: I want to continue my experiments with storage, but at home :)
Dieterbe
better if you can do it during company time ;)
Civil
Dieterbe: ah, it's not possible :)
they've tried to push forward and requested to stop all the experiments and setup best solution at the moment of revert everything to RRDs
and they don't want to spend money for having testing environment for graaphite :)
I'd like to do that during company time :)
Dieterbe
best solution is reverting to RRD's? o_O
Civil
Dieterbe: no, I've convinced them to use Ceres at least :)
but they were insisting on reverting to RRDs for some time
and we still (almost 4 month after graphite with ceres is production ready for us) support old graphite+rrd setup on older server
because it's not allowed to drop it by management :)
Dieterbe: management thought that maximum ammount of time to choose best backend for graphite, optimize and test everything should be 2 weeks :)