the design of fluentd (looking at the source code) is actually not too unlike logstash.. logstash has 4x more forks, and 7.5x more commits. logstash's first commit to github was august 2009. fluentd was june 2011
jspeck has quit
jumpkick: well, "HA".. if you have multiple logstash instances, you don't need MQ
lumberjack will rotate between the live ones
jumpkick
does it do some kind of load balancing or does it just pound one till it fails and then move to the next one?
avleen
pretty much that. I'm going to add in some load balancing logic this week
the splunk forwarder rotates connections every so often, so I'll probably do similar. every <timeout> seconds, reconnect to a new random server
jumpkick
avleen: some way of distributing load across the catching servers would probably be the best, not an easy problem to solve
avleen
jumpkick: if all of the clients randomly reconnect, the load would balance out well :)
jumpkick
too much reconnecting (every few seconds) will also be bad for performance
avleen
naa
jumpkick
yeah
avleen
every 60 seconds, would be quite OK
jumpkick
I suppose
sqlnoob joined the channel
avleen
not like, every second :D
but one a minute, or so, it would be quite OK. it's very fast to reconnect
jumpkick
yeah that'd work
from lumberjack github - "Redis development refuses to accept encryption support, would likely reject compression as well." :(
avleen
yup :(
micko
you could write a tcp proxy that supports encryption in node.js
or use haproxy?
avleen
i've generally found tho, that as long as your logstash downtime is "short" - less than the time before your logs rotate - then things usually recover automatically quickly. I found (at our scale, anyway), adding redis / MQ made performance significaly worse
micko: or write a codec which endrypts the data
*encrypts
having recent written a codec and a filter, i have to say that they are super easy to write
micko
avleen: got anything on github you could share?
jumpkick
lumberjack directly to a logstash cluster sounds like the best way to go; simpler and more secure too
ksclarke has quit
I think I read it does client certs, client ssl keys or something, which is good
keep intruders to the network from flooding the logstash with spam
avleen
micko: sure!
moj0rising joined the channel
jumpkick
or at least they'd have to break into a node and then they could be filtered out
it collect multiple events, compresses them, and then sends them to the output
micko
cool. i'll have a look now
avleen
this was while I was trying to improve the performance with MQ (which has an uppser limit of 40k events/sec)
with the compress_spooler codec, I was able to get.. well, a lot more
I maxed my input at 160k/sec, and MQ was still fine
micko
nice
avleen
it was really cool for shipping data from one logstash to another via zeromq too, if you do that.
jumpkick
lol... that's pretty good 4x faster
avleen
in the end i went back to just having one layer of logstash servers
but I know others would benefit
micko
zeromq better than redis for queueing?
avleen
eh... depends
jspeck joined the channel
i don't full understand zeromq, so i don't know how it buffers, how to make the buffer persistent to disk, etc
whack
avleen: re: fluentd; fluentd was heavily derived in design from logstash; I couldn't get a solid answer about why it exists though.
avleen: zeromq doesn't persist to disk
micko
avleen: ill have a play with both
whack: Redis can now :)
whack
micko: redis has been able to persist to disk for a while
micko
my bad
whack
hehe
micko
im getting confused with it's new HA I think
jumpkick
whack: yeah, that's what I struggled with too... why the heck did they bother to write fluentd (also written in ruby, also using plugins for in/output, does the same thing from what I read)
avleen
whack: yeah that's what i was seeing too (re: fluentd). eh!
sometimes people want to write code, to write code.
micko
^^^^^
avleen
personally, i try not to discourage creativity. sometimes you just have to reinvent something to satisfy a need in yourself
and that's ok
jumpkick
I don't know why they didn't just fork though
micko
can I quote that :P
avleen
hell, i've done it enough times myself
moj0rising
Hello! Whenever I start up logstash with the Twitter input, I get "A plugin had an unrecoverable error. Will restart this plugin." Can anyone here help?
avleen
micko: sure :)
micko: Attribution: "Avleen Vig"
:-)
micko
lol
avleen
whack: how's the newborn? do we have a name yet? :-)
whack
avleen: indeed; when I asked them why, I only got "Because we only want to support json, nothing else"
jumpkick
maybe there was just a preference for apache config file syntax (yuck)
whack
avleen: which is fun because their plugins support more than json now
some people gonna code and not contribute, it happens.