Those machines will actually be onsite at a facitliy
NoodlesNZ has quit
rastro
LS doesn't store data. it only forwards it.
xamox
Ahh, k
That was my bad ten
then
I thougth it did
danfrincu has quit
It's ES that stores it?
Post filter?
rastro
yes, ES is one posisble output from LS.
kepper has quit
xamox
k
Then nevermind
rastro
lol
xamox
Maybe I'll go with your method
haha
kepper joined the channel
Just lsf->LS->ES on cloud
Well that makes it way easier than. :)
rastro
if your typeB serves wrote to your cloud ES, it sounds like you'd be in good shape.
NoodlesNZ joined the channel
yes, exactly.
xamox
Yep, didn't realize that.
The filters just need to live on the typeB machines, correct?
danfrincu joined the channel
rastro
if you only have one LS layer (typeB), then that's where the filters would live.
capt-rogers joined the channel
that would be the one advantage of running an LS in the cloud, would be to have your rules centralized there, but...
btobolaski joined the channel
xamox
Okay, nah, that's fine as they may need to be different per machine
Awesome, thanks for the help rastro, I think that is enough to get me going where I need
rastro
but with LS on typeB, you can have different rules per tenant, and only update a tenant's config when needed.
xamox
Yeah, I think I will go that route
rastro
good luck!
xamox
I guess I do have a follow up question
I've looked a little bit at datadog
rastro
so soon? :)
xamox
I'm guessing that's basically similiar to running ES in the cloud here
dyer
man, it seems like every regex / filter I add in my filters drops my processing rate by 3-4 KiB/s
xamox
What would be the advantage to me say going with a service like that? Just nothing for me to maintain?
coinbird joined the channel
rastro
xamox: i have too many hosts to think about datadog's pricing.
vangap has quit
xamox
Okay, that's what I thought. I maybe in a similiar situation as there are a decent number of the type A machines feeding into type B
rastro
dyer: make sure your regexps are efficient and that you're using conditionals to make sure stuff is only running when needed.
xamox
Alright, cool, well thanks for all the help rastro
rastro
xamox: i have 500 machines in my POC, so...
yeah, good luck!
dyer
is this an efficient regex? ( if [host] =~ /127\.0\.0\.1/ { mutate { remove_field => "host" } } )
danfrincu has quit
rastro
dyer: does host only contain the ip address, or more than that?
bline79 has quit
dyer
more
danfrincu joined the channel
habanero joined the channel
rastro
dyer: it's a simple literal match, which should be good.
dyer: do you have other fields on which this decision could be based?
dyer
that alone dropped my throughput 5k
xamox
rastro, okay, thanks for tossing out that number, not quite there yet but rolling a bunch of machines out and will probably getting not far from that within the next 6-12 months, which is why I'm looking at logstash as currently we aren't doing any log forwarding and need it to scale up
bline79 joined the channel
pkoro joined the channel
rastro
dyer: do you check the ip in [host] more than that once? it might make sense to grok the ip address out first.
dragun0v
_Bryan_ thanks, will try that out
koendc has quit
dyer
Right now I am just starting to try to optimize my filters, so at this point I have just the following
For something like collectd, is it possible to tag metrics.. and have some be filtered one way, and others filtered another way?
radiocats has quit
sl1pm4t joined the channel
radiocats joined the channel
rastro
dyer: assuming your real input isn't stdin, who is shipping the logs?
jerius joined the channel
dyer
I am debugging crappy performance right now
so I am just cating to stdin
I have 2mil lines of production logs as my test case
my intent is to test filters one by one to find the shitty ones and fix them
rastro
dyer: but if you're going to be shipping from LSF, you can add a field *there* to identify haproxy/prism rather than scan [message].
rtoren joined the channel
dyer: do your logs have a standard prefix (like syslog's)?
dyer
no, unfortunately
rastro
dyer: so no initial date/time?
WhiteHatTux joined the channel
dyer
well we ship everything via syslog but the patterns are all over the place
but the transport from each DC is syslog to AWS, adn in AWS we process logs
danzilio has quit
rastro
the point i was trying to make is... if [message] has a common date/time (etc) string at the front, i would suggest grok{}ing that off first, and putting the rest back in [message].
then your regexps would run against a shorter string, which should make them more efficient.
j_t
Is there a way in logstash to buffer say 20 log lines, before I sent them off to output{} ?
rastro
j_t: some outputs have a buffer, like elasticsearch{}'s flush_size.
k, ty I will real it right now... one thing... I kinda inherited a lot of this so can you explain to me what these mean in a pattern
%{D}
and %{Q}
rastro
dyer: that should refer to a pattern named "D" or "Q". I don't think they're standard ones, so I'd look around for some custom patterns.
dyer
ah, I found them
## SHORTCUTS:
Q ("){1}
GD %{GREEDYDATA}
D %{DATA}
rastro
wow. your predecessor was lazy :)
BlakeRG
Does anyone know if its possible to get Kibana to switch between dashboards on some interval? If you've ever used Ducksboard you'd know what i'm talking about
rastro
dyer: have you tuned LS? If not, you'll get a lot of return with that, too.
wrath0r has quit
BlakeRG: haven't seen it. i had the browser do it for me.