is putting a small config snippet into the header ( i guess you call it ) for the docs ok? I understand why the csv doc lists a regexp as one of the options, but it's not exactly obvious what it means when you're trying to make the thing work
is there anything specific to logstash when considering disk space requirements?
I imported a 5.9MB file and ended up with a 40MB data directory.
phrawzty
blalor: logstash really doesn't use much disk at all. are you asking about elasticsearch?
semiosis
well that's elasticsearch, not logstash, technically
blalor
sorry, yes, I was THINKING elasticsearch, failed to TYPE it.
geez, you can't read a newbie's mind. that's a bug.
semiosis
the rule of thumb is you'll need ~3x raw log size on disk, if you don't do anything special like enable compression or mutate logs to reduce redundancy
the 3x probably varies widely though
phrawzty
blalor: elasticsearch inflates static log data because a bunch of meta-data is being generated to go along w/ it.
fetep
blalor: couple pieces of advice: make sure you have the compression stuff on
Electrical
semiosis: thought it was 6x ? ( disk tests from whack )
blalor: and you can experiment with index templates, so you don't do full indexing on @message (assuming you're grokking it out to @fields or otherwise have that populated)
semiosis
Electrical: i'm sure you're right, my info is old
phrawzty
Electrical: it really depends how you've configured ES, what fields you're stripping, etc.
Electrical
phrawzty: very true. but default config ( no stripping, no removing, no compression ) gives about 6x ( according to the tests whack did )
blalor
so far I'm just using the embedded instance as a demo, but sucking in about 770MB of prod logs from the last 4 months or so
ohlol joined the channel
czervik_ joined the channel
czervik has quit
czervik_ is now known as czervik
berkay joined the channel
metcalfc_ has quit
metcalfc joined the channel
mortini_
gaah
nikitosiusis joined the channel
feylya joined the channel
i'm out of rubygas for the day i think
bfulton has quit
https://gist.github.com/timconradinc/5273375 <- i added a config snippit for the csv filter, i'm not sure it's the best place for something like that since other modules don't have it
I've got timestamps in this format - "Mar 29 2013 14:19:52". The grok pattern SYSLOGTIMESTAMP won't match on that I don't think because the year is in there so I guess I have to create my own pattern?
techminer: it's quiet easy to create your own patterns as in 'MYSYSLOGLINE %{MONTH} %{MONTHDAY} %{YEAR} %{TIME} %{HOSTORIP:syslog_hostname} %{GREEDYDATA:syslog_message}