I need to edit the logstash config file and change the value of a field dynamically while the pipeline is running, can I run a separate python script that regularly opens, edits and closes the file and have the pipeline continue with the changed fields?
The-spiki joined the channel
The-spiki has quit
The-spiki joined the channel
Or will this be an issue?
Any help will be appreciated :)
hugh_jass joined the channel
phutchins joined the channel
yellow..
gentunian has quit
gentunian joined the channel
torrancew
truthseeker1990: as I tried to explain last night, that won't work
truthseeker1990
I dont think we ever got to this point torrancew :/ Or I misunderstood what you were saying at the time
Can you just succinctly say why it wont work
torrancew
Because Logstash won't pick up the changes to that file
Logstash reads the config files at boot
you'd have to change the file, bounce logstash
truthseeker1990
There is an automatic config reload option in the latest version of logstash
torrancew
ah, I see
truthseeker1990
oh i remember, thats wjhy I came back to this option
torrancew
(I did mention that I've not used 5.x)
truthseeker1990
I read about it last night
yeah
Apparently once this option is enabled, it scans for changes to the config file every 3 seconds
torrancew
that may work, but it is very very hacky, and may be a bit brittle (particularly the programmatic editing of the config file)
gentunian has quit
truthseeker1990
Yeah it really is..I am wondering if maybe logstash might not be the perfect fit solution to my problem
I could write a python script that does what I need, and write the tweets to a S3 bucket. (which is what I wanna do ). Its just not as cool and integrated as using logstash -> S3 -> ElasticSearch
fev3r101 joined the channel
hugh_jass joined the channel
rastro
truthseeker1990: depending on the change needed, can you use the 'translate' filter? it can read from a file (which gets reloaded).
fev3r101 has quit
thewarchild joined the channel
truthseeker1990
Hmm, So what I wanted to do was change the 'keyword' field in twitter plugin input in the config file so that I only get tweets that contain the trending hashtags. These change every hour. Its easy to do if you are working directly with the Twitter API but for some reason the only place in logstash where you mention the 'keyword' is in the config f
ile, and there doesnt seem to be any dynamic programmable way to set that field . The translate filter could work, except that it would still be reading a general sample stream of ALL tweets and then filtering them based on hashtags that I tell it to. It would kind of work, but it wouldnt get a lot of positive hits coz the sample stream is only a s
mall set of all tweets and what are the chances it will contain the hashtags that I want.
I am surprised that logstash gives you the abilty to read from twitter stream but doesnt give you a dynamic programmable way to interact with the twitter API through itself.
rastro
i've found that the world is full of disappointments :)
truthseeker1990
Wise words :)
finalbeta joined the channel
torrancew
truthseeker1990: My advice remains, write a simple thing to scrape twitter for the keywords you care about; your thing can get keywords from whereever you want it to (database, redis, disk, go wild with it)
then your thing can feed data to LS for further processing
via a file, or a socket, or whatever generic mechanism you like
truthseeker1990
Hey torrancew, would I even need LS if I can scrape the tweets? Coz I can do the scraping pretty easily.
torrancew
IDK what you're doing with the data
truthseeker1990
Ah I am just feeding it to a S3 bucket
torrancew
if you're trying to further process the data, LS may be a really nice thing to use there
or if you want to rely on LS's state-awareness or something, etc
(you'll still have to solve the twitter side of that, but LS could handle the s3 writing, if that's useful to you)
truthseeker1990
Yes! That would be useful.
One last question,assuming I wrote a python script to scrape the tweets, do you have any advice on how to feed the stream to LS? I dont think theres a programmable way to interact with LS
torrancew
you'd use some generic mechanism to pass the data
a few options that come to mind:
BlackCrypt0 joined the channel
write tweets to a file, logstash uses a file input to read them
rastro
truthseeker1990: i love using files for external scripts that gather data. it's the easiest least-common-denominator.
torrancew
write tweets to a socket/http endpoint, logstash listens on that thing to receive them
write tweets to redis or something, if for some reason that is easier
rastro: is there an "exec" input type? if so, is it useful here?
(that you know)
derjohn_mob has quit
rastro
torrancew: there is an exec. always felt like a hack to me.
torrancew
I could see that
hugh_jass joined the channel
truthseeker1990
Ok cool, Thanks. I think I might just write to a file, seems like the path of least resistance.
rastro
to me, files beats sockets because it handles when LS isn't running.
torrancew
agreed
rastro
truthseeker1990: if you write them as json, it'll flow in very easily. i do this with snmptraps.
torrancew
and for situations where you have a lot of indexers you want to fan the data out to, I think a redis-like thing still beats sockets
truthseeker1990 has quit
rastro
torrancew: i had one of those, too. i would ftp data down, have that local logstash round-robin it out to three redis instances, and then had three distributed logstashes reading it from redis.
derjohn_mob joined the channel
hugh_jass joined the channel
thewarchild
hi guys, good day to all
‘(\n%{GREEDYDATA:parts_message})+’ if I have this, I have one or more… but the frield name is been overwritten and I am just getting the ‘last’ message. Any way to either: each iteration has its down frield name OR turn it into an array?