hmm. is there a way to seperate out a field as its own event?
happy-dude joined the channel
I'm storing tweets, and I was thinking, since retweets contain the original tweet with updated like/retweet counts, it would be nice to update the original tweet document
pandaadb
I am not sure if there is a plugin that does that
but it would be easy to write your own to yield new events
There is an issue with the ruby filter which does not allow you to yield events (otherwise you could do that with that)
alsochris joined the channel
However, updating does not require you to create a new event. e.g. Elasticsearch ouput can update previous events based on the document_id I believe
instilled has quit
LotR
pandaadb: yeah, but the original tweet is in the retweeted_status field of the retweet. I would need to split this out somehow
pandaadb
can you show me an example paste?
(of the retweet)
optiz0r has left the channel
wendelmineiro joined the channel
instilled joined the channel
arnonhongklay joined the channel
brokencycle joined the channel
pawnbox has quit
hugh_jass has quit
hugh_jass joined the channel
Xylakant joined the channel
instilled has quit
you can likely determine if a tweet is a retweet and then strip the retweet down to only contain what you want, and then update the document with that info
So the way it looks like LotR, the update API of ES can execute a script instead of just doing a usual upsert
So in the script you have access to the original event. You defined the document Id to find it, and create a groovy script that updates the values the way you want it to
and then simply run it. Might be worth a try :)
And by the looks of it it can do anything (create new fields, modify existing fields, overrite stuff etc)
Hi, still looking for conceptual advice. Where would be better to introduce grok filtering - on collector nodes, or on indexing nodes. Using Redis on indexing nodes for buffering.