asksol: could the broker connections could leave dangling references in the hub ?
asksol
dangling?
hr: if you use redis/memcached yes
ionelmc
yeah, the Channel or whatever closes connection but does not remove it from the hub
asksol
hr: but obviously the count is correct what is missing is the results
ionelmc: that would be possible
ionelmc
then there's another operation with the same fd that is something else (cause linux recycles them fds very fast) and gets closed erroneusly
i had this issue before
managed to segfault the interpreter by just mismanaging fds :)
asksol: any hints on where to look for this issue in the broker ?
i was using redis for broker
also, i was thinking, maybe the hub should log some error or raise one if you try to add a fd that's already in the hub - would raising break any existing code ?
asksol
not sure, everything is in kombu.transport.redis
ionelmc
i think that would *really* help tracking the source of the leak
asksol
you would have to rewrite some parts in case
several things will replace the fd afair
e.g. from 'wait for outq readable' to 'read from outq'
I was considering the same, but matched what other event loops are doing
ionelmc
well ok, lemme ask differently: is there any place now that will re-add a handler on a fd - with a *diffrent* callback ?
asksol
also would like to register file objects instead of raw fds
but that is also tricky as I think select returns fds not the original objects
hr
that's annoying ::/
asksol
sure
ionelmc
asksol: you can't ... you only get back numbers from epoll or whatever :(
hr
asksol: any suggestion as to where I should look so I stop asking stupid questions ::p
asksol
hr: maybe redis evicted them, this happens to me and nothing is logged
ionelmc
asksol: so then, is there any place now that will re-add a handler on a fd - with a *diffrent* callback ?
negval has quit
asksol
ionelmc: yes
ionelmc
are there many situations like that?
hr
why would that happen?
ionelmc
can you give me some examples?
asksol
I don't remember
hr
any way to fix the timeout (if it would be the problem)
asksol
<asksol> e.g. from 'wait for outq readable' to 'read from outq'
there will be many more in the future
negval joined the channel
ionelmc
asksol: well yeah, but that's a larger change, and btw, not all event loops implement the reactor pattern
tulip does a proactor iirc
hr
this is the only key I see in my redis celery-task-meta-a4cd6f41-e02f-4cb5-8fa7-fd2639cdb4a2
asksol
hr: the timeout value is arbitrary, it shouldn't have to wait since the counter means all results should have been written
ionelmc
asksol: can you give me a highlevel explanation of the `consolidation` thing in the Hub ?
is it something so you can have a 'bulk callback` - eg, there's a callback that get's called with all the fds that were ready in one poller poll ?
asksol
ionelmc: that just means it will call a callback once for all fds instead of one callback for every fd
it's only used when the pool inqueues are readable
ionelmc
and then you need to filter out what you're interested in in said callback
asksol
as we need to choose one to write to
ionelmc
right?
asksol
right, but all callbacks that were flagged as 'consolidate'
ionelmc
aah right
asksol
it's only there for inqueue scheduling
so there is only one list of 'consolidate fds', you cannot have multiples
so it calls schedule_writes(list_of_inqueue_fds_ready_for_read)
instead of inqueue_ready(fd); inqueue_ready(fd); inqueue_ready(fd)
if did the latter we would have to maintain a set of ready fds and then hub.call_soon(schedule_writes())