#amara

/

      • sylvainc joined the channel
      • nigel_bot has quit
      • nigel_bot joined the channel
      • janeted joined the channel
      • janeted has quit
      • janeted joined the channel
      • janeted has quit
      • janeted joined the channel
      • janeted has quit
      • janeted joined the channel
      • janeted has quit
      • janeted joined the channel
      • janeted has quit
      • sylvainc
        hi bendk1, are you around?
      • bendk1
        jI'm here
      • sylvainc
        oh, cool, safe?
      • bendk1
        yeah I think so
      • it's freezing cold, but I think that's about it
      • sylvainc
        good!
      • bendk1
        I don't pay attention to the weather though :)
      • I guess we'll see
      • sylvainc
        Advantages of working from home!
      • got a question if you have a minute, I think I asked you a while back but I do not remember quickly and I do not want to do something stupid:
      • when we want to add a field to the index,
      • can we just manually edit schema.xml
      • if I want to add a field to ./apps/teams/search_indexes.py
      • bendk1
        no
      • so first of all, I think adding fields is tricky
      • because we need to reindex everything, which is slow
      • we could probably speed it up by removing a bunch of unneeded fields, but haven't done that yet
      • so if we can get away from adding fields that would be better
      • but if you have to add....
      • I think it's adding a field to the search index class
      • then running rebuild index
      • maybe you need to do something to rebuild the schema too
      • basically, I've never done it and am afraid of it
      • sylvainc
        ok, got it, so how do you think would be the best way to do a querry, where we first get matching items from a solr query
      • then need to filter out more using fields that are not there
      • the way I did it a while ago was to retrieece all ids of matching items,
      • then did another query to the db, including id__in=[... matching ids]
      • sorry not sure it is clear,
      • and that seems to be too slow for the largest team we have
      • bendk1
        I'm not 100% clear, but I think I get what you're talking about
      • and yeah, that does seem slow
      • is this for the bulk set primary audio language code?
      • sylvainc
        right now the issue is for filtering, in move_videos
      • all items used for filtering are in the index
      • but we also want to filter according to that primary audio language code
      • for which we need to go through teamvideo -> video -> primary_audio_language_code
      • so right now I retrieve all filtered item, then make a db query in teamvideo with that extra filter, with id__in=... already filtered items
      • bendk1
        and now we are running into the other issue with the index :)
      • there's a regular index which does have the field we want
      • then the team video index, which doesn't (from what I can tell)
      • it would be better if we just had the regular index and added the team field there
      • sylvainc
        do you mean add index to that field in the database?
      • and then we would make a db query in case we need to filter with the primary audio language?
      • bendk1
        I mean we have 2 celery search indexes
      • sylvainc
        otherwise we use the solr index
      • bendk1
        1 for videos and 1 for team videos
      • but team videos are videos, so it doesn't make so much sense in my mind
      • sylvainc
        oh, i see
      • bendk1
        if you look in apps/videos/search_indexes.py you'll see the other index
      • so maybe we should just delay this ticket until search indexes are fixed
      • sylvainc
        i see, but there we have the same issue which is to add the team?
      • bendk1 is now known as bendk
      • I am thinking as a temporary fix, just do a regular query in case we want to filter with primary audio language
      • bendk
        yeah, if that works then go for it
      • not sure about the performance issues of querying all the team video ids though
      • sylvainc
        ok, cool, thanks, I'll try to remember this time
      • i think I would do the whole query at once
      • rather than two
      • bendk has quit
      • bendk joined the channel
      • bendk: in case you missed my last sentences:
      • i think I would do the whole query at once, rather than two, that should fix the timeout we currently have
      • bendk
        ahh, ok that makes sense
      • yeah I think it should be fast enough, but you never know until you try on that staging db :)
      • good luck
      • sylvainc
        thanks again!
      • janeted joined the channel
      • janeted has quit
      • janeted joined the channel
      • nigel_bot has quit
      • nigel_bot joined the channel
      • sylvainc has left the channel