#hypothes.is

/

      • MrWoohoo joined the channel
      • MrWoohoo has quit
      • JohnMcLear has quit
      • travis-ci joined the channel
      • travis-ci
        hypothesis/h#8378 (better-login-form - d2cf842 : Nick Stenning): The build passed.
      • travis-ci has left the channel
      • travis-ci joined the channel
      • hypothesis/h#8379 (replace-accounts-forms - 947a8dc : Nick Stenning): The build passed.
      • travis-ci has left the channel
      • travis-ci joined the channel
      • robertknight/h#15 (t93-create_group_refresh - 70cc2dc : Robert Knight): The build is still failing.
      • travis-ci has left the channel
      • vannevar
        Detail threads / replies should highlight on hover, same as bucket: https://github.com/hypothesis/h/issues/15
      • travis-ci joined the channel
      • travis-ci
        robertknight/h#16 (t93-create_group_refresh - 1dd1383 : Robert Knight): The build was fixed.
      • travis-ci has left the channel
      • vannevar
        created time not immediately visible on new replies: https://github.com/hypothesis/h/issues/16
      • hslack has quit
      • hslack joined the channel
      • woah joined the channel
      • woah has quit
      • woah joined the channel
      • hslack
        <seanh> I was gonna make the blocklist of blocked URIs for the Chrome extension be a list of regexes
      • <seanh> Thought it might be useful to let people block an entire site for example, not just one URL at a time
      • <seanh> But now I'm not so sure: http://example.com will block the entire site
      • <seanh> To block the front page you would need something like ^http://example.com$
      • <seanh> That might not be too bad actually - default if they just give the root URL of a site with no regex characters is to block the whole site
      • <seanh> There's also the problem of regex special characters in URIs
      • <seanh> Maybe I should use glob instead
      • <seanh> Yeah fnmatch.fnmatch() seems better
      • <seanh> Less special characters to worry about, simpler pattern syntax
      • <nick> @seanh: +1 to globbing rather than regex
      • <conor> Is there a list somewhere of URIs we want to block? Gmail is on my list.
      • JohnMcLear joined the channel
      • woah joined the channel
      • <judell> "Is there a list somewhere of URIs we want to block? Gmail is on my list." Is that because the badge shows a bunch of spurious annotations? That's temporary and will be solved when the doc equivalence work is done. At that point, individual email messages should be annotatable. It'd only make sense to do so privately but could be useful, eh?
      • <nick> @judell: Gmail has the same URL for every user, right?
      • <nick> I don't think enabling Hypothesis on Gmail by default is a good idea...
      • <nick> ...if Gmail URLs are anything like what they once were.
      • <nick> And I've also yet to dig into the bug you reported about badge URI equivalence
      • <nick> which seems very odd to me because the badge is just doing a search query.
      • <judell> Gmail message ids are unique I should think. OTOH they're in a fragment identifier so...never mind :)
      • <nick> well, we might be able to improve the situation in the future
      • <nick> but for now fragments are removed by normalisation
      • <conor> The URL is "https://mail.google.com/mail/u/0/#inbox” for every user
      • <nick> @conor: yep
      • <judell> "the badge is just doing a search query" I thought of comparing the two queries on the wire in case encoding diffs.
      • <nick> judell: don't worry -- i'll look into the bug when I can
      • <nick> it just doesn't, a priori, make a lot of sense to me
      • <judell> @conor: yes, individual messages have ids, but i'm agreeing not that interesting/important
      • <conor> Just showing what I meant. Happy to have it solved however.
      • <nick> we do already have a mechanism to block Hypothesis activation on particular URLs
      • <judell> also we have seanh's blocklist all ready to go, so i'll write up a card to proceed with blocking gmail
      • <nick> right
      • <nick> I think we need to do a bit of thinking about how we're going to consolidate these blocklists, though
      • <nick> because I'm not overly keen to maintain two different blocklists for the badge and extension activation
      • <nick> but that's a different discussion
      • <judell> " two different blocklists" what's #2?
      • <nick> there's a site activation blocklist, and https://github.com/hypothesis/h/pull/2573
      • <judell> heh. live and learn.
      • M-bobderbaumeist has left the channel
      • <robertknight> To go over the approach for sending notifications to clients when a user joins a group, I'd need to: 1) Create a new class to represent the group join event, similar to AnnotationEvent, 2) Publish a notification when the user is added to the group, 3) Read those messages off the queue in streamer.py and dispatch a message on the WS with a new message type, 4) Listen for that in the client and fire off an event that triggers an update of the g
      • Any subtleties I should be aware of?
      • <nick> @robertknight: you can ignore AnnotationEvent if it makes things simpler
      • <nick> actually
      • <nick> hmm
      • <nick> hmm hmm
      • <nick> yeah, this is ugly
      • <nick> because I don't really want the streamer to need one nsq connection per message type
      • <nick> but that may be the easiest thing to do in the short term
      • <nick> this probably needs a complete overhaul, to be honest
      • <nick> so that we can, without too much fuss, send messages of various types to all connected clients for a given userid
      • <nick> and at the moment the streamer is really tied to annotation events
      • <seanh> Yeah I haven't thought yet about consolidating the implementation of the two blocklists. I remember that the original site blocklist was done the way it was because it had to work in all circumstances, for example both the Chrome extension and the bookmarklet
      • <seanh> But we can look into it at some point, there might be some way to consolidate them
      • <nick> @seanh: I suspect the right thing to do would be to put everything in the database
      • <nick> and then the chrome extension can download the blocklist once a day into localStorage.
      • <nick> @robertknight: in summary, yes, although you can ignore the `AnnotationEvent` abstraction.
      • <nick> I suggest that you publish to a `control` topic, where each message includes the `userid` it's destined for
      • travis-ci joined the channel
      • travis-ci
        hypothesis/h#8397 (xgknj2RP-disable-Chrome-badge-on-certain-pages - 829f1d6 : Sean Hammond): The build has errored.
      • travis-ci has left the channel
      • hslack
        <seanh> nick: Well, the Chrome extension is making a request to the server on every page already, right? To get the badge number. So maybe we can put the site blocklist in the db, then add the flat for it to that same response. I haven't thought about the bookmarklet yet though
      • <nick> @seanh: given the blocklist could be quite large, I don't think that's going to be a good idea for long
      • <nick> we already do about 20req/s on our servers just to serve the badge
      • <nick> so let's not turn that from a 2KiB response into a 20KiB response
      • <nick> we can just store cache and last update time in local storage and every time it's queried kick off a background thread to update it
      • <nick> or something like that
      • <seanh> nick: Wait, it just needs to add a True/False for the current tab to the response, not the whole blocklist
      • <seanh> current uri, I mean
      • <nick> ahhh
      • <nick> shit, sorry
      • <nick> yes
      • <nick> i'm being dumb
      • <seanh> Lots of checking URIs against lists of patterns, though
      • <nick> meh
      • <nick> cacheable
      • <nick> @seanh: also, if it's fnmatch you can probably push some of that onto the server
      • <nick> err
      • <nick> database
      • <seanh> How?
      • <nick> I'm pretty sure you can do something like store patterns in the database and then do
      • <seanh> Anyway, let's just leave it simple fnmatch for now, see if it becomes a problem. It won't unless we block a lot of sites anyway
      • <nick> `SELECT COUNT(*) FROM blocked_urls WHERE 'https://my.current.url/' LIKE pattern;
      • <nick> @seanh: TBH that's probably so easy to do it might be worth doing from the start...
      • <seanh> Depends exactly how the pattern matching works I guess. fnmatch is good for this cause it's so simple. very few special chars
      • <seanh> Well, it's already finished the Python way!
      • <nick> Sure
      • <seanh> Though very easy to change of course
      • <nick> But I do want to remind you that this is currently 75-85% of all our traffic :)
      • <nick> So we do actually need to be a bit careful about what we do on that endpoint.
      • <seanh> Sounds like LIKE is even simpler than fnmatch in the complexity of the patterns
      • <nick> Yep
      • <nick> You could take patterns from the user and replace `*` with `%` and store them in the database.
      • <seanh> Or just have the user gives us LIKE patterns directly?
      • <nick> I guess -- it's only an admin interface...
      • <seanh> Exactly
      • <seanh> Well, today my wrists are hurting but I can change it tomorrow
      • <seanh> As for 85% of traffic - I don't see any way to reduce it except to distribute the blocklists to be stored client side, and have them refresh it now and then
      • <nick> Caching
      • <nick> And a CDN
      • <seanh> That would also mean we're not reporting to our server every page you visit :) But much harder to implement I think
      • <seanh> Well, that too of course
      • <nick> Frankly just caching the responses will go a long way to reducing the load on the server, I suspect.
      • <nick> That way we're not hitting Elasticsearch for every single request.
      • <seanh> I'm not sure what our setup is with regard to this. Would it be enough to just tell Pyramid to insert Expires and Cache-Control headers?
      • <seanh> Anyway, I'm off for tonight too, ciao
      • <lenazun> good night!
      • tilgovi joined the channel
      • travis-ci joined the channel
      • travis-ci
        robertknight/h#17 (t105-refactor_notification_client - 607a1c7 : Robert Knight): The build passed.
      • travis-ci has left the channel
      • vannevar
      • travis-ci joined the channel
      • travis-ci
        robertknight/h#18 (t105-refactor_notification_client - 6a52544 : Robert Knight): The build passed.
      • travis-ci has left the channel