1:17 AM
harrywood joined the channel
1:38 AM
odibot has quit
1:38 AM
odibot joined the channel
1:49 AM
harrywood has quit
7:22 AM
pezholio joined the channel
7:22 AM
pezholio has left the channel
7:39 AM
LauraJ joined the channel
7:59 AM
LauraJ has quit
8:10 AM
pezholio joined the channel
8:12 AM
pezholio has left the channel
8:25 AM
JeniT joined the channel
8:26 AM
floppy joined the channel
8:27 AM
JeniT has quit
8:30 AM
floppy has quit
8:41 AM
Davetaz has quit
8:43 AM
Elsmorian joined the channel
8:43 AM
Davetaz joined the channel
8:43 AM
pezholio joined the channel
8:43 AM
pezholio has left the channel
9:02 AM
sfello joined the channel
9:27 AM
otfrom joined the channel
9:31 AM
beauvais joined the channel
9:31 AM
floppy joined the channel
9:36 AM
harrywood joined the channel
9:58 AM
odibot
9:58 AM
benjaminbenben joined the channel
10:05 AM
Davetaz
is it just me or is this the clearest the office hangout has ever been?
10:05 AM
odibot
10:05 AM
pezholio joined the channel
10:11 AM
pezholio
Yo floppy
10:12 AM
JeniT joined the channel
10:12 AM
floppy
yo
10:12 AM
pezholio
Bit worried about OpenAddresses
10:13 AM
Count is stuc at just under a million
10:13 AM
But the script is still running
10:13 AM
And no errors
10:13 AM
WUT
10:14 AM
floppy
joy
10:15 AM
pezholio
Can't see anything in the logs either
10:16 AM
floppy
hmm
10:16 AM
that's bad
10:16 AM
not going up at all any more?
10:16 AM
pezholio
I don't think so
10:16 AM
Just jumping on the console
10:18 AM
Yup. Count is at 994983
10:18 AM
And no errors in the log
10:19 AM
I wonder if we can just kick it off from a later page maybe
10:19 AM
floppy
maybe try locally
10:20 AM
see if anything explodes
10:20 AM
pezholio
This was the last one that was ingested
10:20 AM
10:20 AM
floppy
so at least we know where to start then
10:20 AM
pezholio
At 5:30 last night
10:22 AM
ldodds joined the channel
10:23 AM
10:23 AM
LauraJ joined the channel
10:23 AM
So, I think we should start from a few pages back
10:24 AM
Maybe 111650
10:24 AM
floppy
ok - it won't duplicate, so yes
10:24 AM
going to try from local?
10:24 AM
pezholio
Yup
10:26 AM
I'll just try a few pages first
10:42 AM
There's a surprsing amount of duplicates, but it seems fine
10:43 AM
I'll kick it off again
10:43 AM
floppy
pkqk: do you mind if I optimise the JSON response to the datasets search on ODCs?
10:43 AM
I hacked it in a while ago, but it's WAY too slow and doesn't need to be
10:43 AM
so if it won't conflict with you, I'll improve it
10:43 AM
pkqk
sure, I fixed some of it but it’s probably still quite slow
10:43 AM
you mean the json response generation bit?
10:45 AM
just looked through the jbuilder stuff, yeah it’ll be slow, also doesn’t seem to do pagination which you might want, what’s it used for as it appears to be giving a lot more information than the .atom feed
10:46 AM
pezholio
floppy: Right. I've stopped the workers and tweaked the procfile, and added some logging
10:46 AM
Once it's deployed, I'll switch them on again
10:48 AM
floppy
pezholio: ok
10:48 AM
pkqk: yeah, I did a quick hack and reused the full partial, so it's getting loads of info at once
10:48 AM
I'll take a little look
10:48 AM
mainly because I need screenshots for my blog post :)
10:49 AM
pkqk
cool, just keep it on a branch/PR as I might be deploying things today too
10:49 AM
floppy
sure
10:49 AM
I'll run it by you first
10:50 AM
pkqk
👍
10:56 AM
Davetaz
morning all
10:58 AM
pezholio
Mornin' Davetaz
10:58 AM
floppy: It's creating addresses happily now
10:59 AM
floppy
good news
11:00 AM
pezholio
Should get to a million soon!
11:00 AM
996764
11:00 AM
floppy
strange that it stalled
11:00 AM
has it sped up?
11:00 AM
pezholio
Yeah, back to (sort of) normal
11:00 AM
But
11:00 AM
Don't forget, as it goes through, the likelihood of there being duplicates increases
11:01 AM
That's why it probably seems like it's slowing down
11:01 AM
floppy
oh
11:01 AM
pezholio
No idea why it stalled though
11:01 AM
floppy
because companies house isn't unique of course
11:01 AM
many companies registered at same address
11:01 AM
pezholio
H'eaxctly
11:01 AM
floppy
you mean
11:01 AM
ahaa
11:01 AM
so we're not expecting 3 million final addresses
11:01 AM
pezholio
Yup
11:02 AM
floppy
gotcha
11:02 AM
pezholio
New Wu Tang BTW 996764
11:02 AM
11:02 AM
odibot
Wu Tang Clan - Das neue Album "A Better Tomorrow" (Album Stream) - YouTube
11:02 AM
11:02 AM
pezholio
Even
11:17 AM
We have ONE MILLION ADDRESSES
11:19 AM
Davetaz
and a LAAASSSSEERRR
11:20 AM
pezholio
"LASER"
11:23 AM
floppy
I need that laser
11:23 AM
to DESTROY EVERYTHING
11:23 AM
floppy angers
11:23 AM
pezholio
What up?
11:24 AM
floppy
unnecessary bullshit
11:24 AM
don't worry
11:26 AM
LauraJ has quit
11:28 AM
LauraJ joined the channel
11:43 AM
Davetaz
oh balls
11:45 AM
pezholio
'sup?
11:46 AM
floppy has quit
11:47 AM
Davetaz
i'm in Malaysia the week before xmas
11:50 AM
pezholio
Bad planning that
11:52 AM
Davetaz
grrrrr
11:58 AM
harrywood has quit
11:59 AM
floppy joined the channel
12:06 PM
pezholio
I'm beginning to think github pages doesn't like me
12:06 PM
Page builds are JUST NOT HAPPENING
12:06 PM
When I clone and build locally, it's fine
12:06 PM
And I get no error email
12:09 PM
pkqk
blame it on queues
12:10 PM
pezholio
Yeah, but it's normally a few minutes at the most
12:10 PM
This has been over 10 mins
12:11 PM
pkqk
I’ve learnt from talking a friend that works at Heroku that things working is an illusion and anything that big is basically failing all the time
12:11 PM
pezholio
Ha!
12:11 PM
That's encouraging anyway
12:13 PM
But then we do know that the whole Internet is help together with blu tak and bits of string
12:19 PM
Right. Lunch
12:19 PM
pezholio has quit
12:21 PM
pezholio joined the channel
12:21 PM
harrywood joined the channel
12:23 PM
pezholio has quit
12:24 PM
floppy
12:24 PM
odibot
Optimise JSON search by Floppy · Pull Request #978 · theodi/open-data-certificate · GitHub
12:24 PM
By removing most response set data. This cuts the response time by roughly 2/3 for /datasets.json
12:26 PM
pkqk
deleting things, the best way to speed things up
12:30 PM
floppy
yeah, it was always slow because it was doing too many queries to get info that wasn't necessary
12:31 PM
benjaminbenben has quit
12:31 PM
btw, I'm the only person using this I think. It's not documented
12:31 PM
pkqk
too many queries is most of the app