[14:20:50] o/ [14:21:21] I'm back and hacking on a slide deck for my trip to U of Michigan tomorrow. [14:21:25] Let me know if you need anything. [14:22:02] presente! [14:22:08] ok, great. [14:22:18] Looking at "Are we blocking our revscoring 2.0 deployment on RC filters compatibility? Can we deploy service-first?" [14:22:19] recovering my bearings... [14:22:21] In the agenda. [14:22:25] o/ awight [14:22:26] :) [14:23:21] the RC filters compatibility should have been rolled out now, which makes it safe to deploy and rollback the revscoring 2.0 service. [14:25:47] Great! [14:25:56] Shall we do a deploy today? [14:25:58] :) [14:27:20] hell yeah. [14:27:31] I can do it [14:27:45] will make a quick check that our MW code went out on the train [14:30:11] checks out. OK, checking that RC filters is currently working. [14:31:05] looks good. [14:31:17] halfak: I’m ready to try a service deployment. [14:31:44] awight, awesome! [14:31:48] Now when is our window... [14:32:18] 2.5 hours [14:32:31] 1700-1800 UTC [14:33:30] Hmm ok I didn’t realize that the deployment windows cover things outside of MW. Good thing to learn :) [14:34:04] I think the code tree is ready to go already, since I’ve been deploying the prod repo to ores* [14:35:17] awight, sounds good. Should reflect what we have in beta too. If there's an update that's been going to ORES*, it should go out to beta ASAP. [14:38:49] cool, I’ll start there. [14:39:04] um. No deployment window needed, true? [14:39:58] * awight pats self on head to reassure [14:41:15] halfak: awight oh you are here :) [14:41:29] Amir1: hi! [14:41:39] I did some work yesterday but not much [14:41:47] I have lots of questions :( [14:41:53] yes and I’m in a better timezone now, UTC-5 [14:41:55] awight, no deployment window needed for beta [14:42:01] halfak: ty [14:42:02] awight, utc-5 represent! [14:42:18] halfak: Amir1: we should push our morning meetings forward so Amir1 can have a night life [14:42:21] http://www.cbc.ca/radio/ideas/the-1989-cbc-massey-lectures-the-real-world-of-technology-1.2946845 [14:42:25] or just… leave the office at all. [14:42:50] I agree. Would be easy to push them forward to 1400 UTC. [14:43:12] awight: oh UTC-5 is way cooler [14:43:29] lol [14:43:39] Might want to move it to 1500 UTC shortly though due to daylight savings switch. [14:43:41] culture follows chronology? [14:43:50] might be [14:44:04] Nov 5th is US daylight savings day [14:44:31] so. I made https://gerrit.wikimedia.org/r/#/c/386078/ that will be useful for our us [14:44:32] Peru doesn’t have daylight savings, fwiw [14:46:03] halfak: the edit quality campaign for Serbian is ready for review [14:46:25] * halfak looks at editquality [14:47:07] Amir1, how many observations in datasets/srwiki.revisions_for_review.5k_2017.json ? [14:47:22] should be 5k [14:47:28] I see 800 [14:48:02] also I finished moving all tests to pytest but everything related to pickle serialize and unserialize fails, it's fascinating to see that because if you put "print(foo == pickle.serializer(pickle.unseralizie(foo))" it prints false [14:48:11] oh, that should not happen [14:48:14] let me check again [14:48:55] https://travis-ci.org/wiki-ai/revscoring/builds/290903087?utm_source=github_status&utm_medium=notification [14:49:02] assert german.stemmed == pickle.loads(pickle.dumps(german.stemmed)) [14:49:05] this fails [14:49:14] did we mock pickle? [14:49:22] awight, nope. [14:49:25] I highly doubt that [14:49:26] No need to mock pickle. [14:49:32] right [14:49:40] I hate pickle. [14:49:55] Amir1, what is the error? [14:50:11] It's likely that pytest is doing something funny with loading in tests as modules. [14:50:19] I might be able to work around it. [14:50:22] I know too much about pickle. [14:50:29] Also we might consider switching to dill [14:50:33] It's a bit less crazy [14:50:41] I just didn't see a clearly good reason to do it in the past. [14:50:42] the error is "the objects are not the same", comparing objects is not easy [14:50:50] Oh. that's fucked up. [14:50:53] I R [14:51:00] It is in this case. [14:51:05] easy to compare equality [14:51:13] It's explicitly defined on datasources and features. [14:51:58] This is a weird pattern anyway, it would make more sense to test serializer reversability in its own test, IMO. [14:52:34] halfak: I checked serbian again and it seems out of 20K there is only 800 cases that needs review (not made by a bot, etc.) a botpedia I guess, Should I increase the number and if yes, to what> [14:53:29] awight, we're testing that the things that need to be pickled support pickling. [14:53:36] It's not really testing pickle. [14:53:40] right [14:53:42] Testing pickleability. [14:53:57] yes that we don’t throw exceptions & don’t lose information [14:54:19] Amir1, 20000 * (5000/800) = 125000 [14:54:30] oh boy [14:54:34] okay [14:54:35] awight, +1 :) [14:54:53] Nothing like some algebra in the AM [14:54:53] :) [15:02:27] Oof just got waylaid by the wmfall thread [15:02:59] uhoh [15:05:17] halfak: o/ [15:05:25] o/ codezee [15:05:34] halfak: did the weekly sync up happen? [15:08:01] It looks like it did from the agenda. :) [15:08:10] I was AFK sick yesterday [15:08:35] oh... i see [15:10:01] had a question regarding sample page_ids on L53 of https://etherpad.wikimedia.org/p/scoring_platform [15:10:33] I wrote a small query that gives me pages associated with a WikiProject given a project name - https://quarry.wmflabs.org/query/22464 [15:11:24] but as its clear, each WikiProject has several pages associated with it and we have several WikiProjects for each mid-level category [15:11:54] so are we taking all pages on Wikipedia or sample_ids represents some smaller set? [15:13:51] codezee, a random sample of pages. [15:14:01] And all of the WikiProjects tagging the page are relevant. [15:15:20] halfak: any guesses of how big a sample size to start with? [15:15:32] Try 20k [15:15:48] We'll see how the mid-level topics get distributed and maybe do some supplementing. [15:16:24] so basically 20k random page_ids in (1, max page id in Wikipedia), right? [15:17:20] Right. You'll want to limit it to articles (page_namespace = 0) and non-redirect pages (page_is_redirect = 0). [15:17:37] We'll likely pick up some disambiguation pages too. We'll probably need to trim those after the fact/ [15:19:05] ok, let me try some stuff on quarry then... [15:27:19] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q2-Oct-Dec-2017): UX check RC Filters in beta (revscoring 2.0/thresholds release) - https://phabricator.wikimedia.org/T178395#3706887 (10Halfak) Just to be clear, @Ladsgroup != @Halfak. :) Thanks for getting to th... [15:28:51] halfak: it didn't [15:29:09] I added to talk about but no one was online so I went for shopping :D [15:31:28] Boo. [15:37:16] Halfak & awight hows you all doing [15:38:42] Good! Catching up on work I missed yesterday. [15:39:30] Anything i can do to help, sort phab stuff out, send emails, corrospond with other teams etc? [15:39:37] Halfak ^ [15:40:24] Zppix, would you review [[:mw:JADE]]? [15:40:25] 10[1] 04https://meta.wikimedia.org/wiki/:mw:JADE [15:40:37] I just rewrote the page to be more clear. [15:40:43] Halfak sure... [15:42:55] awight & Amir1, I'm going to miss Staff tomorrow because I'll be at U of Michigan. [15:43:10] You could hold it without me or we could reschedule to Friday [15:43:19] Oh... Friday won't work for either of you. :/ [15:43:19] Hmm [15:43:37] Could do late on Thursday? [15:43:51] fine for me [15:44:20] Halfak read it makes sense to me.. [15:46:03] Good deal. No glaring typos? [15:46:23] No or i would of fixed them by now... [15:59:09] late Thursday works for me [16:00:41] 2100 UTC? [16:00:45] awight, Amir1 ^ [16:00:51] That's pretty late :S [16:01:11] It's not super late for me [16:01:28] but the daylight saving time is around the corner [16:01:31] I can do it… have to set up internet at the new apartment still, but I think I can find something or call in. [16:01:34] so I don't know exactly what time [16:02:49] Amir1, not until Nov 5th in US. [16:03:00] I mean in Germany [16:03:02] Do you know what it is for CEST? [16:03:17] it's 29 of october [16:03:57] it's okay for me [16:04:40] Gotcha. We'll have one week of derpyness. [16:23:41] halfak: updated [16:35:46] Amir1, merged [16:36:07] halfak: thanks, then I can deploy it in wikilabels :) [16:36:24] Thanks :) [16:36:29] I need coffee first, and some sort of dinner [16:37:08] * Zppix gives Amir1 some joe [16:41:34] 10Scoring-platform-team, 10AbuseFilter, 10ORES: [Spec] Suppression system for JADE freeform text fields - https://phabricator.wikimedia.org/T153142#3707180 (10Halfak) [16:42:08] 10Scoring-platform-team, 10AbuseFilter, 10ORES: [Spec] Suppression system for JADE freeform text fields - https://phabricator.wikimedia.org/T153142#2871070 (10Halfak) See https://www.mediawiki.org/wiki/Topic:Tzw4ebq17wbdog74 for discussion [16:44:29] halfak: Amir1: coming up on our deployment window. I’m planning to push revscoring 2 to production, cross your fingers! [16:44:45] * Amir1 hides under the table [16:45:07] Hey folks. Looks like I'm going to be AFK for this deployment. [16:45:14] At least the first 30 minutes. [16:46:18] * Zppix runs to the bunker [16:48:42] lol. [16:48:58] * awight reads the directions for the Holy Hand Grenade. [16:49:08] “counteth not to four. Five is right out." [17:00:19] Godspeed. I'm on my bike for a bit. [17:00:36] Will be back online in 40 minutes. If in doubt, roll it all back! [17:01:36] +1 [17:01:37] awight: can i cone out of my bunker now the ac wasnt installed out [17:01:39] that’s my style. [17:02:47] damn, our service is currently being overloaded. This should be fun. [17:03:51] Yay... [17:04:19] awight: theres a high level of vandal activity at enwiki so thats could be why [17:04:31] Zppix: aha, ty for the info [17:04:36] Np [17:05:08] Pays off to have a rollbacker for enwiki in here xD [17:07:38] Oh for sure, thanks for all the help! [17:12:54] Canary looks good, continuing... [17:27:28] that is a slooow clone [17:38:58] Lol [18:02:24] Amir1: do you ever find the ORES service deploy hangs like hell during the fetch stage? [18:02:38] I’ve been waiting 45 minutes and it’s still on 5/9 provisioned. [18:02:43] awight: yeah it does all the time [18:02:51] the submodules are super big [18:02:53] [18:02:55] it's a known issue [18:03:11] some were going fast-ish and others at a crawl or not at all. ok, I’ll restart the deploy. [18:03:20] 10Scoring-platform-team (Current), 10Wikilabels, 10User-Ladsgroup: Edit quality campaign for Serbian Wikipedia - https://phabricator.wikimedia.org/T178108#3707549 (10Ladsgroup) [18:03:51] 10Scoring-platform-team (Current), 10Wikilabels, 10User-Ladsgroup: Edit quality campaign for Serbian Wikipedia - https://phabricator.wikimedia.org/T178108#3681018 (10Ladsgroup) It's now ready for labeling, you can keep track of labels in http://labels.wmflabs.org/stats/srwiki/ [18:04:11] oooh I was disconnected perhaps [18:04:13] argh. [18:07:34] [random] https://meta.slashdot.org/story/17/10/22/1714246/when-an-ai-tries-writing-slashdot-headlines [18:08:40] o/ Amir1 & awight. Hows the deploy? [18:08:44] Sorry I'm late [18:08:50] hung. I’m asking for help in -operations. [18:09:13] The fetch took at least 20 minutes, then I didn’t realize that my SSH connection had broken. [18:09:30] now I’m stuck with no logs and locked out of redeploying. Unsure of the status. [18:16:54] Fetch on tin? [18:17:09] Or fetch through scap? [18:22:30] halfak: btw. The patch that I had for getting usernames based on id is merged now, I wait for it to deploy and then I can implement username lookup ^_^ [18:22:54] in http://labels.wmflabs.org/stats/enwiki/41 [18:28:19] halfak: fetch through scap was unbearably slow. [18:28:36] halfak: Amir1: Deployment seems to have succeeded, but take a look at the graphs, https://grafana-admin.wikimedia.org/dashboard/db/ores?orgId=1&from=now-3h&to=now-1m&refresh=10s [18:28:56] I’m considering rolling back. Unfortunately, our error graphs don’t make sense. [18:30:55] I think there is something wrong that I noticed and make a phab card for too [18:31:00] https://ores.wikimedia.org/scores/enwiki/damaging/?model_info=test_stats&format=json [18:31:12] such requests come from mediawiki I guess [18:31:13] Amir1: a blocker we should rollback for? [18:31:18] * awight fingers the red button [18:31:28] https://grafana.wikimedia.org/dashboard/db/ores-extension?orgId=1 [18:31:38] This is fine which is the important part for me [18:31:56] but I would be super happy if we can fix this ASAP and deploy it really fast [18:32:47] OK so this deployment didn’t break it, at least. [18:34:13] k I’m reassured that recentchanges is still getting scores and thresholds. [18:36:11] * awight brushes broken glass and dirt from hands [18:36:58] halfak: Should I get back to the stress testing now, or is there anything more urgent? [18:37:11] and congratulations all around on revscoring 2! [18:39:39] I'm half paying attention because I'm in another meeting. Are we all good? [18:40:08] Looks like we [18:40:12] 're still erroring [18:40:12] https://ores.wikimedia.org/scores/enwiki/damaging/?model_info= [18:40:36] it errors but it has been broken for a while it seems [18:40:45] looking... [18:40:54] This is really bad [18:41:00] What do you mean it's broken for a while? [18:41:58] Hmm... now looking OK [18:42:09] Oh wait.. no the v1 path is totally borked. [18:42:11] Can ORES report its own version? [18:42:20] /version [18:42:24] ty [18:42:30] https://ores.wikimedia.org/versions [18:42:36] I’m looking for an old version to compare [18:42:59] Looks like we didn't merge and deploy this: https://github.com/wiki-ai/ores/pull/225 [18:43:03] It's super old. [18:43:13] k rolling back. [18:43:42] I'm surprised we didn't catch this in beta. [18:43:57] I guess the old v1 path would fail as expected, but not in the way we anticipated. [18:44:13] I was overly focused on making the extension and service comatible., [18:44:14] I've confirmed the deployment happened as expected though. [18:44:17] *compatible [18:44:23] Right. Was good work. This just got missed. [18:44:26] halfak: I redeployed, either way [18:44:49] Want to get that merged and I'll get a patchset for the prod config? [18:44:57] Then you can merge that and try a deploy tomorrow? [18:47:37] halfak: I had it in beta and reported [18:47:57] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Deploy ORES (revscoring 2.0) - https://phabricator.wikimedia.org/T175180#3585227 (10awight) Rolled back due to a bug in the v1 routes. [18:48:18] halfak: it's merged now, we can deploy to beta soon [18:48:53] OK great. [18:48:59] Arg Amir1 sorry to miss that [18:49:11] Pinning us to ab88a74d087efff620a3eeb0e5aad1540d2a838b fwiw [18:50:17] (03PS1) 10Halfak: Bumps submodules/ores -- fixes v1 model_info [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/386252 [18:50:25] https://gerrit.wikimedia.org/r/386252 [18:50:34] awight & Amir1: ^^ [18:50:37] Oh let me add the bug ID [18:51:01] (03CR) 10Awight: [C: 032] Bumps submodules/ores -- fixes v1 model_info [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/386252 (owner: 10Halfak) [18:51:05] (03PS2) 10Halfak: Bumps submodules/ores -- fixes v1 model_info [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/386252 (https://phabricator.wikimedia.org/T175180) [18:51:08] {{done}} [18:51:09] You rule, halfak! [18:51:15] (03CR) 10Awight: Bumps submodules/ores -- fixes v1 model_info [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/386252 (https://phabricator.wikimedia.org/T175180) (owner: 10Halfak) [18:51:25] (03CR) 10Ladsgroup: [V: 032] Bumps submodules/ores -- fixes v1 model_info [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/386252 (https://phabricator.wikimedia.org/T175180) (owner: 10Halfak) [18:52:14] k, I’ll deploy that to beta after lunch item. [18:52:27] urgh, the rollback is sluggish [18:54:39] halfak: srwiki wikilabels campaign is deployed and I have very little things to do tomorrow morning :) [18:54:58] Nice work Amir1 [18:55:22] Amir1, I'll help out later today and get some stuff on your work sheet. [18:55:22] <3 [18:55:24] Also persian article quality :P [18:56:37] I’ll get back to testing the Celery 4 change on the new cluster. [18:57:31] We’re fully rolled back. [18:58:20] halfak: yeah, for that I wanted you to clarify your comment there [18:58:47] I didn't understand, should I do a sample 250K and clean up or I should do the sample 250 revs / class [18:59:12] 250 revs per class [18:59:27] because fawiki doesn't have explicit classification like enwiki so I can't find 250 revs that are B-class [18:59:52] that's the reason I'm saying we need to do a campaign (like wikidata) [19:01:40] halfak: ^ [19:04:28] 10Scoring-platform-team (Current), 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3707817 (10madhuvishy) 05Resolved>03Open Reopening since we are scheduling the labsdb1001 and 1003 reboots over the next couple weeks. [19:53:25] 10Scoring-platform-team (Current), 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3707945 (10madhuvishy) Proposed timing for the 2 reboots: labsdb1001: Monday Oct 30 2017, 14:30 UTC (16:30 Madrid, 10:30 EST, 07:30 PT) labsdb1003:... [19:55:10] 10Scoring-platform-team (Current), 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3707948 (10Marostegui) Looks good to me! Thanks for getting this arranged [20:14:29] 10Scoring-platform-team (Current), 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3708038 (10madhuvishy) Thanks @Marostegui. I've updated the lists, and our wiki here -https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_sh... [20:15:49] Amir1, sorry to drop off. meeting meeting meeting. Generally, I'm in agreement that we'll need to do labeling. I'll respond on the task. [20:19:57] 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: Article quality campaign for Persian Wikipedia - https://phabricator.wikimedia.org/T174684#3708087 (10Halfak) @Ladsgroup and I discussed this in IRC. I agree that we'll need to do some labeling. I think it would be good to analy... [21:25:04] back in half an hour [21:26:31] tgr or anyone else who might know: [21:26:32] do you have a sense why https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Facebook&rvprop=timestamp|user|comment|content|ids|oresscores produces stuff with scores but https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=General%20Mills&rvprop=timestamp|user|comment|content|ids|oresscores throws an exception? [21:37:53] caching, probably? [21:47:34] dr0ptp4kt: the data for the first is looked up in the enwiki DB and there are only two models there [21:48:06] the second is not in the DB (probably not in recent changes anymore) so it is fetched from ORES which has more model types [21:48:37] the ORES response for the two articles has the same structure so probably the other will throw exceptions as well once it falls out of RC [21:48:55] tgr: ah, right...that. [21:49:30] but given it says no model for 'models' and ORES does not use that name, this will be a PHP bug of some sort [21:56:01] Amir1: ^ looks like https://github.com/wikimedia/mediawiki-extensions-ORES/blob/master/includes/Hooks/ApiHooksHandler.php#L303 should be something like getScores(..)[wfWikiID()]['scores'] [22:03:55] awight|afk, joining us for docs meeting? [22:42:40] dr0ptp4kt: would you mind filing a bug for that? [22:42:52] Looks like the problem might be in Extension:ORES [23:00:54] 10Scoring-platform-team, 10MediaWiki-extensions-ORES: "No model available for [models]" error for API access - https://phabricator.wikimedia.org/T178962#3708463 (10dr0ptp4kt) [23:01:12] 10Scoring-platform-team, 10MediaWiki-extensions-ORES: "No model available for [models]" error for API access - https://phabricator.wikimedia.org/T178962#3708480 (10dr0ptp4kt) [23:01:42] ^ there we go awight|afk and tgr . thanks. wrapping up for the evening. 'night [23:01:49] Thank you! [23:06:43] PROBLEM - check http on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:06:45] 10Scoring-platform-team, 10MediaWiki-extensions-ORES: "No model available for [models]" error for API access - https://phabricator.wikimedia.org/T178962#3708463 (10Tgr) Presumably the response format of ORES changed (revision data is now a subfield of the array), and the extension has not been updated. ORES is... [23:08:01] PROBLEM - ORES home page on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:08:06] uh oh [23:08:12] it just happened to me [23:08:15] should recover [23:08:26] i think labs dns just flapped [23:08:51] RECOVERY - ORES home page on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 420 bytes in 0.005 second response time [23:08:51] RECOVERY - check http on ores.wmflabs.org is OK: OK - Certificate '*.wmflabs.org' will expire on Fri 16 Nov 2018 03:41:05 PM GMT +0000.