[05:39:56] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.015 second response time https://wikitech.wikimedia.org/wiki/ORES [05:41:20] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 1011 bytes in 0.038 second response time https://wikitech.wikimedia.org/wiki/ORES [08:11:50] 10Scoring-platform-team (Current), 10editquality-modeling, 10Chinese-Sites, 10artificial-intelligence: Train/test zhwiki editquality models - https://phabricator.wikimedia.org/T224481 (10Viztor) sweet! [10:15:36] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Serbian-Sites, 10artificial-intelligence: Investigate srwiki goodfaith model, why is it so bad? - https://phabricator.wikimedia.org/T199355 (10Acamicamacaraca) I informed the community. I hope you can deploy this till next week si... [13:35:01] o/ [13:38:28] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Serbian-Sites, 10artificial-intelligence: Investigate srwiki goodfaith model, why is it so bad? - https://phabricator.wikimedia.org/T199355 (10Halfak) Actually, we just deployed from our side on Monday. We're now waiting on the #... [13:39:20] 10ORES, 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Visualize the relationship between the probability of reversion and ores scores - https://phabricator.wikimedia.org/T224918 (10Halfak) [13:39:21] Flood coming! [13:39:22] 10ORES, 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Look at recent changes filters event log to track usage - https://phabricator.wikimedia.org/T225133 (10Halfak) 05Open→03Resolved [13:39:25] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Re-train nlwiki models (better language assets) - https://phabricator.wikimedia.org/T224483 (10Halfak) 05Open→03Resolved [13:39:25] Woo! [13:39:27] 10ORES, 10Scoring-platform-team (Current), 10editquality-modeling, 10revscoring, 10artificial-intelligence: ORES deployment: Early June - https://phabricator.wikimedia.org/T224484 (10Halfak) [13:39:29] 10ORES, 10Scoring-platform-team (Current), 10editquality-modeling, 10revscoring, 10artificial-intelligence: ORES deployment: Early June - https://phabricator.wikimedia.org/T224484 (10Halfak) [13:39:29] Batch resolving tasks [13:39:31] 10Scoring-platform-team (Current), 10editquality-modeling, 10Chinese-Sites, 10artificial-intelligence: Train/test zhwiki editquality models - https://phabricator.wikimedia.org/T224481 (10Halfak) 05Open→03Resolved [13:39:31] \o/ [13:39:33] 10ORES, 10Scoring-platform-team (Current), 10Epic, 10Wikimedia-Hackathon-2019 (Newcomer friendly): Improvements to ORES localization and support - https://phabricator.wikimedia.org/T223382 (10Halfak) [13:39:35] 10ORES, 10Scoring-platform-team (Current), 10Wikimedia-Hackathon-2019 (Newcomer friendly): Improve dutch language assets - https://phabricator.wikimedia.org/T224168 (10Halfak) 05Open→03Resolved [13:39:37] 10Scoring-platform-team (Current), 10Wikilabels: Migrate Wikilabels to new DB server - https://phabricator.wikimedia.org/T224163 (10Halfak) 05Open→03Resolved [13:39:39] 10Scoring-platform-team, 10Data-Services, 10Wikilabels, 10Patch-For-Review, 10cloud-services-team (Kanban): clouddb1002 low on space -- move wikilabelsdb - https://phabricator.wikimedia.org/T224062 (10Halfak) [13:39:41] 10ORES, 10Scoring-platform-team (Current), 10Epic, 10Wikimedia-Hackathon-2019 (Newcomer friendly): Improvements to ORES localization and support - https://phabricator.wikimedia.org/T223382 (10Halfak) 05Open→03Resolved a:03Halfak [13:39:43] 10Scoring-platform-team (Current): Generate datasets for sociative (vital 10k and some wikiprojects) - https://phabricator.wikimedia.org/T221780 (10Halfak) 05Open→03Resolved a:03Halfak [13:39:45] 10ORES, 10Scoring-platform-team (Current): TEC5 2019Q4 goal: Improve monitoring of ORES components in grafana - https://phabricator.wikimedia.org/T220195 (10Halfak) [13:39:47] 10Scoring-platform-team (Current), 10ORES-Support-Checklist, 10User-Ladsgroup: ores-support-checklist is down - https://phabricator.wikimedia.org/T222270 (10Halfak) 05Open→03Resolved [13:39:49] 10ORES, 10Scoring-platform-team (Current), 10Icinga, 10User-Ladsgroup: ORES worker icinga tests complain during testwiki deployments - https://phabricator.wikimedia.org/T219930 (10Halfak) 05Open→03Resolved [13:39:51] 10ORES, 10Scoring-platform-team (Current): TEC5 2019Q4 goal: Improve monitoring of ORES components in grafana - https://phabricator.wikimedia.org/T220195 (10Halfak) 05Open→03Resolved a:03Halfak [13:39:53] 10Scoring-platform-team, 10Growth-Team: Update fiwiki thresholds based on new model - https://phabricator.wikimedia.org/T223164 (10Halfak) [13:39:55] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Spanish-Sites, and 2 others: Create editquality campaign for Spanish Wikiversity - https://phabricator.wikimedia.org/T209670 (10Halfak) 05Open→03Resolved [13:39:57] 10Scoring-platform-team (Current), 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Train fiwiki damaging and goodfaith models with the second campagin data added - https://phabricator.wikimedia.org/T220216 (10Halfak) 05Open→03Resolved [13:39:59] 10Scoring-platform-team, 10Growth-Team (Current Sprint), 10Serbian-Sites: Enable srwiki edit quality filters in RecentChanges - https://phabricator.wikimedia.org/T197012 (10Halfak) [13:40:03] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Serbian-Sites, 10artificial-intelligence: Investigate srwiki goodfaith model, why is it so bad? - https://phabricator.wikimedia.org/T199355 (10Halfak) 05Open→03Resolved [13:40:05] 10ORES, 10Scoring-platform-team (Current), 10editquality-modeling, 10revscoring, 10artificial-intelligence: ORES deployment: Early June - https://phabricator.wikimedia.org/T224484 (10Halfak) [13:40:07] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Simplify and modularize the Makefile template - https://phabricator.wikimedia.org/T190968 (10Halfak) 05Open→03Resolved [13:40:09] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Train/test edit quality models for jawiki - https://phabricator.wikimedia.org/T130288 (10Halfak) 05Open→03Resolved [13:40:11] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10User-Sebastian_Berlin-WMSE, and 2 others: Build article quality model for svwiki - https://phabricator.wikimedia.org/T202202 (10Halfak) 05Open→03Resolved [13:40:13] 10Scoring-platform-team, 10editquality-modeling, 10Epic, 10artificial-intelligence: [Epic] Edit quality models (damaging/goodfaith) - https://phabricator.wikimedia.org/T130213 (10Halfak) [13:40:15] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Complete jawiki edit quality campaign - https://phabricator.wikimedia.org/T130266 (10Halfak) 05Open→03Resolved a:03Halfak [13:40:19] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Train/test edit quality models for jawiki - https://phabricator.wikimedia.org/T130288 (10Halfak) [15:32:16] Scoring async! (Due to a schedule conflict, we do Tuesday standup on IRC, asynchronously.) halfak groceryheist accraze [15:32:38] Nothing new from me. Doing what I was doing before. [15:32:47] Thanks for the ping harej. I'm deep in tech management stuff, so I'll get my async posted in 30 mins. [15:33:00] 👍 [15:33:47] gcal apparently has taken a dump, so accraze probably doesn't know about async standup. [15:33:56] I tried to invite him and gcal is exploding. [15:37:14] i'm here! not much to update, more onboarding today. Just got my laptop -- setting it up now, have a call with IT in ~1hr. [15:38:01] Oh yes, in staff IRC we are all going nuts over the calendar going down. It’s like anarchy. [15:43:23] awesome accraze [15:51:50] anarchy would be a lot more fun :-( [16:02:38] accraze, did you get my invitation to #wikimedia-staff? [16:03:05] I just sent another. I don't know how it shows up. [16:05:48] Ok for my update, I got the deployment done! Woo Looks like everything is looking good. [16:06:10] My update: yesterday: visited round of talk pages with everyone I talked to before. Started a couple new conversations from the snowball. Pulled data out of hadoop, didn't get very far on plotting and modeling, but I'm setup to do that today [16:06:13] I also did some work for onboarding accraze. [16:06:28] halfak yeah I got the invite but when I join it says I'm not authorized to be on the channel [16:06:46] accraze, gotcha. Let's confirm your cloak and then try again. [16:07:07] Finally, I picked up a bit of volunteer work to help out the maintainers of quarry. They are implementing some rate-limiting so I helped them figure out how fast humans are really re-running queries. [16:07:08] getting data out of hadoop remains a pain point since my strategy is to restart my kernel with increasingly inappropriate amounts of memory until it works [16:07:39] groceryheist, hmm. Are you writing the data from hadoop to the local disks? [16:08:42] using spark, i'm collecting to a pandas dataframe and then writing a local csv [16:09:22] I'm using parquet as my intermediate file format [16:09:57] maybe if I write hadoop tables and then used hive to get the data out it would be better? [16:12:23] aha! Could you write the CSV without loading the whole thing into a giant dataframe? [16:12:30] E.g. streaming write it as it comes out of hadoop? [16:13:10] the problem isn't memory on the notebook machine [16:13:40] it's java.lang.OutOfMemoryError: GC overhead limit exceeded [16:14:12] plus a mysterious error from spark that I interpret to mean the workers run out of memory [16:14:37] I think upping memory on workers leads to the GC overhead [16:14:49] so then I change the config to increase that too [16:15:09] it's wierd, but I've worked it out [16:15:18] like i'm not blocked on it [16:15:26] Well I am glad you have a solution, but it sounds backwards :| [16:15:30] yeah [16:15:32] it's not great [16:15:54] maybe collecting as pandas is the problem [16:15:54] I'm mostly in meetings today except for lunch (which I'm going to run to in a minute) [16:16:51] But otherwise, I have time today to look at/think about stuff with you groceryheist. Let me know if there is anything in particular you want bounce off of me. [16:17:28] ok [16:17:45] I hope to have some results worth talking about soon [16:17:58] but it might be tomorrow your time unless you're around late today [16:18:41] thanks! [16:21:16] groceryheist, sounds good. [16:21:41] I have a bike thing at 4PM PDT today so I probably won't be around late. [16:21:54] ok [16:22:02] * halfak heads out to lunch [16:22:05] see you! [20:18:45] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:45:19] PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:47:15] RECOVERY - puppet on ORES-worker02.experimental is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [20:57:03] accraze: are you available for backlog grooming in a few minutes or are you busy with onboarding? Same question for you Halfak [20:58:12] I'm available. I dropped accraze an invite in case he might be available :) [20:58:24] I think it's a good opportunity to talk through what's on our backlog :D [20:58:27] i'll be there! [20:58:31] Awesome [21:03:00] harej, you coming? [21:11:18] 10ORES, 10Scoring-platform-team (Research), 10Wikidata: Explore using ShEx to support ORES in Wikidata - https://phabricator.wikimedia.org/T225944 (10Halfak) [21:14:11] RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:15:46] 10Scoring-platform-team, 10editquality-modeling, 10revscoring, 10artificial-intelligence: Write proof-of-concept good-faith newcomer detection - https://phabricator.wikimedia.org/T141779 (10Halfak) Related: {T211434} [21:16:50] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Prototype a bot framework that utilizes newcomerquality - https://phabricator.wikimedia.org/T211434 (10Halfak) p:05Triage→03Lowest [21:22:52] 10ORES, 10Scoring-platform-team, 10Growth-Team, 10WMF-JobQueue, 10Wikimedia-production-error: Fatal error during RecentChange::notifyEdit (deferred update) from ORES/RecentChangeSaveHookHandler - https://phabricator.wikimedia.org/T225199 (10Halfak) @krinkle, is this still happening? Is it causing a prob... [21:25:37] 10Scoring-platform-team, 10WMF-Legal, 10Wikilabels: Add legal language to wikilabels - https://phabricator.wikimedia.org/T223313 (10Halfak) On May 22nd, @DFoy said he'd set up a meeting with legal, but they were very busy at the time. I think we are still waiting on that. [21:29:28] 10MediaWiki-extensions-ORES, 10Scoring-platform-team: Investigate spikes in threshold lookup requests in ORES-ext - https://phabricator.wikimedia.org/T218744 (10Halfak) 05Open→03Resolved a:03Halfak It looks like we get a spike when we do a deployment and it corresponds to how many models were updated. S... [21:34:12] 10ORES, 10Scoring-platform-team: Implement sentinel for ORES production Redis - https://phabricator.wikimedia.org/T122676 (10Halfak) @akosiaris, I'm checking in on this task. Have we come to any conclusions about investing in Sentinel or not from the ops side of things? I.e., should we pick up this task or d... [21:36:54] 10Jade, 10Scoring-platform-team: Regression: Judgment validation allows for multiple judgments with the same value e.g. 2x {damaging, badfaith} - https://phabricator.wikimedia.org/T210804 (10Halfak) p:05High→03Normal [21:38:46] 10Scoring-platform-team, 10Wikilabels: Information about finished campaigns should be accessible in Wikilabels - https://phabricator.wikimedia.org/T223899 (10Halfak) [21:43:24] 10ORES, 10Scoring-platform-team, 10Cloud-Services: ORES.wmflabs.org - ORES web node labs ores-web-02 - 502 Bad Gateway - https://phabricator.wikimedia.org/T220839 (10Halfak) 05Open→03Resolved a:03Halfak I'm guessing this was a blip with the host that this VM was running on. It seems to have been runni... [21:46:34] 10ORES, 10Scoring-platform-team, 10Continuous-Integration-Config: ORES wheels deployment repo Jenkins job should gate-and-submit - https://phabricator.wikimedia.org/T211041 (10Halfak) This is probably something that #releng can help us with when we get there. https://gerrit.wikimedia.org/r/#/admin/project... [21:52:15] 10ORES, 10Scoring-platform-team, 10Gerrit: Write a cookbook for the workaround for getting LFS to gerrit - https://phabricator.wikimedia.org/T226055 (10Halfak) [21:52:23] 10ORES, 10Scoring-platform-team, 10Gerrit: Write a cookbook for the workaround for getting LFS to gerrit - https://phabricator.wikimedia.org/T226055 (10Halfak) p:05Triage→03High [23:52:40] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues