[06:28:57] congrata [06:29:00] congrats [06:36:04] PROBLEM - check load on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:36:43] PROBLEM - check users on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:37:08] PROBLEM - check disk on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:37:39] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:37:54] PROBLEM - check disk on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:38:35] PROBLEM - check load on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:39:05] PROBLEM - check users on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:42:23] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [07:16:43] RECOVERY - check users on ORES-web02.Experimental is OK: USERS OK - 1 users currently logged in [07:16:54] RECOVERY - puppet on ORES-web02.Experimental is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [07:17:08] RECOVERY - check disk on ORES-web02.Experimental is OK: DISK OK [07:18:04] RECOVERY - check load on ORES-web02.Experimental is OK: OK - load average: 0.10, 0.11, 0.15 [07:22:35] RECOVERY - check load on ORES-web01.Experimental is OK: OK - load average: 0.16, 0.07, 0.09 [07:23:05] RECOVERY - check users on ORES-web01.Experimental is OK: USERS OK - 1 users currently logged in [07:23:54] RECOVERY - check disk on ORES-web01.Experimental is OK: DISK OK [07:26:20] RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [15:19:57] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:22:40] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:27:23] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:43:37] PROBLEM - puppet on ORES-redis02.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:57:44] PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [16:06:42] Async Update -- [16:06:53] Y: Talked w/ Core Platform about MW schema changes for Jade. Also attempted to deploy ORES, but wound up debugging my ssh config instead. [16:07:17] T: Actually deploy ORES and retrain huwiki model. Also more looking at Jade extension code this afternoon. [16:07:25] Ooh I like this protocol [16:07:44] Y: Mostly worked on email, thesis, and a bit of phab cleanup [16:08:54] More Y: Reached out to new volunteer. Still looking for a good place to start. Thinking of evaluation statistics code. [16:09:30] T: Emergency review work for GROUP 2020 (journal paper), Tech management & Roadmap stuff, Deploy ORES! [16:43:59] RECOVERY - puppet on ORES-worker02.experimental is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [16:46:44] RECOVERY - puppet on ORES-redis02.experimental is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [16:51:00] RECOVERY - puppet on ORES-web02.Experimental is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [16:55:23] RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [16:57:27] RECOVERY - puppet on ORES-worker01.experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:00:37] accraze, deploy time! [17:01:28] cool let's do it! [17:01:38] Call when ready [17:01:49] And join #wikimedia-operations [17:09:43] 10Scoring-platform-team, 10Operations, 10SRE-Access-Requests: Add accraze to deployment and deploy-service groups. - https://phabricator.wikimedia.org/T228191 (10Halfak) [17:10:31] 10Scoring-platform-team, 10Operations, 10SRE-Access-Requests: Add accraze to deployment and deploy-service groups. - https://phabricator.wikimedia.org/T228191 (10Halfak) I approve of Andy being added to these groups. This will allow him to log into deployment.eqiad.wmnet and do ORES deployments. I missed... [18:45:02] 10Scoring-platform-team, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add accraze to deployment and deploy-service groups. - https://phabricator.wikimedia.org/T228191 (10Halfak) These groups not added as part of {T226204}. I've updated https://www.mediawiki.org/wiki/Wikimedia_Scoring_Platform_... [19:29:19] Finally digging into the travis-based release strategy [19:29:31] Looks pretty good. I'm working on figuring out travis encrypt now [19:36:27] 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Review Jade data storage and architecture proposal [RFC] - https://phabricator.wikimedia.org/T200297 (10Halfak) [19:37:08] 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Review Jade data storage and architecture proposal [RFC] - https://phabricator.wikimedia.org/T200297 (10Halfak) I've renamed this to something that I can feel better about closing. Thank you all for your hard work -- especially... [19:37:40] 10Scoring-platform-team (Current), 10revscoring, 10artificial-intelligence: Upgrade numpy, scipy, sklearn - https://phabricator.wikimedia.org/T227023 (10Halfak) 05Open→03Resolved [19:37:42] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Deploy MSFT editquality model - https://phabricator.wikimedia.org/T227024 (10Halfak) [19:37:45] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Re-label huwiki damaging and badfaith edits - https://phabricator.wikimedia.org/T223882 (10Halfak) 05Open→03Resolved [19:37:49] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Retrain damaging/goodfaith models for huwiki - https://phabricator.wikimedia.org/T228078 (10Halfak) [19:38:03] 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Review Jade data storage and architecture proposal [RFC] - https://phabricator.wikimedia.org/T200297 (10Halfak) 05Open→03Resolved a:03Halfak [19:38:07] 10Jade, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: [Epic] Extension:JADE scalability concerns - https://phabricator.wikimedia.org/T196547 (10Halfak) [19:48:24] accraze, I'm looking at this "travis encrypt" command. I'm struggling to see how it makes sense to commit an encrypted password for PyPI [19:49:00] Can't the encrypted text just be re-used? [19:49:10] yeah agreed, probably a better way is to use an envar stored on travis [19:50:09] Oh interesting. Can we do that safely? [19:50:49] i believe so - will double check in a bit, about to go grab some lunch [19:51:56] yeah you can define them in the repository settings: https://docs.travis-ci.com/user/environment-variables/#defining-variables-in-repository-settings [19:55:47] * accraze goes foraging [20:23:47] Oh cool. that makes sense. [20:24:25] I think we might want to set up a pypi account for this just in case a bus hits me. [20:42:00] 10Scoring-platform-team (Current), 10articlequality-modeling, 10draftquality-modeling, 10drafttopic-modeling, and 3 others: Decide on criteria for releases/versioning of model repos - https://phabricator.wikimedia.org/T228215 (10Halfak) [20:43:32] 10Scoring-platform-team (Current), 10articlequality-modeling, 10draftquality-modeling, 10drafttopic-modeling, and 3 others: Decide on criteria for releases/versioning of model repos - https://phabricator.wikimedia.org/T228215 (10Halfak) https://github.com/wikimedia/editquality/pull/206 -- Here's a proposa... [21:36:03] 10ORES, 10Scoring-platform-team, 10Growth-Team, 10MediaWiki-extensions-WikibaseClient, and 3 others: ORES/ChangesListHooksHandlerTest causing build failures in other repos (e.g. UploadWizard) - https://phabricator.wikimedia.org/T224672 (10kostajh) 05Open→03Resolved > We'll check it on our end but like... [21:40:41] 10Scoring-platform-team, 10Growth-Team, 10editquality-modeling, 10artificial-intelligence: Update RC Filters for new ORES capacities (July, 2019) - https://phabricator.wikimedia.org/T227094 (10kostajh) >> Hey @Halfak is the expectation the > Yes, that's right. @Halfak could you please clarify which aspec... [21:41:15] 10Scoring-platform-team, 10Growth-Team: Update srwiki thresholds for goodfaith model - https://phabricator.wikimedia.org/T223273 (10kostajh) Tentatively moving to Q2, but @Halfak if you need it sooner please let us know. [21:43:36] 10Scoring-platform-team, 10Edit-Review-Improvements-Integrated-Filters, 10Growth-Team, 10editquality-modeling, 10artificial-intelligence: Deploy ORES filters for jawiki - https://phabricator.wikimedia.org/T225563 (10kostajh) Tentatively scheduling for Q2; if you need it sooner please let us know. [21:43:55] 10Scoring-platform-team, 10Edit-Review-Improvements-Integrated-Filters, 10Growth-Team, 10editquality-modeling, and 2 others: Deploy ORES filters for zhwiki - https://phabricator.wikimedia.org/T225562 (10kostajh) Tentatively scheduling for Q2; if you need it sooner please let us know. [21:44:28] 10Scoring-platform-team, 10Edit-Review-Improvements-Integrated-Filters, 10Growth-Team, 10editquality-modeling, 10artificial-intelligence: Update ORES thresholds for nlwiki - https://phabricator.wikimedia.org/T225561 (10kostajh) Tentatively scheduling for Q2; if you need it sooner please let us know. [21:46:22] 10Scoring-platform-team, 10Growth-Team: Update srwiki thresholds for goodfaith model - https://phabricator.wikimedia.org/T223273 (10Halfak) Hmm. Considering that the srwiki community did a bunch of work to help fix this, it seems like it might be important to make it available to them soon. Getting this done... [21:50:05] hi halfak! I batched a bunch of the ORES updates for Q2. is srwiki the only one you'd like done soon-ish or do the others fall into that category as well? [21:50:16] 10Scoring-platform-team, 10Growth-Team, 10editquality-modeling, 10artificial-intelligence: Update RC Filters for new ORES capacities (July, 2019) - https://phabricator.wikimedia.org/T227094 (10Halfak) @kostajh, that's a good question. I suspect that @Catrope knows best. The RC Filters code isn't on our p... [21:50:35] hey kostajh! [21:50:36] * halfak thinks. [21:50:44] we unfortunately do have a lot on our schedule, plus engineers transitioning teams + summer holidays & wikimania, etc etc :\ [21:51:09] Right. That makes sense. [21:51:30] So we had srwiki, zhwiki, jawiki, and nlwiki, right? [21:52:06] I think the nlwiki model is probably basically the same. I had a zhwiki-pedian do a bunch of work for that one so I think it falls into the same camp as srwiki [21:52:06] those are the ones I see, yes [21:52:11] k [21:52:24] But jawiki is less urgent. I think mostly people are using the API directly -- at least as far as I can tell. [21:52:42] I mean, if we do one of them soon we should probably do all of them as there's a fair amount of mental energy to get into the right headspace to set up the configuration [21:52:58] speaking for myself personally, anyway [21:53:06] Yeah, that's my thought to. [21:53:09] *too [21:53:19] I don't think this will be much work, honestly. [21:53:36] and for T227094, it's not so much new features/capabilities as it is setting up the configuration or updating the configuration for those four wikis [21:53:37] For some of these, RoanKattouw might just say, "Oh, that already fixed itself and we don't need to do anything." [21:53:37] T227094: Update RC Filters for new ORES capacities (July, 2019) - https://phabricator.wikimedia.org/T227094 [21:53:47] Right. [21:54:00] got it. OK, I misread that [21:54:01] Just config updates -- matching it to the product's intention/use. [21:54:36] k [21:55:15] let me check in with others about scheduling after I'm done with triaging [21:55:52] Makes sense. Thanks for taking this up and getting it triaged :) [21:56:20] Maybe some time we can have a conversation about "maintenance mode" and inheriting products over a beer ;) [22:01:39] ha. sounds good :) [22:02:16] 10Scoring-platform-team, 10Growth-Team: Update srwiki thresholds for goodfaith model - https://phabricator.wikimedia.org/T223273 (10kostajh) @Halfak and I discussed on IRC; I will come back to this after finishing triage this week. [22:25:52] wikimedia/editquality#605 (huwiki-retrain - c67d2d6 : Andy Craze): The build passed. https://travis-ci.org/wikimedia/editquality/builds/559695228 [22:54:37] halAFK