[00:59:16] PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [01:06:49] (03PS3) 10Awight: [WIP] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) [01:07:26] Great. ^ basic machinery to update the summary columns passes my smoke tests [01:09:34] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [01:10:26] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [01:27:16] RECOVERY - puppet on ORES-worker01.experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [04:42:58] (03PS4) 10Awight: [WIP] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) [04:43:00] (03PS1) 10Awight: Catch DBError and return Status [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476446 [04:43:04] (03PS1) 10Awight: [WIP] Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 [04:45:10] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [04:45:12] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 (owner: 10Awight) [04:50:22] (03CR) 10jerkins-bot: [V: 04-1] Catch DBError and return Status [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476446 (owner: 10Awight) [04:51:10] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [04:54:21] (03CR) 10jerkins-bot: [V: 04-1] Catch DBError and return Status [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476446 (owner: 10Awight) [06:37:40] 10Scoring-platform-team, 10DBA, 10MediaWiki-Database, 10Blocked-on-schema-change, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) [06:38:09] PROBLEM - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on icinga1001 is CRITICAL: CRITICAL: https://grafana.wikimedia.org/dashboard/db/ores is alerting: 5xx rate (Change prop) alert. [08:13:55] 10ORES, 10Scoring-platform-team, 10Operations: ORES 500s since 2018-11-29 6:25 - https://phabricator.wikimedia.org/T210701 (10fgiunchedi) [08:40:03] Amir1: around ? ^ [08:40:38] akosiaris: in -operations now [09:09:17] RECOVERY - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on icinga1001 is OK: OK: https://grafana.wikimedia.org/dashboard/db/ores is not alerting. [09:23:33] PROBLEM - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on icinga1001 is CRITICAL: CRITICAL: https://grafana.wikimedia.org/dashboard/db/ores is alerting: 5xx rate (Change prop) alert. [09:37:43] RECOVERY - https://grafana.wikimedia.org/dashboard/db/ores grafana alert on icinga1001 is OK: OK: https://grafana.wikimedia.org/dashboard/db/ores is not alerting. [09:39:01] 10ORES, 10Scoring-platform-team, 10Operations, 10Release Pipeline (Blubber), 10Release-Engineering-Team (Backlog): Blubber should be able to make multi docker files per repo - https://phabricator.wikimedia.org/T210267 (10zeljkofilipin) [09:39:17] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, and 2 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10zeljkofilipin) [10:42:30] o/ [10:42:42] We had an outage today. Will write more [11:10:53] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, and 2 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10LarsWirzenius) Hm, this is strange now. This works on my host: env LC_ALL=C.UTF8 python3 -c "print('étoile')" This does n... [11:11:37] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, and 2 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10LarsWirzenius) This does report the C.UTF-8 locale as being available: docker run unitest locale -a [11:44:20] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, and 2 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10LarsWirzenius) C.UTF8 does not exist. In every other locale I try, a UTF8 suffix is an alias to the UTF-8 suffix (with the... [12:02:29] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, and 2 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10LarsWirzenius) If we want to set a default locale for images built by Blubber, we can set LC_ALL or LC_CTYPE. The former ov... [12:27:39] 10ORES, 10Scoring-platform-team (Current), 10Puppet, 10User-Ladsgroup, 10Wikimedia-Incident: ORES services should bind to ores config files - https://phabricator.wikimedia.org/T210719 (10Ladsgroup) [12:30:02] 10ORES, 10Scoring-platform-team, 10Operations, 10Puppet, 10Wikimedia-Incident: Logrotate should restart services when more people are around - https://phabricator.wikimedia.org/T210720 (10Ladsgroup) [12:39:11] 10Scoring-platform-team, 10Operations, 10Release-Engineering-Team: Contact number of some WMDE staff should be avalible to SRE/RelEng - https://phabricator.wikimedia.org/T210721 (10Ladsgroup) [12:55:44] 10Scoring-platform-team, 10Operations, 10Release-Engineering-Team: Contact number of some WMDE staff should be avalible to SRE/RelEng - https://phabricator.wikimedia.org/T210721 (10WMDE-leszek) a:03WMDE-leszek I take it on me. I've briefly talked about this topic with @greg during Technical Conference. We'... [12:55:51] 10Scoring-platform-team, 10Operations, 10Release-Engineering-Team: Contact number of some WMDE staff should be avalible to SRE/RelEng - https://phabricator.wikimedia.org/T210721 (10WMDE-leszek) p:05Triage>03High [13:25:12] 10Scoring-platform-team, 10Wikibase-Containers, 10Wikidata, 10Wikilabels, and 2 others: Stretch in docker registry forces ascii encoding - https://phabricator.wikimedia.org/T210260 (10akosiaris) >>! In T210260#4784708, @LarsWirzenius wrote: > C.UTF8 does not exist. In every other locale I try, a UTF8 suffi... [13:27:50] 10ORES, 10Scoring-platform-team, 10Operations, 10Puppet, 10Wikimedia-Incident: Logrotate should restart services when more people are around - https://phabricator.wikimedia.org/T210720 (10akosiaris) I am afraid we can't really change it. It's been at 06:25am (UTC in our case) forever and people expect th... [15:03:45] o/ Amir1 [15:03:52] hey [15:04:51] What was the outage about? [15:05:16] So do you remember I updated the configs to celery4 type configs? [15:05:23] Yeah. [15:05:36] I added all to puppet, then removed them from ores and ores-deploy repos [15:05:55] then changed in the puppet and then added them back to ores repos [15:06:50] I dropped those changes in puppet assuming the one in ores is enough but it wasn't because it's under local_celery and not the scoring_system we use in prod [15:07:07] so it fell back to default of celery4 which is "json" [15:07:08] woops! [15:07:43] the big problem is that this patch was merged in puppet 24 hours ago but didn't cause any issue becuase it didn't restart the services [15:07:44] How long were we down? [15:08:00] what caused the restart? logrotate at 6 am UTC [15:08:36] 2:30 hours until I woke up, checked IRC cloud on my phone, told people to revert the puppet patch and restart services [15:08:49] I brought the service back up when I was in bed half sleep :D [15:09:36] Yikes. I missed the ping. For some reason it didn't make it to my phone. I need to figure out why. [15:10:03] two good things; It went down when the load is at its lowest, it didn't bring down Wikipedia [15:13:50] Both good things. [15:51:09] Aha! It looks like I have never received an icinga ping from production. [15:51:39] I'm sure I must have decided that at one point in the past, but I changed my mind. I should get these pings. [15:51:57] Amir1, do you know where icinga contacts are configured? [15:52:27] halfak: mostly in puppet monitoring [15:52:36] and the private repo [15:52:52] Daniel Zahn is the expert on this [15:53:16] I'll make a task [15:54:21] 10Scoring-platform-team, 10Icinga: Add ahalfaker to ORES-related icinga contacts - https://phabricator.wikimedia.org/T210742 (10Halfak) [15:55:03] 10Scoring-platform-team, 10Icinga: Add ahalfaker to ORES-related icinga contacts - https://phabricator.wikimedia.org/T210742 (10Halfak) @Dzahn word on the street is that you can help me figure out how to do this :) [15:55:13] Thanks Amir1 [16:00:17] thank you! [16:04:55] 10Scoring-platform-team, 10Icinga, 10Operations: Add ahalfaker to ORES-related icinga contacts - https://phabricator.wikimedia.org/T210742 (10Dzahn) [16:08:57] I need to leave for some stuff, will work a little bit on the bus to Poland [16:09:08] and will be remote all of Friday [16:09:21] oh I'm already remote. I mean remote from another country [16:43:50] 10Scoring-platform-team (Current): check for multicollinearity in newcomer LogReg model - https://phabricator.wikimedia.org/T210751 (10notconfusing) [17:41:47] 10ORES, 10Scoring-platform-team (Current), 10Operations, 10vm-requests: New node request: oresrdb[12]003 - https://phabricator.wikimedia.org/T210582 (10akosiaris) [17:41:48] El búfer 12 está vacío. [17:42:04] AsimovBot, wat [17:42:05] 04Error: Command “wat” not recognized. Please review and correct what you’ve written. [17:48:24] wikimedia/wikilabels#440 (master - 1e89e6b : translatewiki.net): The build was broken. https://travis-ci.org/wikimedia/wikilabels/builds/461380018 [20:40:02] (03PS5) 10Awight: Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) [20:40:04] (03PS2) 10Awight: [WIP] Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 [20:46:58] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 (owner: 10Awight) [20:47:37] (03CR) 10jerkins-bot: [V: 04-1] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [21:07:47] wikimedia/articlequality#111 (glwiki - 537b85b : Aaron Halfaker): The build was fixed. https://travis-ci.org/wikimedia/articlequality/builds/461481683 [21:10:54] awight, https://github.com/wikimedia/articlequality/pull/70 is ready for you again [21:11:35] great [21:11:47] What shall we do about the itwiki model? [21:13:02] (03PS2) 10Awight: Catch DBError and return Status [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476446 [21:13:04] (03PS6) 10Awight: Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) [21:13:06] (03PS3) 10Awight: [WIP] Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 [21:18:08] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 (owner: 10Awight) [21:19:56] awight, what do you mean? [21:21:26] (03CR) 10jerkins-bot: [V: 04-1] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [21:22:39] halfak: The editquality patch I merged yesterday included the model but not as LFS, T210678 [21:22:39] T210678: Accidentally merged models in editquality; need to rewrite history - https://phabricator.wikimedia.org/T210678 [21:23:05] awight, arg. When we "converted to LFS" we missed some of the config it seems. [21:23:16] Let's see if I can amend this directly. [21:23:34] 10Scoring-platform-team: Accidentally merged models in editquality; need to rewrite history - https://phabricator.wikimedia.org/T210678 (10awight) [21:23:39] :100%: [21:23:59] (03CR) 10jerkins-bot: [V: 04-1] Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) (owner: 10Awight) [21:24:12] I wonder what that'll do to downstream observers. [21:26:54] hopefully no one has pulled in a bit [21:27:05] and we don't break any mirrors :| [21:34:03] (03PS7) 10Awight: Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) [21:34:05] (03PS4) 10Awight: Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 (https://phabricator.wikimedia.org/T200297) [21:34:17] Thanks for mopping! [21:40:10] OK I think this is right now. [21:40:40] yes! IT WORKS [21:41:08] * awight extinguishes stray sparks [21:41:31] 10JADE, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) We discussed this again in the TechCom meeting the other day. If DBAs are ok with not just the new f... [21:53:53] 10JADE, 10Scoring-platform-team (Current), 10DBA, 10Operations, and 2 others: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) @Marostegui Hello! I've added a few summary columns and indexes to the link tables, and the resulti... [22:01:02] halfak: I intuitively feel like a system for dividing Wikipedia into topic areas for curation purposes is something worth focusing on for next fiscal year. Which is frustrating because intuitions are difficult to communicate. What do you think? [22:01:22] The way I see it, the work of our team is to help projects scale, and part of scaling a project like English Wikipedia is being able to divide it into manageable chunks. [22:03:59] I agree harej. However, I'm worried about the different technological needs of maintaining a massive index. In the mid-term, we can continue to work on classifier-based topic modeling. [22:04:11] In the long term, we can aim for a better big-index style solution. [22:04:45] Generally, I want this topic-based view to be a strategic priority in 3-5 year planning. [22:05:27] Speaking of which, I was going to ask you to work on a topic idea for discussion at our offsite. Want to dig into this a bit more. [22:05:39] What would you like me to investigate? [22:06:02] Whatever you think is relevant to "a system for dividing Wikipedia into topic areas for curation purposes" [22:06:25] I was originally going to throw "organizing the tool devs" at you, but this seems like it's got you fired up :) [22:09:05] Organizing the tool devs is also (half of) my bread and butter [22:11:52] Technical Engagement is very much about organizing the tool devs, though it's a long term project :) [22:13:54] halfak: What specifically should I bring to Berlin? A proposal for a project? A list of questions we need to ask? [22:14:30] Think about it for a few hours. Read and talk to people as you see fit. Bring whatever you've got. [22:14:40] eventually I want us to write an essay. Nothing fancy. [22:14:45] Think user-space essay :) [22:15:15] Makes sense. I'll see what I can do. [22:15:16] "We should invest in topics. Here's why: " [22:15:39] I'm going to be working on "Here's how we identify and prioritize projects on the SP team." [22:15:47] Ooh, that sounds useful. [22:16:12] Essentially, I want to address the question of how we mature from an on-demand modeling team to having foresight. [22:16:19] I hope to get you some ammunition. :) [22:16:52] harej: FYI the summary column features are coded and ready for review :) [22:17:17] Yup! I'm happy Daniel will let us go to Last Call once this thing is sorted out. [22:17:27] Now we just need Manuel to render his opinions. [22:18:08] https://sims.fandom.com/wiki/Reticulating_splines [22:20:29] ^ +1 feature request [22:20:41] hahaha! [22:21:36] When we're building a model, the progress bar should note "biasing the algorithms", "reducing the humans to numbers", "hiding assumptions", etc. [22:21:47] + Konami code to bypass PoolCounter [22:21:54] haha yes [22:23:52] halfak: off-topic, HANDY is the opposite of agent-based, it's a handful of high-level feedback equations. [22:26:26] They updated to a much more sophisticated model, but it still works at the systems flow level, http://www2.physics.umd.edu/~yakovenk/papers/2016.Motesharrei.NatSciRev.3.470.pdf [22:29:31] Oh yes. [22:29:37] Pile-o-math [22:30:34] 10JADE, 10Scoring-platform-team: Regression: Judgment validation allows for multiples e.g. {damaging, badfaith} - https://phabricator.wikimedia.org/T210804 (10awight) [22:30:43] I feel like, in the far future, when skynet takes over the world, this chat is going to be used against me in a court of the human resistance force. [22:34:24] OK I'm out of here. I'll be available if you need something quick from me tomorrow. [22:36:05] But I'll be mostly AFK to catch up on house work [22:36:09] o/ [22:40:30] (03PS3) 10Awight: Catch DBError and return Status [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476446 [22:40:32] (03PS8) 10Awight: Summarize preferred judgment values in link table [extensions/JADE] - 10https://gerrit.wikimedia.org/r/475932 (https://phabricator.wikimedia.org/T200297) [22:40:34] (03PS5) 10Awight: Add indexes to filter by judgment value [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476447 (https://phabricator.wikimedia.org/T200297) [22:40:36] (03PS1) 10Awight: Throw a fit and s/JADE/Jade/ [extensions/JADE] - 10https://gerrit.wikimedia.org/r/476771 [22:40:53] kbye! [22:45:00] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/476700 (owner: 10L10n-bot) [23:11:00] harej: Note that I didn't write a migration for the pre-release schema--you'll have to drop the jade_diff_judgment and jade_revision_judgment tables on your labs jade. [23:14:05] Are there any side effects to just keeping those tables intact? [23:30:31] harej: no they have to be dropped in order to trigger the addExtensionTable from update.php [23:31:08] The link table values can be repopulated by maintenance script, fwiw [23:50:54] 10JADE, 10Scoring-platform-team: Regression: Judgment validation allows for multiple judgments with the same value e.g. 2x {damaging, badfaith} - https://phabricator.wikimedia.org/T210804 (10awight) [23:59:16] PROBLEM - puppet on ORES-worker01.experimental is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues