[03:16:43] 10DBA, 10MediaWiki-Logging: Investigation: Using log_search to query for logged actions against IPs in a given range - https://phabricator.wikimedia.org/T187584 (10Krinkle) [06:28:48] 10DBA, 10MediaWiki-Database, 10Patch-For-Review: Drop blob_tracking and blob_orphans everywhere - https://phabricator.wikimedia.org/T59186 (10tstarling) [07:30:34] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 (10Marostegui) @Papaul as per our chat yesterday, this host is now depooled, silenced and MySQL is stopped. You are good to go to update the BIOS whenever you arrive to the DC!... [09:24:26] 10DBA, 10Operations, 10ops-eqiad: Disk #9 with errors on db1068 (s4 master) - https://phabricator.wikimedia.org/T201493 (10Marostegui) [09:24:58] 10DBA, 10Operations, 10ops-eqiad: Disk #9 with errors on db1068 (s4 master) - https://phabricator.wikimedia.org/T201493 (10Marostegui) p:05Triage>03Normal [09:54:17] It is funny almost non core host has repl@10.% even though that is what puppet says as the repl grant [09:54:35] Only mX hosts do, as you did them from "scratch" [09:54:43] So they are in a better state [09:55:03] 10.% may be a little wide [09:55:17] probably should be the X.X.% subnets [09:55:19] I am fine with 10.64 and 10.192 actually [09:55:59] then we should change it in production.sql.erb [09:56:05] to have consistency [09:56:06] yes [09:57:53] Bug #25928471: ONLINE ALTER AND CONCURRENT DELETE ON TABLE WITH MANY TEXT COLUMNS CAUSES CRASH [09:58:07] sounds familiar? [09:58:29] It has been fixed? [09:59:52] I am going to package 10.1.35 [10:02:06] https://gerrit.wikimedia.org/r/#/c/451273/ [10:02:59] not to be done now, but probably we should make those grant files in a single line for easier grepping [10:03:31] yeah, indeed [10:03:48] and well, fix many other related things [10:03:50] but one at a time [11:00:48] I depooled pc2004, and I am thinking that maybe I can also depool pc2005 at the same time really, nothing uses it [11:00:53] Any objection? [11:01:04] it is for T201387 [11:01:05] T201387: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 [11:01:21] I had a chat with papaul and he wants to do them today [11:01:51] And I think it is better to get them depooled at the same time to avoid having to wait for the train to finish and all that as it is likely it will be at the same hour [11:04:55] so I said before disconnecting [11:05:05] [13:01] sure, but keep it working (pointing to the active) [11:05:07] [13:01] even if it is only 1 [11:05:08] [13:02] it is touched for monitoring purposes [11:05:10] [13:02] and pyball icinga will complain [11:05:11] [13:02] *and [11:06:15] yeah, I will point everythng to the same one [11:09:10] https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/451293/ [11:10:43] see my comment [11:10:50] I am not depooling pc2006 [11:10:56] oh [11:11:01] only pc2005 [11:11:05] pc2004 was depooled before [11:11:19] i am pointing pc2004 and 2005 to pc2006 [11:11:22] oh, sorry [11:11:33] the comments were confusing to me [11:11:39] ok +1 [11:11:42] \o/ [11:11:43] Thanks [11:12:51] well, you should have mentioned the change of pc2004, but minor issue [11:13:11] it was just ammending a previous commit to clarify it [11:14:14] I am going to do a rolling restart of labsdbs [11:14:24] to upgrade mysql? [11:14:35] yes [11:14:40] cool [11:15:05] kernel, etc. [11:15:35] cool, I have also upgraded kernels on pc200X as they had to be rebooted [11:15:54] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 (10Marostegui) @Papaul pc2005 is also depooled and with MySQL down. You can upgrade pc2004 and pc2005 at the same time Thanks! [11:16:11] you can upgrade mysql on pc2* as they are not too critical [11:16:21] but I didn't want to do anyone until tested on other host [11:18:28] they are running 0.34 already [11:19:03] pc are still on jessie [11:19:09] ah [11:19:26] yes, we were waiting for the hw upgrade [11:19:32] then no need to touch them [11:26:04] :) [11:29:24] I am going to kill a screen of yours on labsdb1009, seem a manual execution of check_private data, successful [11:30:02] go for it! [11:30:11] reminder to set up the query killer (I always forget!) [11:30:26] I just noticed the /home/marostegui/pt-kill-patched [11:30:34] I will try to productionize it [11:30:51] yeah, still running that because the new version with the patch hasn't been released [11:30:54] will call it pt-kill-wikimedia [11:31:08] like the pt-heartbeat-wikimedia [11:31:14] and document it [11:32:34] also I may have to take 3 more coffees to let mysql_upgrade run [11:32:51] XDDDDDDD [11:36:37] https://lefred.be/content/php-7-2-8-mysql-8-0/ [11:37:27] the new authentication is causing lots of headaches because people are not running connectors that support them [11:37:48] I wanted to backport pymysql 0.8.0 for that (an other reasons) from testing [12:01:12] 10DBA, 10monitoring, 10Patch-For-Review, 10Wikimedia-Incident: Monitor read_only variable and/or uptime on database masters, make it page - https://phabricator.wikimedia.org/T172489 (10Marostegui) I think we should not include the monitoring of the uptime on this task, but just the read_only scope. Don't t... [12:03:31] 10DBA, 10monitoring, 10Patch-For-Review, 10Wikimedia-Incident: Monitor read_only variable and/or uptime on database masters, make it page - https://phabricator.wikimedia.org/T172489 (10jcrespo) > I think we should not include the monitoring of the uptime on this task, but just the read_only scope. Who sai... [12:04:47] 10DBA, 10monitoring, 10Patch-For-Review, 10Wikimedia-Incident: Monitor read_only variable and/or uptime on database masters, make it page - https://phabricator.wikimedia.org/T172489 (10Marostegui) "jcrespo renamed this task from Monitor read_only variable and/or uptime on atabase masters, make it page to M... [12:06:17] 10DBA, 10monitoring, 10Patch-For-Review, 10Wikimedia-Incident: Monitor read_only variable and/or uptime on database masters, make it page - https://phabricator.wikimedia.org/T172489 (10jcrespo) I think the task should be renamed to "Monitor read_only on all databases, make it page for masters" [12:07:04] so wikireplicas are actually in read_write [12:07:14] 10DBA, 10monitoring, 10Patch-For-Review, 10Wikimedia-Incident: Monitor read_only variable and/or uptime on database masters, make it page - https://phabricator.wikimedia.org/T172489 (10Marostegui) That is fine by me - I just wanted to give my opinion about what the current task description says. [12:07:15] not sure if on purpose [12:08:00] maybe the create view scripts lack SUPER ? [12:09:27] No, maintainviews should have SUPER as far as I remember [12:09:50] Grants for maintainviews@localhost | [12:09:53] +----------------------------------------------------------------------------------------------------------------------+ [12:09:56] | GRANT SUPER ON *.* TO 'maintainviews'@'localhost' [12:09:57] so the config says read_only => ON [12:10:03] but they are 0 on file [12:11:09] read_only=0 is hardcoded at labsdb-replica.my.cnf.erb [12:11:30] I guess a leftover from labsdb1001/3 migration? [12:11:37] blame says Jaime Crespo 2016-07-26 11:28:52 +0200 [12:11:44] xdddd [12:11:49] not touched after 2016 [12:12:05] so I guess intended to be 1? not sure [12:12:13] Yeah, make it 1 [12:12:24] I will make it configurable [12:12:39] and add cloud people [12:12:46] so they can evaluate [12:57:45] 10DBA, 10monitoring, 10Patch-For-Review, 10Wikimedia-Incident: Monitor read_only on all databases, make it page on masters - https://phabricator.wikimedia.org/T172489 (10jcrespo) [13:26:47] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 (10Marostegui) [13:34:37] 10DBA, 10Data-Services, 10Toolforge, 10Tracking: Certain tools users create multiple long running queries that take all memory and/or CPU from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601 (10jcrespo) [13:39:49] 10DBA, 10Operations, 10decommission, 10ops-codfw: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Papaul) [13:49:21] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 (10Papaul) [13:49:55] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 (10Papaul) 05Open>03Resolved @Marostegui complete closing the task. Thanks [13:50:35] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Upgrade pc2004 and pc2005 BIOS - https://phabricator.wikimedia.org/T201387 (10Marostegui) Thanks - I will take it from here, to repool the servers once they've caught up! [15:24:53] 10DBA, 10Operations, 10decommission, 10ops-codfw: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Papaul) [15:47:15] 10DBA, 10Operations, 10decommission, 10ops-codfw: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Papaul) ``` show interfaces ge-6/0/12 Physical interface: ge-6/0/12, Administratively down, Physical link is Down Interface index: 1212, SNMP ifIndex: 76... [15:47:32] 10DBA, 10Operations, 10decommission, 10ops-codfw: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Papaul) [16:07:48] 10DBA, 10Operations, 10decommission, 10ops-codfw, 10Patch-For-Review: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Papaul) [16:08:08] 10DBA, 10Operations, 10decommission, 10ops-codfw, 10Patch-For-Review: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Papaul) 05Open>03Resolved This is complete resolving it. [16:08:15] 10DBA, 10Operations, 10decommission, 10ops-codfw, 10Patch-For-Review: db2064 crashed and totally broken - decommission it - https://phabricator.wikimedia.org/T195228 (10Marostegui) https://gerrit.wikimedia.org/r/#/c/operations/dns/+/451362/ merged and deployed [16:53:06] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10Harej) I've worked with @awight on a document describing JADE's requirements and possible implementation... [16:53:45] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [16:54:44] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [16:57:08] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [16:57:40] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [17:09:25] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) > be curated (patrolled, deleted, etc.) within MediaWiki. The must important question is: how i... [17:37:30] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4489021, @daniel wrote: >> be curated (patrolled, deleted, etc.) within MediaWiki.... [18:35:49] 10DBA, 10Operations, 10ops-eqiad: Disk #9 with errors on db1068 (s4 master) - https://phabricator.wikimedia.org/T201493 (10Cmjohnson) @Marostegui swapped the disk with a new one please resolve once raid rebuilds [18:52:13] 10DBA, 10Operations, 10ops-eqiad: Disk #9 with errors on db1068 (s4 master) - https://phabricator.wikimedia.org/T201493 (10Marostegui) @Cmjohnson it failed - can you pull the disk out and then back in? We have seen that happening before. Let's give it a second chance (I assume it is one of the new disks) `... [18:54:47] 10DBA, 10Operations, 10ops-eqiad: Disk #9 with errors on db1068 (s4 master) - https://phabricator.wikimedia.org/T201493 (10Marostegui) [19:02:58] 10DBA, 10Operations, 10ops-eqiad: Disk #9 with errors on db1068 (s4 master) - https://phabricator.wikimedia.org/T201493 (10Marostegui) ``` root@db1068:~# megacli -PDRbld -ShowProg -PhysDrv [32:9] -aALL Rebuild Progress on Device at Enclosure 32, Slot 9 Completed 2% in 5 Minutes. ``` [19:39:57] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) > That sounds great to us, but we've put quite a bit of time into planning how to prevent abuse a... [21:30:29] 10DBA, 10Toolforge: issue with accessing wbc_entity_usage from tools-db - https://phabricator.wikimedia.org/T201563 (10eranroz) [21:35:51] 10DBA, 10Toolforge: issue with accessing wbc_entity_usage from tools-db - https://phabricator.wikimedia.org/T201563 (10eranroz) Possibly related to T144010? select view_definition from information_schema.views where table_schema='enwiki_p' and table_name='wbc_entity_usage'; ``` select `enwiki`.`wbc_entity_usa... [21:39:56] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4489550, @daniel wrote: > My point was that "must be editable and watchable" prett... [21:57:53] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) > are you just saying that pages on a central wiki seems reasonable, or that you think judgments... [22:19:55] 10DBA, 10JADE, 10Operations, 10TechCom-RFC, 10Scoring-platform-team (Current): Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4490106, @daniel wrote: > How do you feel about having a public discussion on this...