[00:03:05] 10DBA, 10Collaboration-Team-Triage, 10Growth-Team, 10StructuredDiscussions, and 2 others: Enable Flow on wikitech (labswiki and labtestwiki), then turn on for Tool talk namespace - https://phabricator.wikimedia.org/T127792 (10bd808) [05:08:28] 10Blocked-on-schema-change, 10DBA, 10Wikidata, 10Schema-change: Drop eu_touched in production - https://phabricator.wikimedia.org/T144010 (10Marostegui) Similar to T144010#2737391 I have renamed the column on db1110 in eqiad, which is a host that receives reads. We'll see if some errors arise. ``` root@db1... [08:02:41] 10DBA, 10MediaWiki-Database, 10MediaWiki-Debug-Logger: Log the query that caused a lock timeout - https://phabricator.wikimedia.org/T198755 (10jcrespo) I don't think "interactive logging" would be easy to implement- the query knows if it cannot continue or if it reaches a timeout, but it is not notified of w... [08:40:06] 10DBA, 10MediaWiki-Database, 10Operations: Evaluate and decide the future of relational datastore at WMF after the upgrade of MariaDB 10.1 is finished - https://phabricator.wikimedia.org/T193224 (10jcrespo) [08:40:19] 10DBA, 10Patch-For-Review: Test MySQL 8.0 with production data and evaluate its fit for WMF databases - https://phabricator.wikimedia.org/T193226 (10jcrespo) 05Open>03stalled This is stalled until we implement a way to mix different GTID implementation on the same section: e.g. T172497#4309959 [08:42:56] 10DBA, 10MediaWiki-Configuration, 10Operations, 10Patch-For-Review, 10User-Joe: Create tool to handle the state of database configuration in MediaWiki in etcd - https://phabricator.wikimedia.org/T197126 (10jcrespo) [08:42:59] 10DBA, 10monitoring, 10Epic, 10Patch-For-Review, 10Wikimedia-Incident: Reduce false positives on database pages - https://phabricator.wikimedia.org/T177782 (10jcrespo) [08:43:35] 10DBA, 10MediaWiki-Configuration, 10Operations, 10Patch-For-Review, 10User-Joe: Create tool to handle the state of database configuration in MediaWiki in etcd - https://phabricator.wikimedia.org/T197126 (10jcrespo) [08:43:48] 10DBA, 10monitoring, 10Epic, 10Patch-For-Review, 10Wikimedia-Incident: Reduce false positives on database pages - https://phabricator.wikimedia.org/T177782 (10jcrespo) [08:44:02] 10DBA, 10MediaWiki-Configuration, 10Operations, 10Patch-For-Review, 10User-Joe: Create tool to handle the state of database configuration in MediaWiki in etcd - https://phabricator.wikimedia.org/T197126 (10jcrespo) [08:44:05] 10DBA, 10monitoring, 10Epic, 10Patch-For-Review, 10Wikimedia-Incident: Reduce false positives on database pages - https://phabricator.wikimedia.org/T177782 (10jcrespo) [08:55:58] jynus: good morning, would you have time to apply a grant change? (https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/445421/ , not yet puppet-merged, related to the terbium/wasat -> mwmaint* migrations) [08:56:47] let me see [08:58:04] please merge, I will deploy the changes [08:59:00] thanks, doing that now [09:00:55] merged [10:33:07] there was another place where the terbium IP was used in a grant definition, there's a comment that this file is not yet used, if it can be confirmed that this is still the case, I could simply merge the patch without DBA involvement: https://gerrit.wikimedia.org/r/445590 [10:34:04] let me check [10:34:23] there is what it should be there, what it is puppetized and reality [10:34:42] we are right now on baby steps to monitor properly the 3 [10:35:44] I think that should have the host that is serving dbtree [10:35:49] that you mentioned some time ago [10:36:06] did you do that change that you mention some time ago, moritzm? [10:36:27] of serving dbtree from mwmaint1001 ? [10:37:32] mmh, dbtree.wikimedia.org gives me an error, checking [10:38:04] so that's caused by the outdated grant? [10:39:54] probably [10:40:14] so did you repoint the server to mwmaint1001? [10:40:19] I am asking, it is ok [10:40:28] just to know which grant I should add [10:40:40] yeah, that was switched to mwmaint1001 [10:40:43] Uncaught Error: Call to undefined function mysql_connect() in /srv/dbtree/index.php:31\nStack trace:\n#0 /srv/dbtree/inc/sanity.php(279): db()\n#1 /srv/dbtree/inc/sanity.php(951): sql->__construct('tendril.servers')\n#2 /srv/dbtree/inc/tree.php(88): sql::query('tendril.servers')\n#3 /srv/dbtree/index.php(41): Tree->generate()\n#4 {main}\n thrown in /srv/dbtree/index.php on line 31 [10:40:52] cool [10:40:59] so that has to happen [10:41:16] but then it has to be substituted by mwmaint1001's ip [10:41:25] on both backends [10:42:13] this is why I would like to server everything from dbmonitor web servers [10:42:25] as tendril is more frequently tested [10:42:47] also we may have to add mwmaint2001 ? [10:42:55] if we do a codfw failover [10:43:08] currently it only has an eqiad backend configured: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/445400/ [10:43:26] there's https://phabricator.wikimedia.org/T163141 to change that [10:43:27] sadly, tendril, the application [10:43:35] doesn't support multiple db backends [10:43:49] or dbtree, which is the same [10:44:05] ah, right. that's also mentioned at https://phabricator.wikimedia.org/T163141#4293775 [10:44:06] that is one of the many reasons I want to substitute [10:44:12] it [10:44:18] for something that can run on both dcs [10:46:21] so actually, mwmaint1001 is on grants [10:46:54] who is then trying to connect and fails? [10:47:09] maybe it uses a different server [10:47:13] *user name [10:47:34] I'm comparing with what's on terbium [10:47:45] I think I get it [10:47:59] there are 29 users, probably only 6 are actually useful [10:49:38] the user configured for DB access in /srv/dbtree is "tendril_web" [10:49:57] <_joe_> I have a UX question for you, marostegui jynus [10:50:27] UX lol? [10:50:30] _joe_: one sec [10:50:35] <_joe_> the context is db configuration via etcd, and the cli tool :D [10:50:35] we have a low priority outage [10:50:42] <_joe_> oh sorry, later :D [10:50:45] <_joe_> I can wait [10:50:59] but more priority than I guess new features [10:51:36] moritzm: so I have added the account with the same name as the existing one, and it still doesn't work [10:56:03] moritzm: so I can loging manually [10:56:08] *log in [10:56:27] I am going to curl from the localhost [10:56:43] jynus: it's actually an error in the dbtree application [10:56:54] it needs to be modified for PHP5 -> PHP7 changes [10:57:36] it uses mysql_connect() which is removed in 7 (while it used to work on terbium which is using PHP 5 still) [10:58:11] pfff [10:58:35] I am not going to fix an application that should die [10:58:41] I'll switch back to tendril until that's fixed on the source level [10:58:44] specially if it is using mysql [10:58:54] (the bad extension) [10:59:21] I prefer to create my replacement in python [10:59:26] I can do that, but not today [10:59:50] yeah, it seems to use the mysql extension [11:00:23] I'll switch back to terbium and open a Phab task? [11:00:32] I think this is a dejavu [11:00:46] I have had this issue before [11:02:17] didn't at least find any reference to this in Phabricator [11:02:18] in fact, on my python mini-framework, I abstract .connect() and .execute() to be able to change connector easily [11:02:32] let me try myself at least once [11:02:59] but do you understand why I want to get rid of that? [11:03:09] fully understood :-) [11:03:11] google references, outated php code [11:03:25] it was actually ok a few years ago [11:03:38] not sure how often dbtree is used, we can also choose to keep it broken until a replacement is available [11:03:48] I would ask [11:04:08] for me the privacy issues seem too large to keep it up [11:04:30] but maybe we can recruit some d3/js wizard to help [11:04:50] I am confortable with the backend, but no so much with the frontend [11:05:59] there's also the new code stewardship process managed by RelEng, maybe this is a good candidate for this [11:05:59] I wonder, however, how tendril works, if it runs on stretch [11:06:27] it is not stretch [11:06:33] so here is my proposal, moritzm [11:06:40] it will unblock you, but I may need help [11:06:53] move dbtree to dbmonitor [11:07:10] which is jessie and has to run anyway mysql for tendril to work [11:07:37] I don't think there is a good reason to run dbtree on mwmaint [11:07:54] the same there was not a good reason to run tendril on einstenium [11:08:13] this will give us a few months to have a replacement ready [11:08:29] and we put all problematic code on a single place :-) [11:09:03] ack, sounds good! I'll first rollback the current setup to terbium and then work on patches to move dbtree to dbmonitor [11:09:21] I will help, just I may need backup [11:09:33] I don't know if it is already on a separate profile [11:10:06] no, it's just a class, but we can fix this on the way [11:10:16] let me finish a reimage I have on the fly [11:10:43] so it doesn't alert [11:10:49] and I will do a patch proposal [11:11:20] ack, I'm reverting the patch to switch from terbium to mwmaint1001 in the mean time [11:47:34] FYI, Varnishes picked up the config change, https://dbtree.wikimedia.org/ now served from terbium again [11:49:31] I've started separating dbtree on its own class [11:49:53] but you may be more familiar with the apache-related classes? [11:58:49] ack, I need to wrap up other bits now, I'll have a look later or Monday morning [11:59:49] I'll still open a task for dbtree/php5 incompatbility (if only to have it on record for the stewardship process or so) [16:22:41] 10DBA, 10Core-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 4 others: Deploy MCR storage layer - https://phabricator.wikimedia.org/T174044 (10daniel) [20:21:14] 10DBA, 10Operations, 10decommission, 10ops-eqiad: Decommission db1051 - https://phabricator.wikimedia.org/T195484 (10RobH) [20:28:53] 10DBA, 10Operations, 10decommission, 10ops-eqiad, 10Patch-For-Review: Decommission db1054 - https://phabricator.wikimedia.org/T197063 (10RobH) a:05Cmjohnson>03None [20:37:48] 10DBA, 10Operations, 10decommission, 10ops-eqiad, 10Patch-For-Review: Decommission db1054 - https://phabricator.wikimedia.org/T197063 (10RobH) [20:41:24] 10DBA, 10Operations, 10decommission, 10ops-eqiad: Decommission db1054 - https://phabricator.wikimedia.org/T197063 (10RobH) [20:41:37] 10DBA, 10Operations, 10decommission, 10ops-eqiad: Decommission db1054 - https://phabricator.wikimedia.org/T197063 (10RobH) a:03Cmjohnson