[06:48:37] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: db2060 crashed (RAID controller) - https://phabricator.wikimedia.org/T154031#3024172 (10Marostegui) [06:48:40] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: db2060 not accessible - https://phabricator.wikimedia.org/T156161#3024170 (10Marostegui) 05Open>03Resolved I am going to close this for now, but will leave the server depooled for the next few days. If we see this happening again we'll reopen it Th... [08:19:01] 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#3024218 (10Marostegui) Hello, I thought I would update the ticket with the last findings about gtid+multi source. Looks like what we faced here: T146261#2744128 is more than an issue that we t... [08:45:56] 10DBA: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3024248 (10Marostegui) @jcrespo I would like to run a first iteration of pt-table-checksum on m3 maybe, to see how it goes. I would be using the dsns side table as I stated above to see ho... [08:58:18] 07Blocked-on-schema-change, 10DBA, 06Community-Tech, 10MediaWiki-extensions-PageAssessments: Update page_assessments_projects schema for subprojects in production - https://phabricator.wikimedia.org/T156305#3024274 (10jcrespo) A script has to be run on labs, please file a separate task for them, it is not... [09:28:29] db1026 has high rate of errors [09:30:13] it is a bot on a single page [09:30:16] let's see.. [09:30:17] ah [09:30:59] doing one bad query every 2 seconds [09:32:04] is there anything we can do about that? [09:39:23] well, I do not think it is a problem, as it is only making things fail for itself, not for others [09:42:05] 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#3024344 (10JAllemandou) @Marostegui : Waow ... This is nasty! [09:53:38] 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#3024373 (10jcrespo) I think it is just easier to change domain_id everywhere, wait 15 plus days. CHANGE MASTER on the multisource slaves including the coords. Then enabling GTID. [09:57:13] 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#3024376 (10Marostegui) >>! In T149418#3024373, @jcrespo wrote: > I think it is just easier to change domain_id everywhere, wait 15 plus days. CHANGE MASTER on the multisource slaves including th... [10:12:27] 10DBA: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3024425 (10jcrespo) Ok, but do not create percona databases on production, it is a source of garbage and security issues. I used separate tables per database called __wmf_checksums in the... [10:15:34] 10DBA: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3024428 (10Marostegui) >>! In T154485#3024425, @jcrespo wrote: > Ok, but do not create percona databases on production, it is a source of garbage and security issues. I used separate table... [10:19:52] 10DBA, 06Collaboration-Team-Triage, 06Community-Tech-Tool-Labs, 10Flow, and 5 others: Enable Flow on wikitech (labswiki and labtestwiki), then turn on for Tool talk namespace - https://phabricator.wikimedia.org/T127792#2054159 (10MarcoAurelio) Please, just don't. Wikitext talk pages are working just fine.... [11:20:29] 10DBA, 07Wikimedia-log-errors: Under high load, there is replication check pile-ups on coredbs, specially enwiki API servers - https://phabricator.wikimedia.org/T150474#3024666 (10jcrespo) a:03jcrespo [11:28:22] 10DBA, 13Patch-For-Review: Deploy gtid_domain_id flag in our mysql hosts - https://phabricator.wikimedia.org/T149418#3024674 (10Marostegui) I have done a few more tests to see what we can get by even reseting the slaves, but not good news: https://jira.mariadb.org/browse/MDEV-12012?focusedCommentId=91836&page=... [11:30:01] you changing pt-heartbeat to innodb? [11:30:15] yes [11:30:16] not ok? [11:30:18] \o/ [11:30:24] no no, more than ok! :) [11:30:30] or just want to be ready in case I break production? [11:30:46] A bit of both, just making sure I was aware of that change :) [11:30:49] I trust you! [11:30:58] (and you will say you don't trust yourself?) [11:31:24] well, I am monitoring replica lag [11:31:42] and errors, that should be enough [11:32:00] I will also convert it 7 times on dbstores and labs, that will make my life easier [11:33:22] it is causing some contention on high used servers [11:33:49] but better do it once and getting it fixed forever [11:37:12] yeah [11:37:13] agreed [11:37:14] :) [12:06:21] 07Blocked-on-schema-change, 06Collaboration-Team-Triage, 10Notifications, 13Patch-For-Review, 07Schema-change: Add primary key to echo_notification table - https://phabricator.wikimedia.org/T136428#3024753 (10Marostegui) From x1 the following hosts are done dbstore1001,1002 dbstore2001 db2033 db1029 Pe... [12:18:03] 10DBA, 06Operations, 10Phabricator, 06Release-Engineering-Team: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#3024779 (10Marostegui) 05Open>03Resolved a:03Marostegui I am going to close this for now as the short-term solution was to move search to ES and this hasn't... [12:20:34] 10DBA, 06Operations, 10Phabricator, 06Release-Engineering-Team: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#3024783 (10Paladox) Elasticsearch is actually a long term fix. I think we have permanently moved to it. We were preparing to move to it before. :) [12:21:45] 10DBA, 07Wikimedia-log-errors: Under high load, there is replication check pile-ups on coredbs, specially enwiki API servers - https://phabricator.wikimedia.org/T150474#3024786 (10jcrespo) ```lines=10 cat *.hosts | sort | uniq | while read host port; do echo $host $port; mysql --skip-ssl -A -h $host -P $port h... [12:22:11] 10DBA, 06Operations, 10Phabricator, 06Release-Engineering-Team: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#3024788 (10Marostegui) >>! In T156905#3024783, @Paladox wrote: > Elasticsearch is actually a long term fix. I think we have permanently moved to it. We were prepa... [12:23:07] 10DBA, 06Operations, 10Phabricator, 06Release-Engineering-Team: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#3024790 (10Paladox) Yep, @mmodell has improved searching by a lot :) [12:33:45] 10DBA, 07Wikimedia-log-errors: Under high load, there is replication check pile-ups on coredbs, specially enwiki API servers - https://phabricator.wikimedia.org/T150474#3024830 (10jcrespo) Done the labs ones, db1052, db1095, db1057. The rest will need depooling. [13:05:54] 10DBA: Create a whitelist of tables to checksum on all wikis - https://phabricator.wikimedia.org/T104310#1413199 (10Marostegui) This list of tables that Jaime mentioned look the same at the time of the ticket, so probably should keep excluded from the list, because they either remain without a PK or are too mass... [13:10:58] 10DBA: Create a whitelist of tables to checksum on all wikis - https://phabricator.wikimedia.org/T104310#3024908 (10jcrespo) The thing is, revision and text are key tables, and those should be checked. Many other tables, except maybe user, page and others are not that important, those 2 are. They have PKs, so ma... [13:21:23] there is something really wrong [13:21:35] with what? [13:21:48] a wikiuser connection had a transaction open having read the heartbeat table [13:21:59] and even if it had been depooled [13:22:06] it blocked the alter [13:22:07] which server? [13:22:10] db1051 [13:22:20] but if it happens there, it will happen everywhere [13:22:30] it is not that there is a lot of traffic [13:22:44] it is there is something blocking it even more- that is the source of all alter problems [13:22:45] and you had to kill it? [13:22:48] yes [13:23:11] could it be a race condition between the alter and the already existing connection fighiting for the metadata? [13:23:17] 10.64.32.33:49458 [13:23:23] no, it is depooled [13:23:25] well actually not because the alter gave up I assume? [13:23:32] ah, it got it…when depooled? [13:23:37] yes [13:23:41] :| [13:23:45] that is why I said it is horrible [13:23:46] maybe some server not getting the code? [13:23:54] some kind of persistent connections, maybe [13:24:05] but persistent connections are not that a problem [13:24:14] unless they leave a transaction open [13:24:52] I even stopped the slave thinking it was replication doing something weird [13:24:53] it wasn't [13:24:57] it was wikiuser [13:25:03] when I killed, instant alter [13:25:12] did you check if that mw1163 had the latest db-eqiad.php? [13:25:14] just in case? [13:25:25] no, I didn't [13:25:39] I want to finish the alter first [13:25:41] yeah [13:25:49] let me check it [13:26:07] it is depooled there [13:27:00] 51 is a wl, not a dump/vslow [13:27:47] yeah I see.. [13:28:34] I'm going for lunch, will continue the rolling apply later [13:28:43] ok, is there anything you want me to take over to? [13:29:04] not really, there are only a few left [13:29:29] ok :) [13:29:33] enjoy lunch [13:58:10] 10DBA: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3025051 (10Marostegui) We will need to also make sure we do not include hosts running RBR (ie: sanitarium masters): ``` pt-table-checksum requires statement-based replication, and it sets... [14:10:11] I have cleaned up labsdb1005 to prepare it for backup [14:18:33] 10DBA: Create a whitelist of tables to checksum on all wikis - https://phabricator.wikimedia.org/T104310#3025138 (10Marostegui) >>! In T104310#3024908, @jcrespo wrote: > The thing is, revision and text are key tables, and those should be checked. Many other tables, except maybe user, page and others are not that... [14:43:10] 10DBA: Create a whitelist of tables to checksum on all wikis - https://phabricator.wikimedia.org/T104310#3025237 (10jcrespo) >>! In T104310#3025138, @Marostegui wrote: > Maybe adjusting the chunk-size manually instead of leaving the tool to automatically adjust it itself. As we need to decomm servers from pretty... [14:47:53] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: Change rack for servers in s1 in codfw - https://phabricator.wikimedia.org/T156478#3025240 (10Marostegui) @Papaul you think we can do db2062 sometime this week? Thanks! [14:50:18] 10DBA, 13Patch-For-Review, 07Wikimedia-log-errors: Under high load, there is replication check pile-ups on coredbs, specially enwiki API servers - https://phabricator.wikimedia.org/T150474#3025242 (10jcrespo) 05Open>03Resolved This is now done- there was indeed contention here- if the conversion to innod... [14:56:43] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: Change rack for servers in s1 in codfw - https://phabricator.wikimedia.org/T156478#3025254 (10Papaul) We can today. [14:57:40] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: Change rack for servers in s1 in codfw - https://phabricator.wikimedia.org/T156478#3025257 (10Marostegui) Awesome, I will depool it and get it ready to be moved Thanks! [15:09:08] 10DBA, 06Labs, 10Labs-Infrastructure, 10Tool-Labs, 10Wikimedia-Developer-Summit (2017): Labsdbs for WMF tools and contributors: get more data, faster - https://phabricator.wikimedia.org/T149624#3025336 (10jcrespo) 05Open>03Resolved I am going to resolve this as fixed, even if we have yet to finish th... [15:11:00] 10DBA, 06Labs, 10Labs-Infrastructure, 07Epic, 07Tracking: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788#2475959 (10jcrespo) [15:11:03] 10DBA, 06Labs, 10Labs-Infrastructure: Having lots of accounts with separate grants makes auditing difficult. - https://phabricator.wikimedia.org/T141096#3025345 (10jcrespo) 05Open>03Resolved a:03jcrespo This is now done- users are handled with roles + accounts are handled on a database. Resolving mysel... [15:13:20] 10DBA, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Create a cronjob/check to run check_private_data data script and report back - https://phabricator.wikimedia.org/T153680#3025371 (10Marostegui) I have pushed that change and I have run puppet on labsdb1009 and the change is looking good. We will se... [15:14:26] 10DBA, 10Datasets-General-or-Unknown, 06Release-Engineering-Team, 13Patch-For-Review, and 2 others: Automatize the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459#3025377 (10jcrespo) a:05jcrespo>03None Unc... [15:16:27] 10DBA, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: implement performance_schema for mysql monitoring - https://phabricator.wikimedia.org/T99485#3025383 (10jcrespo) 05Open>03Resolved Performance schema is finally on all servers, and so does the sys schema. I will close this becaus... [15:17:52] 10DBA, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: implement performance_schema for mysql monitoring - https://phabricator.wikimedia.org/T99485#3025389 (10jcrespo) [15:17:55] 10DBA, 06Operations, 05Prometheus-metrics-monitoring: Decide storage backend for performance schema monitoring stats - https://phabricator.wikimedia.org/T119619#3025386 (10jcrespo) 05stalled>03Resolved a:03jcrespo We will finaly go, because of privacy concerns, for a private prometheus instance for the... [15:18:01] 10DBA, 06Operations, 10Traffic, 06WMF-Legal, and 2 others: dbtree loads third party resources (from jquery.com and google.com) - https://phabricator.wikimedia.org/T96499#3025390 (10jcrespo) [15:19:33] 10DBA, 07Epic, 07Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#3025392 (10jcrespo) a:05jcrespo>03None Unclaiming because we will work on the individual tickets, one at a time. [15:23:33] 10DBA, 06Labs, 06Operations, 07Tracking: Database replication problems - production and labs (tracking) - https://phabricator.wikimedia.org/T50930#3025409 (10jcrespo) [15:23:36] 10DBA, 06Labs, 10Labs-Infrastructure, 07Chinese-Sites: Lost database changes on s2 for 3 hours on labs replicas - https://phabricator.wikimedia.org/T129432#3025407 (10jcrespo) 05Open>03declined labsdb1001 and labsdb1003 will be decommissioned at some point in the near future- this will be fixed, but no... [15:24:53] 10DBA, 06Labs, 10Labs-Infrastructure, 07Chinese-Sites: Lost database changes on s2 for 3 hours on labs replicas - https://phabricator.wikimedia.org/T129432#3025411 (10jcrespo) Also, the initial report was fixed, it was reopened because other issues tarted to get mixed there. [15:25:55] 10DBA, 10Analytics, 13Patch-For-Review: Set up auto-purging after 90 days {tick} - https://phabricator.wikimedia.org/T108850#3025412 (10jcrespo) a:05jcrespo>03None [15:30:32] 10DBA, 06Operations, 10ops-codfw: Several es20XX servers keep crashing (es2017, es2019, es2015, es2014) since 23 March - https://phabricator.wikimedia.org/T130702#3025417 (10jcrespo) a:05jcrespo>03None No crashes in the last 4 months, it seems? [15:32:23] 10DBA, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Get HA db support for labs internal services - https://phabricator.wikimedia.org/T126251#3025423 (10jcrespo) a:05jcrespo>03None Blocked on performing a failover to the proxy by service owners, although I am not sure if failing over to codfw is a... [15:33:01] 10DBA, 06Operations: Populate the wikishared db on all dbstores - https://phabricator.wikimedia.org/T126252#3025427 (10jcrespo) p:05Normal>03Low a:05jcrespo>03None [15:35:03] 10DBA, 06Operations: Populate the wikishared db on all dbstores - https://phabricator.wikimedia.org/T126252#3025435 (10jcrespo) [15:36:06] 10DBA, 06Operations, 13Patch-For-Review: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#3025436 (10jcrespo) a:05jcrespo>03None Let's have a look soon at the decom plan and paste it here when we are happy. [15:36:56] 10DBA, 06Operations: Puppetize grants for mysql analytics servers - https://phabricator.wikimedia.org/T114476#3025440 (10jcrespo) a:05jcrespo>03None [15:39:51] 10DBA, 06Operations: Decommission db1015, db1035 and db1044 - https://phabricator.wikimedia.org/T148078#3025442 (10jcrespo) p:05Normal>03High a:05jcrespo>03None High because they will complain of lack of space soon. [15:40:11] 10DBA, 06Operations, 10ops-codfw: Several es20XX servers keep crashing (es2017, es2019, es2015, es2014) since 23 March - https://phabricator.wikimedia.org/T130702#3025445 (10Marostegui) >>! In T130702#3025417, @jcrespo wrote: > No crashes in the last 4 months, it seems? Indeed, the uptimes are looking very... [15:40:54] 10DBA, 06Operations: Decommission db1015, db1035 and db1044 - https://phabricator.wikimedia.org/T148078#3025447 (10jcrespo) [15:40:57] 10DBA: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3025446 (10jcrespo) [15:40:59] 10DBA, 06Operations, 13Patch-For-Review: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#3025448 (10jcrespo) [15:41:11] 10DBA: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#2912830 (10jcrespo) p:05Normal>03High [15:41:38] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#2714228 (10jcrespo) [15:42:09] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3025456 (10Marostegui) db1044 is db1095's master, so we need to look for another candidate within the shard (and change it to ROW) and make sure it has the same content as db1044 otherwise we will br... [15:43:18] 10DBA, 06Operations, 07Chinese-Sites: Drop *_old database tables from Wikimedia wikis - https://phabricator.wikimedia.org/T54932#3025459 (10jcrespo) a:05jcrespo>03None [15:44:03] 10DBA: Remove AFT tables from the analytics slaves - https://phabricator.wikimedia.org/T92739#3025464 (10jcrespo) a:05jcrespo>03None [15:49:58] 10DBA, 06Labs: Lots of rows are missing from enwiki_p.`revision` - https://phabricator.wikimedia.org/T115207#3025476 (10jcrespo) 05Open>03declined I have given up on trying to resync production to the current labsdbs. It is almost impossible while they are in use. The soon to be setup servers wont have tha... [15:50:03] 10DBA, 06Labs, 06Operations, 07Tracking: Database replication problems - production and labs (tracking) - https://phabricator.wikimedia.org/T50930#3025478 (10jcrespo) [15:50:47] 10DBA, 10Analytics, 10Analytics-EventLogging: Potentially decrease db1046's InnoDB buffer pool - https://phabricator.wikimedia.org/T125829#3025480 (10jcrespo) a:05jcrespo>03None [15:51:44] 10DBA, 06Operations: Puppetize grants for mysql backups on dbstore hosts - https://phabricator.wikimedia.org/T111929#3025484 (10jcrespo) a:05jcrespo>03None [15:53:10] 10DBA, 06Labs, 10Tool-Labs-tools-Other, 13Patch-For-Review: High replication activity filled up labsdb1004 with binlogs - https://phabricator.wikimedia.org/T150553#3025500 (10jcrespo) 05stalled>03Resolved [15:56:58] 10DBA, 10Monitoring, 06Operations: Display lag on grafana (prometheus) and dbtree from pt-heartbeat instead (or in addition) of Seconds_Behind_Master - https://phabricator.wikimedia.org/T141968#3025531 (10jcrespo) a:05jcrespo>03None [15:59:00] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: Change rack for servers in s1 in codfw - https://phabricator.wikimedia.org/T156478#3025537 (10Papaul) @RobH we about to move db2062 in row D rack D6 to row B rack 5. I will like for you please if you have time to make some changes on both switches . o... [15:59:03] 10DBA, 06Operations: Adapt wmf-mariadb10 package for jessie or puppetize differently its service to adapt it to systemd - https://phabricator.wikimedia.org/T116903#3025538 (10jcrespo) 05stalled>03Open a:05jcrespo>03None This is now possible for 10.1, and I have packages for stretch that do that. We hav... [15:59:56] 10DBA, 06Operations: Drop the tables old_growth, hitcounter, click_tracking, click_tracking_user_properties from enwiki, maybe other schemas - https://phabricator.wikimedia.org/T115982#3025543 (10jcrespo) a:05jcrespo>03None [16:02:36] 10DBA, 06Operations, 13Patch-For-Review: Reduce memory commitment on database hosts with many objects, specially s3, dbstore/research and labs - https://phabricator.wikimedia.org/T107282#3025553 (10jcrespo) 05Open>03Resolved This was done on dbstore2 manifest. We have not seen reasons to do it on the oth... [16:03:10] 10DBA, 10MediaWiki-Database: blob_tracking indexes apparently unused - https://phabricator.wikimedia.org/T59186#3025557 (10jcrespo) a:05jcrespo>03None [16:03:39] 10DBA, 06Collaboration-Team-Triage, 06Operations: Move echo tables from local wiki databases onto extension1 cluster for mediawikiwiki, metawiki, and officewiki - https://phabricator.wikimedia.org/T119154#3025558 (10jcrespo) a:05jcrespo>03None [16:58:14] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3025774 (10jcrespo) Have a look at my shard planning, I think I had some options there. [17:00:51] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3025793 (10Marostegui) Yeah, you placed db1064 there as a master for it for s4. We need to make sure they have the same data as otherwise ROW will not like that [17:05:41] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025811 (10chasemp) On all three new servers :) ```2017-02-14 17:03:54,237 DEBUG SQL: CREATE OR REPLACE DEFINER=vie... [17:08:40] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025834 (10jcrespo) gnwikibooks? [17:09:50] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025835 (10jcrespo) Try now on labsdb1009. [17:11:48] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3025843 (10jcrespo) s4 is... special in that regard. [17:12:51] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3025853 (10Marostegui) worst case scenario we can move db1044's data to db1064 :-) [17:14:00] can I kill your check data? [17:14:11] yes [17:14:21] it was a test [17:14:23] it is making labsdb lag due to [17:14:30] the cronjob is supposed to run 1 day per week at 5am [17:14:31] alter table metadata locking [17:14:38] :( [17:15:31] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025866 (10chasemp) >>! In T153743#3025835, @jcrespo wrote: > Try now on labsdb1009. nope > root@labsdb1009:~# maintain-views --databases enwiki --replac... [17:16:20] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025868 (10jcrespo) Then this is a different issue than the `FLUSH STATUS` one. Checking. [17:21:39] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025880 (10Marostegui) >>! In T153743#3025841, @chasemp wrote: >>>! In T153743#3025834, @jcrespo wrote: >> gnwikibooks? > > It seems order of creation for... [17:24:09] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025885 (10jcrespo) Maybe it lacks the grants for replace (drop). I have dropped `[ugwiki_p]> drop view abuse_filter_action;`. Can you run the same query o... [17:26:31] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025893 (10chasemp) Yes, it does exist but to update the views it's easier to say "update all of them to match canonical source replacing any that already... [17:27:09] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025895 (10Marostegui) Could be but the grants are ALL as per this (labsdb1009) so I would assume the replace should go through no?: ``` GRANT ALL PRIVILEG... [17:28:30] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025898 (10chasemp) >>! In T153743#3025885, @jcrespo wrote: > Maybe it lacks the grants for replace (drop). I have dropped `[ugwiki_p]> drop view abuse_fil... [17:33:20] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025952 (10Marostegui) Silly question, are you connecting through `localhost` specifically? I am thinking about the issue we had in September where it was... [17:33:41] yes, he is at localhost [17:33:46] it complains @localhost [17:33:49] ok ok [17:34:30] i remember we had some issues during the offiste with localhost and loopback I believe, I cannot remember all the details and we had to grant something like 127.0.0.1 as well [17:34:35] that was back on the old labs servers [17:35:25] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3025971 (10chasemp) >>! In T153743#3025952, @Marostegui wrote: > Silly question, are you connecting through `localhost` specifically? > I am thinking about... [17:36:13] sorry this is such a pain gents :) I have to lunch for a bit [17:36:30] I tried to be verbatim on the task if you want to run those commands to recreate [17:42:51] I can reproduce the problem if I log as the user [17:45:24] CREATE VIEW enwiki_p.test AS SELECT 1; -> denied [17:47:05] even after the flush privileges? [17:47:05] ok, I think I got it [17:47:10] what is it? [17:47:20] for some reason, it needs CREATE VIEW and drop [17:47:26] in addition to SELECT [17:47:36] :? [17:47:50] but that wasn't there last week when chase tried :| [17:47:54] how did it work? [17:48:02] I think something is chaning the grants [17:50:48] why does it need create view and drop on the original dbs? [17:50:52] it makes no sense [17:51:05] but it works with it, it doesn't work without it [17:51:17] I am seeing that mysql was restarted 6 days ago [17:51:28] which is before we tried all this [17:51:37] so nothing that could've been changed after the restart or antyhing [17:53:50] ˜/jynus 18:50> why does it need create view and drop on the original dbs? -> I would have never guessed that. select should be more than enough! [17:57:40] "GRANT SELECT,DROP, CREATE VIEW ON `%wik%`.*" works [17:58:03] the same on `%wik%\_p`.* doesnt [18:00:00] I think I may know what is happening [18:00:23] show some light [18:00:33] there is a process creating users [18:00:49] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: Change rack for servers in s1 in codfw - https://phabricator.wikimedia.org/T156478#3026081 (10RobH) >>! In T156478#3025537, @Papaul wrote: > @RobH we about to move db2062 in row D rack D6 to row B rack 5. I will like for you please if you have time to m... [18:00:49] I am almost sure this user used to have the labsdb user role [18:01:16] the process may be revoking that role for this user because it is not a valid user? [18:01:19] let me try [18:01:58] ah right, we are using roles here [18:03:22] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: Change rack for servers in s1 in codfw - https://phabricator.wikimedia.org/T156478#3026085 (10Marostegui) db2062 has been moved to B5 DNS updated db-eqiad,codfw files updated mysql started replication started and server catching up tendril updated Than... [18:10:29] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3026114 (10jcrespo) I've added a workaround that makes no sense but that works for now ,we need to revisit it in the future. Maybe there is a bug on the la... [20:04:15] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: Migrate labsdb1005/1006/1007 to jessie - https://phabricator.wikimedia.org/T123731#3026797 (10yuvipanda) We announced a while ago we're gonna do this on the 15th. [20:05:56] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: Migrate labsdb1005/1006/1007 to jessie - https://phabricator.wikimedia.org/T123731#3026807 (10jcrespo) Let's use T157358 for this. Postgres is a different beast. [22:02:51] 07Blocked-on-schema-change, 10DBA, 06Community-Tech, 10MediaWiki-extensions-PageAssessments: Update page_assessments_projects schema for subprojects in production - https://phabricator.wikimedia.org/T156305#3027389 (10kaldari) Created new task at T158097.