[00:56:11] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3170256 (10Papaul) [05:56:50] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3170411 (10Marostegui) db1073 is now done: ``` root@neodymium:~# mysql --skip-ssl -hdb1073 enwiki -e "show create table revision\G" *************************** 1. row *... [06:00:56] 10DBA, 13Patch-For-Review: Unify revision table on s7 - https://phabricator.wikimedia.org/T160390#3170412 (10Marostegui) labsdb1001 is done: ``` [root@labsdb1001 05:59 /root] # for i in `cat s7_T160390`; do echo $i; mysql --skip-ssl $i -e "show create table revision\G" ; done arwiki ***************************... [06:49:48] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3170529 (10Marostegui) dbstore2001 is done: ``` root@neodymium:/home/marostegui# mysql --skip-ssl -hdbstore2001.codfw.wmnet enwiki -e "show create table revision\G" ***... [09:10:34] 10DBA: Unify revision table on s2 - https://phabricator.wikimedia.org/T162611#3170691 (10Marostegui) Started the ALTERs on dbstore2001. [10:36:52] 07Blocked-on-schema-change, 10Wikidata, 03Wikidata-Sprint: Deploy schema change for adding term_full_entity_id column to wb_terms table - https://phabricator.wikimedia.org/T162539#3170794 (10aude) I am not sure we are deploying new Wikidata code this week. (we normally deploy every other week) If you really... [12:16:48] 07Blocked-on-schema-change, 10Wikidata, 03Wikidata-Sprint: Deploy schema change for adding term_full_entity_id column to wb_terms table - https://phabricator.wikimedia.org/T162539#3171058 (10Marostegui) >>! In T162539#3170794, @aude wrote: > I am not sure we are deploying new Wikidata code this week. (we nor... [12:28:15] 10DBA: Network maintenance on row D - https://phabricator.wikimedia.org/T162681#3171082 (10Marostegui) [12:28:31] 10DBA: Network maintenance on row D - https://phabricator.wikimedia.org/T162681#3171082 (10Marostegui) [12:29:38] 10DBA: Network maintenance on row D (databases) - https://phabricator.wikimedia.org/T162681#3171100 (10ayounsi) [12:33:26] 10DBA, 13Patch-For-Review: Unify revision table on s7 - https://phabricator.wikimedia.org/T160390#3171119 (10Marostegui) db1041, the primary master has been altered: ``` root@neodymium:/home/marostegui# for i in `cat s7_T160390`; do echo $i; mysql --skip-ssl -hdb1041.eqiad.wmnet $i -e "show create table revisi... [12:36:42] 10DBA: Network maintenance on row D (databases) - https://phabricator.wikimedia.org/T162681#3171124 (10Marostegui) We want to do some master switchovers while eqiad is on sby, so we'd need to coordinate it too: T162133 [12:37:45] 07Blocked-on-schema-change, 10DBA, 05MW-1.28-release (WMF-deploy-2016-08-30_(1.28.0-wmf.17)), 05MW-1.28-release-notes, 13Patch-For-Review: Clean up revision UNIQUE indexes - https://phabricator.wikimedia.org/T142725#3171128 (10Marostegui) s7 (arwiki cawiki eswiki fawiki hewiki huwiki kowiki metawiki rowi... [12:49:39] 10DBA, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3171205 (10Marostegui) Let's coordinate with @ayounsi before attempting any switchover any of the masters to make sure T148506 and T162681 are not in the way of this. [13:11:12] 07Blocked-on-schema-change, 10Wikidata, 03Wikidata-Sprint: Deploy schema change for adding term_full_entity_id column to wb_terms table - https://phabricator.wikimedia.org/T162539#3171242 (10aude) @Marostegui I think starting around the ~20th (or whatever you think best) is good with us. [14:47:10] 10DBA, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 13Patch-For-Review, 03Wikidata-Sprint: [Task] Remove "all" option for Special:EntitiesWithout*" - https://phabricator.wikimedia.org/T161631#3137680 (10WMDE-leszek) > Done? ping @Lydia_Pintscher @daniel [14:48:35] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3171536 (10Cmjohnson) @Marostegui | db1096 |a1 **(no available u space — pick another location)** | db1097|d1 **(No issues)** | db1098|a2 **(Will definitely need a decom se... [14:50:59] 10DBA, 10Wikidata, 03Wikidata-Sprint: Wikibase\Repo\Store\Sql\SqlEntitiesWithoutTermFinder::getEntitiesWithoutTerm can take 19 hours to execute and it is run by the web requests user - https://phabricator.wikimedia.org/T160887#3171539 (10Lydia_Pintscher) [14:56:58] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3171602 (10Marostegui) Thanks @Cmjohnson, what about these changes: ``` db1096 - a6 db1098 - b5 db1099 - d3 ``` [15:02:18] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3171633 (10jcrespo) [15:02:23] 10DBA, 06Operations, 13Patch-For-Review: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#3171634 (10jcrespo) [15:02:25] 10DBA, 13Patch-For-Review: run pt-table-checksum on s2 (WAS: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038) - https://phabricator.wikimedia.org/T154485#3171631 (10jcrespo) 05Open>03Resolved The modified task is for me complete- run pt-table-checksum on s2. I have not checked aev... [15:03:00] 10DBA, 13Patch-For-Review: run pt-table-checksum on s2 (WAS: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038) - https://phabricator.wikimedia.org/T154485#3171649 (10Marostegui) Yaaaaaaaay!!! :-) Thanks for the hard archeology work! [15:05:43] 10DBA, 13Patch-For-Review: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3171664 (10jcrespo) a:05Marostegui>03jcrespo I am claiming this just for coordination purposes, not meaning I do not recognize you (Manuel) have done most of the work already. [15:05:56] 10DBA, 13Patch-For-Review: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3171666 (10jcrespo) p:05Triage>03High [15:06:47] 10DBA, 13Patch-For-Review: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3171672 (10Marostegui) >>! In T160509#3171664, @jcrespo wrote: > I am claiming this just for coordination purposes, not meaning I do not recognize you (Manuel) have done most of the work already. Do not even need... [15:09:00] 10DBA, 06Operations: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3171683 (10jcrespo) [15:09:29] 10DBA, 06Operations: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3171683 (10jcrespo) Not for dc ops yet. [15:10:36] 10DBA, 05codfw-rollout: Analyze if we want to replace some masters in eqiad while it is not active - https://phabricator.wikimedia.org/T162133#3171705 (10jcrespo) [15:10:38] 10DBA, 06Operations: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3171704 (10jcrespo) [15:11:35] 10DBA, 06Operations: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3171683 (10jcrespo) [15:12:22] 10DBA, 06Operations: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3171709 (10Marostegui) [15:12:32] 10DBA, 13Patch-For-Review: run pt-table-checksum on s2 (WAS: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038) - https://phabricator.wikimedia.org/T154485#3171711 (10jcrespo) [15:12:34] 10DBA, 06Operations: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036) - https://phabricator.wikimedia.org/T162699#3171710 (10jcrespo) [15:12:53] 10DBA, 13Patch-For-Review: run pt-table-checksum on s2 (WAS: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038) - https://phabricator.wikimedia.org/T154485#3045895 (10jcrespo) [15:12:55] 10DBA, 06Operations: Decommission db1015, db1035, db1044 and db1038 - https://phabricator.wikimedia.org/T148078#3171712 (10jcrespo) [15:21:10] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3171812 (10Cmjohnson) I need to check b5, it's a 24pt switch not 48. I believe there is 1 more available 1G port. [15:23:30] 10DBA, 06Operations, 10ops-eqiad, 13Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3171835 (10Marostegui) >>! In T162233#3171812, @Cmjohnson wrote: > I need to check b5, it's a 24pt switch not 48. I believe there is 1 more available 1G port. If not, we ca... [15:47:00] jynus, marostegui: anything against performing a codfw cache-wipe + warmup to test the procedure from your side? [15:47:31] volans: you'll hit codfw core slaves + es? [15:48:07] I'll hit whatever the mediawiki-cache-warmup hits :D [15:48:10] but yeah, that's the idea [15:48:39] no issues from my side [15:48:47] if it is codfw XD [15:49:24] this is the task [15:49:24] https://github.com/wikimedia/operations-switchdc/blob/master/switchdc/stages/t04_cache_wipe.py [15:50:07] last time joe did it (quite some weeks ago) it was fine, so I assume things have changed, but it should be fine [15:50:37] go ahead I would say [15:50:53] the script should be the same, no changes AFAIK, just that now is coded inside the switchdc stuff [15:50:56] ok, thanks [15:55:19] I am about to run a large replace on db1030 [15:55:36] s6, right? [15:55:41] yes [15:55:51] i am really worried I know that from memory [15:56:25] marostegui: the living tendril-tree :-P [15:56:29] https://www.youtube.com/watch?v=9C4uTEEOJlM [15:57:04] hahaha jynus!!! [15:57:27] test completed, for what is worth (also failed, looking at nodejs logs) [15:57:53] nodejs? [15:58:24] the warmup script was done by timo in nodejs [15:58:30] ah [15:59:00] haha nice https://grafana-admin.wikimedia.org/dashboard/db/mysql-aggregated?panelId=8&fullscreen&orgId=1&var-dc=codfw%20prometheus%2Fops&var-group=core&var-shard=es1&var-shard=es2&var-shard=es3&var-shard=s1&var-shard=s2&var-shard=s3&var-shard=s4&var-shard=s5&var-shard=s6&var-shard=s7&var-role=All [15:59:49] I might need to re-run it after we fix the small issue [15:59:57] what was it? [16:00:11] missing / in a path :( blame _joe_ though :D [16:07:12] jynus, marostegui FYI testing again, if still ok [16:07:18] go! [16:11:52] done [16:12:01] :) [16:19:26] marostegui: so looking at the dashboard, seems that the second time the impact was way less [16:19:43] should we run this warmup also "before" the start of the switchover? [16:19:56] seems like it warms a bit the db too and it's quicker the second time ;) [16:20:23] <_joe_> yes [16:23:14] I would go ahead and do it yes [16:23:23] We are thinking about doing some warm ups from our side too [16:23:46] jynus: marostegui we had a bunch of database query errors on ruwiki ( https://logstash.wikimedia.org/goto/37ba8f721daf247e92f001d619e871a1 ) [16:23:50] probably on other wikis [16:24:16] most complaining about the server that went away [16:24:23] <_joe_> so, another attempt happening now [16:24:26] and there was one for ruwiki that is quite concerning: Error: 1176 Key 'pl_from' doesn't exist in table 'pagelinks' (10.64.48.152) [16:24:40] <_joe_> marostegui: ^^ codfw will be *hot* :P [16:24:48] pagelinks? [16:24:57] did you have a deploy? [16:25:01] hashar: we are doing some maintenance over s6 at the moment (where ruwiki is) so that could explain that has gone away, the pagelinks thing is different [16:25:08] but not on pagelinks [16:25:42] Key 'pl_from' doesn't exist is a deployment-related error [16:26:56] I think i know what is that coming from [16:27:09] what? [16:27:10] it is only on db1093, where we dropped the UNIQUE key pl_from and converted it to PK [16:27:14] seems the pl_from missing are glitches from this morning https://logstash.wikimedia.org/goto/7df60cb57d69039502d3c1f3814432da [16:27:25] but that was done some weeks ago [16:27:27] not today [16:27:45] but that was done per request, right? [16:27:56] on merged deploy, right? [16:28:20] I dont think so [16:28:25] or is is the planned changes? [16:28:44] well from logstash link above there were some at 6:40 and a few others at 10:10 [16:30:07] No, I am seeing that error the 30th of march too on frwiki which is s6 [16:30:33] for that same host (the only one in the shard with that index removed) [16:30:48] well, removed -> convertd to PK [16:31:12] which code is forcing that index? because that is most likely a problem [16:31:39] ApiQueryLinks::run [16:31:44] yep [16:32:03] so we better depool db1093 and revert that [16:32:09] yes [16:32:19] and there is another one "Error: 1176 Key 'img_user_timestamp' doesn't exist in table 'image' " from ApiQueryAllImages.php [16:32:30] (was on (10.64.32.136) ) [16:34:14] I can add an index with that name (and those columns) [16:34:50] yes, a duplicate index [16:34:55] yep [16:34:57] probably better option [16:35:00] yeah [16:35:04] so that we can transition [16:35:07] I will do that once it gets depooled [16:35:09] documment the FORCE [16:35:13] yep [16:35:15] so it also gets changed [16:35:24] and get listed on the renamed indexes [16:35:32] wilco [16:35:51] thank you :-} [16:35:52] img_user_timestamp [16:35:56] is different [16:36:57] hashar, which wiki? [16:37:14] there are 917 wikis on that host [16:37:37] jynus: foundationwiki [16:37:47] according to https://logstash.wikimedia.org/goto/72a7394fe05e3bc3f7d1c94a5424f59b [16:38:12] most probably you would want some kind of logstash dashboard to highlight those? [16:38:28] we already have one [16:38:30] ah [16:38:32] but it is full of garbage [16:39:01] I guess it lacks reduplication? [16:39:05] deduplication [16:41:44] marostegui, I do not see https://phabricator.wikimedia.org/T160415 applied to s3 [16:44:36] mmmm maybe i forgot s3?? [16:44:51] forgot 800/900 wikis? [16:44:58] don't know [16:45:00] give me a sec [16:45:15] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172211 (10jcrespo) 05Resolved>03Open [16:45:18] sounds strange that I forgot it [16:45:55] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3106145 (10jcrespo) p:05Triage>03High I do not see this applied... [16:47:50] on the other side, hashar, someone added an index hint and didn't add us a reviewer, which is a big no [16:50:34] maybe we can make CI to catch that [16:50:43] based on a reference schema living out of mediawiki repos [16:51:18] no, difficult [16:51:23] ok, I am running the alters on db1093 [16:51:24] that is a pure code thing [16:51:27] I will start working on s3 now [16:52:00] it is not like modifying tables.sql or so [16:52:07] ;( [16:52:14] oh, you mean like [16:52:33] detecting the index exist when used as a hint [16:52:39] doable [16:53:01] but needs people to create unit tests [16:53:10] when a dev add a FORCE_INDEX I guess they also add a .sql patch file in the repo [16:53:17] no [16:53:24] that is the part that is independent [16:53:26] they just dont give a shit ? [16:53:41] (sorry for the bad languages) [16:53:46] you can force an index it doesnt exist or exists but hasn't been depoyed yet [16:54:12] it is just query(,,[force => 'bla bla']) [16:54:23] doesn't require a schema change [16:54:33] we we can do is maybe detect all instances [16:54:41] and document the hell of them [16:55:01] and force units tests and pointing to the use case they solve [16:55:30] I can also maintain a table.sql with the current, rather than the desired schema [16:55:47] to make it easier. Which repo could I use for that? [16:57:03] dont you already have a python script that captures the current state of all schema? [16:57:21] nope [16:57:25] db1078 is done [16:57:25] I wish [16:57:31] going with the other big server now [16:59:04] jynus: I was referring to the db check tool at https://gerrit.wikimedia.org/r/#/c/256231/ [16:59:08] but that is probably different :D [16:59:09] the other one is also done, so errors should be minimum now as those two servers server most of the main traffic [16:59:18] hashar, that is horrible [16:59:30] but it is still a draft [16:59:39] well that is a start :} [17:00:02] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172287 (10Marostegui) Both main servers have been done now, so err... [17:01:20] anyway have to move out [17:01:23] * hashar waves [17:01:30] thanks for the heads up hashar [17:01:38] hashar, thanks for reporting these [17:01:45] you are welcome [17:02:00] you cought before it was worse [17:02:20] the dbquery error report actually came from Elena who was doing some QA testing [17:02:27] db1093 was a calculated risk [17:02:32] i cannot believe how i forgot s3 [17:02:37] so embarrasing [17:02:43] np [17:02:45] and most probably we could polish up the logstash dashboard and or mediawiki logging [17:02:56] hashar, someone changed the database logging [17:03:01] ;( [17:03:03] and now it is a mess for me [17:03:06] welcome to our bazaar! [17:03:13] <_joe_> ok, another pass! [17:03:14] but yeah I feel the pain of things ever moving [17:03:19] I have to check 3 different channels and the exception [17:03:21] and hhvm [17:03:26] to get db-related stuff [17:04:31] joe, not sure what you did- but after the latest pass, we have much less db traffic on codfw [17:04:52] maybe it wasn't you [17:05:34] eqiad is altered now, all the serving hosts have the new index [17:05:46] jynus: that more traffic started yesterday [17:05:52] ? [17:06:00] 6am [17:06:16] <_joe_> jynus: I'll try again now [17:06:18] than increased at 13 [17:07:25] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172330 (10Marostegui) All eqiad main servers are now done, so erro... [17:08:15] I am looking at lower rows read since 16 UTC today [17:08:50] https://grafana.wikimedia.org/dashboard/db/mysql-aggregated?panelId=8&fullscreen&orgId=1&from=1491844120411&to=1491930520412&var-dc=codfw%20prometheus%2Fops&var-group=core&var-shard=s1&var-shard=s2&var-shard=s3&var-shard=s4&var-shard=s5&var-shard=s6&var-shard=s7&var-shard=x1&var-role=All [17:09:26] for which I do not really have an explanation [17:10:09] mostly s4 and s5 [17:10:22] jynus: yes but go back to last 2 days [17:10:28] ah [17:10:28] and you'll see it was down before [17:10:31] let's see [17:10:46] I see now [17:10:56] and before was higher and lower, it's not constant [17:11:30] that is a mistery, maybe some write process sending reads though replication? [17:12:03] oh, I know [17:12:10] pt-table-checksum would fit there [17:12:16] so no worries [17:12:40] it is good there is some activity- warming up the slaves [17:15:02] eheheh [17:15:38] although your script doesn't touch es2* servers much, which was kind of original reason [17:17:19] it's not mine :D [17:17:49] I know [17:18:02] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172413 (10Marostegui) db1069 (sanitarium) db1095 (sanitarium2), la... [17:18:06] short for that you are running on behalf of those that have created it [17:18:24] :-P [17:18:29] I shared a theory with manuel that the script does nothing [17:18:52] but we say it worked because no problem happenend on the last codfw -> eqiad transition [17:18:53] at DB level? or app level? [17:19:12] going to repool db1093 after adding the pl_from index [17:19:19] at DB level meaning the issue that es hosts got saturated last time [17:19:39] I'm pretty sure that doens't help [17:19:39] and that maybe the issues will repeat again [17:19:59] it's pre-warming mainly apc and memcache caches [17:20:06] for few pages [17:20:08] yeah, the question [17:20:18] is if that is enough to not cause es contention [17:20:26] for the es probably is easier for you to run some warmup queries? [17:20:29] or it was because eqiad was already warmed up [17:20:42] volans, do not worry [17:20:47] we have been working on that [17:21:28] but with almost 20 TB of data, it is difficult to know how to warmup effectively [17:21:57] yeah! [17:22:08] plus it takes forever- 16 minutes just to go over the last X years of testwiki [17:23:20] what have you choose to warmup? getting RO query logs from eqiad and replay it? [17:23:35] enwiki, latest X years [17:24:15] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172455 (10Marostegui) dbstore1001 and dbstore1002 are done [17:24:34] ok [17:24:43] replication already warms up latests edits [17:24:52] we need to do it with older, but still used ones [17:25:48] yeah, that's why I was thinking about replaying the user's traffic, at least a bit of it [17:26:24] but even that is not that useful [17:26:51] given that traffic to the db is different from the one with cached revisions or preparsed ones [17:29:09] right [17:30:55] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172480 (10jcrespo) p:05High>03Normal [18:11:46] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172658 (10Marostegui) dbstore2002 and dbstore2001 are done. [18:21:54] 10DBA, 13Patch-For-Review: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3172687 (10jcrespo) Core servers have been all checked an fixed. Only ones missing are: ``` dbstore1001 +--------+----------+------------+--------+ | db | tbl | total_rows | chunks | +--------+---------... [18:25:53] 10DBA, 10Wikidata: Repeated reports of wikidatawiki (s5) API going read only - https://phabricator.wikimedia.org/T123867#3172690 (10Ladsgroup) @Multichill: Do you get such issues recently? My bot stopped getting them [18:51:38] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07PostgreSQL, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#3172790 (10jcrespo) tl_from may need the same fix than pl_from, I am seeing some errors on db1093 on ruwiki. [18:55:27] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3172816 (10Reedy) [18:59:55] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07PostgreSQL, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#3172819 (10Marostegui) >>! In T17441#3172790, @jcrespo wrote: > tl_from may need the same fix than pl_from... [19:02:01] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07PostgreSQL, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#3172825 (10Marostegui) >>! In T17441#3172819, @Marostegui wrote: >>>! In T17441#3172790, @jcrespo wrote: >... [19:07:19] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3172850 (10Reedy) Crossposted to mailing lists https://lists.wikimedia.org/pipermail/mediawiki-l/2017-April/046494.html https://lists.wikimedia.org/pipermail/wikit... [19:15:03] 10DBA, 10MediaWiki-Database, 13Patch-For-Review, 07PostgreSQL, 07Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441#3172927 (10jcrespo) I do not think this is an unbreak now- we can probably wait until tomorrow. This went... [19:16:38] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3172932 (10Marostegui) All codfw slaves are done: ``` db2057 db2050... [20:19:39] 10DBA, 13Patch-For-Review: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3173259 (10jcrespo) dbstore1001 and dbstore1002 fixed, checksums for dbstore2001 pending. dbstore2001 seems with smaller errors, 1002 required almost full table reimports. [21:08:45] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3125897 (10Legoktm) For MariaDB, is it the same version number? [21:26:41] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3173416 (10Reedy) >>! In T161232#3173372, @Legoktm wrote: > For MariaDB, is it the same version number? I think for 5.5 yes.. For when we only support mysql >= 5.... [21:30:02] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3125897 (10Kghbln) > While the vast majority (over 65%) is on MySQL 5.5 or higher, MySQL 5.1 at 28% is still quite significant. From my experience: I guess this is... [21:53:22] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3173448 (10Krinkle) >>! In T161232#3173421, @Kghbln wrote: >>>! @Krinkle wrote: >> While the vast majority (over 65%) is on MySQL 5.5 or higher, MySQL 5.1 at 28% is... [21:54:00] 10DBA, 10Wikidata, 07Performance, 15User-Daniel, and 2 others: Use redis-based lock manager in dispatch changes in production - https://phabricator.wikimedia.org/T159826#3173453 (10Ladsgroup) The patch is there (https://gerrit.wikimedia.org/r/#/c/347395/) and I talked to Ops, It seems they are okay with me... [21:54:30] 10DBA, 06Operations, 10ops-codfw, 13Patch-For-Review: codfw rack/setup 22 DB servers - https://phabricator.wikimedia.org/T162159#3173455 (10Papaul) [22:21:14] 10DBA, 10ArchCom-RfC, 10MediaWiki-Database, 07RfC: Should we bump minimum supported MySQL Version? - https://phabricator.wikimedia.org/T161232#3173531 (10Kghbln) > This shows a figure of 8% instead of 28%, which would include your example of a hosting provider upgrading PHP but not MySQL. Yeah, that's tru...