[06:47:24] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T158969#3105045 (10Marostegui) >>! In T158969#3071469, @Cmjohnson wrote: > db1070 is under warranty for 2 more months. Requested new part from DEll > > Congratulations: Work Order SR944780612 was successfully submi... [06:51:04] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3105046 (10Marostegui) dbstore1001 got the change: ``` CREATE TABLE `cx_translations` ( `translation_id` int(1... [07:05:48] 10DBA: s5: db1070 not using file per table - https://phabricator.wikimedia.org/T157931#3105060 (10Marostegui) db1070 finished importing all the files. It is now replicating and trying to catch up ``` Seconds_Behind_Master: 247481 ``` [07:34:48] 10DBA: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3105135 (10Marostegui) I have just started the check on frwiki. [07:44:12] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3105144 (10Marostegui) s7 is full done after doing eqiad hosts: db... [07:56:55] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3105156 (10Marostegui) And finally the last host, dbstore2001 got the change too: ``` root@DBSTORE[wikishared]>... [10:06:19] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3105436 (10jcrespo) I finished dbstore1002 too (again, only for the tables with PKs) for s2, I will fix dbstore1001 now. [10:06:40] ^ awesome archeology job, lots of thanks [10:22:58] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3105473 (10Marostegui) s1 is fully done: ``` root@neodymium:/home/... [10:33:16] 10DBA, 10Analytics, 06Labs: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#3105483 (10ArielGlenn) After my initial ping, I dropped this ball. Picking it back up again. [13:01:30] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3105883 (10KartikMistry) Thanks @Marostegui and @jcrespo ! [13:41:20] 10DBA, 06Operations, 10ops-codfw: es2015 crashed on 2017-03-11 - https://phabricator.wikimedia.org/T160242#3105953 (10Marostegui) @Papaul es2015 is now off. Please turn it on once you are done with the main board replacement. Thank you! [14:05:33] 07Blocked-on-schema-change, 10DBA: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#770351 (10Marostegui) Hi, The current status of this change in s4 (commonswiki): filearchive table - all of them have: `fa_minor_mime varbinar... [14:26:18] 10DBA, 07Schema-change, 07Tracking: Schema changes for Wikimedia wikis (tracking) - https://phabricator.wikimedia.org/T51188#3106084 (10jcrespo) [14:26:21] 07Blocked-on-schema-change, 10DBA: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3106080 (10jcrespo) 05stalled>03Open a:03Marostegui [14:26:37] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3106086 (10jcrespo) a:05Marostegui>03jcrespo [14:36:20] 10DBA, 13Patch-For-Review: s5: db1070 not using file per table - https://phabricator.wikimedia.org/T157931#3106122 (10Marostegui) So, db1070 has caught up. I have repooled it with a low weight to see how it behaves. Replication had no issues. I will leave it like that (increasing its weight) until Monday or so... [14:36:29] 10DBA, 13Patch-For-Review: s5: db1070 not using file per table - https://phabricator.wikimedia.org/T157931#3106124 (10Marostegui) a:03Marostegui [14:39:55] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3106132 (10jcrespo) dbstore1001 is fixed for s2, again with the exception of tables without PKs and zhwiki, which is finishing at this point. Going with db1036. [14:59:20] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T158969#3106257 (10Cmjohnson) @Marostegui disk is rebuilding Enclosure Device ID: 32 Slot Number: 10 Drive's position: DiskGroup: 0, Span: 5, Arm: 0 Enclosure position: 1 Device Id: 10 WWN: 500003978859EC70 Sequenc... [15:01:36] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T158969#3106273 (10Marostegui) Awesome!! Thank you! ``` root@db1070:~# megacli -PDRbld -ShowProg -PhysDrv [32:10] -aALL Rebuild Progress on Device at Enclosure 32, Slot 10 Completed 1% in 5 Minutes. ``` [15:08:53] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3106338 (10Nuria) ping @jcrespo. Could you give us an ETA on when we could start doing these changes? [15:09:14] ^ I will reply to that jynus [15:11:14] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3106339 (10Marostegui) >>! In T160454#3102530, @Nuria wrote: > Excellent, let us know what you think is a good time on your end to do this and we will take an outage accordingly. For us, the s... [15:11:37] "master (db1046) and slave (db1047)" [15:11:48] there are 2 slaves db1047 and dbstore1002 [15:11:56] :) [15:12:04] amending! [15:12:15] thank you [15:12:35] there use to be a third one, dbstore2002 but they didn't want to touch it- so now there is not cross-dc backup for that [15:13:38] so once we do the dc switchover…? [15:13:50] they will keep reading from eqiad anyways? [15:14:02] there is no stats infrstructure on codfw [15:14:12] the same there is no labs infrastructure there [15:14:16] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3106348 (10Nuria) >Could you give us the exact list of tables that would need to be renamed? Yes, will compile today and post here. [15:15:07] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3106349 (10Marostegui) >>! In T160454#3106348, @Nuria wrote: >>Could you give us the exact list of tables that would need to be renamed? > Yes, will compile today and post here. Great - thanks! [15:41:28] 10DBA, 10Analytics, 10Analytics-EventLogging, 10ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#2497306 (10Nuria) Moving to radar as i think is taken care of by dbas [15:42:12] ^see [15:42:23] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3106486 (10jcrespo) db1036 is now ok. dbstore2001.codfw.wmnet was all ok, but it is missing several rows from `bv20**_edits` tables (they are empty there). [15:43:06] 10DBA, 10Analytics, 10Analytics-EventLogging, 10ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#3106493 (10Marostegui) >>! In T141407#3106478, @Nuria wrote: > Moving to radar as i think is taken care of by dbas Not... [15:57:04] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3106549 (10Marostegui) s5 is fully done now: dewiki: ``` dbstore200... [16:05:08] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3106582 (10Marostegui) s4 will take a bit longer as the image table... [16:05:43] 07Blocked-on-schema-change, 10DBA, 13Patch-For-Review: *_minor_mime are varbinary(32) on WMF sites, out of sync with varbinary(100) in MW core - https://phabricator.wikimedia.org/T73563#3106583 (10Marostegui) Starting with the image table first on: dbstore2001 and db2065 [16:28:23] 10DBA, 06Operations: dbstore1001 troubleshoot IPMI issue - https://phabricator.wikimedia.org/T158893#3106763 (10Cmjohnson) @marostegui: nothing concrete has been determined since we upgraded the idrac f/w. My POC was holiday right after we started our conversation and then me. I reached out to them today. Wa... [16:30:12] 10DBA, 06Operations: dbstore1001 troubleshoot IPMI issue - https://phabricator.wikimedia.org/T158893#3106780 (10Marostegui) >>! In T158893#3106763, @Cmjohnson wrote: > @marostegui: nothing concrete has been determined since we upgraded the idrac f/w. My POC was holiday right after we started our conversation... [17:51:24] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3107208 (10Ottomata) Hm, just looked myself, but I don't see any non EventLogging `log` databases on db1046 or db1047. At least, the 'eventlog' user can't see them. Not sure... [18:16:53] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107353 (10chasemp) >>! In T157359#3095669, @jcrespo wrote: > @aude @MaxSem @Kolossos Can you verify your applications (e.g. restarting them) and se... [18:18:35] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3107357 (10Ottomata) @DarTar @leila, @milimetric, @Tbayer: Q for yall. I need to figure out what we actually need to replace MySQL research slaves. I am unfamiliar with what... [18:19:10] 10DBA, 06Labs: page_lang column of the page table is not replicated to Labs - https://phabricator.wikimedia.org/T154355#3107372 (10chasemp) >>! In T154355#3099624, @TTO wrote: > `metawiki_p.page` now contains the page_lang column; however, `user_groups` view still dos not contain the `ug_expiry` column. Shall... [18:24:45] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#2987618 (10Halfak) We can probably not replace db1047 but we'll want to take backups of the user created dbs there. [18:25:49] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107415 (10dschwen) Nope. My stuff fails now with: ``` ERROR: function setsrid(box3d, integer) does not exist LINE 6: SetSRID('BOX3D(-... [18:35:35] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#2987618 (10Neil_P._Quinn_WMF) I've never used either s1 or s2—only analytics-store and (once or twice) x1. As an extra data point, the [data access docs](https://wikitech.wiki... [18:41:09] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3107446 (10Halfak) I've just backed up the `halfak` DB. I can also be responsible for backing up the `staeiou` DB. [18:42:41] 10DBA, 06Operations, 10ops-codfw: es2015 crashed on 2017-03-11 - https://phabricator.wikimedia.org/T160242#3107447 (10Marostegui) The Dell technician didn't show up and @Papaul has arranged another appointment for Monday. I will be off on Monday, so it will need to be powered of by @jcrespo. I have just pow... [18:47:24] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3107454 (10jcrespo) I have checked all s2 slaves, including codfw and those not part of the specific lists. All either didn't have differences or have been corrected.... [18:55:28] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107475 (10dschwen) After fixing those to `ST_SetSRID` my PostGIS query now fails with ``` ERROR: Operation on mixed SRID geometries ``` [19:03:39] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107510 (10jcrespo) That is probably a conquequence of a PostGIS upgrade. Can you post/point to the whole query/relevant code? [19:20:46] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107585 (10dschwen) Sure the query is here: https://github.com/dschwen/wikiminiatlas/blob/master/tiles/jsontile.php#L115 [19:34:38] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107634 (10jcrespo) This could be the reason- requiring to setup the SRID for constants: http://gis.stackexchange.com/questions/68711/postgis-geomet... [19:38:45] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107645 (10dschwen) Looks like the OSM data uses SRID 3857 and I compare to a Bounding Box with SRID 900913 [19:42:11] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107654 (10dschwen) Ok, next issue is that suddenly the column `the_geom` does not exist anymore in the `land_polygons` and `coastlines` tables.... [19:44:02] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107655 (10dschwen) Ok, I think I'm back in business! Testing a bit more now. [19:54:55] 10DBA, 06Labs: page_lang column of the page table is not replicated to Labs - https://phabricator.wikimedia.org/T154355#3107682 (10TTO) I reckon we'd better keep @Aklapper happy and open a new task for the new issue :) [19:56:06] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: labsdb1006/1007 (postgresql) maintenance - https://phabricator.wikimedia.org/T157359#3107687 (10jcrespo) Good to know- feel free to test and communicate issues- we will soon otherwise irrevocably delete the previous instance. Thanks... [20:00:53] 10DBA, 06Labs, 10Labs-Infrastructure: ug_expiry column of the user_groups table is not present on Labs - https://phabricator.wikimedia.org/T160686#3107713 (10TTO) [20:04:49] 10DBA, 06Labs, 10Labs-Infrastructure: LabsDB replica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3107740 (10chasemp) [20:14:16] 10DBA, 06Labs, 10Labs-Infrastructure: ug_expiry column of the user_groups table is not present on Labs - https://phabricator.wikimedia.org/T160686#3107789 (10chasemp) p:05Triage>03Normal I'll attempt to run the view generation when I can in the next few days [20:34:18] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3107852 (10Nuria) List of tables: CentralAuth_5690875 ChangesListFilters_16174591 ChangesListFilters_16403617 CommandInvocation_15243810 ContentTranslation_11628043 ContentTranslationCTA_16017... [20:35:45] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3107859 (10Nuria) We will need to: - send notice to users - update automated scripts [20:37:46] 10DBA: Refreshing testreduce_vd database on ruthenium - https://phabricator.wikimedia.org/T160691#3107862 (10ssastry) [20:44:33] 10DBA: Refreshing testreduce_vd database on ruthenium - https://phabricator.wikimedia.org/T160691#3107903 (10ssastry) Actually, I figured I can convert the json scripts to emit the sql commands and run it via the mysql commandline script. I am going to give that a try. [21:19:49] 10DBA: Refreshing testreduce_vd database on ruthenium - https://phabricator.wikimedia.org/T160691#3108045 (10ssastry) Okay, that worked. But, I had to run "delete from " since I didn't have truncate permissions. But, I am leaving this ticket open to get any advice for how to go about this in the future an... [21:35:05] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3108227 (10leila) @Ottomata: I'm only using dbstore1002 these days. For my work, I'm fine if db1047 is gone. [22:00:34] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3108353 (10Halfak) I've confirmed that all tables in `staeiou` are cleared for deletion. @staeiou can come here to confirm if he gets the ping. [22:02:50] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3108355 (10Staeiou) >>! In T156844#3108353, @Halfak wrote: > I've confirmed that all tables in `staeiou` are cleared for deletion. > > @staeiou can come here to confirm if... [22:25:22] 10DBA, 06Labs, 10MediaWiki-extensions-Babel: Replicate babel db table on Labs - https://phabricator.wikimedia.org/T160713#3108521 (10Base)