[05:05:13] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s7 - https://phabricator.wikimedia.org/T166208#3410449 (10Marostegui) [05:05:25] 10Blocked-on-schema-change, 10DBA: Convert unique keys into primary keys for some wiki tables on s1, s2, s4, s5 and s7 (eqiad) - https://phabricator.wikimedia.org/T164185#3410452 (10Marostegui) [05:05:28] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s7 - https://phabricator.wikimedia.org/T166208#3288342 (10Marostegui) 05Open>03Resolved Everything is done [05:06:46] 10DBA, 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, and 5 others: Drop tables with bad data: mediawiki_page_create_1 mediawiki_revision_create_1 - https://phabricator.wikimedia.org/T169781#3410454 (10Marostegui) As Nuria said, this can be handled by the Analytics team [05:28:23] 10DBA, 10Data-Services: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3410469 (10Marostegui) Our sanitarium host (db1069) got replication broken with: ``` Error 'Table 'ukwikimedia.site_stats' doesn't exist' on quer... [06:22:14] let me know if you want to deploy multinstance role to db1102 or even dbstore2002 [06:44:11] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Apply schema change to add 3D filetype for STL files - https://phabricator.wikimedia.org/T168661#3410571 (10Marostegui) [06:47:02] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Apply schema change to add 3D filetype for STL files - https://phabricator.wikimedia.org/T168661#3410575 (10Marostegui) s1 is done. Altering the primary master would be a bit risky, I have seen some slaves getting delayed while doing the alter, specially... [07:22:54] there is a host on tendril ":3316", could that have been you, by mistake? [07:23:46] I just want to make sure there is no compromise or anything [07:23:52] let me check [07:24:03] ah yes [07:24:10] probably me when added db1102:3316 [07:24:12] and failed the first time [07:24:14] I will delete it [07:24:17] oh, then no problem [07:24:33] I was worried of some kind of compromise or database corruption [07:24:38] no :) [07:24:45] we are good :) [07:24:47] if it is that, I can take care of it [07:24:53] don't worry [07:25:01] ah, ok, thanks! :) [07:25:09] keep doing the grest things you are doing [07:25:19] that is breaking dbstore2002 :) [07:25:37] ? [07:25:59] I was checking dbstore2002 for its second instance [07:30:01] 10DBA, 10Data-Services, 10Wikimedia-maintenance-script-run: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3410626 (10jcrespo) 05Resolved>03Open CC @bd808 ^ Maybe maintenance was not updated, but something else... [07:31:51] 10DBA, 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, and 5 others: Drop tables with bad data: mediawiki_page_create_1 mediawiki_revision_create_1 - https://phabricator.wikimedia.org/T169781#3410630 (10elukey) @kaldari we can do it! If you don't mind I'd wait for Andrew to come back from v... [07:32:50] 10DBA, 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, and 5 others: Drop tables with bad data: mediawiki_page_create_1 mediawiki_revision_create_1 - https://phabricator.wikimedia.org/T169781#3410631 (10Marostegui) I would suggest to rename them before issuing any drop. Leave them renamed w... [07:37:29] when it is a good time to schedule labsdb1005 reboots? [07:37:40] somewhere mid-next week? [07:37:51] Did cloud team announced it already? [07:38:07] I would suggest tuesday or wednesday, in case it breaks we still have days to fix it [07:40:34] ok [07:42:02] 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3410634 (10jcrespo) I would do labsdb1004 first, which is the slave for the toolsdb, and labsdb1005- I didn't want to pressure you because I knew you had other concerns. I would say Tue... [07:43:05] 10DBA, 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, and 5 others: Drop tables with bad data: mediawiki_page_create_1 mediawiki_revision_create_1 - https://phabricator.wikimedia.org/T169781#3410635 (10elukey) >>! In T169781#3410631, @Marostegui wrote: > I would suggest to rename them befo... [07:44:16] I am going to deploy https://gerrit.wikimedia.org/r/363375 [07:44:36] cool [07:48:33] did you see my 10.1.25 upgrade? [07:49:13] I added support for "systemctl mariadb@s1 start" [07:52:14] yeah [07:52:19] nice :) [07:52:42] did it work nicely? [07:53:54] I haven't tested it yet [07:59:54] I dropped .eqiad.wmnet:3316 from tendril [08:00:00] thank you [08:34:25] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410711 (10jcrespo) [09:16:34] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410781 (10jcrespo) I have checked puppet, and I do not see any error with the puppet configuration (ip, mac of the new hosts). @ayounsi do you have time to help us check the network config... [09:18:47] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410793 (10jcrespo) db1098, for example, should have IP `10.64.16.83` and mac `18:66:DA:F8:D5:E0` according to the server and puppet configuration, but PXE doesn't move forward with the ins... [09:20:01] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410794 (10jcrespo) ``` Link Status ``` This could be a physical issue or a network configuration issue, could you help us check? [09:21:56] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410795 (10jcrespo) Oh, I think I have it `18:66:DA:F8:D5:E1` says connected. I think we used the wrong port to configure the server. This may still need network check, maybe? but most like... [09:38:13] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3156297 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by jynus on neodymium.eqiad.wmnet for hosts: ``` ['db1098.eqiad.wmnet'] ``` The log can be found in `/var/log/wmf-auto-re... [09:46:41] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410856 (10jcrespo) Sadly, I still cannot see it booting. [09:47:42] :( [09:51:30] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3410893 (10jcrespo) @Cmjohnson this is not urgent, but can you check the link of the initially configured device? `18:66:DA:F8:D5:E0` aka network card1. @ayounsi can see link, but cannot se... [09:57:48] most likely tftp boot only happens on the first network card and that is the wrong active one [15:26:48] 10DBA, 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, and 5 others: Drop tables with bad data: mediawiki_page_create_1 mediawiki_revision_create_1 - https://phabricator.wikimedia.org/T169781#3411969 (10elukey) @Marostegui: The idea would be to: 1) stop eventlogging on eventlog1001 2) sto... [16:04:15] 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3412075 (10Nuria) [16:09:13] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3412082 (10Nuria) [16:09:33] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#2867045 (10Nuria) Not on our end that we know of. are these tables also going to be deleted from analytics store? [16:14:21] 10DBA, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 5 others: Drop tables with bad data: mediawiki_page_create_1 mediawiki_revision_create_1 - https://phabricator.wikimedia.org/T169781#3412109 (10Nuria) [16:40:09] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3412190 (10Marostegui) >>! In T153033#3412082, @Nuria wrote: > Not on our end that we know of. are these tables also going to be deleted from analytics store? They do exist on db1047 and dbstore1002 (enwiki db... [16:56:33] 10DBA, 10Operations, 10ops-codfw: db2044: Disk on predictive failure - https://phabricator.wikimedia.org/T169693#3412422 (10Papaul) a:05Papaul>03Marostegui Disk replacement complete [17:04:58] 10DBA, 10Operations, 10ops-codfw: db2044: Disk on predictive failure - https://phabricator.wikimedia.org/T169693#3406026 (10Volans) [17:08:14] 10DBA, 10Operations, 10ops-codfw: db2044: Disk on predictive failure - https://phabricator.wikimedia.org/T169693#3412481 (10Marostegui) Thank you @Papaul : ``` logicaldrive 1 (3.3 TB, RAID 1+0, Recovering, 17% complete) ``` [18:15:07] 10DBA, 10Operations, 10ops-codfw: db2044: Disk on predictive failure - https://phabricator.wikimedia.org/T169693#3412737 (10Marostegui) 05Open>03Resolved All good now: ``` physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 600 GB, OK) ``` [18:21:12] 10DBA, 10Data-Services, 10Patch-For-Review, 10Wikimedia-maintenance-script-run: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3412750 (10bd808) >>! In T169488#3410469, @Marostegui wrote: > Our sanitarium host (d... [18:22:03] 10DBA, 10Data-Services, 10Patch-For-Review, 10User-bd808, and 2 others: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3412759 (10bd808) a:05jcrespo>03bd808 [19:09:19] 10DBA, 10Data-Services, 10MediaWiki-extensions-Babel, 10Patch-For-Review, 10cloud-services-team (Kanban): Replicate babel db table on Labs - https://phabricator.wikimedia.org/T160713#3108521 (10Andrew) This is done on most labsdbs. Remaining steps: 1) Figure out why puppet is disabled on labsdb1009 and... [19:28:57] 10DBA, 10Operations, 10Performance-Team, 10Traffic, 10Wikidata: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3413078 (10aaron) >>! In T164173#3343495, @aaron wrote: > @daniel , can you look into the amount of purges happening in... [20:11:59] 10DBA, 10Release-Engineering-Team (Watching / External): Missing / Dropped databases? - https://phabricator.wikimedia.org/T132838#3413255 (10greg) [20:19:27] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3413278 (10Nuria) @maarostegui: then , the best way I can think of to find out whether anyone is using them is to send an e-mail to analytics@ (give people a month respond noting date by which you will delete tho... [20:23:43] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#2867045 (10demon) A //month//?! I can't imagine there's //any// useful data to be gathered out of this. MoodBar was a complete and absolute failure. [20:29:38] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3413301 (10Nuria) @demon : a month is the standard time we give users to reply to usage requests, see some data that used this extension: https://meta.wikimedia.org/wiki/Research:MoodBar/First_month_of_activity... [20:32:18] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3413303 (10demon) /me shrugs [21:22:14] 10DBA, 10Operations: Evaluate how hard would be to get aa(wikibooks|wiktionary) databases deleted - https://phabricator.wikimedia.org/T169928#3413499 (10MarcoAurelio) [21:26:27] 10DBA, 10Operations: Evaluate how hard would be to get aa(wikibooks|wiktionary) and howiki databases deleted - https://phabricator.wikimedia.org/T169928#3413528 (10MarcoAurelio) [21:35:20] 10DBA, 10Wikimedia-Language-setup: Deletion of als Wiktionary, Wikibooks and Wikiquote? - https://phabricator.wikimedia.org/T169450#3413551 (10MarcoAurelio) So I guess that, to sum up, the questions here would be: * would you expect major issues if we delete/drop als[.*] tables/wikis? * is it possible to redi... [21:48:46] 10DBA, 10Operations, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3413600 (10madhuvishy) @jcrespo, okay, I'll do the announcements. @Halfak We are proposing labsdb1004 reboot (wikilabels db server) for Tuesday 11 July at 1400 UTC. Would that work for... [21:58:28] 10DBA, 10Operations, 10Scoring-platform-team, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3413650 (10Halfak) Yup! That works! [21:58:32] 10DBA, 10Operations, 10Scoring-platform-team, 10cloud-services-team: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3413651 (10Zppix) [22:24:06] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3413746 (10Milimetric) The thing to consider here is that the people who are likely in the loop about what's happening on the mediawiki dbs are not in the loop about research being done on those clones on dbstore... [22:25:50] 10DBA, 10Operations, 10Performance-Team, 10Traffic, 10Wikidata: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3413753 (10aaron) a:05aaron>03None [22:35:42] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3413793 (10Nuria) > I can write to the typical research lists to ask this question if you like, so this task doesn't stall. But if this situation comes up again we should think about a good process. Let's assume... [22:42:25] 10DBA, 10Operations, 10Performance-Team, 10Traffic, 10Wikidata: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3224448 (10Krinkle) ChangeNotificationJob https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/6cfd514ee9/cl... [22:42:45] 10DBA, 10Operations, 10Performance-Team, 10Traffic, 10Wikidata: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3413807 (10Krinkle) p:05Normal>03High [22:48:10] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921#3413880 (10Krinkle) [23:28:56] 10DBA, 10Operations, 10Performance-Team, 10Traffic, 10Wikidata: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3414111 (10aaron) I also wonder why some of those log warnings come from close() and others have the proper commitMaste...