[06:32:52] <wikibugs>	 10DBA: Power supply error on db1055 - https://phabricator.wikimedia.org/T182653#3830049 (10Marostegui)
[06:44:39] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3830063 (10Marostegui)
[06:44:59] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3754580 (10Marostegui)
[06:46:37] <wikibugs>	 10DBA, 10Patch-For-Review: Power supply error on db1055 - https://phabricator.wikimedia.org/T182653#3830065 (10Marostegui) p:05Triage>03Normal
[06:46:55] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Power supply error on db1055 - https://phabricator.wikimedia.org/T182653#3830049 (10Marostegui)
[07:10:35] <wikibugs>	 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Power supply error on db1055 - https://phabricator.wikimedia.org/T182653#3830085 (10Marostegui) a:03Cmjohnson @Cmjohnson I have been unable to identify which of the PSU is the one failing, the idrac console isn't recording which one is it (sometimes i...
[07:20:11] <wikibugs>	 10DBA, 10Patch-For-Review: Checksum data on s7 - https://phabricator.wikimedia.org/T163190#3830100 (10Marostegui)
[08:27:32] <jynus>	 I have sent wmf-mysql80_8.0.3-rc-1_amd64.deb to install1002:/home/jynus/stretch
[08:29:06] <jynus>	 just the executable is 600MB
[09:11:30] <jynus>	 91% ETA 1:32:50
[09:23:19] <wikibugs>	 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3830312 (10Marostegui)
[09:23:41] <wikibugs>	 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#3830317 (10Marostegui)
[09:23:43] <wikibugs>	 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3830316 (10Marostegui)
[09:23:46] <wikibugs>	 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3803575 (10Marostegui) 05Open>03Resolved
[09:45:42] <moritzm>	 marostegui, jynus: are you reimaging servers ATM? I'm about to refresh the netboot image for stretch (there was a point release for stretch last weekend)
[09:46:40] <jynus>	 not me
[09:46:58] <jynus>	 let me check in case marostegui is busy
[09:47:52] <jynus>	 there is a wmf-auto-reimage -c -- mw1259.eqiad.wmnet
[09:48:09] <jynus>	 (but not running) on a screen
[09:48:20] <jynus>	 but in any cases that should be jessie
[09:48:22] <marostegui>	 Nope, I am not doing anything :)
[09:48:29] <moritzm>	 ah, that's an older session I think, for some reason that hung on Thursday
[09:48:32] <jynus>	 I will check anyway
[09:48:38] <moritzm>	 I did that :-)
[09:48:39] <jynus>	 in case I find something else
[09:48:42] <jynus>	 ah, ok
[09:48:46] <jynus>	 so nope
[09:48:49] <moritzm>	 so mw1259 is fine
[09:48:57] <moritzm>	 k, thanks, updating in 1-2 mins
[09:52:19] <moritzm>	 updated
[10:54:21] <jynus>	 19:09:59 100%
[10:54:57] <marostegui>	 \o/
[10:56:24] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3830446 (10Marostegui) s4 eqiad hosts:  [] labsdb1001.eqiad.wmnet (broken - will not be done) [] labsdb1003.eqiad.wmnet [] db1102.eq...
[10:56:41] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3830447 (10Marostegui)
[11:14:16] <jynus>	 marostegui: I am touching s5/s8, I am going to assume you are not touching those?
[11:14:55] <marostegui>	 yep, no touching any of those
[11:14:57] <marostegui>	 only s4 and s7
[11:15:16] <jynus>	 ok, will ping if I finish with those
[11:15:24] <marostegui>	 cool!
[12:16:50] <jynus>	 I am going to start touching s1, probably only codfw
[13:14:30] <marostegui>	 sorry - I was having lunch
[13:14:37] <marostegui>	 sure, not touching s1 at all myself
[13:45:13] <wikibugs>	 10DBA, 10Patch-For-Review: Checksum data on s7 - https://phabricator.wikimedia.org/T163190#3830929 (10Marostegui)
[14:01:58] <wikibugs>	 10DBA, 10Patch-For-Review: Checksum data on s7 - https://phabricator.wikimedia.org/T163190#3831001 (10Marostegui)
[14:44:04] <jynus>	 dbstore1002 will have replicated almost 1 day at this point without errors
[14:44:26] <marostegui>	 nice
[14:44:30] <marostegui>	 it was failing almost every 24h 
[14:44:34] <marostegui>	 so that looks promisiing
[14:49:46] <wikibugs>	 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Power supply error on db1055 - https://phabricator.wikimedia.org/T182653#3831142 (10Cmjohnson) @Marostegui  Replaced the PSU and both are now redundant  Date/Time:   12/12/2017 14:43:15 Source:      system Severity:    Critical Description: Power supply...
[14:51:56] <wikibugs>	 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Power supply error on db1055 - https://phabricator.wikimedia.org/T182653#3831148 (10Marostegui) 05Open>03Resolved That was fast! Thanks a lot ``` RECOVERY - IPMI Sensor Status on db1055 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK ```
[15:00:48] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1044 - https://phabricator.wikimedia.org/T181696#3831175 (10Cmjohnson) p:05Normal>03Low
[15:04:58] <wikibugs>	 10DBA, 10Patch-For-Review: Checksum data on s7 - https://phabricator.wikimedia.org/T163190#3831210 (10Marostegui)
[15:35:24] <wikibugs>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Patch-For-Review, 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3599592 (10Imarlier) @aaron - see note from Jaime above, he's waiting on answers fro...
[15:49:48] <jynus>	 db2072 was killed because it went over the timeout :-(
[15:50:00] <jynus>	 I think the older packages had a non-infinity timeout
[15:51:58] <wikibugs>	 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831379 (10Cmjohnson)
[15:57:11] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831395 (10Cmjohnson) assigning to @Marostegui for installs
[16:04:47] <wikibugs>	 10DBA, 10Patch-For-Review: Checksum data on s7 - https://phabricator.wikimedia.org/T163190#3831437 (10Marostegui)
[16:16:02] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3769627 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` db1111.eqiad.wmnet ``` The log can be found in `/var/log/wmf-auto-rei...
[16:22:28] <jynus>	 I am going to go next with the passive phabricator eqiad node
[16:22:45] <marostegui>	 cool!
[16:24:55] <jynus>	 are you making 1111 and co jessie or stretch?
[16:25:18] <marostegui>	 jessie
[16:34:44] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831544 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['db1111.eqiad.wmnet'] ```  Of which those **FAILED**: ``` ['db1111.eqiad.wmnet'] ```
[16:34:53] <marostegui>	 uuuh?
[16:34:55] <marostegui>	 let's see
[16:35:57] <marostegui>	 weird.. the first puppet run works fine, let's try again
[16:38:11] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831568 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` db1111.eqiad.wmnet ``` The log can be found in `/var/log/wmf-auto-rei...
[16:42:51] <volans>	 marostegui: the first one failed for:
[16:42:51] <volans>	 Failed to reset failed state of unit puppet.service: Unit puppet.service is not loaded.
[16:43:27] <marostegui>	 But I ran puppet manually after that and worked fine
[16:44:14] <volans>	 marostegui: option pxelinux.pathprefix "trusty-installer/";
[16:44:15] <volans>	 ???
[16:44:24] <marostegui>	 uh??
[16:44:38] * volans updating local copy
[16:45:03] <volans>	 seems the same
[16:45:11] <volans>	 in modules/install_server/files/dhcpd/linux-host-entries.ttyS1-115200
[16:45:19] <marostegui>	 maybe chris got confused
[16:45:37] <marostegui>	 errr volans are you looking at db1111 or db1011?
[16:45:38] <marostegui>	 :)
[16:45:52] <marostegui>	 this is what I see
[16:45:55] <volans>	 ah right
[16:45:55] <marostegui>	 host db1111 {
[16:45:55] <marostegui>	     hardware ethernet 80:18:44:DF:D4:D0;
[16:45:55] <marostegui>	     fixed-address db1111.eqiad.wmnet;
[16:45:56] <marostegui>	 }
[16:45:56] <stashbot>	 D4: iscap 'project level' commands - https://phabricator.wikimedia.org/D4
[16:46:04] <volans>	 sorry my bad
[16:46:43] <jynus>	 marostegui: could it be bad options on first install, or not part of the regex?
[16:46:56] <volans>	 if you want jessies, be careful that soon we'll switch to stretch by default: T182215
[16:46:56] <stashbot>	 T182215: install_server: switch to stretch as default install image - https://phabricator.wikimedia.org/T182215
[16:46:59] <jynus>	 I have not checked, just throwing things into the wind
[16:47:09] <marostegui>	 volans: yep, I was aware
[16:47:24] <marostegui>	 jynus: no, the install was fine, what apparently failed was the first puppet run, but doing it manually worked
[16:47:27] <marostegui>	 we will see this time
[16:47:30] <marostegui>	 it is now waiting for it
[16:47:33] <marostegui>	 so we will now soon :)
[16:47:34] <volans>	 also, if a reimage fails AFTER d-i, it can be resumed from there
[16:48:11] <volans>	 in the sense that the various options allow to skip specific parts
[16:48:19] <marostegui>	 Ah, I didn't know that :)
[16:49:41] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831598 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['db1111.eqiad.wmnet'] ```  Of which those **FAILED**: ``` ['db1111.eqiad.wmnet'] ```
[16:49:44] <marostegui>	 yeah, failed again
[16:50:29] <volans>	 Failed to reset failed state of unit puppet.service: Unit puppet.service is not loaded.
[16:50:44] <marostegui>	 yeah, I am running puppet again manually on the host to see what we get
[16:50:59] <volans>	 that's not what failed
[16:51:17] <volans>	 is the systemctl reset-failed puppet.service
[16:51:18] <volans>	 that failed
[16:51:34] <volans>	 jessie or stretch?
[16:51:38] <marostegui>	 jessie
[16:56:21] <volans>	 nope
[16:56:26] <volans>	 Release:	9.3
[16:56:27] <volans>	 Codename:	stretch
[16:57:05] <marostegui>	 mmm
[16:57:14] <volans>	 moritzm: did you change the default image today by any chance? it seems that we got a stretch here without specifying it in modules/install_server/files/dhcpd/linux-host-entries.ttyS1-115200
[16:57:28] <volans>	 same for kafka1023 with elukey
[16:57:42] <marostegui>	 ˜/moritzm 10:45> marostegui, jynus: are you reimaging servers ATM? I'm about to refresh the netboot image for stretch (there was a point release for stretch last weekend)
[16:57:48] <marostegui>	 that is what he said today
[16:57:59] <volans>	 yeah that I knew, refresh for the point release
[16:58:09] <volans>	 unsure if anything else changed though
[16:59:51] <marostegui>	 I don't see anything on gerrit from moritzm today changing anything
[17:00:13] <jynus>	 there was a puppet4 client migration ongoing, wasn't it?
[17:00:34] <jynus>	 and some kind of wikimedia/wikimedia-backports change
[17:01:10] <marostegui>	 but that shouldn't mess up with the installation distro no?
[17:01:43] <volans>	 no, at most could explain the failure if there isn't anymore a puppet service to reset fail
[17:02:07] <volans>	 but with that in mind every stretch reimage should fail, while it was working
[17:02:25] <volans>	 maybe we change the puppet.conf that affected this
[17:02:37] <volans>	 at the moment I think the two things are unrelated
[17:02:44] <volans>	 1) we got stretch instead of jessie
[17:03:16] <volans>	 2) the reimage failed because the "systemctl reset-failed puppet.service" failed with "Unit puppet.service is not loaded"
[17:03:34] <volans>	 (2) I can fix in the reimage to do it only if it's loaded indeed
[17:03:53] <volans>	 (1) I dunno
[17:04:10] <marostegui>	 2) is a valid fix anyways, no?
[17:04:24] <moritzm>	 volans: no, I only ran the scripts to refresh the netboot images for jessie and stretch
[17:04:29] <volans>	 yep, looking at it, although I'd like to know what changed :D
[17:04:55] <moritzm>	 the choice of distros is handled entirely by the install_server puppet module
[17:06:11] <marostegui>	 then that is weird because it got stretch
[17:08:59] <jynus>	 I cannot see anything weird on puppet or on install1002
[17:11:07] <jynus>	 and dns works as intended
[17:11:17] <marostegui>	 volans: you mentioned that also happened to luca?
[17:11:57] <volans>	 # cat /srv/tftpboot/jessie-installer/version.info
[17:11:58] <volans>	 Debian version:  9 (stretch)
[17:11:58] <volans>	 Installer build: 20170615+deb9u2+b1
[17:12:08] <volans>	 seems that something's wrong over there
[17:12:18] <marostegui>	 :/
[17:12:25] <jynus>	 :-)
[17:12:32] <marostegui>	 moritzm: ^
[17:12:32] <volans>	 but in pxelinux.cfg/boot.txt refers jessie
[17:12:41] * volans confused :D
[17:13:25] <volans>	 in the stretch-installer grep -rins jessie returns nothing, and that's correct
[17:13:36] <volans>	 while in the jessie-installer both greps find stuff
[17:15:26] <jynus>	 it is all a trick to upgrade to stretch earlier than intended
[17:15:30] <marostegui>	 haha
[17:15:40] <volans>	 upgrade by error :D
[17:16:54] <moritzm>	 volans: hmm, not sure. I just ran the same command as with all previous jessie point updates as well: /home/faidon/update-netboot.sh on puppetmaster1001
[17:17:14] <moritzm>	 ah, found the error I think
[17:17:27] <volans>	 it's paravoid fauld then, it's in his home :-P
[17:17:31] <volans>	 should be puppetized
[17:17:33] <marostegui>	 XD
[17:17:33] <volans>	 :D
[17:17:45] <moritzm>	 line 11 refers to stable, but jessis is now oldstable
[17:18:01] * moritzm wonders why this didn't break for the 8.9 update, but maybe that one didn't have a rebuilt d-i
[17:18:07] <volans>	 yeah rolling names bites
[17:23:50] <moritzm>	 volans, marostegui, jynus, elukey: I fixed the code name and re-ran the script on puppetmaster1001
[17:24:11] <marostegui>	 \o/
[17:24:12] <marostegui>	 thanks a lot
[17:24:19] <marostegui>	 I will issue a reimage then again :)
[17:24:40] <moritzm>	 let me run puppet on install1002 first :-)
[17:24:45] <volans>	 thanks moritzm !
[17:24:51] <marostegui>	 I was about to do it :)
[17:25:48] <moritzm>	 puppet run completed, should be fine now
[17:25:55] <marostegui>	 let's go!
[17:26:07] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831686 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` db1111.eqiad.wmnet ``` The log can be found in `/var/log/wmf-auto-rei...
[17:26:29] <moritzm>	 IIRC Debian also releases firmware-enriched images these days, we should switch to those, I'll open a task
[17:27:35] <volans>	 do they include the bnx2x too?
[17:27:48] <volans>	 I remember for the 10G I had to add them manually in the past
[17:28:27] <marostegui>	 Oh the fun of bnx2 and e1000...
[17:30:13] <moritzm>	 I don't know, needs to be researched, for now I opened https://phabricator.wikimedia.org/T182699
[17:30:32] <moritzm>	 I only saw this mentioned on debian-devel last week, haven't looked into this myself
[17:30:38] <volans>	 ok
[17:30:45] <volans>	 would be nice
[17:31:38] <volans>	 moritzm: you're the only one that reimaged today also, mw1260 was supposed to be stretch or jessie?
[17:31:46] <volans>	 because I guess you got stretch
[17:33:24] <moritzm>	 yeah, that was intentional, it used to use jessie and I wanted to migrate to stretch
[17:33:59] <volans>	 so that worked because it was the "correct" stretch
[17:34:04] <volans>	 I assume
[17:34:05] <jynus>	 do we have an s7 outage?
[17:34:11] <marostegui>	 uh?
[17:34:40] <marostegui>	 there is lag yep
[17:34:59] <jynus>	 is it gone?
[17:35:19] <marostegui>	 yeah, in eqiad yes
[17:35:22] <marostegui>	 in codfw is recoverying
[17:35:38] <jynus>	 normal, it has to take 2x the time
[17:35:46] <marostegui>	 i know
[17:36:55] <marostegui>	 https://grafana.wikimedia.org/dashboard/db/mysql?panelId=2&fullscreen&orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1062&var-port=9104
[17:42:58] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1015 - https://phabricator.wikimedia.org/T173570#3831730 (10Cmjohnson)
[17:44:36] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1021 - https://phabricator.wikimedia.org/T181378#3831735 (10Cmjohnson)
[17:45:18] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1026 - https://phabricator.wikimedia.org/T174763#3831736 (10Cmjohnson)
[17:45:57] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1045 - https://phabricator.wikimedia.org/T174806#3831739 (10Cmjohnson)
[17:46:23] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad: Decommission db1049 - https://phabricator.wikimedia.org/T175264#3831741 (10Cmjohnson)
[17:49:25] <wikibugs>	 10DBA, 10Operations, 10Phabricator, 10hardware-requests, 10ops-eqiad: Decommission db1048 (was Move m3 slave to db1059) - https://phabricator.wikimedia.org/T175679#3831747 (10Cmjohnson) All non-interruptible steps have been completed.  Still needs wiping/removal from rack
[17:49:39] <jynus>	 marostegui: was something heavy  runing on s7 that could have contributed to it? I assume not on the master or codfw?
[17:49:55] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1050 - https://phabricator.wikimedia.org/T178162#3831749 (10Cmjohnson)
[17:50:34] <marostegui>	 jynus: nope, nothing running from my side
[17:50:51] <jynus>	 I was hoping for something to explain it
[17:51:13] <wikibugs>	 10DBA, 10Operations, 10hardware-requests, 10ops-eqiad, 10Patch-For-Review: Decommission db1044 - https://phabricator.wikimedia.org/T181696#3798986 (10Cmjohnson)
[17:51:26] <elukey>	 were you guys able to reimage a jessie host?
[17:51:45] <elukey>	 I am getting a kernel panic while pxe installing
[17:52:37] <elukey>	 (before d-i)
[17:52:48] <wikibugs>	 10DBA, 10Operations, 10Goal: Migrate MySQLs to use ROW-based replication - https://phabricator.wikimedia.org/T109179#3831770 (10jcrespo) We believe that since s5 was accidentally migrated to ROW, the lag is improved; so it did on labsdbs despite not having any kind of replication control, unlike production.
[17:57:03] <wikibugs>	 10DBA, 10Operations, 10Patch-For-Review: Rack and setup db1111 and db1112 - https://phabricator.wikimedia.org/T180788#3831781 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['db1111.eqiad.wmnet'] ```  Of which those **FAILED**: ``` ['db1111.eqiad.wmnet'] ```
[17:57:16] <marostegui>	 it timedout when trying pxe
[17:57:32] <marostegui>	 elukey: I just got the kernel panic too
[17:57:47] <jynus>	 :-)
[17:57:59] <jynus>	 maybe time to leave it for tomorrow
[17:58:02] <marostegui>	 yeah
[17:58:06] <marostegui>	 I was thinking about that
[17:58:09] <jynus>	 and research the issues
[17:58:20] <jynus>	 downtime/disable alerts just in case
[17:58:30] * elukey cries in a corner
[17:58:32] <marostegui>	 https://phabricator.wikimedia.org/P6451
[17:58:41] <elukey>	 moritzm: --^
[17:58:51] <elukey>	 yeah got again too
[18:01:31] <jynus>	 marostegui: looking better
[18:01:46] <jynus>	 it only affected db1034 and db1039
[18:02:14] <jynus>	 which I am going to assume are not pooled/pooled with 0 weight
[18:02:29] <jynus>	 that is correct
[18:02:33] <jynus>	 so actually, no issue
[18:02:45] <jynus>	 codfw got behind, but at this point this is "normal"
[18:03:16] <jynus>	 I only got worried because last time that happened, codfw complaind first on master breakage
[18:03:33] <jynus>	 all other hosts worked nicely, we should just decomm the older servers
[18:03:57] <jynus>	 proof: https://grafana.wikimedia.org/dashboard/db/mysql-replication-lag?panelId=7&fullscreen&orgId=1&from=1513097990213&to=1513101633587
[18:04:18] <jynus>	 vs: https://grafana.wikimedia.org/dashboard/db/mysql-replication-lag?panelId=7&fullscreen&orgId=1&from=1513097990213&to=1513101633587&var-dc=codfw%20prometheus%2Fops
[18:04:34] <moritzm>	 marostegui, elukey: hmm, not sure. 3.16.51-2 is the new kernel in the jessie 8.9 update, maybe something has regressed there
[18:04:50] <jynus>	 3.16.51-2?
[18:05:02] <jynus>	 I though there was some newer?
[18:05:47] <jynus>	 oh, we have 4.9 on jessies
[18:06:18] <moritzm>	 the jessie installer is first using the default kernel shipped by jessie (3.16), the switch to 4.9 only occurs after the initial installation
[18:06:28] <jynus>	 yes, I understand
[18:06:41] <jynus>	 I wonder if it is worth updating the installer at this time
[18:06:59] <jynus>	 I mean the already done upgrade
[18:07:31] <moritzm>	 it's not so simple to update the jessie installer to use 4.9 from the start unfortunately
[18:07:43] <jynus>	 no no
[18:07:45] <jynus>	 I meanr
[18:07:52] <jynus>	 not to update the installer, period
[18:07:59] <jynus>	 and update packages afterwards
[18:08:21] <jynus>	 aka install with 8.8
[18:08:42] <moritzm>	 yeah, but we need to upgrade the installer, it's essentially broken after every jessie or stretch point release (until we implement https://phabricator.wikimedia.org/T182699)
[18:08:50] <jynus>	 ah, ok
[18:09:05] <jynus>	 :-(
[18:09:08] <moritzm>	 I only ran the upgrade script since elukey pinged me on IRC with the error earlier the day
[18:21:59] <elukey>	 marostegui: what was the host that failed with kernel panic?
[18:23:21] <elukey>	 (let's check tomorrow :)
[18:28:21] <marostegui>	 db1111
[18:29:19] <wikibugs>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Patch-For-Review, 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3831865 (10aaron) >>! In T175672#3778177, @jcrespo wrote: > @aaron the proxy is inst...
[18:30:43] <elukey>	 just opened https://phabricator.wikimedia.org/T182702
[18:31:24] <marostegui>	 thanks!
[18:47:41] <wikibugs>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Patch-For-Review, 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3831913 (10jcrespo) > A local and foreign replica would do  it is installed on both...
[19:39:41] <wikibugs>	 10DBA, 10Data-Services: Make Dispenser's principle_links table accessible in new Wiki replica cluster - https://phabricator.wikimedia.org/T180636#3832045 (10Dispenser) @jcrespo The current pipeline is: # A bash/python script on ToolForge which makes 275,000 MW API requests and bundles JSON responses in a `.tar...