[00:46:54] 10Traffic, 10Operations, 10Performance-Team, 10Wikimedia-General-or-Unknown, 10SEO: Search engines continue to link to JS-redirect destination after Wikipedia copyright protest - https://phabricator.wikimedia.org/T199252 (10Imarlier) I'm going to start with the simple case, I'll make it more complicated... [06:23:58] With our IRC ad service you can reach a global audience of entrepreneurs and fentanyl addicts with extraordinary engagement rates! https://williampitcock.com/ [06:23:59] I thought you guys might be interested in this blog by freenode staff member Bryan 'kloeri' Ostergaard https://bryanostergaard.com/ [06:23:59] Read what IRC investigative journalists have uncovered on the freenode pedophilia scandal https://encyclopediadramatica.rs/Freenodegate [06:24:06] A fascinating blog by freenode staff member Matthew 'mst' Trout https://MattSTrout.com/ [06:24:40] With our IRC ad service you can reach a global audience of entrepreneurs and fentanyl addicts with extraordinary engagement rates! https://williampitcock.com/ [06:24:44] I thought you guys might be interested in this blog by freenode staff member Bryan 'kloeri' Ostergaard https://bryanostergaard.com/ [06:24:48] Read what IRC investigative journalists have uncovered on the freenode pedophilia scandal https://encyclopediadramatica.rs/Freenodegate [06:24:51] A fascinating blog by freenode staff member Matthew 'mst' Trout https://MattSTrout.com/ [07:07:27] With our IRC ad service you can reach a global audience of entrepreneurs and fentanyl addicts with extraordinary engagement rates! https://williampitcock.com/ [07:07:27] I thought you guys might be interested in this blog by freenode staff member Bryan 'kloeri' Ostergaard https://bryanostergaard.com/ [07:07:27] Read what IRC investigative journalists have uncovered on the freenode pedophilia scandal https://encyclopediadramatica.rs/Freenodegate [07:07:32] A fascinating blog by freenode staff member Matthew 'mst' Trout https://MattSTrout.com/ [07:20:01] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: Review analytics-in4/6 rules on cr1/cr2 eqiad - https://phabricator.wikimedia.org/T198623 (10elukey) Some https calls still registered, I tried to add more https_proxy settings and I opened T201134 to follow up with the author o... [08:28:46] 10netops, 10Operations: Intermitent connectivity issues between eqiad servers? - https://phabricator.wikimedia.org/T201139 (10jcrespo) [08:31:17] 10netops, 10Operations: Intermitent connectivity issues between eqiad servers? - https://phabricator.wikimedia.org/T201139 (10jcrespo) [08:47:40] 10netops, 10Operations: Intermitent connectivity issues between eqiad servers? - https://phabricator.wikimedia.org/T201139 (10Joe) [09:04:37] 10netops, 10Operations: asw2-a-eqiad FPC5 gets disconnected every 10 minutes - https://phabricator.wikimedia.org/T201145 (10faidon) p:05Triage>03High [09:15:04] 10netops, 10Operations: cr1/2-eqiad PFE_FW_SYSLOG_IP6_GEN log entries - https://phabricator.wikimedia.org/T201149 (10faidon) p:05Triage>03High [10:29:18] 10netops, 10Operations: connectivity issues between several hosts on asw2-b-eqiad - https://phabricator.wikimedia.org/T201039 (10faidon) So... what's the status of this? What else has been observed, what has been done to troubleshoot and what's the latest from Juniper? I tried to access the Juniper case for m... [12:58:00] 10Traffic, 10Operations, 10ops-eqiad: cp1080 uncorrectable DIMM error slot A5 - https://phabricator.wikimedia.org/T201174 (10BBlack) [12:59:19] 10Traffic, 10Operations, 10ops-eqiad: cp1085 bad DAC/SFP? - https://phabricator.wikimedia.org/T201175 (10BBlack) [12:59:55] 10Traffic, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install cp1075-cp1090 - https://phabricator.wikimedia.org/T195923 (10BBlack) a:05Cmjohnson>03BBlack Hmmm maybe subtasks are better, setting some of those up: T201174 + T201175 [13:00:12] 10Traffic, 10Operations, 10ops-eqiad: cp1080 uncorrectable DIMM error slot A5 - https://phabricator.wikimedia.org/T201174 (10BBlack) [13:00:15] 10Traffic, 10Operations, 10ops-eqiad: cp1085 bad DAC/SFP? - https://phabricator.wikimedia.org/T201175 (10BBlack) [13:00:18] 10Traffic, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install cp1075-cp1090 - https://phabricator.wikimedia.org/T195923 (10BBlack) [13:01:18] wow this channel is really growing in stature, we attract spammers now [13:02:24] /o\ [13:03:23] 91% of test coverage for x509.py and 77% for acme_requests.py [13:03:36] I need to increase a little bit the coverage in acme_requests [13:03:39] but it's looking good [13:20:00] 10netops, 10Operations: Intermitent connectivity issues in eqiad's row C - https://phabricator.wikimedia.org/T201139 (10jcrespo) [13:41:17] 10Traffic, 10Operations, 10ops-eqiad: cp1085 bad DAC/SFP? - https://phabricator.wikimedia.org/T201175 (10Cmjohnson) @bblack I replaced both sfp+'s please try again and let me know if the problem persists. [13:42:40] 10Traffic, 10Operations, 10ops-eqiad: cp1080 uncorrectable DIMM error slot A5 - https://phabricator.wikimedia.org/T201174 (10Cmjohnson) Description: A problem was detected in Memory Reference Code (MRC). ------------------------------------------------------------------------------- Record: 79 Date/Time... [13:48:04] 10Traffic, 10Operations, 10ops-eqiad: cp1080 uncorrectable DIMM error slot A5 - https://phabricator.wikimedia.org/T201174 (10Cmjohnson) I swapped DIMM in A5 with DIMM in B5 to see if the error follows the DIMM. Cleared the log [13:49:33] 10Traffic, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install cp1075-cp1090 - https://phabricator.wikimedia.org/T195923 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by bblack on neodymium.eqiad.wmnet for hosts: ``` ['cp1085.eqiad.wmnet'] ``` The log can be found in `/var/log/w... [13:55:28] 10Traffic, 10Operations, 10ops-eqiad: cp1085 bad DAC/SFP? - https://phabricator.wikimedia.org/T201175 (10BBlack) Installer launches over PXE fine now, fixed :) [14:01:32] bblack: seems like fixing cp1085 fixed the ICMP errors on https://grafana.wikimedia.org/dashboard/db/network-performances-global?panelId=20&fullscreen&orgId=1&from=now-30m&to=now [14:02:40] XioNoX: meeting [14:38:34] 10netops, 10Operations, 10ops-eqiad: asw2-a-eqiad VC link down - https://phabricator.wikimedia.org/T201095 (10Cmjohnson) If we need to swap this cable, we will need to order more 5M 40G QFSP+ cables. I only have 3M spares. I don't know if we want to use the Fiberstore brand....we had a few bad cables duri... [14:48:08] 10Traffic, 10Operations, 10ops-eqiad: cp1068 memory correctable errors - https://phabricator.wikimedia.org/T194757 (10Cmjohnson) The server will need to be powered down to reseat DIMM...please schedule a day/time with me. [15:04:19] 10netops, 10Operations, 10ops-eqiad: replace mr1-eqiad - https://phabricator.wikimedia.org/T185171 (10Cmjohnson) [15:05:10] bblack: I just confirmed with jynus... 9 IPSec tunnels per non-primary dc, between the proxysql node and the db masters [15:05:11] 10Traffic, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install cp1075-cp1090 - https://phabricator.wikimedia.org/T195923 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['cp1085.eqiad.wmnet'] ``` and were **ALL** successful. [15:05:42] 10Traffic, 10Operations, 10ops-eqiad: cp1068 memory correctable errors - https://phabricator.wikimedia.org/T194757 (10BBlack) 05Open>03declined Let's just skip this, it's one of the servers we'll be decomming once cp1075-90 are rolled into service. [15:06:02] 10netops, 10Operations, 10ops-eqiad: replace mr1-eqiad - https://phabricator.wikimedia.org/T185171 (10Cmjohnson) 05Open>03Resolved @faidon I did not change racktables until it was removed from the rack. The old srx is now removed and added to the decom tracking sheet. I updated racktables to reflect the... [15:06:11] vgutierrez: not too bad :) [15:10:43] 10Traffic, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install cp1075-cp1090 - https://phabricator.wikimedia.org/T195923 (10BBlack) [15:10:47] 10Traffic, 10Operations, 10ops-eqiad: cp1085 bad DAC/SFP? - https://phabricator.wikimedia.org/T201175 (10BBlack) 05Open>03Resolved [15:12:07] 10Traffic, 10Operations, 10ops-eqiad: cp1050 apparently stuck while "Initializing firmware interfaces..." - https://phabricator.wikimedia.org/T171168 (10BBlack) 05Open>03declined To be decommed in the next couple of weeks, no point [15:12:28] 10Traffic, 10Operations, 10ops-eqiad: cp1053 possible hardware issues - https://phabricator.wikimedia.org/T165252 (10BBlack) 05Open>03declined To be decommed in the next couple of weeks, no point! [16:02:03] 10netops, 10Operations, 10ops-eqiad: asw2-a-eqiad VC link down - https://phabricator.wikimedia.org/T201095 (10ayounsi) We need to always have spares of the cables we use in production. Are spares being tracked somewhere? As this one is urgent to replace and Fiberstore is quick, I'd say go with them as well. [16:28:21] 10netops, 10Operations: asw2-a-eqiad FPC5 gets disconnected every 10 minutes - https://phabricator.wikimedia.org/T201145 (10ayounsi) a:03ayounsi Juniper ticket 2018-0803-0360 created. According to Kibana, this started ~6h after T201095 got created. As it outputs Critical and Emergency logs, another questio... [17:01:51] 10netops, 10Operations: connectivity issues between several hosts on asw2-b-eqiad - https://phabricator.wikimedia.org/T201039 (10ayounsi) CCed you to the JTAC case, not sure how to make sure you have default access to all the cases. So far poor replies from JTAC, I'll escalate if it doesn't get proper respons... [17:21:06] 10netops, 10Operations: cr1/2-eqiad PFE_FW_SYSLOG_IP6_GEN log entries - https://phabricator.wikimedia.org/T201149 (10ayounsi) a:03ayounsi `/kernel: Nexthop index allocation failed` due to router limitation and the design of our mgmt network (see description of T174397) `PFE_FW_SYSLOG_IP6_GEN` is temporary l... [18:03:24] 10netops, 10Operations: Intermitent connectivity issues in eqiad's row C - https://phabricator.wikimedia.org/T201139 (10ayounsi) a:03ayounsi [20:06:51] 10netops, 10Operations: connectivity issues between several hosts on asw2-b-eqiad - https://phabricator.wikimedia.org/T201039 (10ayounsi) JTAC came back with troubleshooting and data gathering commands/configuration to do if the issue happen again. [20:16:23] 10netops, 10Operations: asw2-a-eqiad FPC5 gets disconnected every 10 minutes - https://phabricator.wikimedia.org/T201145 (10ayounsi) JTAC recommendation is to format and re-install the switch member using: https://kb.juniper.net/InfoCenter/index?page=content&id=KB20643 In their emails they say that only usb i... [22:16:35] 10Traffic, 10DNS, 10Operations, 10ops-eqiad: rack/setup/install dns100[12].wikimedia.org - https://phabricator.wikimedia.org/T196691 (10RobH) [22:39:25] 10Traffic, 10DNS, 10Operations, 10ops-eqiad: rack/setup/install dns100[12].wikimedia.org - https://phabricator.wikimedia.org/T196691 (10RobH) a:05RobH>03BBlack So these two systems fail their puppet runs, but fail for the following: Error: Could not retrieve catalog from remote server: Error 500 on SE... [22:39:40] 10Traffic, 10DNS, 10Operations: rack/setup/install dns100[12].wikimedia.org - https://phabricator.wikimedia.org/T196691 (10RobH) [22:44:02] This channel has been hacked by Australia's #1 hacker Simon 'eVestigator' Smith https://evestigatorsucks.com/ [22:44:05] With our IRC ad service you can reach a global audience of entrepreneurs and fentanyl addicts with extraordinary engagement rates! https://williampitcock.com/ [22:44:09] I thought you guys might be interested in this blog by freenode staff member Bryan 'kloeri' Ostergaard https://bryanostergaard.com/ [22:44:12] Read what IRC investigative journalists have uncovered on the freenode pedophilia scandal https://encyclopediadramatica.rs/Freenodegate [22:44:15] A fascinating blog by freenode staff member Matthew 'mst' Trout https://MattSTrout.com/