[00:04:43] PROBLEM - Puppet run on deployment-ms-fe01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [00:23:08] RECOVERY - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is OK: OK: Less than 100.00% above the threshold [0.0] [00:41:57] 10Beta-Cluster-Infrastructure, 07Puppet: Puppet failing on deployment-conf03 due to missing files - https://phabricator.wikimedia.org/T144703#2627269 (10AlexMonk-WMF) a:03AlexMonk-WMF [00:44:43] RECOVERY - Puppet run on deployment-ms-fe01 is OK: OK: Less than 1.00% above the threshold [0.0] [00:51:05] RECOVERY - Puppet run on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:12:45] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:23:33] PROBLEM - Puppet run on deployment-mediawiki02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [02:45:12] 10Continuous-Integration-Config: stylelint:src CI job fails on custom less methods - https://phabricator.wikimedia.org/T145348#2627286 (10Tgr) [02:52:42] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [03:03:33] RECOVERY - Puppet run on deployment-mediawiki02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:13:30] PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [03:53:28] RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [04:00:26] RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.58 ms [04:16:44] PROBLEM - Host deployment-parsoid05 is DOWN: CRITICAL - Host Unreachable (10.68.16.120) [04:18:15] Yippee, build fixed! [04:18:15] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #139: 09FIXED in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/139/ [05:36:08] (03PS1) 10Tim Starling: Update location of parserTests.php [integration/config] - 10https://gerrit.wikimedia.org/r/309935 [05:55:54] (03CR) 10Legoktm: [C: 032] Update location of parserTests.php [integration/config] - 10https://gerrit.wikimedia.org/r/309935 (owner: 10Tim Starling) [06:27:14] PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [06:28:43] (03PS2) 10Tim Starling: Update location of parserTests.php [integration/config] - 10https://gerrit.wikimedia.org/r/309935 [06:28:50] (03CR) 10Legoktm: [C: 032] Update location of parserTests.php [integration/config] - 10https://gerrit.wikimedia.org/r/309935 (owner: 10Tim Starling) [06:31:12] (03Merged) 10jenkins-bot: Update location of parserTests.php [integration/config] - 10https://gerrit.wikimedia.org/r/309935 (owner: 10Tim Starling) [06:32:23] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:56:15] 06Release-Engineering-Team, 10Phabricator, 05acl*phabricator: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2627544 (10jcrespo) [06:58:05] 06Release-Engineering-Team, 10Phabricator, 05acl*phabricator: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2627544 (10Legoktm) I unarchived it for now. [07:02:11] 06Release-Engineering-Team, 10Phabricator, 05acl*phabricator: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2627558 (10jcrespo) I will wait to see if I did something wrongly or was just a mistake. [07:02:13] RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0] [07:07:22] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [07:54:36] 10Beta-Cluster-Infrastructure, 13Patch-For-Review, 07Puppet: Puppet failing on deployment-conf03 due to missing files - https://phabricator.wikimedia.org/T144703#2627614 (10hashar) [08:01:40] 10Beta-Cluster-Infrastructure, 03Scap3: Fixup beta scap3 keyholder problems - https://phabricator.wikimedia.org/T144647#2606197 (10hashar) p:05Triage>03Normal [08:37:53] PROBLEM - Puppet run on deployment-redis01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [08:38:01] 10Beta-Cluster-Infrastructure: deployment-pdf01 low free space warning - https://phabricator.wikimedia.org/T145343#2627173 (10hashar) ``` # du -m --max-depth 2 --one-file-system / |sort -rn|head -n10 6477 / 2738 /usr 2081 /usr/share 1854 /lib 1691 /lib/modules 1039 /home/cscott 1039 /home 54... [08:38:09] 10Beta-Cluster-Infrastructure: deployment-pdf01 low free space warning - https://phabricator.wikimedia.org/T145343#2627694 (10hashar) p:05Triage>03Normal [08:48:19] 10Beta-Cluster-Infrastructure, 07Puppet, 07Tracking: Deployment-prep hosts with puppet errors (tracking) - https://phabricator.wikimedia.org/T132259#2627710 (10hashar) [08:55:40] 10Deployment-Systems, 03Scap3, 13Patch-For-Review: Update Debian Package for Scap3 - https://phabricator.wikimedia.org/T127762#2627727 (10fgiunchedi) 05Open>03Resolved @thcipriani yup! I've updated the package on carbon to 3.2.5-1 [09:06:59] 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2627760 (10elukey) [09:12:54] RECOVERY - Puppet run on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:17:20] 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2627786 (10elukey) I wanted to spin up a new Debian instance with Horizon for deployment-prep but it seems that we are already hitting the resource limits: {F4459123} Maybe... [09:38:15] 03Scap3, 10Mathoid, 06Services, 13Patch-For-Review, 15User-mobrovac: Enable Scap3 config deploys for Mathoid - https://phabricator.wikimedia.org/T144755#2627816 (10mobrovac) [09:38:16] 03Scap3: Scap fails to force-deploy the config - https://phabricator.wikimedia.org/T145194#2627813 (10mobrovac) 05Open>03Resolved a:03thcipriani Fixed in v3.2.5-1. Thnx @thcipriani ! [09:40:29] 03Scap3: Scap fails to force-deploy the config - https://phabricator.wikimedia.org/T145194#2627817 (10mobrovac) [09:42:30] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T143328#2564761 (10Legoktm) @demon: special request this week, could {b43ac35351e70f3b6429cc527509ac33f52c6404} / https://gerrit.wikimedia.org/r/#/c/309061/ be reverted in the w... [09:55:00] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [10:13:43] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [10:34:58] RECOVERY - Puppet run on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [10:41:19] 03Scap3, 10Mathoid, 06Services, 13Patch-For-Review, 15User-mobrovac: Enable Scap3 config deploys for Mathoid - https://phabricator.wikimedia.org/T144755#2627943 (10mobrovac) 05Open>03Resolved [10:41:39] 03Scap3, 10Mathoid, 06Services, 15User-mobrovac: Enable Scap3 config deploys for Mathoid - https://phabricator.wikimedia.org/T144755#2609180 (10mobrovac) [10:53:43] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [11:03:26] 03Scap3: Local config deploys should use the target's current version - https://phabricator.wikimedia.org/T145373#2627981 (10mobrovac) [11:04:28] 06Release-Engineering-Team, 10Phabricator, 05acl*phabricator: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2627994 (10Aklapper) > @Aklapper archived Blocked-on-schema-change, despite being created with his ok, now WMF schema creation... [11:05:01] 06Release-Engineering-Team, 06Project-Admins: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2627998 (10Aklapper) [11:07:36] 10Browser-Tests-Infrastructure, 06Release-Engineering-Team, 10MediaWiki-extensions-Examples, 07Documentation, and 5 others: Improve documentation around running/writing (with lots of examples) browser tests - https://phabricator.wikimedia.org/T108108#2628024 (10zeljkofilipin) Reported redirect as {{T145374}}. [11:25:29] 06Release-Engineering-Team, 06Operations, 06Project-Admins: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2628076 (10jcrespo) @Aklapper you literally told me to create this project: T119751#1835024 Now this and #Blocked-on-operation... [11:45:29] good morning hashar! [11:51:10] addshore: o/ :) [11:51:18] :D [11:51:31] hashar: is there any chance I could get you to create a 'wmde' group on gerrit? [11:52:02] right now we have wikidata & tcb-team but we are trying out some new teams and it will be best to have everyone in 1 group (as well as keeping these legacy ones) [11:52:09] addshore: Gerrit supports LDAP group lookup with a syntax such as ldap/wmde [11:52:16] and there is a wmde group in the labs LDAP [11:52:23] oooh [11:52:38] aude and I had the "wmde" LDAP groups created specially for that [11:52:41] but we can't easilyl add people to ldap groups. how can I see who is in the ladap group already? [11:52:42] similar to the wmf group [11:52:44] and later nda group [11:53:14] so in existing Gerrit groups, you can try to add the 'person' ldap/wmde [11:53:20] and it might just do the right thing [11:53:59] just looked on gerrit and ldap/wmde didn't seem to work :/ [11:54:15] example for operations/puppet.git , the Owner is ldap/ops [11:54:15] https://gerrit.wikimedia.org/r/#/admin/projects/operations/puppet,access [11:54:25] ooooh [11:54:32] maybe ldap groups cant be added to Gerrit groups [11:54:42] okay, then I guess I just need to find the list of people in the ldap group! [11:54:56] * hashar tries on wikidata group https://gerrit.wikimedia.org/r/#/admin/groups/32 [11:55:10] yeah works [11:55:18] gotta head to "Included Groups" at the bottom [11:55:24] awesome [11:55:32] granted folks are in LDAP [11:55:40] how can I check that? [11:55:43] you can remove them from the list of members [11:55:44] hmm [11:55:48] I always find ldap a struggle to get data out of... [11:55:53] some complicated LDAP query on a labs instance [11:56:12] we had ldap-list utility as a wrapper, but apparently ops did not like it and it is going to disappear [11:56:28] so you end up having to rely on ldapsearch [11:57:00] ldapsearch -xLLL -H ldap://server.domain.net \ [11:57:00] -b "cn=users,dc=server,dc=domain,dc=net" uid=username1 \* + [11:57:05] !log Gerrit: added ldap/wmde as an included group of the 'wikidata' group. Asked by and demoed to addshore [11:57:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [11:57:29] addshore: yeah something like that ;D [11:57:32] 06Release-Engineering-Team, 06Operations, 06Project-Admins: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2628197 (10jcrespo) [11:57:38] https://wikitech.wikimedia.org/wiki/Ldapsearch is ridiculously empty though :( [11:57:47] right, a couple other things! [11:57:50] we used to have: https://wikitech.wikimedia.org/wiki/Ldaplist [11:57:57] or: ldaplist -l passwd johndoe [11:58:01] ldaplist -l group wmde [11:58:25] Can I get added as the owner of https://gerrit.wikimedia.org/r/#/admin/projects/analytics/wmde that I can add some more people (once I figure out this ldap stuff)? [11:58:53] well you are the owner already [11:59:07] project shows the owner is the group analytics-wmde : https://gerrit.wikimedia.org/r/#/admin/projects/analytics/wmde,access [11:59:17] and that group has only a single member https://gerrit.wikimedia.org/r/#/admin/groups/uuid-a3215befbf6301605393aa39dfa6364a05b7fc9b [11:59:26] so you add more members [11:59:30] https://usercontent.irccloud-cdn.com/file/BQwRKuCD/ [11:59:37] or be bold and add ldap/wmde as an included group [11:59:46] oh [12:00:09] also these repos needs to be slightly finer grained and keep my as the only one that can merge into the production branch right now [12:00:16] click the "Members" link on the left? [12:00:30] ahh no [12:00:32] https://usercontent.irccloud-cdn.com/file/6uYFmGrD/ [12:00:34] input fields are grey [12:00:45] the group "analytics-wmde" belongs to "Project and Group creators" [12:00:50] let me change that so it is self owned [12:01:06] !log Gerrit: made analytics-wmde group to be owned by themselves [12:01:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [12:01:11] done [12:01:13] retry? :] [12:03:49] !log upgrading elasticsearch to 2.4.0 on deployment-elastic0? - T145058 [12:03:52] 06Release-Engineering-Team, 06Operations, 06Project-Admins: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2628201 (10Aklapper) >>! In T145361#2628076, @jcrespo wrote: > @Aklapper you literally told me to create this project: More rec... [12:03:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [12:04:16] hashar: that looks better! [12:04:24] now let me figure out this ldap thing [12:05:05] hashar: any idea what our ldap server is called? :P [12:07:57] 10Browser-Tests-Infrastructure, 07Jenkins, 13Patch-For-Review, 07Ruby, 15User-zeljkofilipin: MEDIAWIKI_URL may be set to incorrect value in mwext-mw-selenium job - https://phabricator.wikimedia.org/T144912#2628210 (10zeljkofilipin) The problem is that `#get_wikitext` uses `index.php`'s `action=raw` mode... [12:11:27] ahh ldaplist -l group wmde still works on labs bastion hashar ! [12:13:00] neat [12:13:23] boy is the list out of date :P [12:14:41] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [12:15:03] addshore: I think the primary use case was to grant you guys rights on Jenkins [12:15:12] so we would fill a RT ticket asking for one to be added [12:15:21] I guess the list can be revisited [12:15:39] and whatever process WMDE uses to enroll / remove people could add a step asking for the person to be added/removed from ldap/wmde [12:15:50] yeh, I might files some phab tickets, at least 2 people should probably be removed and a buck load added! :) [12:16:04] I think request would be handled by ops via #ops-access-requests https://phabricator.wikimedia.org/project/view/956/ [12:16:10] 06Release-Engineering-Team, 06Operations, 06Project-Admins: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2628241 (10jcrespo) @Aklapper, now that I know the context (I wasn't aware of that task, and that it a problem by itself), it wa... [12:16:28] or maybe it is LDAP-Access-Requests https://phabricator.wikimedia.org/project/view/1564/ :D [12:16:48] yup! [12:17:02] yeah that later phabricator projects seems more appropriate [12:17:14] the ops-access-requests one is for shell access on the wikimedia prod cluster [12:17:35] guess you can fill in a single task with list of address/uid to remove and list of those to be added [12:17:58] I think some non ops can process the ldap modifications (Chad can definitly) [12:29:37] 10Beta-Cluster-Infrastructure: Set 'cluster' salt grain appropriately for all instances in beta cluster - https://phabricator.wikimedia.org/T87199#2628268 (10hashar) [12:47:06] 10Continuous-Integration-Infrastructure (phase-out-gallium), 13Patch-For-Review: Firewall rules for labs support host to communicate with contint1001.wikimedia.org (new gallium) - https://phabricator.wikimedia.org/T137323#2628340 (10hashar) @mmodell do we still need Harbormaster on iridum to be able to talk to... [12:50:23] !log rolling back upgrading elasticsearch to 2.4.0 on deployment-elastic05 - T145058 [12:50:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [12:54:43] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [13:07:23] 06Release-Engineering-Team, 06Operations, 06Project-Admins: #blocked-on-schema-change was archived, now schema change workflow is broken - https://phabricator.wikimedia.org/T145361#2628396 (10Aklapper) @jcrespo: Makes a lot of sense! No worries; I just was a bit confused by the choice of words (and probably... [13:22:35] gehel: thank you for taking of beta cluster ElasticSearch boxes :]] [13:23:27] hashar: yeah, not that successful... We'll do better next time :P [13:47:38] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #143: 04FAILURE in 3 min 37 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/143/ [13:47:47] hashar: if you get time amongst your deploying I would appreciate a +2 on https://gerrit.wikimedia.org/r/#/c/309991 (which I accidently removed).... [13:51:21] :D [13:51:42] add done! [13:51:50] thanks! [13:51:59] https://gerrit.wikimedia.org/r/#/admin/projects/analytics/wmde,access [13:52:03] not sure it makes much sense [13:52:12] yeh, I think thats good [13:52:39] so I believe I can exclusively submit on the production branch, even though everyone is able to go and add themselves to it? :P [13:52:43] 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2628562 (10elukey) @greg, @hashar - I just created deployment-mediawiki04 in deployment-prep and the VCPU quota is now maxed out (there were only 4 VCPUs left). Do we need... [13:52:51] addshore: looks like :] [13:53:00] and JenkinsBot can submit as well [13:53:02] on any ref [13:53:13] PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [13:53:30] not if I mark it as exclusive! ;)_ [13:53:33] i think.. [13:53:42] as I saw some other repos with exceptions to allow jenkinsbot! [13:56:10] 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2628588 (10hashar) We had two 8 CPU / 16 G instances created to migrate the databases to Jessie T138778 that is scheduled for Thursday. Once migrated I guess they will be del... [13:57:23] addshore: I have no idea what "exclusive" means :( [13:57:29] well it is in Gerrit [13:57:32] but I dont know what is the effect [13:57:50] it meands what it says there is exclusive / the only thing that applies, and jenkins bot gets its rights form further up the chain (i think) [13:57:55] *menas [13:58:00] **means.. [14:14:42] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2628704 (10thcipriani) >>! In T135427#2576224, @thcipriani wrote: > 1. Ensure every cherry-picked patch has a 'Bug: TXXXXXX' > 2. Check that task: > * Closed? Remove ch... [14:15:31] 03Scap3, 10ChangeProp, 10EventBus, 06Services, 15User-mobrovac: Enable Scap config deploys for Change Propagation - https://phabricator.wikimedia.org/T144595#2628705 (10mobrovac) 05Open>03Resolved [14:18:19] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2628717 (10mobrovac) >>! In T135427#2628704, @thcipriani wrote: > Created https://phabricator.wikimedia.org/p/beta-puppetmaster/ bot user to comment on tasks. Refining pu... [14:33:15] RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0] [14:35:02] Hello everybody, for https://phabricator.wikimedia.org/T144006 I created deployment-mediawiki04.deployment-prep.eqiad.wmflabs (debian) and now I'd need to add the mw appserver role to it [14:36:55] https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=configure&instanceid=e15c7433-de55-4ed2-bf93-a4127fc30b52&project=deployment-prep®ion=eqiad ! :) [14:37:28] from the private message, you can copy paste from deployment-mediawiki02 : base::firewall, beta::deployaccess, mediawiki::conftool, role::mediawiki::appserver [14:38:16] then the process should be the same for any other hosts on beta [14:41:35] !log applied base::firewall, beta::deployaccess, mediawiki::conftool, role::mediawiki::appserver to deployment-mediawiki04.deployment-prep.eqiad.wmflabs (Debian jessie instance) - T144006 [14:41:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:45:13] puppet started :) [14:47:06] will take a while [14:47:27] then once the puppet patch is landed / puppet run [14:47:46] puppet gotta run on tin.deployment-prep.eqiad.wmflabs to update the dsh files scap relies on [14:48:16] then try triggering a scap https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/ to push mw to the new box [14:48:18] and see what happens [14:50:53] hashar hi, i now have an irc bouncer :) [14:52:41] paladox: you will never miss a message: :D [14:52:58] Well actually i will since it's limit is 500 [14:53:09] I need to figure out how to set it for even furthur [14:53:13] But it is free :) [14:56:27] hashar one of my patches was merged upstream in gerrit https://gerrit-review.googlesource.com/#/c/85912/ :) [14:58:51] neat :) [14:59:04] yep [14:59:39] i also have another patch https://gerrit-review.googlesource.com/#/c/86011/ that fixes polygerrit on browsers that doint support Promise :) [15:00:46] theres also https://gerrit-review.googlesource.com/#/c/85340/ which will fix one of our bugs and allow for better code reviews since the diff wont be cut off [15:00:55] as long as it is enabled in your diff preference [15:00:56] :) [15:04:14] 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations: Upgrade Zuul on scandium.eqiad.wmnet (Jessie zuul-merger) - https://phabricator.wikimedia.org/T145057#2628851 (10hashar) [15:04:16] 10Continuous-Integration-Infrastructure, 10Packaging, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#2628850 (10hashar) [15:05:00] 10Continuous-Integration-Infrastructure, 10Packaging, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#1392464 (10hashar) I have build a new Jessie package and it is going to be upgraded on scandium (zuul merger) via T145057 [15:05:53] 10Continuous-Integration-Infrastructure, 07Technical-Debt: Relocate CI generated docs and coverage reports - https://phabricator.wikimedia.org/T137890#2628857 (10hashar) a:05hashar>03None Not working on it for now. It is staying on gallium/contint1001. [15:11:50] 10Continuous-Integration-Config, 13Patch-For-Review, 07Regression, 07Upstream, 07Zuul: integration-zuul-layoutdiff claims difference when there is none - https://phabricator.wikimedia.org/T143966#2628873 (10hashar) 05Open>03Resolved That has been resolved by deploying a newer version of Zuul on galli... [15:15:31] 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations, 10Traffic: Move gallium to an internal host? - https://phabricator.wikimedia.org/T133150#2628884 (10hashar) [15:15:33] 10Continuous-Integration-Infrastructure (phase-out-gallium), 03releng-201617-q1, 07Wikimedia-Incident: Phase out gallium.wikimedia.org - https://phabricator.wikimedia.org/T95757#2628885 (10hashar) [15:15:36] 10Continuous-Integration-Infrastructure (phase-out-gallium): Target architecture without gallium.wikimedia.org - https://phabricator.wikimedia.org/T133300#2628882 (10hashar) 05Open>03Resolved That came at length but as an outcome of T140257 the new machine contint1001 is going to use the same architecture as... [15:21:53] hashar: puppet completed on mw04 [15:23:42] hashar: anything against me merging https://gerrit.wikimedia.org/r/#/c/309999/ ? [15:26:20] elukey: go go go ! :) [15:26:36] elukey: hmm no :D [15:26:38] sorry [15:26:44] will cause varnish to start routing traffic to it [15:26:55] cause of cache::text::apps https://gerrit.wikimedia.org/r/#/c/309999/2/hieradata/labs.yaml [15:27:08] well yeah but after scap pull will be ok [15:27:16] probably want to add the dsh change first https://gerrit.wikimedia.org/r/#/c/309999/2/hieradata/labs/deployment-prep/common.yaml [15:27:17] if you want I can split the change in two [15:27:22] okok [15:27:25] had the same idea [15:27:25] that can probably be live hacked on the puppet master [15:27:27] run scap [15:27:35] verify that mw works on the new server [15:27:44] then cherry pick the patch / run puppet on varnishes [15:28:01] you can land both files in a single puppet patch, but the deployment would need to be split [15:28:05] or split in two changes [15:28:09] sorry I have just caught that :D [15:30:34] or we can live hack /etc/dsh/group/mediawiki-installation on deployment-tin [15:30:43] to add the new server, run scap manually and see what happens :D [15:31:00] I'd like it :) [15:31:03] at least [15:31:12] it seems that the new mw server has some content on /srv/mediawiki [15:31:18] so maybe puppet magically scap pull [15:31:44] it should do it [15:31:59] if the host is in conftool [15:32:00] mmmm [15:32:08] yeah we might be lucky [15:32:14] let's try if mw/apache works [15:32:22] yup [15:32:57] Giuseppe worked on a patch to eliminate the need to have things in dsh, giving conftool the authority on generating that data [15:33:10] BUT I have no idea if it works for Labs too.. I'd say yes [15:33:49] scap pull works nicely :) [15:34:07] I am in audio with greg [15:34:50] maybe via curl ? [15:35:41] yep trying [16:00:03] bah: curl -H 'Host: en.wikipedia.beta.wmflabs.org' http://127.0.0.1/ [16:00:10] yields

Domain not configured

[16:00:11] :( [16:02:32] hashar: sorry I am in meetings, I checked the dsh on deployment-prep-tin and 04 is not there [16:02:35] so we need puppet [16:03:06] elukey: I am in team meeting as well :) lets catch up tomorrow so :D [16:03:27] sure :) [16:26:24] 10Browser-Tests-Infrastructure, 07Jenkins, 13Patch-For-Review, 07Ruby, 15User-zeljkofilipin: MEDIAWIKI_URL may be set to incorrect value in mwext-mw-selenium job - https://phabricator.wikimedia.org/T144912#2629295 (10MBinder_WMF) [16:27:43] 10Browser-Tests-Infrastructure, 07Jenkins, 13Patch-For-Review, 07Ruby, 15User-zeljkofilipin: MEDIAWIKI_URL may be set to incorrect value in mwext-mw-selenium job - https://phabricator.wikimedia.org/T144912#2629301 (10jhobs) [16:28:27] 10Browser-Tests-Infrastructure, 06Reading-Web-Backlog, 07Jenkins, 13Patch-For-Review, and 3 others: MEDIAWIKI_URL may be set to incorrect value in mwext-mw-selenium job - https://phabricator.wikimedia.org/T144912#2629307 (10MBinder_WMF) [16:37:29] 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2629355 (10hashar) https://wikitech.wikimedia.org/wiki/HHVM/Troubleshooting has some interesting bits furl http://en.wikipedia.beta.wmflabs.org/wiki/M... [16:38:53] hasharAway: I've split the patch into two (dsh + traffic routing) [16:45:43] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [16:49:18] stuff I havent said: [16:49:34] flukey is moving the MW Beta appservers to Debian https://phabricator.wikimedia.org/T144006 \O/ [16:49:46] err s/flukey/elukey/ [16:51:02] hashar: ok to pick up the dsh work tomorrow morning? [16:51:09] twentyafterfour: if Harbormaster on iridium does not need to talk to Zuul/Gearman, I guess we can drop the related iptables rule ( https://phabricator.wikimedia.org/T137323#2617144 ) [16:51:18] elukey: yeah together with the zuul upgrade for jessie ? [16:51:34] elukey: looks like the new mw server is working. It should be straightforward to add it [16:51:37] hashar: yeah [16:51:43] hashar yep iridium dosent use gearman, i thought it did but it dosent. It uses jenkins http [16:51:58] yeah [16:51:58] we thought about using gearman [16:52:05] then Mukunda found a magic plugin :D [16:52:32] Oh, but i like zuul display and gearman will allow it to go accross several nodes. [16:52:47] will send a puppet patch [16:52:50] 10Continuous-Integration-Infrastructure (phase-out-gallium), 13Patch-For-Review: Firewall rules for labs support host to communicate with contint1001.wikimedia.org (new gallium) - https://phabricator.wikimedia.org/T137323#2629481 (10hashar) Confirmed with @20after4 , there is no more need for Harbormaster/Iri... [16:54:49] hashar: sure, zuul + jessie :) [16:54:49] 10Continuous-Integration-Infrastructure (phase-out-gallium), 13Patch-For-Review: Firewall rules for labs support host to communicate with contint1001.wikimedia.org (new gallium) - https://phabricator.wikimedia.org/T137323#2629494 (10mmodell) Indeed, confirmed. [16:56:28] twentyafterfour: easy https://gerrit.wikimedia.org/r/#/c/310039/ :) thank you! [16:57:08] 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations, 13Patch-For-Review: Migrate CI services from gallium to contint1001 - https://phabricator.wikimedia.org/T137358#2629559 (10hashar) [16:57:12] 10Continuous-Integration-Infrastructure (phase-out-gallium), 13Patch-For-Review: Firewall rules for labs support host to communicate with contint1001.wikimedia.org (new gallium) - https://phabricator.wikimedia.org/T137323#2629556 (10hashar) 05Open>03Resolved a:03hashar All rules have been tested and work... [16:57:40] 10Continuous-Integration-Infrastructure (phase-out-gallium), 13Patch-For-Review: Firewall rules for labs support host to communicate with contint1001.wikimedia.org (new gallium) - https://phabricator.wikimedia.org/T137323#2629565 (10hashar) [16:58:02] I am off. Have an happy day [17:03:54] PROBLEM - Puppet run on deployment-redis01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:11:07] 06Release-Engineering-Team, 10DBA, 10MediaWiki-Maintenance-scripts, 06Operations, and 2 others: Add section for long-running tasks on the Deployment page (specially for database maintenance) - https://phabricator.wikimedia.org/T144661#2629609 (10greg) Here's what the output looks like from jouncebot when t... [17:13:59] 10Beta-Cluster-Infrastructure, 06Operations, 13Patch-For-Review: /mnt/upload7 does not exist anywhere, yet it is referenced in multiple places in wmf-config - https://phabricator.wikimedia.org/T129586#2629613 (10AlexMonk-WMF) 05Open>03Resolved thanks @hashar [17:20:42] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [17:25:18] 06Release-Engineering-Team, 10DBA, 10MediaWiki-Maintenance-scripts, 06Operations, and 2 others: Add section for long-running tasks on the Deployment page (specially for database maintenance) - https://phabricator.wikimedia.org/T144661#2629638 (10bd808) >>! In T144661#2629609, @greg wrote: > @bd808: it migh... [17:26:19] 06Release-Engineering-Team, 10DBA, 10MediaWiki-Maintenance-scripts, 06Operations, and 2 others: Add section for long-running tasks on the Deployment page (specially for database maintenance) - https://phabricator.wikimedia.org/T144661#2629655 (10greg) yeah, I'd prefer both because I know some people ignore... [17:27:39] 10Gerrit, 06Developer-Relations: Add a welcome bot to Gerrit for first time contributors - https://phabricator.wikimedia.org/T73357#2629661 (10Aklapper) Thanks @legoktm! The text used by OpenStack is pretty friendly, explaining, and great. I have problems to rephrase it to not be plagiarism (and I don't see... [17:30:59] 10Gerrit, 06Developer-Relations: Add a welcome bot to Gerrit for first time contributors - https://phabricator.wikimedia.org/T73357#2629666 (10greg) Pretty sure it's under a free license, so as long as we attribute correctly in our source you don't need to worry about plagiarism :) [17:34:33] 10Gerrit, 06Developer-Relations: Add a welcome bot to Gerrit for first time contributors - https://phabricator.wikimedia.org/T73357#2629671 (10Reedy) >>! In T73357#2629666, @greg wrote: > Pretty sure it's under a free license, so as long as we attribute correctly in our source you don't need to worry about pla... [17:43:54] RECOVERY - Puppet run on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:58:49] hello! I'm having issue deploying config files with scap3 [17:59:04] my config is created correctly in .git/config-files fromthe template, the log says that a symlink is created, but it does not seem to be the case [18:47:08] PROBLEM - Puppet run on deployment-db1 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:48:43] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10Wikimedia-General-or-Unknown: Allow to test a mediawiki-config change to the beta cluster - https://phabricator.wikimedia.org/T136828#2630026 (10Dereckson) 05Resolved>03Open I'm reopening the bug following a discussion with @Urbanecm. Martin w... [19:00:21] 03Scap3: Scap3 config references to deployed directory - https://phabricator.wikimedia.org/T145437#2630067 (10thcipriani) [19:22:07] RECOVERY - Puppet run on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [20:00:41] 06Release-Engineering-Team, 10Dumps-Generation, 13Patch-For-Review: getMWVersion, used by dumps, was removed. Please restore. - https://phabricator.wikimedia.org/T145336#2630302 (10ArielGlenn) 05Open>03Resolved Thanks Chad! Abstracts are already running on commonswiki as we speak. Closing! [20:04:23] 10Beta-Cluster-Infrastructure, 06Operations, 13Patch-For-Review: /mnt/upload7 does not exist anywhere, yet it is referenced in multiple places in wmf-config - https://phabricator.wikimedia.org/T129586#2630321 (10hashar) Well done @AlexMonk-WMF that the last item of a long tail. Quite an achievement \o/ [20:04:40] /mnt/upload7 is gone at last [20:04:43] rejoice! [20:14:37] hasharAway: awesome! [20:14:50] oh I have done nothing [20:14:55] bd808: it is all Krenair :] [20:15:52] :) The answer to so many problems is "get Krenair interested in solving it" [20:16:20] yeah I have a list of known and reliable fixers [20:16:23] you are on it :D [20:18:42] hasharAway: awww. I do try to fix some broken windows. [20:22:21] thanks hasharAway & bd808 :) [20:27:01] sleep!! sleep ! *wave* [20:35:38] :) [20:43:06] PROBLEM - Puppet run on deployment-db1 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:07:01] PROBLEM - Puppet run on deployment-elastic07 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:14:20] 03Scap3, 10Parsoid: Rollback failed when target is down - https://phabricator.wikimedia.org/T145460#2630755 (10Arlolra) [21:15:25] 10Continuous-Integration-Config, 10MediaWiki-extensions-RelatedArticles, 06Reading-Web-Backlog, 07Browser-Tests, 03Reading-Web-Sprint-81-We-suck-at-coming-up-with-sprint-names: RelatedArticles browser tests should run on a commit basis - https://phabricator.wikimedia.org/T120715#2630772 (10Jdlrobson) [21:23:06] RECOVERY - Puppet run on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [21:23:28] 10Continuous-Integration-Config, 10MediaWiki-extensions-RelatedArticles, 06Reading-Web-Backlog, 07Browser-Tests, and 2 others: RelatedArticles browser tests should run on a commit basis - https://phabricator.wikimedia.org/T120715#1902206 (10Jdlrobson) a:03Jdlrobson [21:23:38] PROBLEM - Puppet run on deployment-kafka05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:30:17] (03PS1) 10Jdlrobson: Run RelatedArticles browser tests on every commit [integration/config] - 10https://gerrit.wikimedia.org/r/310141 (https://phabricator.wikimedia.org/T120715) [21:44:18] 06Release-Engineering-Team, 15User-greg: Update annual plan roadmap thing on office - https://phabricator.wikimedia.org/T139180#2630959 (10greg) 05Open>03Resolved a:03greg [21:47:01] RECOVERY - Puppet run on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [21:59:53] PROBLEM - Puppet run on deployment-db03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:01:39] 10Continuous-Integration-Config, 10MediaWiki-extensions-RelatedArticles, 06Reading-Web-Backlog, 07Browser-Tests, and 2 others: RelatedArticles browser tests should run on a commit basis - https://phabricator.wikimedia.org/T120715#2631000 (10Jdlrobson) Browser tests run via `check experimental` are now apas... [22:02:11] PROBLEM - Puppet run on deployment-ircd is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:02:37] PROBLEM - Puppet run on deployment-db04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:03:37] RECOVERY - Puppet run on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [22:09:54] RECOVERY - Puppet run on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:12:34] RECOVERY - Puppet run on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:18:40] PROBLEM - Puppet run on deployment-poolcounter02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:37:10] RECOVERY - Puppet run on deployment-ircd is OK: OK: Less than 1.00% above the threshold [0.0] [22:56:00] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:58:40] RECOVERY - Puppet run on deployment-poolcounter02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:24:52] 06Release-Engineering-Team, 15User-greg: Create P&T offsite slides (due 9/12) - https://phabricator.wikimedia.org/T144511#2631283 (10greg) 05Open>03Resolved [23:30:59] RECOVERY - Puppet run on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:47:31] thcipriani twentyafterfour: o/ is it possible to get a jenkins box that is real hardware and not a vm instance? i'm trying to get some of this android emulation shiz working and it looks like this "WebView" component requires a physical GPU [23:52:40] niedzielski: hrm, we've never done that before afaik. The only jenkins box on physical hardware is the master instance. [23:53:05] thcipriani: ok thanks. i didn't know if you guys had a closet of instance somewhere [23:57:12] nope, we're a virtual team only ;) [23:59:05] you could probably request a physical box and make it a jenkins slave, though I think they prefer to have all jenkins slaves in labs, and physical boxes in labs is...