[00:02:25] 06Labs, 10Horizon: Horizon dashboard for managing instance puppet config - https://phabricator.wikimedia.org/T91990#2221357 (10yuvipanda) With more discussion between me, @chasemp and @bd808, I think https://etherpad.wikimedia.org/p/puppet-enc-labs is at a fairly steady solid state. [00:31:41] 06Labs, 10Tool-Labs: Web requests fail after a period of time - https://phabricator.wikimedia.org/T133090#2221474 (10Nettrom) I unfortunately have to report that the problem wasn't resolved, when I checked a couple of hours after @valhallasw's update, 503s were again reported. I tried a similar approach, kill... [01:17:35] 06Labs, 10Tool-Labs: Web requests fail after a period of time - https://phabricator.wikimedia.org/T133090#2219634 (10bd808) tools.suggestbot's ~/service.log shows a lot of restarts as well: ``` 2016-04-19T22:32:14.457170 No running webservice job found, attempting to start it 2016-04-19T22:45:58.119421 No runn... [04:55:28] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2221665 (10jayvdb) >>! In T131934#2220571, @BBlack wrote: > If we dropped Zero-rating for that carrier, you'd probably still be getting abuse from that carrier. Why do yo... [05:02:07] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2221666 (10jayvdb) >>! In T131934#2198320, @Gunnex wrote: > ... > 3. most of them are using "video2commons" in combination with "googlevideo.com" as source (typical url: "... [05:26:50] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2221686 (10jayvdb) >>! In T131934#2221083, @DFoy wrote: > At this point, we have established that we have the technical capability to identify Zero traffic on tool labs to... [06:26:26] 06Labs, 10Monitoring, 06Operations, 10wikitech.wikimedia.org: Bacula recovery of sql files from silver/wikitech fails - https://phabricator.wikimedia.org/T131195#2221790 (10jcrespo) alex- I think I was able to recover to a different host, but not from that file. But I may be wrong. In any case, the root pr... [07:25:04] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2221830 (10Gunnex) >>! In T131934#2221083, @DFoy wrote: > (...) > I agree here about getting some data together. To proceed, I think we need to know what the typical age o... [08:34:52] PROBLEM - Host tools-bastion-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.228) [09:33:28] (03CR) 10Lokal Profil: "recheck" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/281930 (https://phabricator.wikimedia.org/T39422) (owner: 10Lokal Profil) [09:43:26] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2222200 (10jayvdb) >>! In T131934#2221666, @jayvdb wrote: >>>! In T131934#2198320, @Gunnex wrote: >> ... >> 3. most of them are using "video2commons" in combination with "... [10:00:13] (03CR) 10Hashar: "check experimental" [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/275190 (https://phabricator.wikimedia.org/T128503) (owner: 10MarcoAurelio) [10:01:13] (03CR) 10Hashar: "With https://gerrit.wikimedia.org/r/284430 I have made CI to run tox whenever one comments 'check experimental' in Gerrit." [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/275190 (https://phabricator.wikimedia.org/T128503) (owner: 10MarcoAurelio) [10:01:46] 10Tool-Labs-tools-stewardbots, 10Continuous-Integration-Config, 13Patch-For-Review: Implement jenkins tests on labs/tools/stewardbots - https://phabricator.wikimedia.org/T128503#2222358 (10hashar) a:03MarcoAurelio [11:26:22] 06Labs, 06Operations, 13Patch-For-Review, 15User-bd808: Setting up bulk proxies pointing to a multiwiki mediawiki-vagrant setup running on a labs vm - https://phabricator.wikimedia.org/T132216#2222729 (10akosiaris) p:05Triage>03Normal >>! In T132216#2193397, @Krenair wrote: > While that may be a workar... [11:41:10] 06Labs, 10MediaWiki-extensions-OATHAuth, 06Security-Team, 10wikitech.wikimedia.org, and 2 others: wikitech 2fa provisioning form does so without confirmation - https://phabricator.wikimedia.org/T130892#2222783 (10akosiaris) [11:41:13] 06Labs, 10Monitoring, 06Operations, 10wikitech.wikimedia.org: Bacula recovery of sql files from silver/wikitech fails - https://phabricator.wikimedia.org/T131195#2222780 (10akosiaris) 05Open>03Resolved a:03akosiaris >>! In T131195#2221790, @jcrespo wrote: > alex- I think I was able to recover to a di... [11:52:11] (03CR) 10Lokal Profil: "recheck" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/283108 (owner: 10Jean-Frédéric) [11:58:49] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Jberkel was created, changed by Jberkel link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Jberkel edit summary: Created page with "{{Tools Access Request |Justification=Debugging some problems with credit generation on wsexport/tool, see https://github.com/wsexport/tool/issues/57 |Completed=false |User Na..." [12:34:22] (03PS1) 10Luke081515: Implemented hook "akick" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 [12:34:40] (03CR) 10jenkins-bot: [V: 04-1] Implemented hook "akick" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 (owner: 10Luke081515) [12:36:45] (03PS2) 10Luke081515: [WIP] Implement hook "akick" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 [12:45:14] (03PS1) 10Sebastian Berlin (WMSE): api: Decode URL-encoded link text [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/284450 (https://phabricator.wikimedia.org/T132029) [12:54:17] (03CR) 10Jean-Frédéric: [C: 032] api: Decode URL-encoded link text [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/284450 (https://phabricator.wikimedia.org/T132029) (owner: 10Sebastian Berlin (WMSE)) [12:54:54] (03Merged) 10jenkins-bot: api: Decode URL-encoded link text [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/284450 (https://phabricator.wikimedia.org/T132029) (owner: 10Sebastian Berlin (WMSE)) [13:00:47] !log heritage Deployed latest from Git, 48bce77 and dfbff9b (T132029) [13:00:47] Did you mean tools.heritage instead of heritage? [13:00:48] T132029: Html output of api should de-urlencode link titles and pagenames for display - https://phabricator.wikimedia.org/T132029 [13:00:48] heritage is not a valid project. [13:00:59] !log tools.heritage Deployed latest from Git, 48bce77 and dfbff9b (T132029) [13:01:00] T132029: Html output of api should de-urlencode link titles and pagenames for display - https://phabricator.wikimedia.org/T132029 [13:01:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [13:17:35] 06Labs, 10MediaWiki-Vagrant: mwrepl does not load wiki in labs vagrant - https://phabricator.wikimedia.org/T133146#2223281 (10Yurik) [15:07:31] 06Labs, 10Labs-Infrastructure, 06Operations, 10Traffic: Move californium to an internal host? - https://phabricator.wikimedia.org/T133149#2223741 (10Dzahn) [15:07:54] 06Labs, 10Labs-Infrastructure, 06Operations, 10Traffic: Move californium to an internal host? - https://phabricator.wikimedia.org/T133149#2223744 (10Dzahn) @Andrew Does the horizon host need the public IP ? [15:21:32] RECOVERY - Puppet staleness on tools-bastion-10 is OK: OK: Less than 1.00% above the threshold [3600.0] [16:07:08] 06Labs, 10Horizon: Need Horizon dashboard for manipulating service groups - https://phabricator.wikimedia.org/T91989#1100569 (10bd808) A semi-related idea from T125002#1971966 would be to stop using service groups for Labs projects in general and then build something else ({T128158}) to handle creating the ser... [16:12:12] 06Labs, 10Labs-Infrastructure, 06Operations, 10Traffic: Move californium to an internal host? - https://phabricator.wikimedia.org/T133149#2224036 (10Andrew) @Dzahn as far as I know it does not, moving it to an internal IP would be fine. [16:38:32] (03CR) 10Luke081515: "Blocked by a change whoch allows the bot to detect joins." [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 (owner: 10Luke081515) [16:40:34] 06Labs, 10Labs-Infrastructure: Get a real (letsencrypt) cert for labtestwikitech.wikimedia.org - https://phabricator.wikimedia.org/T133167#2224130 (10Krenair) technically I am outside the labs team [17:37:06] valhallasw`cloud, hi, it seems the bot is still not reporting to the #wikimedia-interactive channel :( [17:40:13] yurik: i'll try restarting it again in 15 mins [17:40:22] thx [17:58:49] * valhallasw`cloud reads up on docs again [18:01:25] yurik: I'll try rebuilding and re-pushing... [18:07:22] (03PS1) 10Luke081515: Define hook "joinActions" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284496 [18:07:54] YuviPanda: grrrit-wm doesn't want to load the new docker image [18:08:07] YuviPanda: and I can't actually search images on the docker registry?? [18:08:41] (03CR) 10Luke081515: [C: 04-1] "typos" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284496 (owner: 10Luke081515) [18:09:08] YuviPanda: *and* I seem to be unable to connect to the registry on port 5000, so I'm at a loss [18:09:20] (03PS2) 10Luke081515: Define hook "joinActions" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284496 [18:10:08] (03CR) 10Luke081515: [C: 032 V: 032] Define hook "joinActions" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284496 (owner: 10Luke081515) [18:12:13] (03Merged) 10jenkins-bot: Define hook "joinActions" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284496 (owner: 10Luke081515) [18:12:34] (03PS3) 10Luke081515: [WIP] Implement hook "akick" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 [18:15:17] (03PS4) 10Luke081515: [WIP] Implement hook "akick" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 [18:17:05] (03PS5) 10Luke081515: [WIP] Implement command "akick" [labs/tools/Luke081515IRCBot] - 10https://gerrit.wikimedia.org/r/284448 [18:26:04] 10Labs-Kubernetes, 10grrrit-wm: grrrit-wm update/deployment failing - https://phabricator.wikimedia.org/T133189#2224604 (10valhallasw) [18:47:28] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Mmr was created, changed by Mmr link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Mmr edit summary: Created page with "{{Tools Access Request |Justification=I am working with Adam Holt and Emmanuel Engelbart to help with the offline wikipedia project, enwp10. |Completed=false |User Name=Mmr }}" [18:49:33] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Mmr was modified, changed by Kelson link https://wikitech.wikimedia.org/w/index.php?diff=450733 edit summary: [18:49:46] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Mmr was modified, changed by Kelson link https://wikitech.wikimedia.org/w/index.php?diff=450734 edit summary: [19:17:46] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2224940 (10DFoy) > In T131934#2221686, @jayvdb wrote: > > In part, I question this assertion. There is a large cohort of people in developing countries who can only affor... [19:39:00] andrewbogott: hello! I have an instance that had a new IP address assigned by DHCP, and obviously the DNS entry still point to the old IP .. [19:39:00] is that known ? ;-) [19:39:06] (I have no clue why it changed of ip [19:40:18] hashar: that's really not supposed to happen :( [19:40:29] How recently did the ip change? [19:40:34] no clue :( [19:41:00] and I dont think shin ken has history [19:41:48] ah I can grep wikimedia-relent irc logs [19:43:27] 06Labs, 10Tool-Labs, 06Zero: Tool labs tools should have a method of identifying Zero traffic - https://phabricator.wikimedia.org/T131934#2225025 (10Gunnex) >! In T131934#2224940, @DFoy wrote: > (...) > The users are amazingly adept at the process of switching sim cards around, and some phone models offer a... [19:43:30] andrewbogott: nothing obvious but maybe since March 26th ~ [19:45:14] hashar: does the instance need a rescue or can you recreate? [19:46:00] rescue :( [19:46:02] will fill a task about it [19:46:14] if we can hack LDAP to change the IP that would do [19:49:08] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: integration-dev instance changed of IP address - https://phabricator.wikimedia.org/T133207#2225061 (10hashar) [19:49:18] andrewbogott: filled it with more details at https://phabricator.wikimedia.org/T133207 [20:01:30] hashar, so can you ssh in using the new ip? [20:02:10] hashar: when you say 'LDAP and DNS still point to it' what do mean by the dns part of that? [20:02:17] LDAP doesn't have anything to do with dns anymore [20:03:05] bah [20:03:17] andrewbogott: yeah wrong sorry. DNS gone apparently [20:04:22] it is gone from DNS [20:04:28] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: integration-dev instance changed of IP address - https://phabricator.wikimedia.org/T133207#2225160 (10hashar) DNS: ``` $ dig +short A integration-dev.integration.eqiad.wmflabs $ dig +short -x 10.68.23.123 $ dig +short -x 10.68.17.81 $ ``... [20:07:00] Hm, there's not a good way to create a new dns record out of thin air [20:09:43] RECOVERY - Puppet run on tools-exec-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [20:25:34] PROBLEM - SSH on tools-exec-1215 is CRITICAL: Server answer [20:27:40] PROBLEM - SSH on tools-exec-1202 is CRITICAL: Server answer [20:30:35] RECOVERY - SSH on tools-exec-1215 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2~wmfprecise2 (protocol 2.0) [20:32:41] RECOVERY - SSH on tools-exec-1202 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2~wmfprecise2 (protocol 2.0) [21:16:33] valhallasw`cloud: it's on port 443, not 5000. [21:19:50] 10Labs-Kubernetes, 10grrrit-wm: grrrit-wm update/deployment failing - https://phabricator.wikimedia.org/T133189#2225467 (10yuvipanda) Ok, so I'm debugging. First find the pod name: ``` yuvipanda@tools-k8s-master-01:~$ kubectl --context=lolrrit-wm --namespace=lolrrit-wm get pods NAME READY STATU... [21:26:51] 10Labs-Kubernetes, 10grrrit-wm: grrrit-wm update/deployment failing - https://phabricator.wikimedia.org/T133189#2225489 (10yuvipanda) And I see grrrit-wm in #wikimedia-interactive. [21:27:21] chasemp: ^ wrote up some of my debugging workflow for k8s stuff, you might find it interesting. [21:28:04] cool [21:28:25] 10Labs-Kubernetes, 10grrrit-wm: grrrit-wm update/deployment failing - https://phabricator.wikimedia.org/T133189#2225491 (10yuvipanda) Confirmed with: ``` yuvipanda@tools-k8s-master-01:~$ kubectl --context=lolrrit-wm --namespace=lolrrit-wm logs grrrit-m1056 | grep interactive info: joining channels 0=#mediawi... [21:29:12] YuviPanda: #define rebuild from scratch? [21:29:49] In any case, thanks for documenting [21:29:50] valhallasw`cloud: I didn't do anything - I just ran the same command and it decided to rebuild the image. [21:29:56] PROBLEM - SSH on tools-bastion-02 is CRITICAL: Server answer [21:30:09] valhallasw`cloud: other than me being sudo so I don't know what the differences were. [21:30:09] Weeiiird [21:30:24] PROBLEM - SSH on tools-exec-1216 is CRITICAL: Server answer [21:30:26] Maybe sudo su vs sudo -H? [21:30:30] valhallasw`cloud: yeah... my vague suspicion is docker somehow decided that config.yaml hadn't changed. [21:30:31] Anyway, bedtime [21:30:39] valhallasw`cloud: nighty! I'll write up more structured docs next week [21:34:54] RECOVERY - SSH on tools-bastion-02 is OK: SSH OK - OpenSSH_6.9p1 Ubuntu-2~trusty1 (protocol 2.0) [21:35:26] RECOVERY - SSH on tools-exec-1216 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2~wmfprecise2 (protocol 2.0) [23:03:25] 06Labs, 10MediaWiki-Vagrant: mwrepl & hhvmsh do not load wiki in labs vagrant - https://phabricator.wikimedia.org/T133146#2225779 (10Yurik) [23:25:23] (03PS1) 10Yurik: Added #wikimedia-interactive [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 [23:31:43] (03CR) 10Legoktm: [C: 032] Added #wikimedia-interactive [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 (owner: 10Yurik) [23:32:03] (03CR) 10Jforrester: "Can you not just use the -editing channel instead?" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 (owner: 10Yurik) [23:35:46] (03CR) 10Yurik: "what is editing channel has to do with this? interactive team can benefit from having phab tickets related to our projects show in interac" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 (owner: 10Yurik) [23:36:38] (03CR) 10Jforrester: "What interactive team? You mean the Multimedia team? We're in -multimedia, which is where eventually we'll stash Graph etc. work, yeah." [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 (owner: 10Yurik) [23:42:51] (03CR) 10Yurik: "interactive team is a team of the discovery dept, responsible for maps and other interactive features like graphs. If you are proposing a" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 (owner: 10Yurik) [23:47:14] (03CR) 10Jforrester: [C: 031] "> interactive team is a team of the discovery dept, responsible for maps and other interactive features like graphs." [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/284613 (owner: 10Yurik)