[02:08:43] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Luke081515 was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=150418 edit summary: [02:31:38] 10Tool-Labs: Unattended upgrades are failing from time to time - https://phabricator.wikimedia.org/T92491#1155873 (10BBlack) >>! In T92491#1112781, @coren wrote: > Apt tools indeed use proper locking, but do so to ensure exclusive runs not concurrency. But Yuvi is correct that those error messages are not it.... [14:26:07] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Weipengyu was created, changed by Weipengyu link https://wikitech.wikimedia.org/wiki/Nova+Resource%3aTools%2fAccess+Request%2fWeipengyu edit summary: Created page with "{{Tools Access Request |Justification=Hi, I am a PhD student in Data Science and my current project is to study the relationship between Wiki usage and the stock market. I wou..." [14:36:08] twentyafterfour: There's a bit of a problem. I was also thinking of using phab-02 as a diffusion host but that would need port 222 unblocked [14:49:53] andrewbogott_afk: mark: taking early lunch for a bank run, bbiab (<2h) [15:37:28] 10Tool-Labs-tools-Other: bring back missing-from-wikipedia - https://phabricator.wikimedia.org/T72199#736040 (10sumanah) Teresa, is it ok if we assign this to you? [18:41:42] ^d: Would it be much of a problem if I got admin on project phabricator? [18:50:48] mutante: ^ [18:51:31] (03PS1) 10Greg Grossmeier: Add Browser-Tests to #-releng notices [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/200217 [18:54:34] mutante: Would it much of a problem if I got admin on the phab project? [18:55:42] Negative24: to create a new instance? i wouldn't think so, but we should ask chase/mukunda too [18:55:51] Negative24: meanwhile i can also make one for you [18:56:02] well its more of the security policy [18:56:12] you are a member already, right [18:56:15] just not admin [18:56:22] yup. diffusion requires port 222 to be unblocked [18:56:27] ah [18:56:32] but [18:57:01] it should probably only be unblocked for instances that are setup to handle it [18:57:23] I haven't seen the policies so far [18:57:38] but I'm guessing that phab-01 and 02 are on the same policy list [18:57:50] *group [18:57:58] i would say, in that case it warrants having it's own.. well phab ticket [18:58:01] to solve this [18:58:16] I bother so many people :( [18:58:58] no worries, if we make a ticket people can look at it anytime and don't have do it in real time [18:59:02] I was thinking about removing phab-02 and recreating it with a different security policy [18:59:02] that lowers the bother factor:) [18:59:54] and yea, the security policies would be per project [19:00:16] project as in diffusion or not or phabricator or not? [19:01:15] eh, i mean labs project [19:01:20] so phabricator [19:01:52] also, it's not like it's hard to create a new project [19:02:12] but worth having a few more eyes on it [19:02:12] that seems a bit too much [19:02:42] just saying that the time to actually make it is not more than making an instance [19:03:56] when you say "requires port 222 to be unblocked" then unblocked for who? [19:04:32] who or what needs to be able to connect to it [19:04:40] 6Labs, 6Phabricator: Phabricator security policy open up port 222 - https://phabricator.wikimedia.org/T94217#1157733 (10Negative24) 3NEW [19:04:49] :) [19:05:05] ssh [19:05:17] gotcha [19:05:24] Its the alternate port for use by ssh so that port 22 can be used for repo cloning [19:05:39] 222? ewww [19:05:46] that's worse than 29415 [19:05:54] 29418 even :) [19:06:15] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1157742 (10Negative24) [19:06:55] paravoid: How? Its three characters vs five [19:07:06] it should be 22 [19:07:51] Which would you perfer: everytime you clone you have to specify port #### or everytime phabricator needs maintenance, ssh in with port 222? [19:07:56] *prefer [19:08:15] neither [19:08:28] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1157750 (10Dzahn) This is about phabricator in labs, so it's about a project admin using the security policies via the wikitech ui. [19:08:45] paravoid: You suggest that upstream. For now port 222 makes sense [19:09:06] I don't have to suggest it anywhere [19:09:30] phabricator.wikimedia.org does not go to the phabricator machine [19:09:37] so port 222 won't work anyway [19:09:47] mutante: Whats you phab username? [19:09:52] *your [19:09:56] Negative24: Dzahn [19:09:56] Dzahn [19:10:03] Oh [19:10:08] You just did that :) [19:10:29] i wanted to clarify it's about phabricator in labs and using the wikitech ui, not prod. firewalling [19:10:32] right [19:11:12] presumably this is staging a prod change though, no? [19:11:29] paravoid: Very far future with #Gerrit-migration [19:12:15] I want to setup an instance with diffusion setup so that people can learn how to use it. [19:12:15] well yea, it's about opening that port [19:12:24] just the different ways we achieve that in labs vs. prod [19:13:05] From what I know, this is the first machine to actually be set up with all the phabricator stuff enabled [19:13:15] yea, true [19:13:17] *going to be setup (ahem) [19:13:18] and that's cool [19:16:01] Going out for lunch so we can discuss later if anything comes up on the task [19:17:02] sounds good [19:23:04] https://tools.wmflabs.org/magnustools/multistatus.html [19:23:16] two of the four services are down [19:23:26] can someone please help [19:32:19] GerardM-: Lemme see. [19:32:31] :) thanks [19:43:42] GerardM-: 3 out of the 4 are showing for me [19:43:54] but the second one wasn't migrated from the toolserver [19:44:38] GerardM-: They should all be up. [19:45:06] oh wait. never mind about my last comment [19:45:13] Thanks Coren [19:45:29] GerardM-: afaict, they overran memory after some hours of being up. [19:45:57] Autolist and Game are very stable, clearly. [20:02:52] legoktm: https://www.mediawiki.org/wiki/User:Valhallasw/ProjectChannels [20:03:23] should probably sort differently for that, but whatever [20:45:53] PROBLEM - Puppet failure on tools-exec-05 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [20:47:22] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 57.14% of data above the critical threshold [0.0] [20:48:48] PROBLEM - Puppet failure on tools-exec-09 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:49:27] PROBLEM - Puppet failure on tools-exec-06 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [20:49:31] Looks like puppet doesn't like execs :p [20:50:19] PROBLEM - Puppet failure on tools-exec-03 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [20:50:45] PROBLEM - Puppet failure on tools-master is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [20:50:51] PROBLEM - Puppet failure on tools-webgrid-07 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [20:51:13] PROBLEM - Puppet failure on tools-exec-15 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [0.0] [20:52:55] PROBLEM - Puppet failure on tools-webproxy-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:53:57] PROBLEM - Puppet failure on tools-webgrid-tomcat is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [20:54:07] PROBLEM - Puppet failure on tools-redis is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:55:10] PROBLEM - Puppet failure on tools-exec-13 is CRITICAL: CRITICAL: 28.57% of data above the critical threshold [0.0] [20:55:38] 10Tool-Labs-tools-Other: bring back missing-from-wikipedia - https://phabricator.wikimedia.org/T72199#1158143 (10terrrydactyl) It's okay to assign it to me, I'll ask around to see how we can get the site back up. [20:55:50] PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:56:52] PROBLEM - Puppet failure on tools-exec-gift is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [20:57:58] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [20:57:58] PROBLEM - Puppet failure on tools-webgrid-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [20:57:58] PROBLEM - Puppet failure on tools-exec-08 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [20:58:08] PROBLEM - Puppet failure on tools-mail is CRITICAL: CRITICAL: 85.71% of data above the critical threshold [0.0] [20:58:48] PROBLEM - Puppet failure on tools-dev is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:00:32] PROBLEM - Puppet failure on tools-exec-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:00:40] PROBLEM - Puppet failure on tools-webgrid-03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:00:55] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:02:05] PROBLEM - Puppet failure on tools-exec-04 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [21:02:23] PROBLEM - Puppet failure on tools-exec-cyberbot is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [21:03:57] PROBLEM - Puppet failure on tools-webgrid-04 is CRITICAL: CRITICAL: 85.71% of data above the critical threshold [0.0] [21:05:27] ^ Those are all my fault, and should resolve on the next pass. [21:05:55] PROBLEM - Puppet failure on tools-submit is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:09:03] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158221 (10mmodell) I see no problem with having port 222 opened. @chasemp: should we just make @negative24 an admin on the wikitech "phabricator" project? [21:11:02] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:11:24] PROBLEM - Puppet failure on tools-webgrid-generic-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:18:07] RECOVERY - Puppet failure on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [21:20:12] RECOVERY - Puppet failure on tools-exec-13 is OK: OK: Less than 1.00% above the threshold [0.0] [21:20:51] RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:20:51] RECOVERY - Puppet failure on tools-master is OK: OK: Less than 1.00% above the threshold [0.0] [21:21:17] RECOVERY - Puppet failure on tools-exec-15 is OK: OK: Less than 1.00% above the threshold [0.0] [21:22:55] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [21:22:55] RECOVERY - Puppet failure on tools-exec-08 is OK: OK: Less than 1.00% above the threshold [0.0] [21:23:57] RECOVERY - Puppet failure on tools-webgrid-tomcat is OK: OK: Less than 1.00% above the threshold [0.0] [21:25:39] RECOVERY - Puppet failure on tools-webgrid-03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:25:53] RECOVERY - Puppet failure on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [0.0] [21:25:54] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [21:27:03] RECOVERY - Puppet failure on tools-exec-04 is OK: OK: Less than 1.00% above the threshold [0.0] [21:28:47] RECOVERY - Puppet failure on tools-dev is OK: OK: Less than 1.00% above the threshold [0.0] [21:30:51] RECOVERY - Puppet failure on tools-webgrid-07 is OK: OK: Less than 1.00% above the threshold [0.0] [21:31:18] RECOVERY - Puppet failure on tools-webgrid-generic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:32:21] RECOVERY - Puppet failure on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [0.0] [21:33:44] RECOVERY - Puppet failure on tools-exec-09 is OK: OK: Less than 1.00% above the threshold [0.0] [21:33:58] RECOVERY - Puppet failure on tools-webgrid-04 is OK: OK: Less than 1.00% above the threshold [0.0] [21:33:59] RECOVERY - Puppet failure on tools-redis is OK: OK: Less than 1.00% above the threshold [0.0] [21:35:51] RECOVERY - Puppet failure on tools-submit is OK: OK: Less than 1.00% above the threshold [0.0] [21:37:19] RECOVERY - Puppet failure on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [0.0] [21:37:57] RECOVERY - Puppet failure on tools-webproxy-02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:39:28] RECOVERY - Puppet failure on tools-exec-06 is OK: OK: Less than 1.00% above the threshold [0.0] [21:40:18] RECOVERY - Puppet failure on tools-exec-03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:46:29] hey, pywikibot service group doesn't work properly gives memory (hdd I think) error [21:46:39] anyone around that I can use help [21:46:41] ? [21:50:51] Amir1: I’m here but I don’t know anything about that bot. [21:50:57] What do you mean the service group doesn’t work? [21:51:32] andrewbogott: Hey, it's not a bot [21:51:40] is the nightly creator for pywikibot [21:51:43] pywikibot <- not a bot? [21:52:07] pywikibot service group is not a bot, it's the official nightly creator for the tool [21:52:24] check out tools.wmflabs.org/pywikibot [21:52:31] ok [21:52:49] Want to add me to the group so I can try? [21:53:07] sure [21:54:10] andrewbogott: can you give me link to service group management? [21:54:13] Amir1: /what/ gives an error? [21:54:41] Amir1: https://tools.wmflabs.org , then 'manage maintainers' under the pywikibot project [21:54:43] valhallasw`cloud: hey check out [21:54:51] nightly.err [21:55:40] Amir1: right. I'm not sure how old those messages are, but if they are recent, try increasing the memory of the SGE jobs [21:56:08] I think we currently request the default = 256MB or so [21:57:09] and qacct -j nightly shows maxvmem varying between 100 and 250MB, so that might not be enough [21:57:18] sometimes even more, 330MB [21:57:20] hmm let me check [21:57:29] valhallasw`cloud: thanks [21:57:32] it's easy to fix [21:59:59] yep, the default is 256MB, so I'm not sure how those 330M jobs did not get killed [22:00:23] increased it to 1g [22:00:30] I think surely it's enough [22:01:46] is it possible to be notified by email or something if your continuous job stops? [22:02:45] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158486 (10chasemp) >>! In T94217#1158221, @mmodell wrote: > I see no problem with having port 222 opened. @chasemp: should we just make @negative24 an admin on the wikitech "phabricat... [22:03:41] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158488 (10mmodell) @chasemp: phab-02 and I don't know how to do it either ;) [22:03:55] PhantomTech: Yeah. Add -m ae to your jstart command [22:04:03] and maybe -M your@email.address [22:04:38] -m ae will send an e-mail on abort/reschedule (a) and job end (e) [22:04:46] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158490 (10Negative24) >>! In T94217#1158488, @mmodell wrote: > @chasemp: phab-02 > > and I don't know how to do it either ;) Do what? Open port 222, make me an admin, or configure D... [22:05:17] so jstart -N name -m ae -M email@address.com [script]? [22:05:49] I think so, yes. If you want to check whether it works, use -m bae, which also sends an email on job start [22:05:58] thanks [22:06:09] see man qsub for more info on that parameter [22:07:08] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158498 (10Dzahn) https://wikitech.wikimedia.org/w/index.php?title=Help:Security_groups&redirect=no [22:09:42] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158501 (10Dzahn) yes, the instances are in the same project. see above link for this quote: "Every project has a 'default' security group that provides access to ssh and Nagios (whic... [22:11:38] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158519 (10Dzahn) how to add new rule to default group: https://wikitech.wikimedia.org/w/index.php?title=Help:Security_groups&redirect=no#Individual_rule [22:15:51] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158540 (10Negative24) Essentially all you need to do is click "Add rule" in the default group, put 222 in both the beginning and end port range inputs, tcp as the protocol, and 0.0.0.... [22:18:41] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158544 (10chasemp) TBH far more people are going to be annoyed and inconvenienced with git on a weird port than ssh. Only a few people will ever ssh in. If I'm doing it in prod and... [22:22:57] (03PS2) 10Greg Grossmeier: Add two more projects to #-releng notices [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/200217 [22:24:13] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158559 (10demon) >>! In T94217#1158544, @chasemp wrote: > TBH far more people are going to be annoyed and inconvenienced with git on > a weird port than ssh. Only a few people will e... [22:30:25] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158579 (10Negative24) @chasemp @demon Well that was amusing. Yes. SSH for **maintenance** is over port 222. I'm not insane enough to serve cloning from port 222. :) (you guys should r... [22:31:33] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for alternate ssh - https://phabricator.wikimedia.org/T94217#1158582 (10Negative24) [22:33:32] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for regular ssh with git on port 22 - https://phabricator.wikimedia.org/T94217#1158589 (10Negative24) [22:51:23] (03CR) 10Ejegg: [C: 031] Correct Fundraising project tag regex [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/199665 (owner: 10Awight) [22:55:06] Coren: hey, around? [22:55:24] Amir1: Somewhat. What can I do for you? [23:12:08] hi, sumana gave me maintainer access to missing-from-wikipedia, but i'm having trouble figuring out how to get the site ( https://tools.wmflabs.org/missing-from-wikipedia/ ) back up. i tried `webservice start`. [23:26:42] terrrydactyl, okay [23:26:45] how can we help? [23:27:23] so i didn't set up this tool, but i think it was running for a bit and then died, but i didn't do any of the setup so i'm kind of lost [23:28:07] i signed into tools and `became missing-from-wikipedia`, but not sure what to do next [23:34:30] sorry, it's probably not much information to go with Krenair [23:34:51] yeah, I don't know how I can help you [23:35:44] you said you tried `webservice start`? [23:35:48] did it not do anything [23:35:48] ? [23:36:12] it says "Starting webservice... started." [23:36:23] maybe the code is wrong? [23:38:11] terrrydactyl: did sumana write it? cant he help? [23:39:13] she did, but i think she struggled through it last time. plus she's been busy and hasn't had time to maintain it, hence getting me on board [23:39:17] she* [23:39:29] i see [23:39:54] i can email her, was just wondering it was something that i was missing [23:40:05] it's weird that the webservice start... [23:40:07] starts* [23:40:08] ran the app locally and it seems to be okay [23:40:11] but then you still get the error [23:40:16] Maybe Coren knows what that is about? [23:40:29] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for regular ssh with git on port 22 - https://phabricator.wikimedia.org/T94217#1158912 (10demon) >>! In T94217#1158579, @Negative24 wrote: > @chasemp @demon Well that was amusing. Yes. SSH for **maintenance** is over port 222. I'm not insane eno... [23:40:35] * Coren reads scrollback. [23:40:36] yeah, i'm not sure. webservice restart also "works" [23:40:52] but nothing shows up on https://tools.wmflabs.org/missing-from-wikipedia/ [23:41:34] terrrydactyl: Check the file 'error.log' in the tool's home directory, it may contain the explanation. [23:41:42] * terrrydactyl goes to check [23:42:20] `2015-03-27 23:36:12: (network.c.358) can't bind to port: 14001 Address already in use` [23:44:04] i'm using the wrong port? [23:45:14] looks like it also had the same error back in jan, but with a different port: 2015-01-28 01:30:24: (network.c.358) can't bind to port: 4001 Address already in use [23:47:52] Coren, is there a port i should be using? and how would i change it? [23:48:16] terrrydactyl: No, that's a bug on tool labs' side. [23:48:40] Lemme try to find the rogue process. [23:50:17] terrrydactyl: Try again? [23:50:59] Coren, same error: 2015-03-27 23:50:28: (network.c.358) can't bind to port: 14003 Address already in use [23:51:01] different port [23:51:31] terrrydactyl: Yeah, it's going to be hit-and-miss for a bit; there seems to be a batch of jobs that got confused port information. [23:51:51] should i just periodically try webservice start then? [23:52:15] terrrydactyl: I'll tell you when to try as I hunt down the culprits. :-) [23:52:28] okay, thanks Coren! :) [23:55:46] im having problems running my bot on labs, it works fine localy but on labs the job's err and out files are blank and qstat says its running [23:55:51] can anyone help me? [23:58:16] 6Labs, 6Phabricator: Phabricator security policy open up port 222 for regular ssh with git on port 22 - https://phabricator.wikimedia.org/T94217#1158942 (10Negative24) Keep in mind, this is a Labs instance. We can figure out the production configuration but for now I think that that would be overkill.