[00:27:02] RECOVERY - Puppet failure on tools-exec-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [00:45:19] 6Labs, 5Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#1683718 (10scfc) What is the problem with it? It is used by some scripts and works for them. It doesn't stop anyone from writing scripts in Python 3 or using their own support library or … [00:47:26] 6Labs, 5Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#1683721 (10yuvipanda) The code is terrible, and at least for me that's reason enough. I'm cleaning up the ldap module and most of the infrastructure level scripts there depend on it, so this is a tracking ticket for c... [00:50:30] 6Labs, 5Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#1683726 (10yuvipanda) It also makes the scripts that use it not be standlone nor packaged into a library but dependent on this one file that is placed into place by puppet... [01:10:50] Is it possible for GNU mailutils to be installed on tools? The current heirloom mailx doesn't allow setting Content-Type header (or any other header), and this means non-ascii text (eg. Chinese) would get an application/octet-stream and be very ugly in mail clients [01:19:03] jimmyxu: can you file a bug, etc? shouldn't be too hard I think [01:19:20] sure, will do [01:19:26] jimmyxu: thanks [01:19:31] jimmyxu: did your OAuth app get approved? [01:19:49] nope [01:20:00] jimmyxu: can you link me to it again? I can probably do it [01:20:16] yuvipanda: https://meta.wikimedia.org/wiki/Special:OAuthConsumerRegistration/update/176bf094c1f9b699219f27d1d005212e [01:21:23] jimmyxu: hmm, I get a 'this does not exist' message? [01:21:27] are you sure that's the correct URL? [01:21:59] I have no idea if that is only visible to me but here's a different special page https://meta.wikimedia.org/wiki/Special:OAuthListConsumers/view/176bf094c1f9b699219f27d1d005212e [01:22:49] halfak: ^ what do you think? also can you access those links/ I can't seem to despite me also being an oauth admin [01:22:50] 6Labs: Install mailutils on tools - https://phabricator.wikimedia.org/T114073#1683750 (10jimmyxu) 3NEW [01:23:12] yuvipanda: filed ^ [01:23:30] jimmyxu: ok! [01:23:33] 6Labs, 5Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#1683757 (10scfc) The dependence on Puppet to install local Python modules is a not-uncommon theme in the repository, so I don't know why that would be unbearable for `ldapsupportlib.py`. (But that's more JFTR.) [01:24:16] 6Labs, 5Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#1683758 (10yuvipanda) Where else are individual python files installed by puppet for use as libraries by other scripts? [01:36:12] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1683788 (10yuvipanda) Good question - I'm not sure. We don't know which of the keys supplied is actually matched, so can't just log from there either. [01:37:10] Krenair: https://gerrit.wikimedia.org/r/#/c/242039/ if you'd like to do some code review :) [01:40:48] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1683798 (10yuvipanda) ^ Cleaned up the ldap script to make modifications easier. [01:45:29] andrewbogott: also any objections to https://phabricator.wikimedia.org/T114063 [01:46:31] I don’t object in theory — I do like having a ready made way to slurp in ldap.conf [01:46:35] but I don’t much care how it happens [01:46:55] 6Labs, 5Patch-For-Review: Kill ldapsupportlib.py - https://phabricator.wikimedia.org/T114063#1683802 (10scfc) Besides #Tool-Labs, it's used by [[http://git.wikimedia.org/blob/operations%2Fpuppet.git/c00f88596e4c1146750bdc0a8ecc4509361e8d28/modules%2Fswift%2Fmanifests%2Fproxy.pp#L53|`swift`]] and [[http://git.w... [01:48:20] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1683803 (10yuvipanda) Hmm if we have https://bugzilla.mindrot.org/show_bug.cgi?id=2081 we can use that for logging, since we'll know which key is being asked for. [01:48:35] 6Labs, 10Tool-Labs, 5Patch-For-Review: Install flex on bastions - https://phabricator.wikimedia.org/T114003#1683804 (10scfc) 5Open>3Resolved Removed. [01:52:58] andrewbogott: ok! yeah, there's now /etc/ldap.yaml... [01:53:38] andrewbogott: I could eventually build a ldapconf library for reading the ldap conf format and then use that instead of the current splitting code [01:56:59] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1683808 (10yuvipanda) A version with that patch is currently in debian stretch and ubuntu wily. [01:58:03] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1683817 (10yuvipanda) @MoritzMuehlenhoff how insane is the idea of backporting openssh 6.9, just to the tools project at least? [02:17:53] Getting "AphrontConnectionQueryException: Attempt to connect to phuser@m3-master.eqiad.wmnet failed with error #2003: Can't connect to MySQL server on 'm3-master.eqiad.wmnet' (99)." too many times on Phabricator today. What's up? << Niharika: not sure how labs related? [02:26:33] andrewbogott: do you know how I can query LDAP to give me all users who are part of a certain service account? [02:26:44] No clue. I thought it might be because of "m3-master.eqiad.wmnet". Apparently not. Apologies! [02:29:21] Niharika: no worries! [02:29:27] Niharika: .eqiad.wmflabs is how labs hosts are [02:29:40] Niharika: the .wmnet implies it is the wikimedia production network [02:30:23] yuvipanda: Ah, okay. Who's to blame if things break on wmnet? :P [02:30:35] Niharika: -operations, in general :D [02:30:56] Niharika: in ths case, it is phabricator / db. db is dba, who is jynu.s and phab is releng team [02:31:02] On it. [02:36:13] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [05:11:17] RECOVERY - Puppet failure on tools-checker-01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:57:22] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [08:31:47] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1684091 (10MoritzMuehlenhoff) I'd like to avoid that for the production cluster (we already diverge from SSH distro packages for precise). The patch isn't straightforward t... [08:32:25] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [08:35:24] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1684092 (10valhallasw) Ah, but maybe we can just let the lookup script do the logging? Then we don't need to backport anything, while still having logs. I'm not sure what ss... [08:52:43] from -ops "starting cloning of labsdb1005 (Tools DB), minimal to no disruption is expected" [09:08:05] yuvipanda: ? :D [09:08:12] I have no idea what timezone you are in :D [09:27:06] he should be sleeping by now [09:27:21] ahh, is he back in SF? :) [10:52:11] 6Labs, 10Tool-Labs, 5Patch-For-Review: Remove dependency of toollabs::checker on toollabs::submit and shut down bigbrother on tools-checker-01/tools-checker-02 - https://phabricator.wikimedia.org/T113744#1684491 (10scfc) 5Open>3Resolved [12:44:06] jimmyxu, still around? [12:44:47] 6Labs, 10Tool-Labs: toolserver.org rejects mail for valid addresses - https://phabricator.wikimedia.org/T114102#1684668 (10scfc) 3NEW [12:45:18] 6Labs, 10Tool-Labs, 5Patch-For-Review: SMTP service on toolserver.org down - https://phabricator.wikimedia.org/T113756#1684677 (10scfc) 5Open>3Resolved a:3coren [12:45:40] 6Labs, 10Tool-Labs, 5Patch-For-Review: SMTP service on toolserver.org down - https://phabricator.wikimedia.org/T113756#1675114 (10scfc) Closing this here as the scope was SMTP service not being accessible, which now is the case. [12:46:12] 6Labs, 10Tool-Labs: toolserver.org rejects mail for valid addresses - https://phabricator.wikimedia.org/T114102#1684668 (10scfc) [12:46:14] 6Labs, 10Tool-Labs, 5Patch-For-Review: SMTP service on toolserver.org down - https://phabricator.wikimedia.org/T113756#1684685 (10scfc) [13:42:08] PROBLEM - Puppet failure on tools-bastion-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [13:55:58] Coren: tools-bastion-01 is OOM and I can’t ssh. Do you by chance have an existing session there so we can figure out who to scold? [13:56:23] Gah, I had one /minutes/ ago. [13:56:33] ok, I’ll just reboot it :( [13:56:36] I did manage to log in though [13:56:39] oh, great [13:56:49] ... it doesn't look oom. [13:56:54] hm, I can too now. It must’ve recovered. [13:57:06] probably whatever was going cray cray died. [13:57:10] A minute ago it told me '-bash: fork: Cannot allocate memory' [13:57:36] andrewbogott: Yep, dmesg is full of autoarchiv0.tcl going boom. [13:57:40] puppet still can’t fork [13:58:16] andrewbogott: ... there's 1.5G of free ram, so that's not it. [13:58:59] * Coren stares at the ps list. [13:59:10] Hm. Still some people running bots in screen sessions. [14:17:09] RECOVERY - Puppet failure on tools-bastion-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:13:09] PROBLEM - Puppet failure on tools-bastion-01 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [15:48:09] RECOVERY - Puppet failure on tools-bastion-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:49:05] halfak: around now/ [15:49:36] o/ jimmyxu. So. I looked at the application. There's not much information about what the application is supposed to do linked. Can you link me to some docs? [15:51:02] halfak: Yea.. that was supposed to be not very publicly available, but only for my own bots and from time to time bots from others of which their owners had asked me to run a task on their account for them [15:51:50] Why would we have bots whose behaviors are not publicly documented? [15:52:54] halfak: It's doc'd on zhwiki's bot listing https://zh.wikipedia.org/wiki/Wikipedia:%E6%9C%BA%E5%99%A8%E4%BA%BA/%E5%88%97%E8%A1%A8 [15:53:02] halfak: look for Jimmy-bot & Jimmy-abot [15:53:44] halfak: and we can run text replacements under supervision, and edits in our user spaces without authorisation. This is why I'm asking for a consumer :) [15:54:05] 6Labs, 10Tool-Labs: Unable to boot Ruby app on tool labs - https://phabricator.wikimedia.org/T109322#1685347 (10MusikAnimal) @scfc @coren (hope you don't mind the ping Coren) I've had another issue with booting up the app, but I won't get into that as I'm able to eventually get it up after repeated attempts.... [15:54:21] jimmyxu, gotcha. [15:54:38] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1685349 (10yuvipanda) @MoritzMuehlenhoff we would only need them for trusty and Jessie, and yes think doing this for production would be a bad idea. Tool Labs has its own pr... [15:54:54] Part of my job is to ensure that the rights requested are commensurate with the use of the tool. [15:55:11] totally understandable [15:55:16] It looks like you'll primarily be using OAuth to provide authorization via your own account. [15:55:18] Is that right? [15:56:49] halfak: my own bots, plus these kind of jobs https://zh.wikipedia.org/?diff=37384217 [15:57:17] halfak: the page owner would like to run the task under his bot, but is python-illiterate [15:58:22] jimmyxu, {{approved}} [15:58:26] Thanks for your patience :) [15:58:39] thx :) [16:14:41] 6Labs, 10Labs-Infrastructure, 6operations: install/setup labservices1001 - https://phabricator.wikimedia.org/T106584#1685467 (10RobH) [16:24:23] 6Labs, 10Labs-Infrastructure, 10hardware-requests, 6operations, 3labs-sprint-116: New server: labservices1001 - https://phabricator.wikimedia.org/T106147#1685502 (10RobH) [16:24:38] 6Labs, 10Labs-Infrastructure, 6operations: rename holmium to labdns1002 - https://phabricator.wikimedia.org/T106303#1685508 (10RobH) [16:24:40] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-107: holmium is a spof - https://phabricator.wikimedia.org/T106142#1685509 (10RobH) [16:24:42] 6Labs, 10Labs-Infrastructure, 10hardware-requests, 6operations, 3labs-sprint-116: New server: labservices1001 - https://phabricator.wikimedia.org/T106147#1460393 (10RobH) 5Open>3Resolved a:3RobH labservices1001 has been allocated and setup via subtasks [16:25:12] 6Labs, 10Labs-Infrastructure, 6operations: rename holmium to labservices1002 - https://phabricator.wikimedia.org/T106303#1685519 (10RobH) [16:25:59] 6Labs, 10Labs-Infrastructure, 6operations: install/setup labservices1001 - https://phabricator.wikimedia.org/T106584#1472236 (10RobH) [16:32:12] 6Labs, 10Labs-Infrastructure, 6operations: install/setup labservices1001 - https://phabricator.wikimedia.org/T106584#1685561 (10RobH) [16:33:02] 6Labs, 10Labs-Infrastructure, 6operations: install/setup labservices1001 - https://phabricator.wikimedia.org/T106584#1685566 (10RobH) a:5RobH>3Andrew I've done everything up to the signing puppet and salt keys for the initial puppet run(s). Assigning this task from myself to @andrew for service implemen... [17:04:48] (03PS1) 10Andrew Bogott: Added dummy password and cert for openstack designate manifests. [labs/private] - 10https://gerrit.wikimedia.org/r/242191 [17:05:13] (03CR) 10Andrew Bogott: [C: 032 V: 032] Added dummy password and cert for openstack designate manifests. [labs/private] - 10https://gerrit.wikimedia.org/r/242191 (owner: 10Andrew Bogott) [17:35:18] test [18:20:20] Coren [18:20:43] Myes? [18:22:25] Coren:When I submit a job in the grid, the task run normaly, buy in some hours after the submit, the job stops [18:23:20] UA31_: Stops how? [18:24:00] Coren:??? [18:24:27] UA31_: How does it stop? Is it killed? Does it exit? Are there errors in the log? [18:26:36] Coren:No errors in file [18:27:13] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1402 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:28:17] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1402 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [18:28:53] Krenair: is testing-shinken- you? [18:31:43] UA31_: And what is the exit code? (qacct could tell you) [18:36:59] Coren:qstat -j 99116 [18:36:59] Following jobs do not exist: [18:36:59] 99116 [18:37:00] : previous job [18:37:46] qacct, not qstat. lemme check. [18:42:56] UA31_: exit_status 130 == SIGINT. That normally means a keyboard interrupt, which shouldn't happen to a running job. I know the grid doesn't ever use SIGINT, so it's not clear what did. [18:44:15] yuvipanda, yes [18:45:49] 6Labs, 10Tool-Labs, 5Patch-For-Review, 3labs-sprint-116: Allow direct ssh access to tools - https://phabricator.wikimedia.org/T113979#1686324 (10MoritzMuehlenhoff) I'll make the backports this week. [18:49:55] yuvipanda, we know it works now :) [18:53:38] 6Labs, 10Tool-Labs: toolserver.org rejects mail for valid addresses - https://phabricator.wikimedia.org/T114102#1686392 (10coren) a:3coren [18:58:51] 6Labs, 10Labs-Infrastructure: New db grants for labservices1001 - https://phabricator.wikimedia.org/T114159#1686416 (10Andrew) 3NEW a:3jcrespo [19:00:18] PROBLEM - Puppet failure on tools-proxy-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:00:54] 6Labs, 10Labs-Infrastructure: New db grants for labservices1001 - https://phabricator.wikimedia.org/T114159#1686433 (10Andrew) [19:01:34] 6Labs, 10Labs-Infrastructure: New db grants for labservices1001 - https://phabricator.wikimedia.org/T114159#1686416 (10Andrew) [19:01:35] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-107, 5Patch-For-Review: holmium is a spof - https://phabricator.wikimedia.org/T106142#1686437 (10Andrew) [19:06:32] PROBLEM - Puppet failure on tools-proxy-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:20:17] RECOVERY - Puppet failure on tools-proxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:21:35] RECOVERY - Puppet failure on tools-proxy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:26:35] legoktm: Argh, we missed an issue in forrestbot. [19:26:37] legoktm: "ValueError: invalid literal for int() with base 10: '.1'" [19:34:15] (03PS1) 10Jforrester: Cope with wmf_parts being .1 rather than 1 [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242252 [19:36:15] legoktm: CR appreciated. :-) [19:46:10] (03Merged) 10jenkins-bot: Cope with wmf_parts being .1 rather than 1 [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242252 (owner: 10Jforrester) [19:51:51] (03PS1) 10Jforrester: Cope with wmf_parts[1] being .1 rather than 1, rather [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242265 [19:52:04] (03CR) 10Jforrester: [C: 032] Cope with wmf_parts[1] being .1 rather than 1, rather [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242265 (owner: 10Jforrester) [19:58:31] (03Merged) 10jenkins-bot: Cope with wmf_parts[1] being .1 rather than 1, rather [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242265 (owner: 10Jforrester) [20:00:03] !log tools.forrestbot Deploy If6637138 and follow-up fix I573559c3; run immediately to clean-up queue. [20:00:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.forrestbot/SAL, Master [20:16:46] 6Labs, 10Tool-Labs, 5Patch-For-Review: toolserver.org rejects mail for valid addresses - https://phabricator.wikimedia.org/T114102#1686865 (10coren) 5Open>3Resolved This should now be fixed; same root cause as the previous issue (new default mail config clobbered toolserver.org specific one) [20:17:01] 6Labs, 10Labs-Infrastructure, 6operations: install/setup labservices1001 - https://phabricator.wikimedia.org/T106584#1686869 (10Cmjohnson) [20:21:34] !log tools.forrestbot Archived log for before today to forrestbot.err.archive2015-09-28 [20:21:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.forrestbot/SAL, Master [20:21:37] James_F: maybe wmf_parts = wmf_parts[1:] ? [20:21:55] and thanks for taking care of that :) [20:22:17] legoktm: Yeah, I duno, just needed a quick hack. :-) [20:22:59] legoktm: Also because the branch to REL1_26 was a week late, we now have a lot of things tagged wmf.2 when they should be wmf.1. [20:23:01] I'll fix it later. [20:25:02] I'm looking for a place to host some small web app for MediaWiki admins. How can I get a tool labs box? Just submit stuff here? https://wikitech.wikimedia.org/wiki/Special:FormEdit/Tools_Access_Request [20:31:15] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Jeroen De Dauw was created, changed by Jeroen De Dauw link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Jeroen_De_Dauw edit summary: Created page with "{{Tools Access Request |Justification=I'd like to host a small web app that allows wiki admins to configure and download extension bundles. |Completed=false |User Name=Jeroen..." [20:56:32] 6Labs, 10Tool-Labs, 5Patch-For-Review: toolserver.org rejects mail for valid addresses - https://phabricator.wikimedia.org/T114102#1687079 (10scfc) Verified with `timl@toolserver.org`; merci! [20:59:11] yuvipanda: How do you re-run a query in Quarry? Like http://quarry.wmflabs.org/query/4265 [20:59:54] yuvipanda: also, it would be useful if the query status included the date it was run [21:00:46] do you just create a new query with the same SQL? [21:04:49] kaldari: someone else's query? You have to copy paste [21:05:05] cool. thanks [21:13:22] (03PS1) 10Hashar: Remove py35 from default tox envlist [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242348 [21:15:04] (03CR) 10Hashar: "The reason is to migrate to a single job 'fox-jessie' that just executes 'tox'. This way" [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242348 (owner: 10Hashar) [21:15:09] PROBLEM - Puppet staleness on tools-worker-02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [21:15:22] (03CR) 10Hashar: "This way you can define envlist however you want." [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242348 (owner: 10Hashar) [21:26:05] (03CR) 10Jforrester: "This was so I could test locally against python 3; I have py35 but not py34…" [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242348 (owner: 10Hashar) [21:28:05] yuvipanda: if Quarry reports back "Sort aborted: Query execution was interrupted" is that probably because it's taking to long to execute? [21:31:05] (03CR) 10Hashar: "envlist, just define the list of environments that are run by default. py34 or py35 are built-in so you can specify it from the command" [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242348 (owner: 10Hashar) [21:53:24] (03PS1) 10Dzahn: ganeti: add fake DSA key to fix compiler runs [labs/private] - 10https://gerrit.wikimedia.org/r/242361 [21:55:00] (03CR) 10Dzahn: [C: 032] "this is to enable checking ganeti changes in the puppet compiler." [labs/private] - 10https://gerrit.wikimedia.org/r/242361 (owner: 10Dzahn) [21:55:17] (03CR) 10Dzahn: [V: 032] "this is to enable checking ganeti changes in the puppet compiler." [labs/private] - 10https://gerrit.wikimedia.org/r/242361 (owner: 10Dzahn) [21:58:55] 6Labs, 10Labs-Infrastructure, 5Patch-For-Review: New db grants for labservices1001 - https://phabricator.wikimedia.org/T114159#1687300 (10Andrew) 5Open>3Resolved [21:58:56] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-107, 5Patch-For-Review: holmium is a spof - https://phabricator.wikimedia.org/T106142#1687301 (10Andrew) [21:59:22] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-107, 5Patch-For-Review: holmium is a spof - https://phabricator.wikimedia.org/T106142#1460338 (10Andrew) labservices1001 is up and serving dns. The remaining step is to add it as a secondary nameserver to labs instances. [21:59:49] yuvipanda: what’s the status with https://phabricator.wikimedia.org/T98556? [22:00:00] It sort of looks like it’s done, or mostly done? [22:08:59] yuvipanda: https://github.com/wikimedia/nagf [22:09:13] andrewbogott: not done at all actually [22:09:20] andrewbogott: I started on it and then never went anywhere [22:09:23] Krinkle: I think so... [22:09:29] kaldari: it's got a 20min limit [22:09:45] kaldari: ^^ not Krinkle [22:09:50] yuvipanda: want to do it tomorrow? We’re precariously close to finishing that task… without it we’ll have a ‘fail’ in our quarterly report [22:10:04] hmm [22:10:06] * yuvipanda ponders [22:10:56] andrewbogott: ok, do you know where / how wikitech calls out into the domain proxy? [22:11:13] andrewbogott: I'm looking at it now [22:11:17] it’s managed by keystone [22:11:28] so the failover should be just a change in the keystone catalog [22:11:56] andrewbogott: oh? I mean, how does it know what IP to hit? [22:12:14] you mean, how does it know where the proxy api is? [22:12:20] yeah [22:12:48] !log project-proxy deleting dynamicproxy-01 and -02, starting afresh with jessie [22:12:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL, Master [22:13:01] gotta wait for a minute I guess [22:13:26] yuvipanda: keystone [22:13:40] it’s the same way wikitech contacts nova-api [22:14:40] andrewbogott: I see. I've no idea how that works... :| If I do the proxies and give you a IP / URL can you change that for me? [22:15:02] https://dpaste.de/xu3U [22:15:09] yes, but... [22:15:20] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Jeroen De Dauw was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=186722 edit summary: [22:15:40] we don’t need to change it, do we? Since it uses dns, we would create a new api host and change proxy-eqiad.wmflabs.org to point to that [22:16:05] and then if we needed to fail over, we could change keystone to point to proxy-eqiad2.wmflabs.org or just change dns again [22:17:16] sorry, maybe I misunderstand the question [22:17:17] andrewbogott: ooooh, yes, ok fair enough [22:17:18] yes [22:17:23] andrewbogott: let me build the hosts now [22:17:28] thanks :) [22:17:51] I’m about done for the day, but I should be able to finish the holmium spof task in the morning. And Jaime is working on db replication. [22:18:03] ok [22:20:04] yuvipanda: the keystone catalog is very simple. It has a list of services, and then a list of endpoints — each endpoint binds a url to a service id. [22:20:16] ah I see [22:20:20] So, the service name ‘proxy’ gets us the proxy id, which gets us the endpoint, which is that labs box. [22:20:36] It’s not very interesting, just provides a single place to organize your rest endpoints. [22:21:09] (03PS1) 10Jforrester: get_slug: Update comment to show what happens to new-style wmf branches [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242368 [22:21:11] (03PS1) 10Jforrester: get_slug_PHID: Catch list being empty and throw explanation [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242369 [22:24:36] legoktm: RTB's backlog of e-mails seems to grow without end. :-( [22:24:55] James_F: er, what do you mean? [22:25:43] legoktm: https://tools.wmflabs.org/forrestbot/log.txt – "2015-09-29 21:00:16,975: INFO - 340 e-mails to process (4642 kB)" then "2015-09-29 22:00:18,747: INFO - 344 e-mails to process (4213 kB)" [22:25:56] legoktm: Is it not marking as read or not iterating or something> [22:26:17] eh....that's weird [22:26:24] valhallasw set up all the email stuff [22:26:27] Yeah. [22:26:40] It looks like it's just not able to catch up. [22:26:47] Maybe we should make it grab more in a go? [22:27:26] it should grab everything [22:27:33] The code looks like it does. [22:34:14] (03CR) 10Jforrester: "I'm aware that I can work around this patch, but that doesn't make it a good patch." [labs/tools/forrestbot] - 10https://gerrit.wikimedia.org/r/242348 (owner: 10Hashar) [22:37:02] andrewbogott: looks like I have to repackage invisible unicorn [22:37:05] * yuvipanda has fond memories [22:37:07] let me do that [22:46:51] (03PS1) 10Yuvipanda: Add .gitreview [labs/invisible-unicorn] - 10https://gerrit.wikimedia.org/r/242388 [22:46:53] (03PS1) 10Yuvipanda: Add debian dir [labs/invisible-unicorn] - 10https://gerrit.wikimedia.org/r/242389 [22:47:09] (03CR) 10Yuvipanda: [C: 032 V: 032] Add .gitreview [labs/invisible-unicorn] - 10https://gerrit.wikimedia.org/r/242388 (owner: 10Yuvipanda)