[00:19:10] 06Labs, 10Gerrit: Strange errors when cloning operations/mediawiki-config from gerrit to labs NFS - https://phabricator.wikimedia.org/T142787#2546348 (10AlexMonk-WMF) [00:27:26] RECOVERY - Puppet run on tools-bastion-03 is OK: OK: Less than 1.00% above the threshold [0.0] [00:38:04] Hi, I'm having an issue linking my account in Phabricator [00:40:36] hi [00:40:48] go ahead [00:42:06] Wait, I fixed it [00:42:08] sorry [00:57:57] 06Labs, 10Tool-Labs: New entries in meta_p.wiki are missing a URL - https://phabricator.wikimedia.org/T142759#2546533 (10AlexMonk-WMF) It seems my script broke a while ago. Probably when we moved dblist files into the 'dblists' folder in mediawiki-config, or changed InitialiseSettings to use short array syntax... [01:10:50] PROBLEM - Puppet run on tools-services-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [01:37:31] 06Labs, 10Tool-Labs: New entries in meta_p.wiki are missing a URL - https://phabricator.wikimedia.org/T142759#2546613 (10AlexMonk-WMF) I'm about to upload a new version of mine, no related fixes for your bug, as it'd give you this: ```krenair@tools-bastion-03:/tmp/krenair-operations-software/maintain-replicas... [01:37:49] 06Labs: New entries in meta_p.wiki are missing a URL - https://phabricator.wikimedia.org/T142759#2546614 (10AlexMonk-WMF) [01:39:40] 10Labs-project-Wikistats: wikistats (labs project): convert (all) mediawikis to use API instead of parsing old Special:Statistics - https://phabricator.wikimedia.org/T142766#2546628 (10Dzahn) conversion mode, round 2 28 wikis succesfully converted to API parsing. --- need to restore: 10156, 10145 -- deleted... [01:40:01] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations: Enable access to Wikipedia Tulu (tcywiki) on labs replicas - https://phabricator.wikimedia.org/T142223#2546632 (10AlexMonk-WMF) Note that this can be done by any member of the 'ops' group in puppet, it does not need to wait for my maintain-replicas rewrit... [01:40:22] 10Labs-project-Wikistats: wikistats (labs project): convert (all) mediawikis to use API instead of parsing old Special:Statistics - https://phabricator.wikimedia.org/T142766#2546636 (10Dzahn) If you can find a working URL ending in "api.php" for any of the above, please let me know. [01:50:50] RECOVERY - Puppet run on tools-services-02 is OK: OK: Less than 1.00% above the threshold [0.0] [01:54:46] | GRANT ALL PRIVILEGES ON `u2170\_\_%`.* TO 'u2170'@'%' | [01:54:54] So why can't I create database u2170_meta_p? :/ [01:55:16] duh [01:55:18] two underscores [01:55:21] I may be going blind [02:02:32] 06Labs: New entries in meta_p.wiki are missing a URL - https://phabricator.wikimedia.org/T142759#2546658 (10AlexMonk-WMF) BTW: I ran my script under my own user (with a couple of naming changes) - log into tools, `sql meta_p` and `use u2170__meta_p.wiki;` [02:09:49] 06Labs, 10Labs-Infrastructure: Track labs instances hanging - https://phabricator.wikimedia.org/T141673#2546674 (10yuvipanda) [02:12:05] 06Labs, 10Labs-Infrastructure: Track labs instances hanging - https://phabricator.wikimedia.org/T141673#2546678 (10yuvipanda) [02:25:07] 06Labs, 10Labs-project-Phabricator: https://phab-01.wmflabs.org returns a core exception - https://phabricator.wikimedia.org/T137270#2546683 (10Negative24) @Paladox Phab-02 doesn't have the same puppet class assigned as all the other instances. It is using `role::phabricator::labs::diffusion` instead of `role:... [03:25:00] 10Tool-Labs-tools-Pageviews: Improve Topviews interface - https://phabricator.wikimedia.org/T142802#2546734 (10MusikAnimal) [03:35:06] 06Labs, 10Labs-Infrastructure: Track labs instances hanging - https://phabricator.wikimedia.org/T141673#2546759 (10yuvipanda) So I dug in and grepped all the logs for all active instances from openstack, and the following 33 instances have some form or other of: ``` [12156856.840147] INFO: task log-log.tcl:77... [03:59:24] 06Labs, 10Labs-Infrastructure: Track labs instances hanging - https://phabricator.wikimedia.org/T141673#2546776 (10yuvipanda) I can only ssh with my root key into the following instances: 1. druid103.eqiad.wmflabs 2. tools-exec-1220.eqiad.wmflabs 3. tools-bastion-03.eqiad.wmflabs 4. tools-exec-1207.eqiad.wmfl... [04:23:52] 06Labs, 10Labs-Infrastructure: Track labs instances hanging - https://phabricator.wikimedia.org/T141673#2546800 (10yuvipanda) Looking at distribution of labvirts... ``` root@labcontrol1001:/home/yuvipanda/20160811# cat hosts | sort | uniq -c 3 labvirt1001.eqiad.wmnet 1 labvirt1002.eqiad.wmnet... [05:43:24] PROBLEM - Puppet staleness on tools-webgrid-lighttpd-1208 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [43200.0] [05:43:57] PROBLEM - Puppet staleness on tools-exec-1211 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [43200.0] [05:47:59] PROBLEM - Puppet staleness on tools-exec-1213 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [43200.0] [05:53:11] PROBLEM - Puppet staleness on tools-exec-1204 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [43200.0] [06:11:15] PROBLEM - Puppet staleness on tools-webgrid-lighttpd-1207 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [43200.0] [06:16:56] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [06:16:57] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [06:16:58] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [06:16:59] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [06:17:01] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [06:32:52] 10Tool-Labs-tools-Erwin's-tools: 502 Bad Gateway - https://phabricator.wikimedia.org/T142637#2546875 (10Supernino) 05Open>03Resolved a:03Supernino Now they're up again :) [06:54:37] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [06:54:39] 💝💝 JEM Y ASIMOVBOT SE AMAN 💝💝 [07:28:14] 06Labs, 10Labs-Infrastructure, 10DBA: labsdb* has no automatic failover solution - https://phabricator.wikimedia.org/T141097#2546914 (10jcrespo) [07:28:17] 06Labs, 10Labs-Infrastructure, 10DBA, 07Epic, 07Tracking: Labs databases rearchitecture (tracking) - https://phabricator.wikimedia.org/T140788#2546915 (10jcrespo) [07:28:20] 06Labs, 10Labs-Infrastructure, 10DBA, 13Patch-For-Review: Setup and provision labsdb1009, labsdb1010 and labsdb1011 - https://phabricator.wikimedia.org/T140452#2546913 (10jcrespo) [07:31:35] 06Labs, 10Labs-Infrastructure, 10DBA: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#2546917 (10jcrespo) [07:31:38] 10Tool-Labs-tools-Erwin's-tools: 502 Bad Gateway - https://phabricator.wikimedia.org/T142637#2541662 (10Nemo_bis) Good :) [07:31:56] 06Labs, 10Labs-Infrastructure, 10DBA: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#2546932 (10jcrespo) [07:31:59] 06Labs, 10Labs-Infrastructure, 10DBA, 13Patch-For-Review: Setup and provision labsdb1009, labsdb1010 and labsdb1011 - https://phabricator.wikimedia.org/T140452#2546933 (10jcrespo) [07:33:52] 06Labs, 10Labs-Infrastructure, 10DBA: labsdb* has no automatic failover solution - https://phabricator.wikimedia.org/T141097#2546936 (10jcrespo) @coren, We cannot wait more on this; @mark has specifically asked me to unblock this so we can go on with T142807. [08:38:14] 06Labs, 10Tool-Labs, 10DBA: Replication seems to be halted for multiple databases - https://phabricator.wikimedia.org/T142310#2547071 (10jcrespo) 05Open>03Resolved a:03jcrespo We identified the tool that was probably causing the memory issues- there was a single tool that was not latency-critical; taki... [08:46:24] 06Labs, 10Labs-Infrastructure, 10DBA: Decommission labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T142807#2547080 (10jcrespo) p:05Triage>03High [09:41:55] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Trofimovamw was created, changed by Trofimovamw link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Trofimovamw edit summary: Created page with "{{Tools Access Request |Justification=Research, Ontology development |Completed=false |User Name=Trofimovamw }}" [10:47:36] 06Labs, 10Labs-Infrastructure, 10DBA: labsdb* has no automatic failover solution - https://phabricator.wikimedia.org/T141097#2547381 (10mark) @jcrespo, that's not what I said. :) It sounds like this is not an easy decision that we can make without consequences and without gathering more information first. I... [11:29:32] when I run "python pwb.py replace -fix:poet -page:User:4nn1l2/Sandbox -always" it' OK and the script us executed without any interaction or propmpts, but when I add jsub in the command-line "jsub -N poet2poet python pwb.py replace -fix:poet -page:User:4nn1l2/Sandbox -always", nothing happens. poet2poet.err reads "/usr/bin/python2.7: can't open file 'pwb.py': [Errno 2] No such file or... [11:29:34] ...directory" why is that and how to fix it? [11:36:24] nn1l2: when you run it manually, do you run it in your home directory? [11:37:16] no I run it in /project/data/nn1l2bot/pywikibot-core directory [11:37:46] then you have to prepend that to pwb.py in your jsub-line [11:38:18] like pywikibot-core/pwb.py instead of just pwb.py [11:38:52] I cd to that directory and pwb.py is in that directory [11:40:45] I have no problem running the script without jsub [11:41:42] have you tried my suggestion? [11:51:41] thank you gifti, it worked! [11:52:38] yay [12:19:03] is there a labsadmin available? [12:19:14] A tools needs a webservice restart [12:19:39] * a tool [12:21:14] Luke081515: there should be a role for you to do that [12:21:24] :D [12:23:52] http://tools.wmflabs.org/templatetransclusioncheck/ throws 502, so it needs an restart. the owner is inactive since june [12:32:39] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: tools.templatetransclusioncheck hangs - https://phabricator.wikimedia.org/T142834#2547650 (10valhallasw) [12:38:59] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: tools.templatetransclusioncheck hangs - https://phabricator.wikimedia.org/T142834#2547692 (10valhallasw) The host has a high load average (4), and several 100% CPU php-cgi processes from `tools.jembot`: {T132880}? Server also doesn't respond on localhost: ``` va... [12:39:29] 06Labs, 10Tool-Labs, 10Tool-Labs-tools-Other: tools.templatetransclusioncheck hangs - https://phabricator.wikimedia.org/T142834#2547695 (10valhallasw) 05Open>03Resolved a:03valhallasw [12:40:00] Luke081515: I an restart it atm but no time to do much debugging there if taht doesn't work, webservice things it's running fwiw [12:40:21] !log tools tools.templatetransclusioncheck@tools-bastion-03:~$ webservice restart [12:40:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [12:40:48] seems back [12:44:29] chasemp: thank you very much :) [14:25:38] hi, can I create a webproxy inside a subdomain in horizon? I can see search.wmflabs.com in the domains list but only wmflabs.com is available when I create web proxies [14:25:58] We don't have wmflabs.com. [14:26:13] Krenair: sorry, wmflabs.org [14:26:38] You can control search.wmflabs.org records as you like, point them to any instance inside your project [14:27:17] SSL certificates do not allow *.*, so we can't have the proxy serve anything not directly under wmflabs.org [14:27:40] (we don't have it set up to run Let's Encrypt on all those domains) [14:29:51] Krenair: thanks, looking/trying but not sure to understand yet [14:30:26] Our proxy instance does not have a *.search.wmflabs.org SSL certificate dcausse [14:30:47] It has a *.wmflabs.org certificate, so it won't match anything underneath search.wmflabs.org [14:30:52] ok [14:31:46] Kelson: but how can I create a enwikitest.search.wmflabs.org:80 => http://10.68.18.222:8080 web proxie? [14:31:54] I'm not Kelson [14:31:54] oops ^ Krenair [14:33:40] If it *must* be under search.wmflabs.org, you'll have to give an instance in your project a public IP, set up your cert there (e.g. using LE) and then have it proxy to relforge-search:8080 [14:34:47] I suppose you can do it without HTTPS support, but I don't intend to add support for domains other than wmflabs.org to the proxy until we can handle HTTPS properly [14:36:40] Krenair: ok understood, then I think I'll create webproxies with more specific names using dashes (e.g. enwiki-relforge.wmflabs.org) instead of using the search subdomain [14:36:53] that may be the easiest option for now [14:37:04] it's what we do for some things in deployment-prep [14:37:23] Krenair: ok, makes sense, thanks for your help! [14:37:29] that don't go via our varnish instances (which have public IPs etc.) for whatever reason [14:38:02] it's not an issue for me I think [14:38:09] ok [15:16:40] 10Labs-project-Wikistats: wikistats (labs project): convert (all) mediawikis to use API instead of parsing old Special:Statistics - https://phabricator.wikimedia.org/T142766#2548001 (10Dzahn) p:05Triage>03Normal [16:55:05] tom29739 I just learnt about another nice kubectl feature - the kubectl 'explain' command :D try it out! (provides docs for things) [17:04:58] yuvipanda: ooh, nice [17:06:04] I wish some other commands were like that. [17:18:24] 10Wikibugs, 07Easy: Change Project-Creators to Project-Admins in channels.yaml - https://phabricator.wikimedia.org/T142851#2548434 (10Danny_B) [17:39:19] going to kill shinken-wm for a while [18:07:30] (03PS2) 10Dzahn: Replace git.wikimedia.org url with diffusion url [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/296900 (https://phabricator.wikimedia.org/T139089) (owner: 10Paladox) [18:11:26] (03CR) 10Dzahn: [C: 032] Replace git.wikimedia.org url with diffusion url [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/296900 (https://phabricator.wikimedia.org/T139089) (owner: 10Paladox) [18:11:36] (03CR) 10Dzahn: [V: 032] Replace git.wikimedia.org url with diffusion url [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/296900 (https://phabricator.wikimedia.org/T139089) (owner: 10Paladox) [18:12:26] !log crosswatch switched reference to git.wm to diffusion for T139089 [18:12:27] Did you mean tools.crosswatch instead of crosswatch? [18:12:27] T139089: Fix references to git.wikimedia.org in all repos - https://phabricator.wikimedia.org/T139089 [18:12:27] crosswatch is not a valid project. [18:12:48] !log tools.crosswatch switched reference to git.wm to diffusion for T139089 (gerrit 296900) [18:13:08] T139089: Fix references to git.wikimedia.org in all repos - https://phabricator.wikimedia.org/T139089 [18:24:35] !log ores deployed ores-wmflabs-deploy:b015348 [18:24:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [18:25:04] !log ores ran FLUSHALL on ores-redis-02:6380 [18:25:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [19:14:49] !log deployment-prep rebooting deployment-cache-upload04, it's stuck in https://phabricator.wikimedia.org/T141673 and varnish is no longer working there afaict, so trying to bring upload.beta.wmflabs.org back up [19:14:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL, Master [19:20:21] !log deployment-prep that fixed it, upload.beta is back up [19:20:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL, Master [19:23:25] Krenair: did you notice that stashbot stopped whining at you? :) [19:24:17] bd808, ... actually I didn't. Did you change that? If so thanks! [19:24:46] I did. It has a "don't bug Krenair" rule now [19:25:11] haha [19:54:17] !log git disabling puppet temp on gerrit-test3 for change https://gerrit.wikimedia.org/r/#/c/302980/ (Owner ostriches) [19:54:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL, Master [20:01:53] !log tools migrating tools-grid-master (currently inactive) to labvirt1013 away from crowded 1010 [20:01:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:10:52] !log tools migration of tools-grid-master to labvirt1013 complete [20:10:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:17:39] !log git reenable puppet on gerrit-test3, test was success :). [20:17:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL, Master [20:18:06] 10Labs-Kubernetes: Install Helm on Kubernetes - https://phabricator.wikimedia.org/T142743#2549156 (10yuvipanda) We restrict the containers you can use to only a whitelisted set on our own registry that we build - do you think helm will still be useful? [20:18:09] 10Tool-Labs-tools-Xtools: resurrect wikiviewstats tool - https://phabricator.wikimedia.org/T91320#2549160 (10Matthewrbowker) [20:18:12] 10Tool-Labs-tools-Xtools: Xtools API hits error and returns 'maintenance' - https://phabricator.wikimedia.org/T136482#2549157 (10Matthewrbowker) 05Open>03Resolved a:05Mabandalone>03Matthewrbowker Thanks to @Alfa80 's fix, it appears to be working. I'm closing this for now, feel free to re-open if it bre... [20:18:14] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Setup Kubernetes Masters in a HA setup - https://phabricator.wikimedia.org/T142862#2549161 (10yuvipanda) [20:28:12] 10Labs-project-Wikistats: fix broken links in largest_html (was: Update lietuvai.lt statistics URLs) - https://phabricator.wikimedia.org/T136183#2549207 (10Dzahn) [20:29:53] 06Labs: (Re-)Create Gitblit->Phabricator testing instance on Labs - https://phabricator.wikimedia.org/T142186#2549243 (10Dzahn) The instance has been created but we can't ssh to it. [20:33:29] 06Labs: (Re-)Create Gitblit->Phabricator testing instance on Labs - https://phabricator.wikimedia.org/T142186#2549262 (10Dzahn) a:05Dzahn>03None [20:34:58] 06Labs, 10Labs-Infrastructure: Creating new instance failed - https://phabricator.wikimedia.org/T136656#2549266 (10Dzahn) [20:35:00] 06Labs: (Re-)Create Gitblit->Phabricator testing instance on Labs - https://phabricator.wikimedia.org/T142186#2526306 (10Dzahn) 05Open>03stalled [20:36:55] !log tools delete tools-webgrid-generic-1405, enough things have moved to k8s from that queue! [20:37:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:37:36] !log tools delete tools-logs-01, going to recreate with a smaller image [20:37:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:39:32] !log tools delete tools-webgrid-lighttpd-1415, enough webservices have moved to k8s from that queue [20:39:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [20:53:12] yuvipanda: ah, purging? :D [20:53:38] Luke081515 nah tools ran up to its quota and so I was just cleaning out rather than increasing it :) [20:53:44] I don't want us to go over our 1TB quota [20:53:51] o.O :D [20:53:58] you mean ram, I guess? [20:55:14] yeah [21:02:53] harej earwig you can reboot the stuck wpx instance now, it should come back up [21:17:05] yuvipanda: that's good, I thought it was a bit hypocritical that labs admins were asking us to be careful with labs resources :D [21:17:31] "Is tools exempt from that :D" [21:18:31] tom29739 yeah, to raise it I'd have gone through the same process [21:18:38] (file a task, etc) [21:32:05] 06Labs, 10Labs-Infrastructure: Upgrade qemu on labvirts - https://phabricator.wikimedia.org/T142866#2549413 (10yuvipanda) [21:33:00] (03PS1) 10BryanDavis: Add django logging channel to default config [labs/striker] - 10https://gerrit.wikimedia.org/r/304573 [21:34:30] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Setup Kubernetes Masters in a HA setup - https://phabricator.wikimedia.org/T142862#2549431 (10yuvipanda) This ran into a bump - we have kube-maintainusers, which is used to populate token auth of all the masters. This should run in only one place,... [21:36:51] !log gitblit cant ssh to instance 'danny'. we tried rebooting it [21:36:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Gitblit/SAL, Master [21:37:16] !log gitblit tried creating a new instance to see if it's a race. 'Failed to create instance.' [21:37:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Gitblit/SAL, Master [21:40:10] 06Labs: (Re-)Create Gitblit->Phabricator testing instance on Labs - https://phabricator.wikimedia.org/T142186#2549432 (10Dzahn) I tried to just create a new instance to see if it's a race / consistent. Currently i am getting "Failed to create instance. " though. We should try again later. [21:42:00] 06Labs: (Re-)Create Gitblit->Phabricator testing instance on Labs - https://phabricator.wikimedia.org/T142186#2549433 (10Dzahn) a:03Danny_B Could you try again later. Giving to you since you now have a project where you can create your own instances. [22:12:26] (03PS1) 10BryanDavis: Change scap3 revision default [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304574 [22:21:55] (03PS2) 10BryanDavis: Change scap3 revision default [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304574 [22:22:05] (03CR) 10BryanDavis: [C: 032] Change scap3 revision default [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304574 (owner: 10BryanDavis) [22:22:11] (03Merged) 10jenkins-bot: Change scap3 revision default [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304574 (owner: 10BryanDavis) [22:24:23] (03CR) 10BryanDavis: [C: 032] Add django logging channel to default config [labs/striker] - 10https://gerrit.wikimedia.org/r/304573 (owner: 10BryanDavis) [22:25:41] (03Merged) 10jenkins-bot: Add django logging channel to default config [labs/striker] - 10https://gerrit.wikimedia.org/r/304573 (owner: 10BryanDavis) [22:26:50] (03PS1) 10BryanDavis: Bump striker submodule for logging change [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304579 [22:27:10] (03CR) 10BryanDavis: [C: 032] Bump striker submodule for logging change [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304579 (owner: 10BryanDavis) [22:27:16] (03Merged) 10jenkins-bot: Bump striker submodule for logging change [labs/striker/deploy] - 10https://gerrit.wikimedia.org/r/304579 (owner: 10BryanDavis) [22:32:11] 06Labs, 10Continuous-Integration-Infrastructure: Request increased quota for labs project - https://phabricator.wikimedia.org/T142877#2549610 (10yuvipanda) [22:32:25] 06Labs, 10Continuous-Integration-Infrastructure: Request increased quota for labs project - https://phabricator.wikimedia.org/T142877#2549626 (10yuvipanda) [22:33:50] 06Labs, 10Continuous-Integration-Infrastructure: Request increased quota for labs project - https://phabricator.wikimedia.org/T142877#2549610 (10yuvipanda) [22:34:57] 06Labs, 10Continuous-Integration-Infrastructure: Request increased quota for contintcloud labs project - https://phabricator.wikimedia.org/T142877#2549634 (10tom29739) [22:36:59] 06Labs, 07Tracking: Existing Labs project quota increase requests (Tracking) - https://phabricator.wikimedia.org/T140904#2549644 (10yuvipanda) [22:37:01] 06Labs, 10Continuous-Integration-Infrastructure: Request increased quota for contintcloud labs project - https://phabricator.wikimedia.org/T142877#2549641 (10yuvipanda) 05Open>03stalled Copying from T139771#2549637 > We had an outage for CI 2 night ago and during that we discovered that nodepool seems to... [22:54:28] !log test [22:54:28] Message missing. Nothing logged. [22:57:25] how many floating IPs is tools using?! [23:00:48] Luke081515: out bound? I knew that once but have forgotten [23:01:23] bd808: I don't know how much are used, but from the IPs from the instances it feels like there are tons ;) [23:01:24] it's a pretty big pool of SNAT IPs as I recall (mabye 64?) [23:01:46] bd808: btw, maybe want to give your stashbot a cloak? [23:02:11] As I remember all of Labs exits via one SNAT pool except when the instance has a public static [23:02:39] yeah, but if there is a @instance-tools-exec-1406.tools.wmflabs.org, the instance has one [23:02:43] Luke081515: I could/should I guess. It has no irc rights so it's never been a priority [23:25:26] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 15User-greg: Create incident report for CI outage on Aug 10th - https://phabricator.wikimedia.org/T142887#2549861 (10greg) [23:29:50] Luke081515: see [Labs-l] Reverse DNS for labs public IPs now working [23:30:32] legoktm: yep, saw that, but if you don't have an instanfr with floating IP, yo uhave internal-nat or something like this ;) [23:30:48] so I'm wondering, that it feels like we have tons with other DNS