[00:13:32] doctaxon: I see some giftbot jobs stuck on the job grid because they are not specifying '-l release=precise'. The giftbot dedicated worker queue (tools-exec-gift.eqiad.wmflabs) only has precise. Job numbers are 548672 and 548688 [00:14:22] they need to have the cron or whatever starts them changed to either specify precise or not specify the giftbot queue [00:14:52] !log tools.giftbot obs stuck on the job grid because they are not specifying '-l release=precise'. The giftbot dedicated worker queue (tools-exec-gift.eqiad.wmflabs) only has precise. Job numbers are 548672 and 548688 [00:14:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.giftbot/SAL [00:43:26] RECOVERY - Host tools-secgroup-test-102 is UP: PING OK - Packet loss = 0%, RTA = 3.91 ms [00:43:35] 10Tool-Labs-tools-Other: Some yifeibot tasks seem to hang indefinately - https://phabricator.wikimedia.org/T152054#2836927 (10zhuyifei1999) The script broke a long time ago, and is being replaced by T147013. I wasn't aware that there are still tasks running. I'll shutdown the web UI soon. [00:48:22] PROBLEM - Host tools-secgroup-test-102 is DOWN: CRITICAL - Host Unreachable (10.68.21.170) [00:49:09] 10Tool-Labs-tools-Other: Some yifeibot tasks seem to hang indefinately - https://phabricator.wikimedia.org/T152054#2836955 (10zhuyifei1999) (Oh and slimerjs, which this script depends on, seems to have a strong tendency to segfault on tool labs trusty nodes for some unknown reasons. Both precise and jessie work) [00:55:57] (03CR) 10BryanDavis: [C: 032] "what's up jerkins?" [labs/striker] - 10https://gerrit.wikimedia.org/r/316025 (https://phabricator.wikimedia.org/T147024) (owner: 10BryanDavis) [00:58:46] (03PS4) 10BryanDavis: Validate new usernames with action=query&list=users&usprop=cancreate [labs/striker] - 10https://gerrit.wikimedia.org/r/316025 (https://phabricator.wikimedia.org/T147024) [00:59:08] RECOVERY - Host tools-secgroup-test-103 is UP: PING OK - Packet loss = 0%, RTA = 7.97 ms [01:00:01] (03PS4) 10BryanDavis: Check request ip for account creation blocks on Wikitech [labs/striker] - 10https://gerrit.wikimedia.org/r/316026 (https://phabricator.wikimedia.org/T147024) [01:00:20] (03PS3) 10BryanDavis: Update client side validation for username and shellname [labs/striker] - 10https://gerrit.wikimedia.org/r/316205 [01:04:07] PROBLEM - Host tools-secgroup-test-103 is DOWN: CRITICAL - Host Unreachable (10.68.21.22) [01:11:05] (03CR) 10BryanDavis: Validate new usernames with action=query&list=users&usprop=cancreate [labs/striker] - 10https://gerrit.wikimedia.org/r/316025 (https://phabricator.wikimedia.org/T147024) (owner: 10BryanDavis) [01:11:06] (03CR) 10BryanDavis: [C: 032] "3rd time is the charm?" [labs/striker] - 10https://gerrit.wikimedia.org/r/316025 (https://phabricator.wikimedia.org/T147024) (owner: 10BryanDavis) [01:11:06] (03CR) 10BryanDavis: [C: 032] "*grumble* jenkins *grumble*" [labs/striker] - 10https://gerrit.wikimedia.org/r/316026 (https://phabricator.wikimedia.org/T147024) (owner: 10BryanDavis) [01:11:07] (03Merged) 10jenkins-bot: Validate new usernames with action=query&list=users&usprop=cancreate [labs/striker] - 10https://gerrit.wikimedia.org/r/316025 (https://phabricator.wikimedia.org/T147024) (owner: 10BryanDavis) [01:11:08] (03Merged) 10jenkins-bot: Check request ip for account creation blocks on Wikitech [labs/striker] - 10https://gerrit.wikimedia.org/r/316026 (https://phabricator.wikimedia.org/T147024) (owner: 10BryanDavis) [01:11:08] (03Merged) 10jenkins-bot: Update client side validation for username and shellname [labs/striker] - 10https://gerrit.wikimedia.org/r/316205 (owner: 10BryanDavis) [01:11:08] * bd808 throws jerkins a cookie [01:12:26] bd808, it wants your blood [01:13:32] It may be something about that repo's gerrit config. It seemed to not want to do a merge commit [01:16:17] (03PS1) 10BryanDavis: Add support for authenticated Action API use [labs/striker] - 10https://gerrit.wikimedia.org/r/324637 (https://phabricator.wikimedia.org/T144712) [01:24:39] RECOVERY - Host secgroup-lag-102 is UP: PING OK - Packet loss = 0%, RTA = 1.13 ms [01:28:58] hi, I've got an issue, where some syntax such as [[File: ]] and == dont work [01:29:08] brand new installation, any idea what's wrong? [01:30:03] asffsa: try asking in #mediawiki ? [01:30:39] why are there so many different channels [01:47:27] PROBLEM - Host secgroup-lag-102 is DOWN: CRITICAL - Host Unreachable (10.68.17.218) [02:50:11] PROBLEM - SSH on tools-exec-1403 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:54:59] RECOVERY - SSH on tools-exec-1403 is OK: SSH OK - OpenSSH_6.9p1 Ubuntu-2~trusty1 (protocol 2.0) [03:22:26] (03PS1) 10Krinkle: [WIP] Add support for IPv6 in IP info message [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:23:13] (03CR) 10jenkins-bot: [V: 04-1] [WIP] Add support for IPv6 in IP info message [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 (owner: 10Krinkle) [03:24:48] (03PS2) 10Krinkle: [WIP] Add support for IPv6 in IP info message [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:27:37] (03PS3) 10Krinkle: [WIP] Add support for IPv6 in IP info message [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:35:39] (03PS4) 10Krinkle: [WIP] Add ASN description and range to IP info message (+ IPv6 support) [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:37:49] (03PS5) 10Krinkle: [WIP] Add ASN description and range to IP info message (+ IPv6 support) [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:41:38] (03PS6) 10Krinkle: [WIP] Add ASN description and range to IP info message (+ IPv6 support) [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:45:33] (03PS7) 10Krinkle: Add ASN description and range to IP info message (+ IPv6 support) [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 [03:46:10] (03CR) 10Krinkle: [C: 032] "Examples:" [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 (owner: 10Krinkle) [03:46:48] (03Merged) 10jenkins-bot: Add ASN description and range to IP info message (+ IPv6 support) [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324648 (owner: 10Krinkle) [04:03:03] (03PS1) 10Krinkle: [WIP] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:05:45] (03PS2) 10Krinkle: [WIP] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:06:34] (03PS3) 10Krinkle: [WIP] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:08:13] (03PS4) 10Krinkle: [WIP] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:22:50] (03PS5) 10Krinkle: [WIP] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:24:08] (03PS6) 10Krinkle: [WIP] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:25:11] (03PS7) 10Krinkle: Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 [04:25:22] (03CR) 10Krinkle: [C: 032] Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 (owner: 10Krinkle) [04:25:57] (03Merged) 10jenkins-bot: Add link for more information about the ASN/ISP. [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324650 (owner: 10Krinkle) [04:31:55] (03PS1) 10Krinkle: Fix hostname lookup limitation [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324653 [04:32:42] (03PS2) 10Krinkle: Fix hostname lookup limitation [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324653 [04:32:49] (03CR) 10Krinkle: [C: 032] Fix hostname lookup limitation [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324653 (owner: 10Krinkle) [04:33:14] (03Merged) 10jenkins-bot: Fix hostname lookup limitation [labs/tools/guc] - 10https://gerrit.wikimedia.org/r/324653 (owner: 10Krinkle) [06:23:50] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2837353 (10Marostegui) >>! In T150802#2836731, @jcrespo wrote: > I wanted to sanitize this for... [06:26:33] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2837356 (10Marostegui) I think we are ready to sanitize s3 now after dropping all the non priv... [06:30:12] PROBLEM - Puppet run on tools-exec-1413 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [06:42:07] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2837381 (10Marostegui) >>! In T150802#2836731, @jcrespo wrote: > I wanted to sanitize this for... [07:00:02] PROBLEM - Puppet run on tools-exec-1415 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:03:14] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2837422 (10jcrespo) > Shall I run the local redact_sanitarium.sh instead of the one we used to... [07:05:49] PROBLEM - Puppet run on tools-exec-1402 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [07:10:12] RECOVERY - Puppet run on tools-exec-1413 is OK: OK: Less than 1.00% above the threshold [0.0] [07:29:34] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2837454 (10Marostegui) I have started to sanitize s3 using the local script in a local screen... [07:35:01] RECOVERY - Puppet run on tools-exec-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [07:40:49] RECOVERY - Puppet run on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [09:08:26] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2837675 (10Marostegui) The script has finished. It took around 1:35h to finish. I am going to... [10:12:49] 06Labs, 10Horizon: Cannot edit Puppet role parameters - https://phabricator.wikimedia.org/T152084#2837775 (10scfc) [11:33:32] 06Labs, 10Labs-Kubernetes, 13Patch-For-Review: Build and validate Ubuntu Trusty image for wikimedia use - https://phabricator.wikimedia.org/T148054#2837902 (10yuvipanda) 05Open>03Resolved a:03yuvipanda [11:34:03] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Build toollabs trusty 'catch all' container - https://phabricator.wikimedia.org/T152089#2837904 (10yuvipanda) [12:34:51] 06Labs, 10Labs-Kubernetes, 10Tool-Labs, 13Patch-For-Review: Build toollabs trusty 'catch all' container - https://phabricator.wikimedia.org/T152089#2837962 (10yuvipanda) [13:27:38] hi! howdo I scp to a project? [13:40:35] PROBLEM - Free space - all mounts on tools-docker-builder-03 is CRITICAL: CRITICAL: tools.tools-docker-builder-03.diskspace.root.byte_percentfree (<44.44%) [13:50:42] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled - https://phabricator.wikimedia.org/T152096#2838157 (10hashar) [13:56:28] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled - https://phabricator.wikimedia.org/T152096#2838175 (10hashar) [14:01:09] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled - https://phabricator.wikimedia.org/T152096#2838181 (10hashar) The ImageNotAuthorized reference the images: 84f2fcfb-7ac5-4c3b-9505-ada37cbcaebf... [14:02:46] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2838183 (10Marostegui) The data has been sanitized correctly and I have started replication in... [14:15:40] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled - https://phabricator.wikimedia.org/T152096#2838195 (10hashar) Spawning an instance from Horizon as novaadmin works fine (tested by Andrew). Tried... [14:27:37] PROBLEM - Puppet run on tools-docker-registry-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:29:42] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled - https://phabricator.wikimedia.org/T152096#2838232 (10hashar) The new Jessie snapshot has ID 0d29c97d-390b-439b-9778-6c2171a7020b and fails as w... [14:30:21] PROBLEM - Puppet run on tools-worker-1019 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:40:35] RECOVERY - Free space - all mounts on tools-docker-builder-03 is OK: OK: All targets OK [14:51:34] PROBLEM - Free space - all mounts on tools-docker-builder-03 is CRITICAL: CRITICAL: tools.tools-docker-builder-03.diskspace.root.byte_percentfree (<10.00%) [14:55:13] 10Tool-Labs-tools-Other: Some yifeibot tasks seem to hang indefinately - https://phabricator.wikimedia.org/T152054#2838301 (10zhuyifei1999) 05Open>03Resolved a:03zhuyifei1999 I killed the rest and disabled web job submission. (The code is really ugly :( ) [14:57:35] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: OpenStack API refuses to launch new instances || Nodepool is out of instance / CI stalled - https://phabricator.wikimedia.org/T152096#2838313 (10chasemp) 05Open>03Resolved a:03chasemp Somehow nodepool was using an invalid token. We... [15:01:35] RECOVERY - Free space - all mounts on tools-docker-builder-03 is OK: OK: All targets OK [15:02:37] RECOVERY - Puppet run on tools-docker-registry-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:03:52] Epantaleo: if you setup your ssh proxycommand to get into the instance scp should follow suit [15:04:35] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2838332 (10yuvipanda) After some more difficulty in the 'omnibus' solution, I am now in the process of adding small utilities to all the language level containers. That might be enough to get autolist working on that. I'll update... [15:08:20] 10Labs-Kubernetes: Odd kubernetes error - https://phabricator.wikimedia.org/T141041#2838338 (10yuvipanda) 05Open>03Resolved Ah, autolist now just redirects to PetScan :) Either way, all the tools mentioned are now in the base images :) I've switched autolist to run on k8s now, and seems to work ok. [15:10:18] RECOVERY - Puppet run on tools-worker-1019 is OK: OK: Less than 1.00% above the threshold [0.0] [15:11:59] chasemp: thanks [15:25:24] bd808: musikanimal I just merged the ruby container. [15:25:32] I guess I need to add an entry to the webservices package [15:59:42] chasemp: the new config file works to get access to the tool, but cause problems when I access another host with ssh [16:08:37] 06Labs, 10Labs-Infrastructure, 10DBA, 10Datasets-General-or-Unknown, 13Patch-For-Review: Provision db1095 with at least 1 shard, sanitize and test slave-side triggers - https://phabricator.wikimedia.org/T150802#2838504 (10Marostegui) The server caught up and the data is being sanitized as it comes in, so... [16:45:59] 06Labs, 10DBA: Prepare and check storage layer for new fi.wikivoyage.org - https://phabricator.wikimedia.org/T151756#2838643 (10jcrespo) 05Open>03Resolved a:03jcrespo From the above patch, this is resolved. [17:02:03] 10Labs-project-Phabricator, 13Patch-For-Review: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2838697 (10Paladox) Patch https://gerrit.wikimedia.org/r/#/c/324408/ and https://gerrit.wikimedia.org/r/#/c/324551/ allows you to configure a diff... [17:02:31] 10Labs-project-Phabricator, 10Phabricator, 13Patch-For-Review: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2838698 (10Paladox) [17:08:10] YuviPanda: awesome! how does it work? [17:17:19] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Shavtay was created, changed by Shavtay link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Shavtay edit summary: Created page with "{{Tools Access Request |Justification=Hi, I started volunteering in wikimedia few months ago, helping the Hebrew wiktionary team with bots that fix many things. I want to up..." [17:21:19] memory [17:38:31] musikanimal: there [17:38:59] bah. there's a pending patch to tie the container into the webservice command [17:39:49] when that lands and is deployed then we can get you testing it :) [17:42:29] (03PS1) 10MarcoAurelio: Post current StewardBot.py code [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324758 [17:42:55] ^will fail dramatically [17:43:20] (03CR) 10jenkins-bot: [V: 04-1] Post current StewardBot.py code [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324758 (owner: 10MarcoAurelio) [17:45:43] (03CR) 10MarcoAurelio: [C: 032 V: 032] "Pushing current code to the repo so fixes can start being done." [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324758 (owner: 10MarcoAurelio) [17:51:13] mafk: :/ all of my pep8 fixes blown away [17:51:48] bd808: we were all working on a lie, that code was not the code that actually worked [17:52:01] this is the code that is running now [17:52:23] I need now a test instance to check the code on a StewardBot-test or something [17:52:25] coolio, will be happy to help wiht testing [17:52:27] *with [17:53:11] * mafk forces his brains to guess an idea [18:04:24] (03PS1) 10BryanDavis: Fix PEP8 violations [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324759 [18:04:29] mafk: ^ [18:04:40] heh, guess I was doing? [18:04:42] python -m pip install --upgrade pip [18:04:46] ;) [18:05:10] bd808: what I need is a clone so StewardBot is still up while I can test updates [18:05:20] any easy way to achieve that? [18:05:57] mafk: make a new tool account? [18:06:07] I can use mabot or maurelio [18:06:16] or run it locally. I haven't tried that [18:07:01] When I'm making big changes to Stashbot I run it from a venv on my laptop [18:07:27] that's part of what stewardbots needs -- migration to run from a tool local venv [18:07:48] so that you can use a less than ancient irc lib [18:07:50] remember that I'm not a coder ;) [18:08:39] mafk: not yet ;) [18:09:00] I though about creating a new folder, with the code we merge from Gerrit, to run it independently [18:09:30] bascially, yes that's the way to do it [18:10:05] and you can actually test patches pre-merge by cherry-picking the patch from gerrit into the working git clone that you are testing from [18:12:19] what about an 'unstable' branch on gerrit? [18:12:31] I commit there and if works, I push to master? [18:12:36] (03CR) 10BryanDavis: "I ran `autopep8 --in-place --aggressive --aggressive StewardBot/StewardBot.py` and then hand edited to fix a small number of issues report" [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324759 (owner: 10BryanDavis) [18:13:15] mafk: you can do that but usually it's not needed. Find a workflow that works for you though certainly [18:13:26] Is SULWatcher code from public_html okay? [18:13:31] I've not updated that [18:13:55] having an unstable branch often leads to waiting too long before actually using the patches live [18:14:25] and then being surprised about why something doesn't work right and lots of patches to bisect to figure out why [18:14:26] * mafk headesks -- needs time to understand all this stuff [18:16:02] (03CR) 10MarcoAurelio: [C: 032] Fix PEP8 violations [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324759 (owner: 10BryanDavis) [18:16:38] starting gate-and-submit [18:19:28] I don't think our patches should be on the mediawiki queue [18:19:49] (03Merged) 10jenkins-bot: Fix PEP8 violations [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/324759 (owner: 10BryanDavis) [18:19:56] there's some magic way to move to a new queue in the CI setup [18:20:07] you can open a task and ask the CI folks to help [18:20:26] I think everything is in the same queue by default [18:22:14] !log tools.stewardbots [[gerrit:324758|Post current StewardBot.py code]] and [[gerrit:324759|Fix PEP8 violations]] courtesy of Bryan Davis. [18:22:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [18:24:44] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:34:49] 06Labs, 10Tool-Labs: enwiki_p replica corruption - https://phabricator.wikimedia.org/T152117#2839007 (10Tb) [18:42:43] 06Labs, 10Tool-Labs: enwiki_p replica corruption - https://phabricator.wikimedia.org/T152117#2839043 (10jcrespo) 05Open>03Resolved a:03jcrespo ``` MariaDB [enwiki]> SELECT * FROM pagelinks where pl_from = 1993506 and pl_title = 'South-Africa'; Empty set (0.00 sec) ``` [18:48:49] 06Labs, 10Recommendation-API: Request increased quota for recommendation-api labs project - https://phabricator.wikimedia.org/T152120#2839107 (10schana) [18:57:45] bd808: I tested the code with the pep8 fixes for stewardbot [18:57:48] it works [18:58:02] <3 [18:59:40] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:12:37] bd808: SULWatcher new code also works [19:12:43] <3 <3 [19:17:28] 10Labs-project-Phabricator, 10Phabricator, 13Patch-For-Review: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2839277 (10Paladox) Adding screenshot to show it working {F4931606} [19:27:26] 10Labs-project-Phabricator, 10Phabricator, 13Patch-For-Review: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2839305 (10Paladox) 05Open>03Resolved a:03Paladox CLosing as resolved, as it installs. Creating a doc at {P4527} [19:28:38] 10Labs-project-Phabricator, 10Phabricator, 13Patch-For-Review: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2839310 (10Dzahn) Can confirm we now have a jessie instance using the regular "role::phabricator::main" role and ``` Linux phabr... [19:37:02] 10Labs-project-Phabricator, 06Operations, 10Phabricator: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2839337 (10Dzahn) [20:21:40] 10Labs-project-Phabricator, 06Operations, 06Release-Engineering-Team: Setup test domain for phab2001 - https://phabricator.wikimedia.org/T152132#2839539 (10Paladox) [20:24:03] 10Labs-project-Phabricator, 06Operations, 06Release-Engineering-Team: Setup test domain for phab2001 - https://phabricator.wikimedia.org/T152132#2839556 (10Dzahn) agreed. we did it in a similar way for gerrit with "gerrit-new". but gerrit wasn't behind varnish, unlike phab. [20:24:19] 10Labs-project-Phabricator, 06Operations, 06Release-Engineering-Team: Setup test domain for phab2001 - https://phabricator.wikimedia.org/T152132#2839570 (10Dzahn) please link this to the other phab2001 ticket(s) in some way [20:24:19] 10Labs-project-Phabricator, 06Operations, 06Release-Engineering-Team: Setup test domain for phab2001 - https://phabricator.wikimedia.org/T152132#2839572 (10Paladox) [21:00:18] 06Labs, 10Tool-Labs: become should have a better error message when homedir doesn't exist - https://phabricator.wikimedia.org/T149511#2839736 (10scfc) p:05Triage>03Low [21:03:48] 06Labs, 10Tool-Labs: Unconfirm account email addresses for Wikitech accounts that bounced during 2016 survey mailings - https://phabricator.wikimedia.org/T149824#2839755 (10scfc) Are the users informed by MediaWiki about the change or do we need to leave messages on their talk pages as well? (For future surve... [21:05:50] "Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Reading data from Toolsbeta failed: TypeError: Data retrieved from Toolsbeta is String not Hash at /etc/puppet/manifests/realm.pp:71 on node toolsbeta-valhallasw-puppet-compiler-3.toolsbeta.eqiad.wmflabs" [21:06:12] but line 71 is '$app_routes = hiera('discovery::app_routes')' [21:08:03] hm, just retrying a while later seems to magically solve it. Puppet... [21:10:40] random intermittent error... [21:11:47] and now I'm back to "Error: Could not retrieve catalog from remote server: Error 400 on SERVER: We need either the domain name for DNS discovery or an explicit peers list at /etc/puppet/modules/etcd/manifests/init.pp:56 on node toolsbeta-valhallasw-puppet-compiler-3.toolsbeta.eqiad.wmflabs" [21:11:49] gaaaaaah [21:18:33] 06Labs, 10Tool-Labs, 13Patch-For-Review, 15User-bd808: Reduce Precise OGE exec hosts to 10 - https://phabricator.wikimedia.org/T151980#2839825 (10bd808) Just as a sanity check to prevent another T149634#2758566: ``` tools-bastion-02.tools:~ bd808$ sudo qconf -sel|grep -- -12 tools-exec-1212.eqiad.wmflabs t... [21:21:31] valhallasw`cloud: ugh. cert problems? [21:22:01] https://phabricator.wikimedia.org/T152142 [21:22:23] and the DNS discovery one was a missing `etcd::peers_list: []` in the hiera config [21:22:44] basically, puppet did not work because the following was missing from hiera: discovery::app_routes: {}, etcd::peers_list: [], labsdnsconfig: {} [21:22:49] * valhallasw`cloud is not amused [21:23:06] *nod* I would guess those are things that role based hiera supplies in prod [21:23:19] there's a lot of crap like that in the puppet code [21:23:56] valhallasw`cloud: for a similar game of whack-a-mole, see https://wikitech.wikimedia.org/wiki/User:BryanDavis/Scap3_in_a_Labs_project [21:24:03] anyhow, I finally have rebuilt my puppet compiler host [21:24:22] let's see if it still works on labs hosts [21:39:05] 06Labs, 10Tool-Labs: Unconfirm account email addresses for Wikitech accounts that bounced during 2016 survey mailings - https://phabricator.wikimedia.org/T149824#2839931 (10bd808) I think the only way to unconfirm is manual db manipulation, so a talk page post would be a good idea. Or a maintenance script that... [21:41:26] 06Labs, 10Tool-Labs, 10puppet-compiler: toolsbeta: set up puppet-compiler / temporary-apply - https://phabricator.wikimedia.org/T97081#2839948 (10valhallasw) Steps to setting up the compiler for tools: 1) create a new m1.large instance in toolsbeta (toolsbeta-valhallasw-puppet-compiler-5) (Jessie, don't for... [21:42:42] 06Labs, 10Tool-Labs, 10MediaWiki-extensions-WikimediaMaintenance: Make maintance script for sending annual survey emails - https://phabricator.wikimedia.org/T148783#2839954 (10bd808) This script should make sure that bounces come back to the wiki so bounce handlers are notified. It would be extra awesome if... [21:49:49] 06Labs, 10Tool-Labs: Missing php5-mcrypt module from tools-exec-14xx - https://phabricator.wikimedia.org/T149810#2839970 (10scfc) p:05Triage>03Low The `php5-mcrypt` should be installed on all execution nodes and is indeed (at least on Trusty instances): ``` scfc@tools-puppetmaster-02:~$ clush -g exec-trus... [21:52:52] 06Labs, 10Tool-Labs, 13Patch-For-Review, 15User-bd808: Change Python hashbang to `#! /usr/bin/env python -E -s` for user-facing tools - https://phabricator.wikimedia.org/T147350#2839980 (10scfc) 05Resolved>03Open The packages still need to be built and deployed. [21:58:28] 06Labs, 10Tool-Labs, 13Patch-For-Review, 15User-bd808: Change Python hashbang to `#! /usr/bin/env python -E -s` for user-facing tools - https://phabricator.wikimedia.org/T147350#2840019 (10bd808) Thanks @scfc. @yuvipanda reminded me of this but I had not dug up the task and reopened it yet. I'll get this d... [21:58:53] 06Labs, 10Tool-Labs, 13Patch-For-Review, 15User-bd808: Change Python hashbang to `#! /usr/bin/env python -E -s` for user-facing tools - https://phabricator.wikimedia.org/T147350#2840022 (10bd808) p:05Triage>03High [22:00:03] 06Labs, 10Tool-Labs, 10Pywikibot-core: Running a core script fails with 'permission denied' creating a logfile folder - https://phabricator.wikimedia.org/T146996#2840025 (10scfc) 05Open>03Resolved a:03valhallasw The cause with the ownership/permissions of the directory and a solution has been explained... [22:13:05] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs: Make a nag system to email maintainers of tools still running on precise gird hosts - https://phabricator.wikimedia.org/T149214#2745525 (10scfc) The information is available in `/var/lib/gridengine/default/common/accounting`: ``` scfc@tools-bastion-03:~$ tail... [22:15:29] RECOVERY - Host tools-secgroup-test-102 is UP: PING OK - Packet loss = 0%, RTA = 1.91 ms [22:16:31] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Shavtay was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=1062588 edit summary: [22:23:24] PROBLEM - Host tools-secgroup-test-102 is DOWN: CRITICAL - Host Unreachable (10.68.21.170) [22:32:11] 06Labs, 10Tool-Labs, 10puppet-compiler: toolsbeta: set up puppet-compiler / temporary-apply - https://phabricator.wikimedia.org/T97081#2840111 (10scfc) CMIIW, but is the compiler really working? http://tools-puppet-compiler.wmflabs.org/324623000/ says "Hosts that have no differences" includes "tools-grid-ma... [23:04:09] RECOVERY - Host tools-secgroup-test-103 is UP: PING OK - Packet loss = 0%, RTA = 1.80 ms [23:04:44] 06Labs, 10Graphite, 06Operations: Move labs 'instances' data to graphite labs - https://phabricator.wikimedia.org/T143405#2840199 (10fgiunchedi) [23:14:07] PROBLEM - Host tools-secgroup-test-103 is DOWN: CRITICAL - Host Unreachable (10.68.21.22) [23:26:27] andrewbogott, i am getting dns failures for promethium and some of my vms .. [23:26:27] [subbu@earth ~] ssh ssastry@promethium.wikitextexp.wmflabs.org [23:26:28] ssh: Could not resolve hostname promethium.wikitextexp.wmflabs.org: Name or service not known [23:26:40] [subbu@earth ~] ssh ssastry@mw-expt.wikitextexp.wmflabs.org [23:26:40] ssh: Could not resolve hostname mw-expt.wikitextexp.wmflabs.org: Name or service not known [23:27:05] or bd808 ^ .. where do i debug this? or is this something transient? [23:27:17] this isn't anything we know about atm subbu [23:27:18] subbu: those are all instances with public ips? [23:27:27] ah yes [23:27:30] good thought [23:27:32] i think so. [23:28:02] i remember setting up proxies for them .. i logged into them 3+ weeks back. [23:28:24] web proxies aren't the same thing as ssh proxies... [23:28:30] alex@alex-laptop:~$ dig +short promethium.wikitextexp.wmflabs.org @labs-ns0.wikimedia.org [23:28:31] alex@alex-laptop:~$ [23:28:56] subbu: you don't just want @eqiad.wmflabs? [23:29:07] 06Labs, 10Tool-Labs: shinken is too "volatile" and imprecise to be of use - https://phabricator.wikimedia.org/T107297#1491854 (10Krinkle) Same issue for te "cvn" project. The notifications are pure noise and have never notified me of anything actionable. For us the main one is "Puppet run is CRITICAL" - which... [23:29:41] i was logging onto them with those names so far .. so presumably, there were public ips for them till now .. but let me try @eqiad.wmflabs anyway. [23:29:50] andrewbogott, I think promethium refers to that host which isn't an instance [23:30:11] yeah, but I don't think that had a public ip either, not sure [23:30:43] anyway, i got in with just @eqiad.wmflabs [23:31:17] hm, ok, it looks like there are floating ips assigned to things in wikitextexp [23:31:22] so I'll investigate a bit [23:31:48] my memory doesn't tell me much at this time .. if i had requested them for some reason. [23:32:13] hm, they have floating ips but no domains assigned [23:32:28] subbu, you have expt.wikitextexp.wmflabs.org and base.wikitextexp.wmflabs.org [23:32:34] yes. [23:33:04] *. those pointing to 208.80.155.188 and 208.80.155.182 respectively [23:33:48] which are instance-mw-base.wikitextexp.wmflabs.org and instance-mw-expt.wikitextexp.wmflabs.org [23:34:25] https://phabricator.wikimedia.org/T132216 might be relevant? [23:34:53] no? [23:35:20] never mind. i am trying to remember why they might have public ips. [23:36:45] subbu: so you're able to do the things that you immediately need to do? [23:37:14] i will know once tim release a new version of a package ... [23:41:10] andrewbogott, but yes, i can log in .. ping the mw-base and mw-expt vms .. and access files there. [23:48:31] PROBLEM - Host secgroup-lag-102 is DOWN: CRITICAL - Host Unreachable (10.68.17.218)