[00:01:44] 6Labs, 5Patch-For-Review: Make a labstore module - https://phabricator.wikimedia.org/T93781#1398598 (10coren) [00:10:24] 500 error on Phabricator. Known? [00:10:44] wrong chan. Going to ops [00:42:53] booted up a new jessie instance in labs and added labs-vagrant. attempting to provision it though its not finding particular packages (like ruby2.0-dev) even after running a manual apt-get update. [00:42:59] do i need to use ubuntu for labs-vagrant to work perhaps? [00:43:11] yes [00:43:26] ok [00:43:32] the mw-vagrant rules don't know how to behave on jessie [01:15:43] 6Labs, 10MediaWiki-extensions-OATHAuth, 10wikitech.wikimedia.org: MF Special:Login doesn't have a field for 2FA - https://phabricator.wikimedia.org/T103771#1398717 (10Negative24) 3NEW [01:34:54] 6Labs, 10MediaWiki-extensions-OATHAuth, 10MobileFrontend, 10wikitech.wikimedia.org, 3Readership-Web: MF Special:Login doesn't have a field for 2FA - https://phabricator.wikimedia.org/T103771#1398732 (10Krenair) [02:41:59] 6Labs, 10MediaWiki-extensions-OATHAuth, 10MobileFrontend, 10wikitech.wikimedia.org, 3Readership-Web: MF Special:Login doesn't have a field for 2FA - https://phabricator.wikimedia.org/T103771#1398868 (10Parent5446) The token input will soon no longer be on the login page anyway, but I am not sure how Mobi... [05:32:02] guys, I am doing rolling salt key regeneration on all instances that aren't in their own little salt cluster by themselves [05:32:17] this will affect only one instance at a time, should be relatively painless for all concerned [06:25:23] 6Labs: Instance osmit.eqiad.wmflabs refuses my public ssh key - https://phabricator.wikimedia.org/T103310#1399137 (10Sbiribizio) As stated before, I think the problem is here: mount.nfs: Failed to resolve server labstore.svc.eqiad.wmnet: Temporary failure in name resolution mountall: mount /public/keys [337] te... [06:38:02] 6Labs: clean up old ec2id-based salt keys on labs - https://phabricator.wikimedia.org/T103089#1399156 (10ArielGlenn) well I went through one run of the script and it tossed many. Seems there are some projects still using the old names; I'll know more later today. [06:46:13] 6Labs, 10Incident-20150617-LabsNFSOutage: Instance osmit.eqiad.wmflabs refuses my public ssh key - https://phabricator.wikimedia.org/T103310#1399159 (10valhallasw) Yes, that looks like it's caused by the NFS outage. Labs roots should still be able to log in to your instance. [06:51:45] 6Labs, 10Incident-20150617-LabsNFSOutage: Instance osmit.eqiad.wmflabs refuses my public ssh key - https://phabricator.wikimedia.org/T103310#1399198 (10Sbiribizio) So I must wait for a fix? It's enough this ticket or I must open a new ticket? Regards [07:15:25] 6Labs, 10Labs-Infrastructure: add domain alias gomwiki.labsdb and lrcwiki.labsdb for s3.labsdb - https://phabricator.wikimedia.org/T103794#1399303 (10Merl) 3NEW [07:16:15] 6Labs, 10Labs-Infrastructure: add domain alias gomwiki.labsdb and lrcwiki.labsdb for s3.labsdb - https://phabricator.wikimedia.org/T103794#1399312 (10Merl) [07:16:16] 10Tool-Labs-tools-Other, 7Tracking: merl tools (tracking) - https://phabricator.wikimedia.org/T69556#1399311 (10Merl) [07:17:11] 6Labs, 10Labs-Infrastructure: add domain alias gomwiki.labsdb and lrcwiki.labsdb for s3.labsdb - https://phabricator.wikimedia.org/T103794#1399303 (10Merl) [08:30:14] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1399417 (10Petrb) 300GB are you sure? I doubt it's more than 300MB There should be some folder called "src" which contains sources, this one I definitely... [08:31:09] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1399422 (10Petrb) there is somewhere a git repository maybe looking for it will get you to that folder [08:31:41] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1399424 (10Petrb) find botbot | grep .git [08:37:22] 6Labs, 6Discovery, 10Maps: *.tiles.wmflabs.org should be served over https - https://phabricator.wikimedia.org/T103801#1399439 (10valhallasw) 3NEW [08:58:05] 6Labs, 10Beta-Cluster: Disable NFS home directories on deployment-prep - https://phabricator.wikimedia.org/T102169#1399502 (10hashar) [09:09:33] 6Labs, 10Incident-20150617-LabsNFSOutage: Recover cssk tool's log files - https://phabricator.wikimedia.org/T103350#1399510 (10Blahma) I have successfully received the files, thank you for your help! [09:52:09] (03PS1) 10Sitic: Stop click event propagation when necessary [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220727 [09:52:32] (03CR) 10Sitic: [C: 032 V: 032] Stop click event propagation when necessary [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220727 (owner: 10Sitic) [09:54:28] 6Labs: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1399592 (10MoritzMuehlenhoff) 3NEW [09:54:56] 6Labs: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1399604 (10yuvipanda) p:5Triage>3High [10:00:10] YuviPanda: can you fix https://phabricator.wikimedia.org/T103310#1399159 ? [10:00:27] login issues because NFS has not been remounted yet [10:00:30] ugh [10:00:33] lookinhg [10:00:36] <3 [10:11:04] 6Labs: Identify (and potentially) help move mwoffliner off NFS - https://phabricator.wikimedia.org/T102682#1399661 (10yuvipanda) They have moved to using /data/scratch now [10:12:56] 6Labs, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1399673 (10yuvipanda) [10:12:57] 6Labs: Disable NFS for 'planet' project - https://phabricator.wikimedia.org/T102695#1399671 (10yuvipanda) 5Open>3Resolved [10:16:59] valhallasw: fixed, let me updtae ticket [10:17:04] <3 [10:19:12] 6Labs, 10Incident-20150617-LabsNFSOutage: Instance osmit.eqiad.wmflabs refuses my public ssh key - https://phabricator.wikimedia.org/T103310#1399684 (10yuvipanda) 5Open>3Resolved a:3yuvipanda Apologies - I assumed these instances were back up after a restart, but due to puppet having fallen behind they w... [10:21:11] 6Labs, 10Incident-20150617-LabsNFSOutage: Instance osmit.eqiad.wmflabs refuses my public ssh key - https://phabricator.wikimedia.org/T103310#1399691 (10Sbiribizio) Thanks to you and all the staff. You're special! [10:26:10] !log orgchart added matanya to project [10:26:12] orgchart is not a valid project. [10:26:17] !log orgcharts added matanya to project [10:26:22] Logged the message, Master [10:29:19] for those who are wondering, I am still wading my way through salt key regeneration on the instances, it's tedious. again any one instance should be affected for only a short (30 seconds at most) period of time [10:30:44] 6Labs: Find alternative solutions for video project's use of NFS - https://phabricator.wikimedia.org/T102402#1399702 (10Matanya) I will move to be using /data/scratch for the main video transcoding. other activities will remain within my home dir. [10:37:53] 6Labs, 10Incident-20150617-LabsNFSOutage: Instance osmit.eqiad.wmflabs refuses my public ssh key - https://phabricator.wikimedia.org/T103310#1399706 (10yuvipanda) You're welcome :) [10:39:50] 6Labs, 10Beta-Cluster: Disable NFS home directories on deployment-prep - https://phabricator.wikimedia.org/T102169#1399716 (10hashar) Still has a bunch: ``` ^[[Aroot@deployment-salt:~# salt '*' cmd.run 'grep /home /etc/fstab|egrep ^labstore' deployment-fluorine.deployment-prep.eqiad.wmflabs: deployment-sca02.d... [10:43:28] 6Labs: Get rid of NFS in the packaging project - https://phabricator.wikimedia.org/T103812#1399733 (10yuvipanda) 3NEW a:3yuvipanda [10:43:44] 6Labs: Get rid of NFS in the packaging project - https://phabricator.wikimedia.org/T103812#1399733 (10yuvipanda) @akosiaris seems to be the only person with files on NFS [10:47:46] 6Labs: Get rid of NFS in the packaging project - https://phabricator.wikimedia.org/T103812#1399756 (10akosiaris) yes please do! [10:48:57] 6Labs, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1399776 (10yuvipanda) [10:48:59] 6Labs: Get rid of NFS in the packaging project - https://phabricator.wikimedia.org/T103812#1399774 (10yuvipanda) 5Open>3Resolved Done! [10:54:27] 6Labs, 10Beta-Cluster: Disable NFS home directories on deployment-prep - https://phabricator.wikimedia.org/T102169#1399818 (10hashar) a:5yuvipanda>3hashar I have cleaned up in /etc/fstab the #labstore... lines with: `salt '*' cmd.run "sed -i '/^#labstore/d' /etc/fstab"` Manually cleaned the /home entry on... [10:54:42] 6Labs, 10Beta-Cluster: Disable NFS home directories on deployment-prep - https://phabricator.wikimedia.org/T102169#1399820 (10hashar) p:5Triage>3High [10:55:02] 6Labs, 10Beta-Cluster: Disable NFS home directories on deployment-prep - https://phabricator.wikimedia.org/T102169#1399822 (10hashar) 5Open>3Resolved [10:55:05] 6Labs, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1399824 (10hashar) [11:00:53] 6Labs, 10Beta-Cluster: Disable NFS home directories on deployment-prep - https://phabricator.wikimedia.org/T102169#1399868 (10yuvipanda) Thank you very much :) [11:04:00] 6Labs, 10Continuous-Integration-Infrastructure: Continuous integration should not depend on labs NFS - https://phabricator.wikimedia.org/T90610#1399888 (10hashar) labstore is gone from /etc/fstab ``` root@integration-saltmaster:~# salt '*' cmd.run 'grep labstore /etc/fstab' integration-dev.integration.eqiad.wm... [11:05:41] 6Labs, 10Continuous-Integration-Infrastructure, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1399893 (10yuvipanda) Hmm, the bastion-01 IP was added to puppet when it was created, are you sure these were running up to da... [11:06:34] 6Labs, 10Continuous-Integration-Infrastructure: Continuous integration should not depend on labs NFS - https://phabricator.wikimedia.org/T90610#1399897 (10hashar) 5Open>3Resolved All fixed as far as I can tell. labstore is no more mounted nor in fstab. [11:06:36] 6Labs, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1399899 (10hashar) [11:07:43] 6Labs, 10Continuous-Integration-Infrastructure, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1399900 (10hashar) For some reason ferm is no more applied on the CI instances, so it could not receive the new rules update. [11:08:08] 6Labs, 10Continuous-Integration-Infrastructure, 10Labs-Infrastructure: Cant ssh to integration-slave-jessie-1001.integration.eqiad.wmflabs - https://phabricator.wikimedia.org/T103312#1399901 (10yuvipanda) Ah, that makes sense :) [11:17:58] Negative24: around? [11:18:45] 6Labs: Disable NFS from performance project - https://phabricator.wikimedia.org/T103824#1399936 (10yuvipanda) 3NEW a:3yuvipanda [11:18:48] Negative24: https://phabricator.wikimedia.org/T103824 :) [12:16:11] YuviPanda: NFS is gone on both beta and integration [12:16:19] though some beta instances still have /data/project [12:16:30] been too lazy to investigate all the use cases [12:25:33] addshore, hace you had a chance to look at the request? :-) [12:37:21] Is it somehow possible to add a domain to the zone file of WMFLabs proxies? [12:38:00] at Special:NovaProxy I can only select something.wmflabs.org [12:38:49] I don't think so, they reqiure extra SSL certs [12:42:05] zhuyifei1999, I guess an own proxy server somewhere that just modifies the HOST header woudn't add much trust ... so there is no realistic option to change away from the http://commonsarchive.org/ 301 redirect [12:42:27] to something transparent [12:55:49] 6Labs: Investigate and disable NFS in the ttmserver project - https://phabricator.wikimedia.org/T103840#1400216 (10yuvipanda) 3NEW a:3yuvipanda [12:57:20] (03PS1) 10Sitic: Use cdnjs for sockjs version negotiation [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220753 [12:57:32] (03CR) 10Sitic: [C: 032 V: 032] Use cdnjs for sockjs version negotiation [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220753 (owner: 10Sitic) [13:07:55] 6Labs: Labs: Reinstall labstore1001 with Jessie - https://phabricator.wikimedia.org/T103266#1400260 (10coren) [13:07:58] 6Labs, 6operations, 10ops-codfw: Labs: Install the new RAID controller in labstore2002 and test - https://phabricator.wikimedia.org/T103267#1400255 (10coren) 5Open>3Resolved a:3coren This was done as a consequence of T103356 Controller is well-supported and performs well, but there are quirks with the... [13:16:27] rillke: https://phabricator.wikimedia.org/T97846 [13:17:08] " I'm pretty sure we do not want to host DNS zones for domain names we do not control, nor add that maintenance to the ops workload." [13:17:51] 6Labs: Investigate and disable NFS in the ttmserver project - https://phabricator.wikimedia.org/T103840#1400302 (10Nemo_bis) > 15.16 < YuviPanda> Nemo_bis: woah, find is taking forever because you seem to have a full checkout of the entire gnome project in there... Yes, feel free to trash everything and disable... [13:19:24] Cyberpower678: nope [13:19:28] thanks, valhallasw, well I understand these concerns [13:20:51] rillke: I'm also not sure how the proxy would work with https. That would require you (or the wmf) to get a certificate and get it deployed on labs [13:22:32] * rillke is hoping for Let'sEncrypt [13:24:59] rillke: if you really really really want to, you can get a public IP address and point an A record to it. [13:25:06] you'll have to deal with ssl yourself, etc. [13:25:13] and it's highly unreccomended as ewll [13:25:14] *well [13:26:17] addshore, :( [13:26:42] addshore, do you think I should re-apply to BAG? [13:26:52] It looks like BAG could use another user [13:32:12] YuviPanda, it's okay how it currently works -- I know you're short on public IPs so keep them for something that really requires them, just cosmetics for me in this case :) [13:32:25] rillke: :) cool. [13:37:24] 6Labs, 6operations, 10ops-codfw: Labs: Install the new RAID controller in labstore2002 and test - https://phabricator.wikimedia.org/T103267#1400393 (10coren) A note, this was done on 2001 in the end as we tried to debug that one. [13:44:24] whoever owns openid-wiki2.openid.eqiad.wmflabs they are out of inodes on / for some time [13:44:25] [14:15:56] is the maps project still needing attention after the NFS Outage? I'm unable to ssh into maps-warper from bastion. Should I wait or file a ticket? [14:17:12] chippy: I’ll look at that instance in a moment. [14:17:36] many thanks. It's not urgent, but the webserver appears running and is ping able :) [14:17:42] appreciate it [14:17:51] andrewbogott: hey! [14:18:04] 'morning! [14:18:06] andrewbogott: moritz filed a bug earlier about new instance creation for precise being broken due to cloud-init issues... [14:18:09] let me find bug [14:18:13] crazy few weeks :( [14:18:41] https://phabricator.wikimedia.org/T103808 [14:20:44] chippy: mind if I reboot that instance to see if it cheers up? [14:21:03] andrewbogott, go ahead :) [14:22:19] 6Labs, 10Mathoid: Allocate an IP for math.math.eqiad.wmflabs - https://phabricator.wikimedia.org/T103853#1400463 (10mobrovac) 3NEW a:3yuvipanda [14:22:23] YuviPanda: ^^ [14:22:52] mobrovac: cool, let me do that now [14:23:03] <3 [14:23:38] mobrovac: done [14:23:40] 6Labs, 10Mathoid: Allocate an IP for math.math.eqiad.wmflabs - https://phabricator.wikimedia.org/T103853#1400476 (10yuvipanda) Granted the math project one floating IP! You can allocate it via 'manage addresses' from the wikitech sidebar. [14:23:55] YuviPanda: grazie! [14:24:17] 6Labs, 10Mathoid: Allocate an IP for math.math.eqiad.wmflabs - https://phabricator.wikimedia.org/T103853#1400477 (10yuvipanda) 5Open>3Resolved [14:33:16] repeat from earlier: whoever owns openid-wiki2.openid.eqiad.wmflabs they are out of inodes on / for some time [14:33:29] YuviPanda: newbie question: do i need to kill the proxying before allocating the IP or it makes no difference? [14:33:36] mobrovac: easiest way, yeah [14:33:38] 6Labs, 6Discovery, 10Maps: *.tiles.wmflabs.org should be served over https - https://phabricator.wikimedia.org/T103801#1400481 (10scfc) For me, http://a.tiles.wmflabs.org/osm-no-labels/12/2105/1346.png gives: > Internal Server Error > The server encountered an internal error or misconfiguration and was unab... [14:33:40] kk [14:36:13] chippy: that instance’s disk is full. I’m cleaning up a bit. [14:37:08] uhh.... [14:37:10] Konsole output ssh -l root trusty-medium-1434577920.contintcloud.eqiad.wmflabs [14:37:11] Last login: Thu Jun 18 14:22:16 2015 from bastion-restricted-01.bastion.eqiad.wmflabs [14:37:11] root@puppet-jmm-mathoid: [14:37:11] [14:37:14] wtf [14:37:54] andrewbogott, thanks, in the past the logs and tmp files were able to be cleaned a bit. (I intend to get a larger /srv partition via labs::lvm::srv ) [14:40:00] is there a known ip or hostname issue over on labs? (see above) [14:41:07] apergos: it works for me. The bastions were rebuilt, is it possible you have an ip hardcoded in /etc/hosts? [14:41:20] 6Labs: Disable NFS from performance project - https://phabricator.wikimedia.org/T103824#1400488 (10Negative24) Mostly unused because its mostly automatic (testing suites). But yes it doesn't use NFS. [14:42:04] in what /etc/hosts? [14:42:34] chippy: try now? [14:42:45] apergos: wait, I missed a bit. Hang on... [14:43:10] k [14:43:18] hm, ok, I see the same thing as you :( [14:43:19] I’ll look [14:45:15] andrewbogott, many thanks! I have successfully logged into the server now [14:45:37] 6Labs: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1400490 (10Andrew) This works for me (and I've been testing it a lot lately due to creating new images.) Is there a hiera setup for that project that may be messing with things? [14:45:42] chippy: you’ll need to free up space still, I only barely made enough [14:45:49] yeah I know :/ [14:46:07] thanks! [14:46:29] andrewbogott, would setting labs::lvm::srv on the configure instance work? [14:46:58] chippy: that will make you a /srv partition with some additional space, depending on what instance flavor you created. [14:48:15] How to shut on an instance that was shut off? [14:48:16] hmm, would it be suitable for database files? [14:50:25] rillke: let me know and I can turn it on? [14:50:27] chippy: yes, it should be. [14:50:34] depending on how much space you need. [14:53:37] 6Labs, 7Database: Add Wikipedia Northern Luri and Wikipedia Goan Konkani to labs replicas - https://phabricator.wikimedia.org/T102647#1400498 (10Krenair) [14:53:40] 6Labs, 10Labs-Infrastructure: add domain alias gomwiki.labsdb and lrcwiki.labsdb for s3.labsdb - https://phabricator.wikimedia.org/T103794#1400497 (10Krenair) [14:54:10] 6Labs, 10Labs-Infrastructure: add domain alias gomwiki.labsdb and lrcwiki.labsdb for s3.labsdb - https://phabricator.wikimedia.org/T103794#1399303 (10Krenair) They'll need to be added to the replicas before there's any point in doing this, see {T102647}. [14:55:11] apergos: does trusty-medium-1434577920.contintcloud.eqiad.wmflabs still exist? it looks to me like it’s been deleted. [14:55:20] Idon't know [14:55:23] chippy: sure, it would be a local volume [14:55:32] it did a couple days ago I guess [14:55:35] apergos: ok. I think it was deleted, the ip was reclaimed. [14:55:49] that's fine but dns should fail to resolve that sucker [14:55:49] apergos: yeah, contintcloud has constantly rebuilding instances. [14:55:51] I don’t know why the dns entry didn’t get cleaned up. I’ll investigate that soon. [14:55:59] that would be awesome :-) [14:56:25] 6Labs, 6operations, 3ToolLabs-Goals-Q4: Investigate kernel issues on labvirt** hosts - https://phabricator.wikimedia.org/T99738#1400512 (10yuvipanda) What next for this? We put some hosts on it and run a suspend resume loop? [14:56:41] thanks YuviPanda, andrewbogott :) [14:58:34] apergos: yeah, we’re leaking dns entries all over the place :( [15:00:47] 6Labs, 6operations, 3ToolLabs-Goals-Q4: Investigate kernel issues on labvirt** hosts - https://phabricator.wikimedia.org/T99738#1400545 (10MoritzMuehlenhoff) I think so. That should show fairly reliable whether the problem still exists (the previous crashes were caused by the restarts after the VENOM securit... [15:01:34] 6Labs: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1400547 (10Andrew) My test was wrong; I'm able to see the issue now. [15:01:39] eewww [15:01:50] 6Labs: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1400548 (10Andrew) a:3Andrew [15:01:55] ok well I look forward to having that be happier [15:02:44] YuviPanda: any idea if the 1434363699 names in the contintcloud names are monotonic or have a date stamp in them or something? [15:02:50] I’m trying to tell when this problem started and when it ended [15:02:57] andrewbogott: not sure... [15:03:40] andrewbogott: looks like a unix timestamp? [15:03:52] oh yeah, probably. [15:03:53] ok... [15:03:59] 1434363699 Is equivalent to: 06/15/2015 @ 10:21am (UTC) [15:08:14] valhallasw: you’re right, thanks [15:11:52] apergos: ok, I fixed some exception handling on the 17th, and the last leaked instances are on the 17th. So probably this issue is resolved and I just need to clean up. [15:12:01] * andrewbogott ponders how to detect leaks [15:13:15] hmm, how would I get back the mount points for /home/glusterdata and /data/project/glusterdata ? [15:13:46] (03PS1) 10Sitic: Fix multiple-select issue in Firefox [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220768 [15:13:56] 6Labs, 10Tool-Labs, 10MediaWiki-extensions-SemanticForms, 10MediaWiki-extensions-SpamBlacklist, and 3 others: "Invalid or virtual namespace -1 given" when submitting access request - https://phabricator.wikimedia.org/T103653#1400665 (10Krenair) 5Open>3Resolved And also deployed to wikitech. Thanks for... [15:14:01] (03CR) 10Sitic: [C: 032 V: 032] Fix multiple-select issue in Firefox [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220768 (owner: 10Sitic) [15:14:05] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Alex Monk was created, changed by Alex Monk link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Alex_Monk edit summary: Created page with "{{Tools Access Request |Justification=test |Completed=false |User Name=Alex Monk }}" [15:14:11] that's much better than it could be [15:14:28] chippy: I think the data from the maps project still hasn't been recovered. Coren would know more. [15:14:51] YuviPanda, okay, no worries :) [15:14:57] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Alex Monk was modified, changed by Alex Monk link https://wikitech.wikimedia.org/w/index.php?diff=167855 edit summary: [15:15:25] chippy: It hasn't yet - we're out of space - but once we manage to shuffle the old broken fs away we'll be able to restore you. Sorry that you guys are the only ones left not fully back up; but as the biggest project you were the designated victim. :-( [15:16:00] Coren, it's no problem, thanks to all you guys for all your hard work. [15:16:17] (I just do the historical maps bit) [15:53:19] 6Labs, 6Discovery, 10Maps: Replacements for a.toolserver.org, b.toolserver.org, c.toolserver.org not available - https://phabricator.wikimedia.org/T103272#1400820 (10scfc) [16:06:06] apergos: I cleaned up all the leaked entries. Please let me know if you see something like this again. [16:06:17] thanks, I'll probably find out tomorrow [16:13:37] 6Labs, 10Labs-Infrastructure: do not add wiki info to meta_p.wiki if they do not exist on labs dbs - https://phabricator.wikimedia.org/T103863#1401037 (10Merl) 3NEW a:3coren [16:14:04] apergos: I've cleaned up inodes on the openid instance [16:14:27] ah that's awesome [16:14:37] I think I have one more of those but I'll report it tomorrow [16:14:44] where were those? I didn't see an obvious directory [16:14:55] I didn't hunt that long but still [16:16:14] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103: designate-sink not cleaning up all entries properly - https://phabricator.wikimedia.org/T103855#1401053 (10Andrew) [16:16:30] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103: designate-sink not cleaning up all entries properly - https://phabricator.wikimedia.org/T103855#1401054 (10Andrew) 5Open>3Resolved ok, all duplicates purged. [16:17:57] apergos: old linux kernel sources [16:17:59] apergos: /usr/src [16:18:12] ahhh [16:18:16] gifti: I'm removing NFS from the dwl project. That ok? [16:18:21] gifti: no downtime or anything. [16:19:23] 6Labs: Kill NFS in scrumbugz project - https://phabricator.wikimedia.org/T102704#1401057 (10yuvipanda) phab08 isn't recoverable yes. They are all on the same network though, the ip address doesn't make a difference in this case :( Can this bug be closed now? [16:19:58] 6Labs, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1401062 (10yuvipanda) [16:20:01] 6Labs: Investigate if NFS is needed on the language project - https://phabricator.wikimedia.org/T103130#1401060 (10yuvipanda) 5Open>3Resolved Alright. re-open if someone needs it? :) [16:20:30] 6Labs, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1360119 (10yuvipanda) [16:20:32] 6Labs: Remove NFS from project chasetest - https://phabricator.wikimedia.org/T102380#1401063 (10yuvipanda) 5Open>3Resolved a:3yuvipanda Done [16:22:12] 6Labs: Disable NFS for dwl project - https://phabricator.wikimedia.org/T103864#1401070 (10yuvipanda) 3NEW a:3yuvipanda [16:22:35] 6Labs: Remove NFS from fundraising project - https://phabricator.wikimedia.org/T103865#1401079 (10yuvipanda) 3NEW a:3yuvipanda [16:23:39] YuviPanda: can I get an assist? I seem to be locked out of staging-palladium :( [16:23:45] thcipriani: looking [16:23:49] thanks [16:24:17] thcipriani: can you attempt to ssh again? [16:24:54] YuviPanda: just gave it a shot [16:25:11] still getting Permission denied (publickey). [16:25:20] yeah I see it [16:26:04] 6Labs: Remove NFS from fundraising project - https://phabricator.wikimedia.org/T103865#1401101 (10cwdent) you can nuke mine! [16:26:25] thcipriani: Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find data item role::deployment::repo_config in any Hiera data file and no default supplied at /etc/puppet/manifests/role/deployment.pp:4 on node staging-palladium.staging.eqiad.wmflabs [16:26:25] Warning: Not using cache on failed catalog [16:26:25] Error: Could not retrieve catalog; skipping run [16:26:33] thcipriani: so puppet not running causing ssh failures... [16:26:53] ah, the repo config was moved to hiera [16:27:14] thcipriani: trivial to replicate for staging? [16:27:37] YuviPanda: not a problem, I can add something to our hiera config [16:28:33] should work fine now [16:29:00] need to patch labs.yaml... [16:29:51] 6Labs, 10Labs-Infrastructure: do not add wiki info to meta_p.wiki if they do not exist on labs dbs - https://phabricator.wikimedia.org/T103863#1401125 (10coren) The process that updates the metadata uses the global mediawiki-config repo as its source of authority, and I don't believe that there is a way of kno... [16:30:34] thcipriani: ok, checking [16:32:11] thcipriani: try now [16:32:27] YuviPanda: bingo! Thanks for your help! [16:32:34] thcipriani: yw! [16:47:53] Hello, is actually something wrong with tools? tools.wmflabs.org loads and loads, also my login into tool labs [16:47:54] Luke081515: temporary glitch, should go away soon [16:47:55] ok, thank you [16:47:55] YuviPanda: do you know what’s happening? [16:47:55] andrewbogott: yes [16:49:09] YuviPanda: Please tell me: Waht do you guess, how long would it take, till I can login? [16:49:38] Luke081515: now [16:49:42] Luke081515: all back? [16:49:53] ok, thanks for your work, I try it [16:50:31] YuviPanda: Yes, it works, thank you [16:50:39] yw [16:51:38] 6Labs: Remove NFS from fundraising project - https://phabricator.wikimedia.org/T103865#1401210 (10awight) Can't see what of mine is in there, but it can't be important. No need to preserve. [16:52:39] 6Labs, 10Tool-Labs: Toolslabs broken - https://phabricator.wikimedia.org/T103869#1401259 (10Bugreporter) 3NEW [16:55:22] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-102, 3Labs-Sprint-103: Labs: rewrite remaining labstore* scripts - https://phabricator.wikimedia.org/T102520#1401302 (10greg) [16:55:39] 6Labs, 10Tool-Labs: Toolslabs broken - https://phabricator.wikimedia.org/T103869#1401304 (10yuvipanda) 5Open>3Resolved a:3yuvipanda Yes, has been fixed - there was a 12 minute outage that's been recovered from. More details to labs@ soon. [16:55:41] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Q4-Sprint-2, 3Labs-Sprint-100, and 3 others: Disable LDAP and enable admin puppet module on labstore100[12] - https://phabricator.wikimedia.org/T95559#1401307 (10greg) [16:55:53] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1401310 (10greg) [17:07:02] 6Labs, 5Patch-For-Review: Make a labstore module - https://phabricator.wikimedia.org/T93781#1401345 (10coren) 5Open>3Resolved This is now the 'labstore' module. [17:12:03] YuviSheep: whats the plan for nfs when T102240 is resolved? [17:12:34] Negative24: it's opt in - there'll be a short process where you say why you want NFS, and we enable it for yuo [17:12:38] Negative24: similar to getting a floating IP now [17:16:04] YuviSheep: so the idea is that nfs minimalism should at least help to fix a future issue with nfs? [17:16:19] hey, my upload seesion died! [17:16:22] Negative24: yup. and we'll identify the actual problems people are using NFS for and find more appropriate solutions [17:16:30] * matanya is looking for Yuvi [17:16:32] matanya: there was a small NFS outage, I just emailed about it [17:16:34] sorry about that [17:16:48] * matanya is looking [17:17:00] heh [17:17:06] yuvi, we love you! [17:17:12] matanya: just emailed [17:17:26] i want to see one sysadmin it never happened to him [17:18:02] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1401395 (10Andrew) [17:18:54] YuviSheep: you are one of the best sysadmins I've ever seen [17:19:35] matanya: :) [17:19:37] Negative24: :D <3 [17:20:08] I'm going to go take a break now before I reboot any more machines :) :( [17:20:10] bye! [17:23:56] 6Labs, 10MediaWiki-extensions-OpenStackManager, 10Labs-Vagrant, 10MediaWiki-Vagrant, 10wikitech.wikimedia.org: Create Vagrant role for Extension:OpenStackManager - https://phabricator.wikimedia.org/T103874#1401408 (10scfc) 3NEW [17:24:30] sigh... that'd mean I'd need to set up vagrant [17:28:10] 6Labs, 10MediaWiki-extensions-OpenStackManager, 10Labs-Vagrant, 10MediaWiki-Vagrant, 10wikitech.wikimedia.org: Update Vagrant role for Extension:OpenStackManager - https://phabricator.wikimedia.org/T103874#1401447 (10scfc) [17:30:10] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: Salvage, then remove volumes on labstores' raid6 - https://phabricator.wikimedia.org/T103265#1385936 (10coren) All but one user request has been satisfied, but we need to keep the broken filesystem around until we are done with an ongoing copy... [17:30:14] 6Labs, 3Labs-Sprint-103, 3ToolLabs-Goals-Q4: virt1000 SPOF - https://phabricator.wikimedia.org/T90625#1401472 (10Andrew) [17:30:17] 6Labs, 6operations, 10ops-eqiad, 5Patch-For-Review, 3ToolLabs-Goals-Q4: Rename virt1000 to labcontrol1002, move to same subnet as labcontrol1001 - https://phabricator.wikimedia.org/T102646#1401470 (10Andrew) 5Open>3Resolved Up and puppetized and happy. [17:30:23] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: increase size of the volume for the maps project and restore - https://phabricator.wikimedia.org/T103358#1401475 (10coren) [17:30:26] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: Salvage, then remove volumes on labstores' raid6 - https://phabricator.wikimedia.org/T103265#1401474 (10coren) 5Open>3stalled [17:33:12] 6Labs, 6operations, 10ops-eqiad, 5Patch-For-Review, 3ToolLabs-Goals-Q4: Rename virt1000 to labcontrol1002, move to same subnet as labcontrol1001 - https://phabricator.wikimedia.org/T102646#1401488 (10Cmjohnson) The disks were raided together. I delete the array and the install worked fine. I did not ad... [17:34:06] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: Make a new backup of the Labs storage to codfw - https://phabricator.wikimedia.org/T103356#1387805 (10coren) This is now in progress: * The broken fs (/mnt/broken/project) is currently being copied to labstore2001:/srv/backup * A snapshot of t... [17:35:20] 6Labs, 6Discovery, 10Maps: Replacements for a.toolserver.org, b.toolserver.org, c.toolserver.org not available - https://phabricator.wikimedia.org/T103272#1401498 (10coren) [17:35:23] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: increase size of the volume for the maps project and restore - https://phabricator.wikimedia.org/T103358#1401494 (10coren) 5Open>3stalled This needs T103265 completed so that we actually have enough disk space to do. [17:46:00] petan: You around? [17:47:47] 6Labs, 3Labs-Sprint-103, 5Patch-For-Review, 3ToolLabs-Goals-Q4: Set up labcontrol1002 as hot spare for labcontrol1001. - https://phabricator.wikimedia.org/T103722#1401528 (10Andrew) This should be all set. I'm going to verify that the glance image sync is working properly and then close this. [17:48:24] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1401531 (10coren) There are many git repos in the directory, but: project/home/botbot/botbot/src has 14M and is reasonable to inspect and restore. [17:54:12] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1401552 (10coren) (And yes, I did mean 300M) [18:23:53] hello! I am currently unable to open the crontab, I've tried on two different tools. Any idea what's going on? [18:29:15] Coren: ^ [18:29:17] MusikAnimal: Same problem here [18:29:27] Tools-submit needs a kick? [18:29:32] valhallasw: ^ [18:29:37] First I hear of it, but I'll look. [18:29:41] $ crontab -l; sleep forever [18:29:58] That does sound like tools-submit needing a reboot [18:30:16] YuviSheep|zzz: I thought you were sleeping? If not, https://gerrit.wikimedia.org/r/#/c/220134/ will bore you to sleep (and allow me to close a ticket) :-) [18:30:38] I am on my phone now [18:30:46] 'sok. :-) [18:30:55] tools-submit is ill indeed. [18:30:58] I'll take a look later tonight or first thing tomorrow [18:32:09] Entire userspace is dead. Rebooting. [18:32:17] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1401659 (10Andrew) ok, I ran yet another test, and I see ImportError: No module named cc_power_state_change in the syslog but I can also log in. So, I think this is interesti... [18:33:28] Hm. Very dead. Rebooting at the low level [18:34:46] MusikAnimal: That work it up. Working better now? [18:35:04] Coren: yep! thank you :) [18:40:56] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1401691 (10MoritzMuehlenhoff) You're right. I can in fact now log into that system. This morning I waited something between 5-10 minutes before I reported it (and Yuvi advised... [18:41:35] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103: Creating a precise instance on labs fails - https://phabricator.wikimedia.org/T103808#1401695 (10Andrew) thanks for rechecking! I'll rename this ticket. [18:42:17] 6Labs, 10Labs-Infrastructure: Precise instances say "ImportError: No module named cc_power_state_change" on startup - https://phabricator.wikimedia.org/T103808#1401698 (10Andrew) p:5High>3Normal [18:51:40] 6Labs, 3Labs-Sprint-103, 3ToolLabs-Goals-Q4: virt1000 SPOF - https://phabricator.wikimedia.org/T90625#1401749 (10Andrew) 5Open>3Resolved [19:16:15] 6Labs, 6Discovery, 10Maps: *.tiles.wmflabs.org should be served over https - https://phabricator.wikimedia.org/T103801#1401832 (10valhallasw) The user on nlwiki uses Firefox, but I could reproduce it with FF this morning. However, the issue might rather lie with the server error noted above. [19:22:18] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1401860 (10coren) I've extracted all the non-.git files from the repos that had changes in the tree; excluding the .pyc files which I could not manually in... [19:23:03] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: Salvage, then remove volumes on labstores' raid6 - https://phabricator.wikimedia.org/T103265#1401865 (10coren) [19:23:05] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Recover /data/project/home/botbot on bots project - https://phabricator.wikimedia.org/T103696#1401864 (10coren) 5Open>3Resolved [19:23:30] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-103: Labs: Salvage, then remove volumes on labstores' raid6 - https://phabricator.wikimedia.org/T103265#1385936 (10coren) This is now only pending on the copy to labstore2001 (in progress, from labstore1002) [19:27:13] why are fully qualified hostnames now including the project in them? [19:27:32] phab-02.eqiad.wmflabs -> phab-02.phabricator.eqiad.wmflabs [19:31:40] Negative24: That's been in the works for several months, and deployed for a couple weeks now. There are a number of reasons, the primary of which is to avoid a number of issues that were caused by the use of EC2 IDs internally for garanteed-unique hostnames. [19:32:16] Negative24: I'm pretty sure I recall Andrew sending several email warning about this and explaining it. :-) [19:32:33] ah probably and I didn't pay attention enough [19:32:54] I've seen it for some time just now asking since my script started sending mismatched errors [19:33:02] Ah. [19:33:44] At any rate, the new names are canonical now (even though the without-project name remains valid for the forseeable future) [19:35:36] bd808: yt? [19:35:50] ottomata: sup? [19:36:03] I'm testing some eventlogging puppet stuff in beta [19:36:14] i cherry picked some things, and then I wanted to reset them. [19:36:24] so i reset to origin/production, but then realized that there were more cherry picked things :/ [19:36:34] doh [19:36:35] so now I have a bunch of uncommitted changes in the working copy [19:36:40] reflog! [19:36:51] oooo [19:36:58] i can reset to whatever was right before my cherry pick [19:36:59] ja? [19:37:07] should be able to, yes [19:37:22] this is cool! i may have never used reflog [19:37:24] this is useful! [19:37:38] it can be a life saver [19:37:52] until it gets trash collected :/ [19:37:59] hm, there are a uncommitted changes that are not mine after the reset [19:38:04] which has happened to me before [19:38:22] for scap [19:38:31] hm and lvs [19:38:39] hmmm... logging in [19:39:23] cherry-picks aren't back on yet [19:39:37] HEAD matches master/production [19:39:46] oh? [19:40:00] ghm [19:40:02] i reset to 4a1ede2 [19:40:29] oh [19:40:33] hm, maybe I should have done 2bb7996 [19:41:04] 6Labs: Remove NFS from fundraising project - https://phabricator.wikimedia.org/T103865#1401952 (10yuvipanda) 5Open>3Resolved Done [19:41:05] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1401954 (10yuvipanda) [19:42:31] ottomata: you can look in /var/log/git-sync-upstream.log to see the patches that were cherry-picked [19:42:47] YuviSheep: got into phab-02 and phab-pup. NFS isn't mounted but I didn't run the NFS removal commands. Should I? [19:42:52] you can just put them all back as picks if reflog isn't helping [19:43:03] Negative24: yeah! :) [19:47:42] bd808: like the most recent ones in teh list? [19:47:54] the 'local hacks' listed last? [19:48:14] ottomata: yeah, but looks like you got it cleaned up already [19:48:22] i id? [19:48:24] i did? [19:48:31] i didnt! [19:48:36] i got rid of my changes [19:48:41] but i didn't touch the ones I didn't know about [19:48:47] ah [19:49:08] ok, just to be sure [19:49:15] i should: reset to orign production [19:49:17] and then cherry pick each of the comits listed there? [19:49:19] in order? [19:49:20] YuviSheep: done [19:49:32] ottomata: yeah, that should work great [19:49:35] ok trying [19:49:44] you can just `git cherry-pick HASH` for each hash [19:50:43] ok done, that looks good [19:50:46] ja? [19:51:12] 6Labs, 6Phabricator: Get rid of NFS in the phabricator Labs project - https://phabricator.wikimedia.org/T102703#1401978 (10Negative24) Phab-02 and phab-pup are recovered and NFS has been disabled on both. [19:51:37] !log phabricator disabled NFS on remaining instances (phab-02 and phab-pup) [19:51:42] Logged the message, Master [19:52:28] ottomata: yeah looks ok to me I think. I at least see my cherry-pick in there :) [19:52:40] * bd808 needs to poke people about that one [19:57:44] ok thanks bd808. i'm going to cherry pick mine again. i'm developing this thing [19:57:48] what's the proper process here? [19:57:58] what if I cherry pick my current [19:58:04] patch [19:58:07] and then make a new patch [19:58:19] and want to cherry pick that? [19:58:27] can I just cherry pick the new sha from gerrit? [20:01:12] 6Labs, 10Beta-Cluster, 10Mathoid, 7Shinken: Shinken is showing HTTP 404 warnings for deployment-mathoid/sca02 mathoid services - https://phabricator.wikimedia.org/T103595#1402018 (10hashar) In operations/puppet.git `modules/beta/files/shinken.cfg` define the check as: ``` define service { service_descr... [20:02:48] 6Labs, 10Beta-Cluster, 10Mathoid, 7Shinken: Shinken is showing HTTP 404 warnings for deployment-mathoid/sca02 mathoid services - https://phabricator.wikimedia.org/T103595#1402042 (10Krenair) Huh. Not sure how I missed that service definition in shinken.cfg. Can we make it GET /_info like prod? [20:05:03] 6Labs, 10Beta-Cluster, 10Mathoid, 7Shinken: Shinken is showing HTTP 404 warnings for deployment-mathoid/sca02 mathoid services - https://phabricator.wikimedia.org/T103595#1402059 (10hashar) a:3hashar [20:05:37] 6Labs, 10Beta-Cluster, 10Mathoid, 7Monitoring, 7Shinken: Shinken is showing HTTP 404 warnings for deployment-mathoid/sca02 mathoid services - https://phabricator.wikimedia.org/T103595#1394154 (10hashar) [20:06:09] 6Labs, 10Beta-Cluster, 10Mathoid, 7Monitoring, 7Shinken: Shinken is showing HTTP 404 warnings for deployment-mathoid/sca02 mathoid services - https://phabricator.wikimedia.org/T103595#1394154 (10hashar) p:5Triage>3Normal [20:07:57] ottomata: git rebase --interactive origin/production to remove your old one (or git reset --hard HEAD^ if you are still on top) and then cherry-pick from gerrit again [20:09:21] origin/production? [20:09:41] that won't lose the other cherry picks? [20:09:42] that's the main branch for ops/puppet [20:09:48] yes [20:10:10] oh i guess --interactive will let me choose all the cherry-picked commits to keep? [20:10:12] the --interactive will give you a list and you can edit the one you want to dorp out of the list [20:10:13] aye [20:10:16] got it [20:10:17] ok [20:10:23] cool [20:10:25] thanks [20:10:28] yw [20:14:50] 6Labs, 6operations, 10wikitech.wikimedia.org, 7HHVM: Move wikitech to HHVM - https://phabricator.wikimedia.org/T98813#1402098 (10Krenair) [20:17:06] 6Labs, 10MediaWiki-extensions-OATHAuth, 10MobileFrontend, 10wikitech.wikimedia.org, 3Reading-Web: MF Special:Login doesn't have a field for 2FA - https://phabricator.wikimedia.org/T103771#1402109 (10Krenair) [20:17:10] 6Labs, 10Wikimedia-Extension-setup, 10wikitech.wikimedia.org, 7Mobile: Install MobileFrontend on wikitech - https://phabricator.wikimedia.org/T87633#1402108 (10Krenair) [20:18:39] 6Labs, 6Project-Creators, 10wikitech.wikimedia.org: Migrate shell access request process to Phabricator - https://phabricator.wikimedia.org/T72627#1402118 (10Krenair) Should we close this in favour of {T97334} ? [20:20:04] 6Labs, 10wikitech.wikimedia.org, 7Documentation: Update shell account name registration instructions - https://phabricator.wikimedia.org/T88092#1402123 (10Krenair) [20:20:24] 6Labs, 6Project-Creators, 10wikitech.wikimedia.org: Migrate shell access request process to Phabricator - https://phabricator.wikimedia.org/T72627#1402125 (10yuvipanda) 5Open>3declined a:3yuvipanda Yes [20:20:28] 6Labs, 10wikitech.wikimedia.org, 7Documentation: Update shell account name registration instructions - https://phabricator.wikimedia.org/T88092#1003882 (10Krenair) It also provides this RT address: `If you already had a SVN account with the same username, ask it to be created manually by sending this informa... [20:21:01] Hi, question: My tool -> http://tools.wmflabs.org/jembot <- is getting out of webservice every 1 to 3 days, and I have to webservice restart each time. Supposedly this shouldn't be happening because there is a periodic control, and my service.manifest seems ok. Any help about this? [20:21:25] YuviSheep, andrewbogott, are https://phabricator.wikimedia.org/T70391 and https://phabricator.wikimedia.org/T71135 strictly wikitech operational issues? [20:22:36] Krenair: that can be done via the web UI. [20:22:41] I can do it now, actually, hang on... [20:22:56] I'm wondering whether ottomata already did the first one [20:23:04] omg so many classes [20:23:19] eh? [20:23:19] :) [20:23:44] looks like not [20:24:17] ottomata: you should DIY if you still care. https://phabricator.wikimedia.org/T70391 [20:26:36] 6Labs, 10wikitech.wikimedia.org: Remove Puppet class generic::packages::git-core and replace misc::package-builder with role::package::builder::labs - https://phabricator.wikimedia.org/T71135#1402176 (10Andrew) I've removed generic::packages::git-core. I don't see misc::package-builder in the default list. W... [20:27:14] Hi? [20:29:41] 6Labs, 10MediaWiki-extensions-OATHAuth, 10MobileFrontend, 10wikitech.wikimedia.org, 3Reading-Web: MF Special:Login doesn't have a field for 2FA - https://phabricator.wikimedia.org/T103771#1402204 (10Florian) I think the best way of resolving this would be to fix: {T74910} [20:30:18] 6Labs, 10wikitech.wikimedia.org, 7Documentation: Update wikitech customised shell account name registration instructions - https://phabricator.wikimedia.org/T88092#1402209 (10Krenair) [20:30:39] oh right, ok [20:30:43] 6Labs, 10Analytics-Cluster, 10wikitech.wikimedia.org: Include role::analytics::hadoop roles in default list of labs puppet groups - https://phabricator.wikimedia.org/T70391#1402211 (10Ottomata) [20:30:46] 6Labs, 10wikitech.wikimedia.org, 7Documentation: Update wikitech customised shell account name registration instructions - https://phabricator.wikimedia.org/T88092#1003882 (10Krenair) That text does not appear in the default version of the message, caused by a local administrator: https://wikitech.wikimedia.... [20:32:23] 6Labs, 10MediaWiki-extensions-OATHAuth, 10MobileFrontend, 10wikitech.wikimedia.org, 3Reading-Web: MF Special:Login doesn't have a field for 2FA - https://phabricator.wikimedia.org/T103771#1402220 (10Jdlrobson) @florian won't https://gerrit.wikimedia.org/r/#/c/219754/ fix this? [20:36:37] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Nikkimaria was created, changed by Nikkimaria link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Nikkimaria edit summary: Created page with "{{Tools Access Request |Justification=Building tools related to The Wikipedia Library (see en.wikipedia.org/wiki/WP:TWL) |Completed=false |User Name=Nikkimaria }}" [20:45:57] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Nikkimaria was modified, changed by Merlijn van Deen link https://wikitech.wikimedia.org/w/index.php?diff=167961 edit summary: [20:59:19] 6Labs, 10wikitech.wikimedia.org: Remove Puppet class generic::packages::git-core and replace misc::package-builder with role::package::builder::labs - https://phabricator.wikimedia.org/T71135#1402314 (10scfc) I assume `misc::package-builder` was removed some time around https://gerrit.wikimedia.org/r/#/c/200525/. [20:59:33] 6Labs, 10wikitech.wikimedia.org: Remove Puppet class generic::packages::git-core and replace misc::package-builder with role::package::builder::labs - https://phabricator.wikimedia.org/T71135#1402315 (10scfc) 5Open>3Resolved a:3Andrew [21:09:48] 10Tool-Labs-tools-Other, 10Phragile, 6TCB-Team: Deploy Phragile on tool-labs - https://phabricator.wikimedia.org/T100192#1402363 (10greg) >>! In T100192#1333343, @Tobi_WMDE_SW wrote: > It is deployed on http://phragile.wmflabs.org now but not jet puppetized. I'm closing this task now and there is a separate... [21:24:05] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402388 (10greg) [21:24:10] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402430 (10Paladox) [21:24:28] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402433 (10greg) >>! In T103918#1402403, @Krenair wrote: > This is something that should be in labs, not production. right, I corrected the request. [21:24:31] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402434 (10Paladox) [21:24:43] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402388 (10Paladox) Thanks. [21:28:46] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402456 (10mmodell) don't we already have several test instances? [21:29:05] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402458 (10greg) (Please don't unsubscribe me from tasks) [21:29:38] ^lol [21:29:45] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402466 (10Qgil) For what is worth, anybody with the time can set up a new Phabricator instance in Labs: https://wikitech.wikimedia.org/wiki/Help:Contents + https://www.mediawiki.org/wiki/Phabricator/Code#... [21:32:01] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402480 (10Paladox) @greg sorry i was editing the description when you was so it was using previous version of description not the one you added so it removed you. [21:35:10] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402496 (10greg) :) [21:36:32] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402500 (10mmodell) @paladox: phab-03 is now running the redesign branch [21:39:07] 6Labs, 6Phabricator: Create another instance for phabricator - https://phabricator.wikimedia.org/T103918#1402505 (10Paladox) Thanks. seems the top bar is the same color. [22:03:06] twentyafterfour: due to ^ phab-03 is broken due to a rollback of phabricator versions to a version without spaces [22:08:01] (03PS1) 10Sitic: Fix loading spinner [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220988 [22:08:15] (03CR) 10Sitic: [C: 032 V: 032] Fix loading spinner [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/220988 (owner: 10Sitic) [22:26:28] 6Labs, 5Patch-For-Review: Make a fact for project_id on labs instances - https://phabricator.wikimedia.org/T93684#1402717 (10Andrew) Note that the attached patch will only work once the metadata has been updated for legacy instances. [22:31:36] 6Labs, 6Collaboration-Team: Allow wayback machine to crawl flow-tests.wmflabs.org - https://phabricator.wikimedia.org/T93221#1402780 (10Mattflaschen) a:5Mattflaschen>3None [22:43:50] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1402848 (10yuvipanda) [22:43:52] 6Labs: Kill NFS in scrumbugz project - https://phabricator.wikimedia.org/T102704#1402847 (10yuvipanda) 5Open>3Resolved [22:47:07] 6Labs, 10Incident-20150617-LabsNFSOutage, 3Labs-Sprint-102, 3Labs-Sprint-103: Audit projects' use of NFS, and remove it where not necessary - https://phabricator.wikimedia.org/T102240#1402856 (10yuvipanda) [22:47:10] 6Labs: Get rid of NFS on 'nginx' project - https://phabricator.wikimedia.org/T102696#1402855 (10yuvipanda) 5Open>3Resolved [23:00:51] 10Wikibugs: Wikibugs should not report notify channels for "added a commit" events - https://phabricator.wikimedia.org/T103929#1402894 (10Krinkle) 3NEW [23:01:36] 10Wikibugs: Wikibugs should not notify channels for "added a commit" events - https://phabricator.wikimedia.org/T103929#1402906 (10Krinkle) [23:06:46] (Second try) Hi, question: My tool -> http://tools.wmflabs.org/jembot <- is currently and every 1 to 3 days offline, and I have to webservice restart each time. Supposedly this shouldn't be happening because there is a periodic control, and my service.manifest seems ok. Any help about this? [23:21:00] jem: How does the tool operate? is the source code somewhere [23:21:05] Any errors in the error.log? [23:24:00] Krinkle: It's just a simple web page [23:24:24] jem: There are no "simple" web pages [23:24:41] Can you paste the index page at https://gist.github.com/ ? [23:26:29] Krinkle: I can, but I don't understand [23:26:44] This is happening with all the pages under jembot/ [23:26:51] jem: Are they plain .html files? [23:27:16] They have basic php [23:27:22] But I can try a hello world html [23:27:31] OK. So we're talking about an application and executed code. [23:27:31] So we can be sure [23:27:43] Try creating test.html and access that [23:29:58] Hmmm, hello world works [23:30:18] http://tools.wmflabs.org/jembot/hello.html [23:30:28] Now I'm even more confused [23:30:50] jem: So there's something in the PHP application that is causing a timeout. [23:31:20] try copying your index file to something else.php and remove things until it works [23:31:30] or some other debug mechanism you feel comfortable with [23:32:18] Yes, I guess that's the way [23:32:41] But it's strange that after a webservice restart everything seems perfect [23:34:21] 6Labs, 10Deployment-Systems, 10wikitech.wikimedia.org, 5Patch-For-Review: Merge as many configuration hacks in wikitech.php configuration file as possible into InitialiseSettings.php - https://phabricator.wikimedia.org/T75939#1403089 (10Krenair) (Re-did the above in https://gerrit.wikimedia.org/r/#/c/22084... [23:34:35] Ok, with just a print from PHP it breaks [23:34:42] http://tools.wmflabs.org/jembot/hello.php [23:34:56] Hello

Hello, world

[23:36:34] Any idea, Krinkle? [23:37:33] jem: That suggests the 'webservice' you started is overloaded still with other requests. [23:37:42] so does connecting with a SOCKS5 proxy to bastion still work? [23:37:48] Try disabling your other php files, then stop the webservice and start anew. And then access hello.php [23:38:01] to connect to other instances [23:38:48] Krinkle: What do you mean by "disabling"? [23:39:59] jem: delete them, or rename them to something other people aren't using [23:40:17] e.g. $ mv public_html public_html2; mkdir public_html; edit public_html/hello.php [23:40:25] Ok [23:42:18] But I already know that after a restart it works, so basically I have to test what in my code creates an overload [23:44:47] 6Labs, 10Deployment-Systems, 10wikitech.wikimedia.org, 5Patch-For-Review: Merge as many configuration hacks in wikitech.php configuration file as possible into InitialiseSettings.php - https://phabricator.wikimedia.org/T75939#1403128 (10Krenair) Only other thing that stands out as needing to go is $wgOpenS... [23:45:01] 6Labs, 10Deployment-Systems, 10wikitech.wikimedia.org: Merge as many configuration hacks in wikitech.php configuration file as possible into InitialiseSettings.php - https://phabricator.wikimedia.org/T75939#1403129 (10Krenair) [23:45:19] Ok, thanks for the help, Krinkle; it's time to sleep, I'll see tomorrow [23:45:41] jem: Yeah, exactly [23:45:49] but you have to investigate that while it is not already overloaded [23:45:53] after a restart [23:45:58] Ok [23:46:13] because we we saw, a simple hello.php times out even [23:46:18] when it is overloaded [23:48:31] 6Labs, 10Tool-Labs, 7Epic: Convert all Labs tools to use cdnjs for static resources - https://phabricator.wikimedia.org/T103934#1403141 (10Ricordisamoa) 3NEW [23:49:34] 6Labs, 10Tool-Labs, 7Epic: Convert all Labs tools to use cdnjs for static resources - https://phabricator.wikimedia.org/T103934#1403141 (10Ricordisamoa) [23:49:35] 6Labs, 10Tool-Labs: Provide usage statistics for the cdnjs mirror - https://phabricator.wikimedia.org/T103072#1403157 (10Ricordisamoa) [23:55:44] 6Labs, 10Tool-Labs, 7Epic: Convert all Labs tools to use cdnjs for static resources - https://phabricator.wikimedia.org/T103934#1403186 (10Ricordisamoa)