[00:00:14] how about the actual deployment. should we also remove that with it? [00:00:38] just thinking that you may want deployment-prep along with deployment [00:06:15] you're talking about the production server group mutante? [00:06:39] yes, production deployment [00:06:52] just looked at that "restricted" group too [00:07:05] 10Wikimedia-Labs-General: Labs infrastructure work - https://phabricator.wikimedia.org/T41784#2450936 (10Danny_B) [00:07:26] robla: Removed RobLa from deployment-prep. [00:09:40] 10Wikimedia-Labs-General: Labs infrastructure work - https://phabricator.wikimedia.org/T41784#434871 (10Dzahn) I think we should probably not tickets years after they have been resolved. [00:10:03] mutante: I'll think about the production one; may need to ask me again later. [00:10:52] robla: *nod* [00:32:09] 06Labs, 10Tool-Labs, 07Regression: uWSGI webservice terminating unexpectedly - https://phabricator.wikimedia.org/T139020#2450974 (10D3r1ck01) @zhuyifei1999, I did all that but still have the same results (web service terminates unexpectedly). [00:47:55] matanya, zhuyifei1999_: I'm going to cycle power on labvirt1012 in a minute, which means encoding02 will be off for a bit. Should be quick. [02:00:49] 06Labs, 10Horizon, 05Continuous-Integration-Scaling: Labs project admin can not delete per project image on Horizon - https://phabricator.wikimedia.org/T110936#1590423 (10AlexMonk-WMF) ```modules/openstack/files/kilo/glance/policy.json: "delete_image": "rule:admin_or_glanceadmin", modules/openstack/files/... [05:53:22] 06Labs, 10Tool-Labs: No permission after creating a new tool - https://phabricator.wikimedia.org/T140004#2451306 (10Dalba) [06:07:31] 06Labs, 10Labs-Other-Projects: video project: move rendering instances to SSD servers - https://phabricator.wikimedia.org/T139802#2451311 (10zhuyifei1999) >>! In T139802#2450156, @Matanya wrote: > cause the stuff is not puppetized, cause puppet on labs suckssss. I'll try to build some manual (simple) puppet... [06:09:24] chasemp: yes [06:35:46] (03CR) 10Legoktm: [C: 04-2] "No, we're not going to add more channels like that." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/292554 (owner: 10Paladox) [06:58:51] 06Labs, 10Labs-Other-Projects: video project: move rendering instances to SSD servers - https://phabricator.wikimedia.org/T139802#2451508 (10zhuyifei1999) @Matanya @Andrew can you apply puppet role `role::labs::lvm::srv` to the instances? Apparently I cannot call puppet modules from `operations/puppet` with c... [07:15:13] 10Wikibugs: Wikibugs links sometimes to the creation event, not to the mentioned comment - https://phabricator.wikimedia.org/T129246#2099604 (10Nikerabbit) Another example which is not linking to the creation event [10:08:43] wikibugs> MediaWiki-User-login-and-signup, MediaWiki-extensions-CentralAuth, Wikimedi... [11:18:14] !log ores aad92ac goes to staging [11:18:17] 10Labs-project-wikistats, 10Analytics, 10Analytics-Wikistats: Design new UI for Wikistats 2.0 - https://phabricator.wikimedia.org/T140000#2452139 (10Danny_B) [11:18:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [11:18:41] tom29739 how's uwsgi on k8s so far? [11:18:50] I'm writing a patch that'll make webservice restarts much faster... [11:19:43] yuvipanda, it's working well. [11:20:04] pip should be much faster as well than with gridengine [11:25:22] !log ores deploying aad92ac to web and worker nodes [11:25:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [11:25:27] musikanimal Matthew_ around? I want to see if we can move xtools-ec to k8s [11:26:55] Hi everyone, I have an old and easy question, originally for Coren [11:27:32] The http://tools.wmflabs.org/?tool=xxx information uses .description but should use toolinfo.json [11:27:44] (In order to avoid duplicating the same info) :) [11:28:31] there's a bug for it somewhere, but I think it's unlikely to get fixed anytime soon - nobody has the bandwidth to touch the homepage just now.... [11:28:59] yuvipanda, it also appears to be working faster too (I just did a quick apachebench test): Time per request: 13.112 [ms] (mean) [11:29:11] yuvipanda: was that for me? [11:29:16] jem yup! [11:29:23] Ah, thanks :) [11:29:31] jem: If you're interested in picking it up, https://phabricator.wikimedia.org/diffusion/LTOL/ is the source of the front page [11:29:34] But I'm surprised about the "bandwith" problem [11:29:54] human bandwidth [11:29:59] i.e. time/energy [11:30:03] https://phabricator.wikimedia.org/T115650 is also related [11:30:12] Thanks, valhallasw`vecto [11:30:14] ah yes, what valhallasw said. not network bandwidth [11:30:25] That sounds more like it [11:31:19] Well, I wouldn't mind to help if it's possible [11:31:36] That particular point seems easy to fix [11:33:26] jem, it looks like it's this file: https://phabricator.wikimedia.org/diffusion/LTOL/browse/master/www/content/tool.php [11:33:43] And this line: if ( is_readable( "{$home}/.description" ) ) { [11:33:54] Yes [11:39:35] jem (IRC): https://phabricator.wikimedia.org/rLTOLbde15df2a379c33edfb8350afd2f0c7186705a93 [11:39:56] so I think it /is/ used, but read from the database and synced every now and then? [11:41:30] Hmmm [11:46:28] Ah, yes, it's working :) [11:47:18] Great, removing it from my to-do list [11:47:32] Thanks everyone [11:51:40] (I have another pending task related to OAuth, but let's bother just once a day) :) [12:10:19] 06Labs, 10Labs-project-Phabricator, 13Patch-For-Review, 07Puppet: On labs phabricator references security extension even though it isn't present - https://phabricator.wikimedia.org/T104904#2452335 (10Danny_B) [12:10:43] 06Labs, 10Labs-project-Phabricator: Login to phab-0[124].phabricator.eqiad.wmflabs is broken, even as root - https://phabricator.wikimedia.org/T130693#2452338 (10Danny_B) [12:11:01] 06Labs, 10Labs-project-Phabricator: Upgrade phab-01.wmflabs.org - https://phabricator.wikimedia.org/T127617#2452340 (10Danny_B) [12:11:21] 06Labs, 10Labs-project-Phabricator: https://phab-01.wmflabs.org returns a core exception - https://phabricator.wikimedia.org/T137270#2452344 (10Danny_B) [12:11:43] 06Labs, 10Labs-project-Phabricator: phab-01 and phab-03 to 04 returns a 502 error - https://phabricator.wikimedia.org/T139444#2452347 (10Danny_B) [12:12:10] 10Labs-project-Phabricator: have a phabricator test instance in labs that uses a working puppet role - https://phabricator.wikimedia.org/T139475#2452350 (10Danny_B) [12:12:54] 06Labs, 10Labs-Infrastructure, 10Labs-project-Phabricator: can't log in to phab-01.eqiad.wmflabs - https://phabricator.wikimedia.org/T125666#2452352 (10Danny_B) [12:16:43] 10Labs-project-Phabricator: Phab-02 sending old stylesheet copies - https://phabricator.wikimedia.org/T94413#2452365 (10Danny_B) [12:26:54] 10Labs-project-Phabricator: phab-01.wmflabs.org test instance's statuses are out of date - https://phabricator.wikimedia.org/T76943#2452400 (10Danny_B) [12:27:12] 10Labs-project-Phabricator: Email not working on phab-01.wmflabs.org - https://phabricator.wikimedia.org/T76427#2452401 (10Danny_B) [12:27:47] 10Labs-project-Phabricator: phab-01.wmflabs.org triggers HeraldManiphestTaskAdapter error when commenting - https://phabricator.wikimedia.org/T98586#2452404 (10Danny_B) [12:27:58] goddamit danny_b [12:28:02] 10Labs-project-Phabricator: Upgrade phab-01 to use the same version as production Phabricator - https://phabricator.wikimedia.org/T78168#2452405 (10Danny_B) [13:13:36] 10Labs-project-Phabricator: Phabricator on labs has failed cronjob - https://phabricator.wikimedia.org/T1151#2452566 (10Danny_B) [13:17:08] 10Labs-project-Phabricator: Phabricator-Labs project? - https://phabricator.wikimedia.org/T1168#2452589 (10Danny_B) [13:17:18] !log git deleting instance git-redirects-01.git.eqiad.wmflabs (I forgot to do that the other day) [13:17:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Git/SAL, Master [13:24:08] 10Labs-project-Phabricator: Change phab-03 to 2015 redesign - https://phabricator.wikimedia.org/T103918#2452626 (10Danny_B) [13:28:10] zhuyifei1999_ around? [13:28:23] yeah [13:28:29] I want to walk you through migrating uwsgi to k8s and then use that for writing docs [13:28:34] now a good time? [13:28:49] ok [13:29:39] ok [13:29:43] * zhuyifei1999_ ssh-ing in [13:29:44] so the thing to do is to [13:29:56] 1. webservice --backend=kubernetes python2 shell [13:30:15] ok [13:30:22] 2. create a new venv in say, ~/www/python/venv.new [13:30:29] 3. Install the things you need in here [13:30:47] it says to stop first though [13:31:07] zhuyifei1999_ oh, hmm. is it ok if we stop video2commons for a while while doing this? [13:31:12] if not I can patch webservice to not need that [13:31:48] the tool is being flooded right now [13:31:55] wait [13:33:53] yuvipanda: is there a way to show "this tool in maintenance" while the webservice is down? [13:34:24] unfortunately not really... [13:34:37] but I've just made a patch that removes the requirement to take the gridengine job down [13:34:43] gimme a moment I'll deploy a temp version [13:34:52] ok [13:35:44] zhuyifei1999_ ok, use /tmp/tools/bin/webservice --backend=kubernetes python2 shell? [13:36:25] andrewbogott matanya: can you apply the puppet role for /srv to 02 & 03? 01 is severely overloaded [13:36:48] yuvipanda: which bastion? [13:36:59] zhuyifei1999_ tools-login [13:37:53] $ /tmp/tools/bin/webservice --backend=kubernetes python2 shell [13:37:53] Traceback (most recent call last): [13:37:53] File "/tmp/tools/bin/webservice", line 73, in [13:37:53] if 'backend' in tool.manifest and tool.manifest['backend'] != args.backend & args.action != 'shell': [13:37:54] TypeError: unsupported operand type(s) for &: 'str' and 'str' [13:38:21] it should be "and" right? [13:39:21] hmm [13:39:22] yes [13:39:25] I'm an idiot [13:39:32] zhuyifei1999_ try now [13:39:55] 10Tool-Labs-tools-wikibugs-IRC-bot, 10Wikibugs, 06Project-Admins: Merge wikibugs projects - https://phabricator.wikimedia.org/T75765#2452772 (10Danny_B) [13:40:11] ok [13:41:11] the shell is very weird, every time I press up and down the prompt goes up one line o.O [13:41:29] (I mean its location) [13:41:58] right [13:42:09] zhuyifei1999_ type 'stty rows 50 cols 150' [13:42:13] that should fix that [13:42:18] zhuyifei1999_: done [13:42:19] (this is an upstream bug I'm tracking and will deploy a fix once they do) [13:42:28] zhuyifei1999_: (the /srv thing I mean) [13:42:34] yuvipanda: ok [13:42:38] andrewbogott: thx [13:42:50] I'll pool them soon [13:43:38] https://www.irccloud.com/pastebin/tOc88vOM/ [13:43:47] yuvipanda: ^ [13:44:03] should I deactivate virtualenv before doing so? [13:44:41] zhuyifei1999_ you should just ignore the venv that currently exists [13:44:42] yeah [13:44:46] just create a new one [13:44:52] we can move it into place once we verify it works [13:44:57] ok [13:44:58] and yeah, venvs created in trusty don't work on jessie... [13:45:39] ok there's $ deactivate which works pretty well [13:46:14] * yuvipanda nods [13:47:53] wow this pip is fast [13:48:23] yeah [13:50:13] so everything installed, what's the next step? [13:51:36] zhuyifei1999_ make sure app.py loads? [13:51:40] python app.py [13:51:46] from the python in your new venv [13:51:51] ok [13:52:45] yep [13:52:52] ok [13:52:53] it loads [13:52:57] cool [13:53:03] then just deactivate venv again [13:53:04] and move it [13:53:07] mv venv venv.old [13:53:10] mv venv.new venv [13:53:21] so now we know our venv works we're just moving the old one away... [13:53:29] but keeping it around just in case we need to revert back to gridengine [13:53:36] when you're done with that, exit the shell [13:53:38] and do [13:53:45] 'webservice --backend=gridengine stop' [13:53:52] 'webservice --backend=kubernetes python2 start' [13:54:04] um should I stop webservice before moving? [13:54:09] nope [13:54:11] shouldn't matter [13:54:13] ok [13:55:57] 10Labs-project-Phabricator, 13Patch-For-Review: Stabilize vcs-user owned files and directories in Phab-02 - https://phabricator.wikimedia.org/T95982#2452896 (10Danny_B) [13:56:18] yuvipanda: it's up https://tools.wmflabs.org/video2commons/ [13:56:26] \o/ [13:56:27] test? [13:56:36] 10Labs-project-Phabricator: Admin access to phab-01.wmflabs.org for RobLa-WMF - https://phabricator.wikimedia.org/T85498#2452900 (10Danny_B) [13:57:40] everything looks okay [13:58:02] \o/ cool [13:58:10] 10Labs-project-Phabricator: phab-01 is broken: HTTPFutureCURLResponseStatus - https://phabricator.wikimedia.org/T88272#2452903 (10Danny_B) [13:58:19] zhuyifei1999_ try a restart with the webservice code in /tmp? it should be much faster than gridengine based restarts [13:58:23] 10Labs-project-Phabricator: Registration for phab-01.wmflabs.org broken: AphrontDuplicateKeyQueryException - https://phabricator.wikimedia.org/T88346#2452904 (10Danny_B) [13:58:48] using the /tmp code or the code in path? [13:58:59] the /tmp code [13:59:05] which I'll hopefully deploy later today [13:59:24] ok [13:59:47] /tmp/tools/bin/webservice --backend=kubernetes python2 restart ? [13:59:52] yeah [14:00:54] yep up [14:01:03] \o/ [14:01:16] I definitely like this solution to the uwsgi problems... [14:02:00] :) [14:03:46] 10Labs-project-Phabricator: New tasks on phab-01.wmflabs.org created with conduit aren't visible to others - https://phabricator.wikimedia.org/T91995#2452944 (10Danny_B) [14:04:03] zhuyifei1999_ thanks for being a guinea pig! [14:04:10] lol [14:05:01] yuvipanda hi, creating a instance using precise does not work. Im only trying to create an instance to test zuul running in precise [14:05:12] to find out why a newer version wont work on precise on production [14:05:27] But im getting error like these [14:05:28] Jul 12 14:03:20 gerrit-test-4 nslcd[1069]: [8b4567] ldap_start_tls_s() failed: Connect error: No such file or directory (uri="ldap://ldap-eqiad.wikimedia.org:389") [14:05:28] Jul 12 14:03:20 gerrit-test-4 nslcd[1069]: [8b4567] failed to bind to LDAP server ldap://ldap-eqiad.wikimedia.org:389: Connect error: No such file or directory [14:05:36] paladox you should ping andrewbogott or file a bug, I think. [14:05:40] Oh ok [14:06:41] yuvipanda: btw jessie is systemd right? [14:07:01] yeah but we don't run systemd in the containers [14:07:13] all of these are running in docker containers orchestrated by kubernetes [14:07:37] oh I'm just having trouble repooling the backends [14:07:46] ah unrelated, I see [14:07:46] yes, jessie is systemd [14:07:48] not k8s [14:07:51] right [14:08:10] finding the right one to use from https://github.com/celery/celery/tree/master/extra [14:08:40] ah [14:08:44] yes you need systemd [14:09:07] 06Labs: Creating a instance with precise fails - https://phabricator.wikimedia.org/T140099#2452963 (10Paladox) [14:09:10] yuvipanda ^^ [14:09:52] paladox: creation of precise hosts is almost entirely unsupported — I know how to fix that, but how important is it? [14:10:03] Oh not important [14:10:29] Not that important since gerrit is being updated this week including it moving to a new host [14:10:40] Just was wondering why. [14:10:57] 06Labs, 10Labs-project-Phabricator: Phab-02 not serving web pages from hostname linked to IP - https://phabricator.wikimedia.org/T96484#2452980 (10Danny_B) [14:12:48] paladox: probably the precise base image doesn't know about the new ldap servers [14:12:57] Oh [14:14:38] 06Labs: Switch existing and new trusty instances to GRUB 2 - https://phabricator.wikimedia.org/T140100#2452995 (10faidon) [14:15:43] 06Labs, 10Labs-project-Phabricator: Get rid of NFS in the phabricator Labs project - https://phabricator.wikimedia.org/T102703#2452988 (10Danny_B) [14:17:12] legoktm I moved fab-proxy to k8s [14:17:46] bd808 I'm going to move hatjitsu to k8s now [14:21:07] bd808 hatjitsu moved over! [14:24:52] yuvipanda: http://docs.celeryproject.org/en/latest/tutorials/daemonizing.html#usage-systemd <= there's no /etc/conf.d on jessie, what's the jessie equivalent of the dir ? [14:25:52] zhuyifei1999_ it's just convention - that path is explicitly referred in https://github.com/celery/celery/blob/master/extra/systemd/celery.service#L9 for example. I think /etc/defaults is usually where people put it [14:26:01] ok [14:28:53] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453053 (10Dalba) [14:30:13] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453053 (10yuvipanda) Do you *really* need 3.5? I think it'll be much easier for everyone if you could just use 3.4 - I don't think we can provide support for 3.5 yet unfortunatel... [14:46:51] andrewbogott: can you rebuild the two instances? I think I've created quite a lot of junk experimenting with puppet in the last hour [14:47:03] zhuyifei1999_: sure [14:47:11] right now, or do you want to mess with them more first? [14:47:30] hmm [14:47:38] * zhuyifei1999_ checks [14:49:44] yeah I think everything is okay [14:49:51] ok, will rebuild now [14:49:59] thx [14:50:33] zhuyifei1999_: you will need to fix the proxies again, since IPs will change [14:50:44] oh [14:50:56] well, I'm not projectadmin :/ [14:51:34] ah, ok, I'll do it then [14:58:14] zhuyifei1999_: ok, all set [14:58:30] k [14:58:39] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#2453218 (10yuvipanda) [15:02:51] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Install libmysqlclient-dev in tools python2 kubernetes containers - https://phabricator.wikimedia.org/T140112#2453261 (10yuvipanda) [15:04:46] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#2453218 (10yuvipanda) [15:05:53] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Packages to be installed in Tool Labs Kubernetes Images (Tracking) - https://phabricator.wikimedia.org/T140110#2453294 (10yuvipanda) [15:08:48] andrewbogott: both instances are up, but they aren't receiving [15:09:12] (I mean old task that 01 is supposed to handle) [15:09:35] zhuyifei1999_: that's because of a pool someplace, right? [15:10:00] well, idk how celery does this exactly [15:10:36] so I can't risk depooling 01 right now [15:10:54] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453318 (10Dalba) Not *really*... I'll try python3.4. Just a dumb question: how do you create a virtual environment for python 3.4? Should I compile another python3.4 from the so... [15:11:21] zhuyifei1999_: sounds like we need matanya's help to figure out about pooling — I don't know anything about the internals of that project of course :) [15:12:21] oh well I'm usually the person managing those (unless he's doing a lot of stuffs that idk) [15:14:25] iirc all pending tasks gets rescheduled hourly [15:15:07] 06Labs, 10Incident-20151216-Labs-NFS, 06Operations: Investigate need and candidate for labstore100(1|2) kernel upgrade - https://phabricator.wikimedia.org/T121903#2453330 (10fgiunchedi) adding labs too, ATM this is the situation kernel-wise: ``` $ ssh labstore1001.eqiad.wmnet uname -a Linux labstore1001 3.1... [15:18:31] andrewbogott: the real thing I'm worried about is 01 might run out of disk space http://tools.wmflabs.org/nagf/?project=video, with so much stuffs running [15:21:09] zhuyifei1999_: let me know if I can do anything to help [15:21:16] ok [15:28:55] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453053 (10tom29739) @Dalba ```tools.piagetbot@tools-bastion-03:~$ virtualenv ~/venv -p /usr/bin/python3 Running virtualenv with interpreter /usr/bin/python3 Using base prefix '/u... [15:30:59] andrewbogott: 03 is receiving :) [15:31:06] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453424 (10yuvipanda) You can do so with: ``` virtualenv -p python3 venv ``` [15:31:06] cool [15:31:17] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453425 (10yuvipanda) https://phabricator.wikimedia.org/T104374#1911373 has info too [15:32:18] I'll depool 01 after all pending tasks are being handled [15:37:44] 06Labs, 10Labs-Kubernetes, 10Tool-Labs: Install dependencies for python-lxml in python container - https://phabricator.wikimedia.org/T140117#2453448 (10yuvipanda) [15:39:16] Amir1 around? [15:39:33] yuvipanda: yup [15:39:59] Amir1 I see you're involved in the checkdictation-fa tool - I want to move it to k8s. thoughts? [15:40:32] yuvipanda: I'm involved a little bit, but I'm just doing some maintenance and robustness [15:40:48] if I move it to k8s do you think you can verify it works? [15:40:50] the whole thing is being done by Yamaha5 (Reza) [15:40:57] yeah [15:41:09] k [15:41:09] ok [15:43:28] Amir1 actually no, I guess I'm not doing that just yet. I'll do so later and ping you etc :) [15:43:31] thanks tho [15:43:54] yuvipanda: okay, thanks :) [15:59:25] 06Labs, 10Labs-Infrastructure: Review disk overcommit ratio for Nova - https://phabricator.wikimedia.org/T140122#2453627 (10Andrew) [16:00:07] 06Labs, 10Labs-Infrastructure: Review Nova RAM overcommit ratio - https://phabricator.wikimedia.org/T140119#2453645 (10Andrew) [16:01:03] 10Wikibugs, 06Project-Admins: Rename "phawikibugs" project to just "wikibugs" - https://phabricator.wikimedia.org/T1123#2453648 (10Danny_B) p:05Triage>03Low [16:13:49] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453747 (10Dalba) 05Open>03Resolved a:03Dalba tom29739 and yuvipanda, thank you so much! I could get to work with `plain-uwsgi`. [16:16:43] 06Labs, 10Tool-Labs: Needing help running the webservice for a simple flask application - https://phabricator.wikimedia.org/T140103#2453789 (10Dalba) [16:38:45] yuvipanda: Ping? [16:39:24] hi Matthew_ [16:39:51] Hello! Did you ping me this morning about moving xtools-ec to something? [16:40:30] Matthew_ yes! I want to move it to kubernetes (nothing changes for you!) and want someone who knows it to test if it is ok after I move it [16:42:52] yuvipanda: oh fyi I see someone transcoding a file named "Wikimania 2016, Hackathon- Running bots and executive code on labs with just a web terminal (PAWS)" on my tool. I'll watch this one :) [16:43:07] \o/ [16:43:08] niiiice [16:43:18] let me know when a link is available? [16:43:26] although I feel quite embarassed by my voice... [16:43:40] ok lol [16:44:10] yuvipanda: I can do that. Feel free to make the change. [16:44:25] ok! [16:45:20] 06Labs: Make ladsgroup admin on the labs 'fa-wp' project - https://phabricator.wikimedia.org/T138372#2454032 (10Andrew) Huji, any objection? [16:45:27] Matthew_ try now? http://tools.wmflabs.org/xtools-ec/ [16:46:01] Appears to work for me. I'll keep an eye on Phabricator and GitHub and let you know if there are any issues. [16:46:17] 06Labs, 10wikitech.wikimedia.org, 13Patch-For-Review: mariadb doesn't come up properly on silver after reboot - https://phabricator.wikimedia.org/T125987#2454039 (10Andrew) 05Open>03Resolved [16:46:21] Matthew_ \o/ ok! [16:46:33] Matthew_ any other php tools I could move? [16:46:33] :) [16:46:42] Hum... [16:47:24] 06Labs, 06Operations, 10ops-codfw: labtestneutron2001.codfw.wmnet does not appear to be reachable - https://phabricator.wikimedia.org/T132302#2454045 (10Andrew) p:05Normal>03Lowest [16:48:18] zhuyifei1999_: encoding01 drained yet? [16:48:52] yuvipanda: I assume you want big tools right now? [16:49:02] Matthew_ any kind really [16:49:17] not yet, haven't depooled yet. there's still 3 tasks in pending state [16:49:26] You could probably move the rest of the xtools stack. The other ones I'm maintainer for aren't active... [16:49:36] Except for peachy-docs. If you want to :) [16:49:48] and 2 aborting while in pending (which outcome is untested) [16:49:55] Matthew_ looking... [16:50:09] Matthew_ moving xtools-articleinfo now [16:50:30] zhuyifei1999_: ok, no problem [16:50:35] and it'll take a few more hours until the running tasks on 01 to finish after I graceful shutdown the service [16:50:42] k [16:50:52] 06Labs: Make ladsgroup admin on the labs 'fa-wp' project - https://phabricator.wikimedia.org/T138372#2454072 (10Andrew) p:05Triage>03Normal [16:51:02] Matthew_ try xtools-articleinfo? [16:51:57] Looks good to me. [16:52:21] Matthew_ moving 'xtools' itself now [16:53:17] OK! [16:54:29] Matthew_ xtools has problems, I'm moving it back to gridengine. try now? [16:57:37] Matthew_ xtools-dev also had similar problems so I just moved them back [16:57:40] but article-info seems ok [16:57:56] so xtools-ec and xtools-articleinfo now run on k8s! \o/ [16:58:21] Yes. The two seem ok. xtools is still working. xtools-dev is a strange beast IIRC... it's fine that it didn't go. [16:58:27] ok [16:58:54] Matthew_ I see you are listed as maintainer for a bunch more tools :D any other I can move? [17:01:19] yuvipanda: Two ticks please. [17:01:22] 06Labs, 10Labs-Other-Projects: video project: move rendering instances to SSD servers - https://phabricator.wikimedia.org/T139802#2454147 (10zhuyifei1999) Andrew applied the puppet role and I got 02 and 03 up with a just-written-today [[https://github.com/Toollabs/video2commons/blob/0b5ce84f59444a2bf1d42cb2eb... [17:01:46] Matthew_ I don't understand the expression.... :) [17:01:56] ticks as in 'minutes'? [17:17:59] yuvipanda: Sorry, I had a co-worker walk up to my desk. You are welcome to move any project I'm listed as maintainer for. Most are unused currently, so it's not a big deal. [17:18:31] ah ok! [17:19:46] The expression is one that I picked up from a book a long time ago... it just means "I'm doing something real fast" Sorry about that... [17:20:30] Matthew_ matthewrbowker-dev and matthewrbowker moved just now [17:21:14] Look good to me. [17:22:17] Matthew_ moved http://tools.wmflabs.org/articlerequest-dev/ and http://tools.wmflabs.org/articlerequest/ [17:22:40] Matthew_ you wanted me to not move peachy-docs? [17:23:00] articlerequest ones look good. [17:23:11] It's OK to move peachy-docs [17:23:40] ok! moving now [17:25:12] Matthew_ http://tools.wmflabs.org/peachy-docs/ done [17:25:31] Matthew_ you are part of the 'paste' tool as well, might I move that too? [17:25:40] I am...? [17:25:48] yeah :D [17:27:00] I didn't know that actually. [17:27:09] :D [17:28:15] tom29739 any more tools you can move over? :) [17:29:05] yuvipanda, I think there's a couple/ [17:29:24] \o/ no uwsgi-plain support yet, but I think I want to have python3 [17:29:30] yuvipanda: why are we moving them? just curious, I saw you said there is no change on our end [17:30:07] musikanimal I'd like to eventually deprecate gridengine for webservices before end of 2016, so my current strategy is: 1. move things, 2. if they work fine, Great! 3. if not, fix the issues that happen, 4. go to (1) [17:30:24] gotcha [17:30:33] it gives you better resource isolation + newer software, and it gets rid of a number of racy code that we've written ourselves [17:30:52] well I have a handful of tools you are welcome to try moving [17:30:56] cool [17:31:13] musikanimal sure! which ones? [17:31:26] pure php or pure nodejs or pure python ones are easiest just now [17:31:53] before we go any further, how can I see the state of the webservices? e.g. qstat returns nothing for xtools-ec and -articleinfo [17:32:02] kubectl get pods [17:32:07] ^ [17:32:26] logs are in the same place as before (error.log and access.log) [17:32:29] webservice status also works [17:32:31] I like it [17:32:32] ! [17:32:32] hello :) [17:32:34] kubectl is very powerful [17:33:01] so are restart script needs to be modified? [17:33:36] actually it still does "qstat | awk '{ print $1; }' | tail -1 | xargs qmod -rj" not sure if that still would work [17:33:48] for which tool is this/ [17:33:48] ? [17:33:59] kubectl delete pod [17:34:03] ^ that works [17:34:25] yuvipanda: all of the xtools suite [17:34:38] if you recall, how it has that weird thing where it just dies and doesn't automatically restart [17:34:39] I'd think that script won't be needed on k8s [17:34:48] we can also add an actual health check [17:34:50] where it does a http call [17:34:55] and then restarts if that http call fails [17:35:04] rather than this hack [17:35:05] webwatcher.sh [17:35:11] which calls webstart.sh [17:35:13] right [17:35:18] so how about I just add a http health check instead? [17:35:32] where it'll hit the /$toolname page and if it fails, it'll restart the pod [17:35:47] http://kubernetes.io/docs/user-guide/production-pods/#liveness-and-readiness-probes-aka-health-checks [17:35:55] yup! [17:36:00] They call it a "liveness probe" [17:36:04] interesting [17:36:10] I will look into that [17:36:11] I'll add that an option to 'webservice' soon but I can add it directly to the pod just now [17:36:40] musikanimal I'm adding it to xtools-articleinfo now [17:36:49] awesome, thanks [17:39:05] musikanimal done for http://tools.wmflabs.org/xtools-articleinfo/ [17:39:08] verify it works fine? [17:39:28] yup! thank you! [17:40:11] musikanimal ok, doing it to xtools-ec just now [17:40:26] this is a much better solution than webrestarter I guess [17:40:51] sounds like it! :) [17:42:19] Getting the webservice off NFS seems to speed it up greatly. [17:42:30] I was going to say the same [17:42:35] so no more NFS? fo real!? [17:42:53] I was running my tool's webservice out of /tmp for a while [17:43:06] musikanimal nope, this is still NFS [17:43:08] just no more gridengine [17:43:24] dah, okay hah [17:43:26]