[03:53:25] 6Labs, 10Tool-Labs: Restore replica.my.cnf for toolsbeta.admin - https://phabricator.wikimedia.org/T109807#1559682 (10scfc) 3NEW [04:12:45] 6Labs, 10Tool-Labs, 5Patch-For-Review: Convert updatetools.pl into a puppetized Python service with monitoring - https://phabricator.wikimedia.org/T94858#1559724 (10scfc) [06:10:53] 6Labs, 10Tool-Labs: Created tool but not added to group - https://phabricator.wikimedia.org/T55100#1559806 (10Ricordisamoa) [06:11:01] 6Labs, 10Tool-Labs: Some of my tools don't have .my.cnf / can't create databases in tools-db - https://phabricator.wikimedia.org/T50950#1559808 (10Ricordisamoa) [07:32:22] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Moss was created, changed by Moss link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/Moss edit summary: Created page with "{{Tools Access Request |Justification=I main plan to use the tool project to test features that are cool like the parliamentdiagram. |Completed=false |User Name=Moss }}" [07:44:39] legoktm : I am trying to setup a mediawiki installation in labs. but when I try to access it from web, I can not [07:45:04] this is the url : http://testinstance.wmflabs.org/ [07:45:13] and I get this error : 504 Gateway Time-out [07:45:23] Could you help me debug the issue? [08:22:20] ankita-ks: did you create a webproxy? [08:22:28] yes, i did [08:22:39] ankita-ks: also, you need to open port 80 (or whatever port your service is listening on) in the default security group [08:23:09] YuviPanda : That could be it. because currently it is not open. Let me try that. [08:24:10] Ok [08:24:41] YuviPanda : thank you so much! This seems to be working so far. :) [08:25:08] Woo [08:25:10] Cool [08:35:42] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1559916 (10yuvipanda) (We already use identd to do auth for our proxy registration setup) [09:16:53] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1559991 (10yuvipanda) This allows us to actually directly expose the full kubernetes API to users for them to do as they want. [09:22:23] YuviPanda: You guys are sending way to many emails to the announce list [09:22:36] andrewbogott: ^ [09:22:40] the reboot ones? [09:22:47] I guess these *do* affect a lot of people [09:26:18] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1560007 (10yuvipanda) Reading through docs, it looks like we'll also need https://github.com/kubernetes/kubernetes/blob/release-1.0/docs/design/service_accounts.... [09:35:31] YuviPanda: What I would do next time is make an upgrade schedule and announce that once [09:36:40] that makes sense. let's do that next time, andrewbogott [09:36:47] sorry about the excess email, multichill [09:36:49] Announcing maintenance a day in advance is too late anyway. Should be done a week in advance. Later is usually a sign of bad planning [09:37:31] For me it just ends up in the same box, but I just noticed the same announcement about 7 (?) times (labvirt0001 - 7) [09:39:00] See it as constructive criticism. Good luck with the remaining upgrades [09:43:46] multichill: +1, we'll figure something out. This also ties in to our earlier request for a central point for maintainenace notices [09:43:49] which we still don't have [09:57:15] YuviPanda : around? [09:57:39] I finally had the wiki up and running on labs ( http://testinstance.wmflabs.org/mediawiki/ ) [09:57:50] but after I tried to install VisualEditor with Parsed [09:57:53] *Parsoid [09:58:02] The screen is just white [09:58:18] I am trying to debug but please let me know if you have any insights on this [10:19:24] ankita-ks: not sure. are you using mediawiki vagrant / labs-vagrant? [10:19:45] nope. Manual mediawiki installation [10:19:55] ankita-ks: ah, I suggest using https://wikitech.wikimedia.org/wiki/Help:MediaWiki-Vagrant_in_Labs [10:20:03] shoulld make your life a lot easier [10:20:25] If I can't debug this, i will switch to vagrant. [10:20:37] yeah [10:20:45] if you want to debug this, I suggest asking in the parsoid channel [10:20:48] #mediawiki-parsoid [10:20:56] but still strongly reccomend using the vagrant stuff [10:23:18] Alright, I will give it a shot. It's just I find it more difficult to debug issues with vagrant than vanilla mediawiki installation. So i am a little skeptical about it. :/ [11:39:46] YuviPanda : around? [11:39:54] So Finally everything is working. [11:40:01] i did not use vagrant after all. [11:40:20] But there is one more thing I want to do. I need to keep the parsoid service running [11:40:47] When I am ssh'ed into the instance and running parsed, visual editor will work [11:41:04] and if I am logged out and parsoid has stopped, ve won't work either [11:41:09] how do i get past that? [12:32:11] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1560339 (10yuvipanda) Ok, so we can have processes running as a specific uid, which is great. They still run as gid 0, which should hopefully be fixable. We can... [13:00:45] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1560404 (10yuvipanda) So plan: # Have a ServiceAccount per Tool # The ServiceAccount will have an associated service context that enforces uid # Write an authen... [13:09:36] 6Labs, 10Datasets-General-or-Unknown, 10Labs-Infrastructure, 10Wikidata: Wikidata JSON entity dumps not being copied correctly on labs - https://phabricator.wikimedia.org/T109830#1560419 (10Addshore) 3NEW [13:18:00] multichill, YuviPanda, there was a maintenance schedule, also — https://wikitech.wikimedia.org/wiki/Virt_node_upgrade_schedule [13:19:04] Ah, great, so next time you just need to send out one email a week in advance with a link to this :-) [13:20:47] …I did, the others were reminders [13:21:24] *shrug* I understand your complaint, but I’d rather send too many emails than have more people be surprised by reboots. [13:23:24] Ah, I guess only 48 hours notice though [13:28:26] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1560460 (10yuvipanda) Ok, so that requires the following missing but planned features: # Authentication plugins # More powerful authorization plugins (currently... [13:43:29] andrewbogott: got a moment? [13:44:19] YuviPanda: what’s up? [13:44:19] andrewbogott: I'm trying to understand how keystone works... [13:44:40] is there a possibility we can expose keystone to labs instances and allow them to authenticate users / service groups? [13:44:47] or is that wildly out of scope of how we use keystone? [13:44:54] (I know absolutely nothing about keystone...) [13:45:55] It’s probably possible. I’m not sure it would be an improvement over talking to ldap though [13:46:03] and, of course, service groups don’t exist in keystone at the moment [13:46:38] keystone is probably exposed already. Lemme find a URL [13:48:47] hm, nope, firewalled [13:54:11] andrewbogott: so if I'm user yuvipanda, how does keystone know to authenticate me? [13:54:18] andrewbogott: do I give it my ldap username and password? [13:54:37] yeah, keystone uses ldap as its backend identity store. [13:54:38] andrewbogott: Im mostly curious because keystone auth might become a default plugin in kubernetes (https://github.com/kubernetes/kubernetes/issues/11626) [13:54:47] andrewbogott: right. so for service accounts, which don't have a password... [13:54:54] oh — well, that would be a good reason to use it. [13:55:13] and I guess you need to provide your ldap password to get a token? [13:55:17] right [13:55:27] right, so that'll make automation a bit of a pain I guess [13:55:34] yeah [13:55:35] since we want people to have it by dint of having succeded in ssh [13:57:33] * YuviPanda ponders [13:58:58] Yeah, I’m not sure how we would get from here to there. There might be other use cases that apply [14:06:59] andrewbogott: can you try logging into tools-dev? [14:07:06] andrewbogott: I think there's a runaway process killing NFS from there [14:07:36] and I Can't ssh in [14:08:13] ah made it in [14:09:29] andrewbogott: yeah, looks like keystone is a no go for us [14:09:34] 6Labs, 3Labs-Sprint-107, 3Labs-Sprint-108, 3Labs-Sprint-109: Evaluate kubernetes for use on Tool Labs - https://phabricator.wikimedia.org/T107993#1560536 (10yuvipanda) Meh, identd is a no-go since the Authenticator interface uses a Go HTTP Request object that doesn't expose the underlying socket, so I can'... [14:36:13] (03PS1) 10Sitic: Don't throw exception when failure is expected [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/232936 [14:36:15] (03PS1) 10Sitic: Fix padding for the subdivided watchlists [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/232937 [14:36:34] (03CR) 10Sitic: [C: 032 V: 032] Don't throw exception when failure is expected [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/232936 (owner: 10Sitic) [14:36:42] (03CR) 10Sitic: [C: 032 V: 032] Fix padding for the subdivided watchlists [labs/tools/crosswatch] - 10https://gerrit.wikimedia.org/r/232937 (owner: 10Sitic) [14:46:25] YuviPanda: sorry, was afk for a moment. I’ll look at tools-dev... [14:47:52] YuviPanda: tools-dev.wmflabs.org == tools-bastion-02? That one? [14:49:23] andrewbogott: yup, seems transient, is ok now [14:49:46] YuviPanda: tools.wmflabs.org has been flapping a bit, maybe it’s a recurring thing [14:50:18] andrewbogott: hmm, that's possible... [14:50:42] we're back to looking and guessing tho. iftop doesn't show any obvious offenders and neither does iotop [14:50:47] need to catch it in the act I guess [14:50:47] yeah [14:59:20] something like, sudo lsof -N | awk '$5 == "REG" {freq[$2]++ ; names[$2] = $1 ;} END {for (pid in freq) print freq[pid], names[pid], pid ; }' | sort -n -r -k 1,1 [14:59:27] show me procs with the most open nfs files? [15:31:34] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/Moss was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=174492 edit summary: [15:35:48] 6Labs, 10Tool-Labs: Make sure gridengine-exec starts on boot - https://phabricator.wikimedia.org/T109728#1560666 (10yuvipanda) I guess we should put in a service {} stanza. [15:54:59] 6Labs, 10Wikimedia-Bugzilla: Add a link to the Phabricator task for bugs on bugs.wmflabs.org - https://phabricator.wikimedia.org/T109840#1560717 (10Glaisher) 3NEW [16:12:08] 6Labs, 10Analytics, 10Labs-Infrastructure, 3Labs-Sprint-108, 5Patch-For-Review: Set up cron job on labstore to rsync data from stat* boxes into labs. - https://phabricator.wikimedia.org/T107576#1560754 (10akosiaris) And poked again. I 'd be lying if I said I am not still ambivalent on this. I finally un... [16:27:50] 6Labs, 10Analytics, 10Labs-Infrastructure, 3Labs-Sprint-108, 5Patch-For-Review: Set up cron job on labstore to rsync data from stat* boxes into labs. - https://phabricator.wikimedia.org/T107576#1560819 (10akosiaris) 5Open>3Resolved Done and checked. Resolving [16:58:58] 6Labs, 10Tool-Labs: toolsbeta-webproxy is unreachable - https://phabricator.wikimedia.org/T109853#1560965 (10scfc) 3NEW [17:19:46] 6Labs, 10Tool-Labs: document the need and usage patterns for special exec hosts - https://phabricator.wikimedia.org/T99067#1561082 (10scfc) Besides the points @yuvipanda made, I'd like to add that the current setup seems unstable to me. Those hosts typically idle for most of the month and then burst into acti... [17:23:42] 6Labs, 10Tool-Labs: Make sure gridengine-exec starts on boot - https://phabricator.wikimedia.org/T109728#1561099 (10valhallasw) Sounds like a good way to make sure it stays online, but I'm not sure if we should rely on just puppet for starting services (after all, it might take ~20 mins). On the other hand, we... [17:28:04] 6Labs, 10Tool-Labs: document the need and usage patterns for special exec hosts - https://phabricator.wikimedia.org/T99067#1561119 (10valhallasw) The last purge of the accounting log was ``` valhallasw@tools-precise-dev:~$ head -n 1 /var/lib/gridengine/default/common/accounting | cut -d: -f11 1394740992 valhal... [18:08:18] 6Labs, 10Tool-Labs: Decommission tools-exec-catscan - https://phabricator.wikimedia.org/T109871#1561325 (10scfc) 3NEW a:3scfc [18:10:35] 6Labs, 10Tool-Labs: Decommission tools-exec-catscan - https://phabricator.wikimedia.org/T109871#1561338 (10scfc) ``` scfc@tools-bastion-01:~$ qconf -dq catscan scfc@tools-bastion-01.eqiad.wmflabs removed "catscan" from cluster queue list scfc@tools-bastion-01:~$ qconf -de tools-exec-catscan scfc@tools-bastion-... [18:10:52] 6Labs, 10Tool-Labs: Decommission tools-exec-catscan - https://phabricator.wikimedia.org/T109871#1561339 (10scfc) [18:16:00] 6Labs, 10Tool-Labs: Decommission tools-exec-catscan - https://phabricator.wikimedia.org/T109871#1561356 (10scfc) [21:23:56] 6Labs, 10Labs-Infrastructure: Support cold-migration or suspended migration, or something, between labvirt hosts - https://phabricator.wikimedia.org/T109902#1562055 (10Andrew) 3NEW a:3Andrew [22:51:05] (03CR) 10Jean-Frédéric: [C: 032] "Merging." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/232657 (owner: 10Jean-Frédéric) [22:51:16] (03CR) 10Jean-Frédéric: [V: 032] "Merging." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/232657 (owner: 10Jean-Frédéric) [23:13:13] 10Tool-Labs-tools-Database-Queries, 6Phabricator: Archive Tool-Labs-tools-Database-Queries project - https://phabricator.wikimedia.org/T107699#1562722 (10Aklapper) 5Open>3declined a:3Aklapper Declining this request per last comment as there seems to be no consensus. [23:27:20] (03PS1) 10Jean-Frédéric: Alter PHP includes path with one more dirname [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233075 [23:27:22] (03PS1) 10Jean-Frédéric: Add commonscat to $dbFields in API/Monuments.php [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233076 [23:27:24] (03PS1) 10Jean-Frédéric: Increase memory available to StatsBuilder to 512M [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233077 [23:27:26] (03PS1) 10Jean-Frédéric: Fix link to the heritage API in the database statistics wikitable [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233078 [23:27:28] (03PS1) 10Jean-Frédéric: Open the Labs Commons database using charset UTF-8 instead of latin-1 [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233079 [23:29:18] (03CR) 10Jean-Frédéric: [C: 032 V: 032] "Backporting changes from Labs." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233075 (owner: 10Jean-Frédéric) [23:29:42] (03CR) 10Jean-Frédéric: [C: 032 V: 032] "Backporting changes from Labs." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233076 (owner: 10Jean-Frédéric) [23:30:01] (03CR) 10Jean-Frédéric: [C: 032 V: 032] "Backporting changes from Labs." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233077 (owner: 10Jean-Frédéric) [23:30:24] (03CR) 10Jean-Frédéric: [C: 032 V: 032] "Backporting changes from Labs." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233078 (owner: 10Jean-Frédéric) [23:30:55] (03CR) 10Jean-Frédéric: [C: 032 V: 032] "Backporting changes from Labs." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/233079 (owner: 10Jean-Frédéric) [23:45:32] 6Labs, 10Tool-Labs, 5Patch-For-Review: Replace all references to "tools" by references to $labsproject in operations/puppet and labs/toollabs - https://phabricator.wikimedia.org/T87387#1562838 (10scfc)