[00:02:20] bd808, do you know how to restart all apaches? [00:02:37] i tried salt commands but failed https://bugzilla.wikimedia.org/show_bug.cgi?id=36422 [00:06:58] bleh, apache restart barfs - Syntax error on line 20 of /etc/apache2/wmf/hhvm.conf: [00:06:58] FastCgiExternalServer: redefinition of previously defined class "/usr/lib/cgi-bin/php5-hhvm" [00:07:33] ori, ^ [00:12:03] yurikR: Still need to restart apaches? [00:12:24] bd808, yes, but it fails [00:12:25] :( [00:12:31] it gave me the above error [00:12:49] and i had to ssh directly to 01 & 02 [00:13:36] !paste [00:15:35] yurikR: I'll see if I can fix them. Hopefully it's just some leftover config from a role change [00:19:48] yurikR: I got the apaches running again. [00:20:07] I think our config branch has drifted :( [00:20:15] bd808, thx! [00:20:35] bd808, http://zero.wikimedia.beta.wmflabs.org/ is still not working :( [00:20:43] any ideas? [00:21:08] that's the message from mutliversion not finding the wiki config I think [00:22:04] i just added the zerowiki [00:24:41] * bd808 wonders why /a/common on deployment-bastion has uncommitted changes to dblist files [00:30:13] !log deployment-prep `git stash`ed dirty dblist files found in /a/common on deployment-bastion [00:30:15] Logged the message, Master [00:31:28] yurikR: Ok, now zero.wikipedia.beta.wmflabs.org redirects me to http://zero.zero.wikimedia.beta.wmflabs.org/ [00:31:49] bd808, http://zero.wikimedia.beta.wmflabs.org/ [00:31:51] not wp [00:32:22] actually http://zero.wikipedia.beta.wmflabs.org/ should not even exist, not sure what's doing that [00:33:57] * bd808 wonders if this is missing varnish config [00:41:03] yurikR: I'm not sure what's wrong, but I get the domain not found page directly from the apaches too. [00:41:30] could it be that i got that .conf file wrong? or is that file not included somewhere/ [00:41:36] bd808, ^ [00:41:39] I'm randomly guessing that some config needs to be added in the betacluster branch of operations/apache-config [00:42:20] bd808, oh, its missing from sites [00:42:39] adding a patch... [00:43:15] You might try to figure out how to cleanup the hhvm error there too. I hacked the local checkout to get the apaches to start [00:43:56] can you commit it? [00:44:10] bd808, pls +2 https://gerrit.wikimedia.org/r/137843 [00:44:44] bd808, and i will +2 your patch to fix hhvm ) [00:52:36] bd808, apache01 still has wikimedia.conf as a file link [00:52:41] runnig puppets there [00:53:02] Something it very strange about the state of the git repo [00:53:17] yurikR: Look here -- https://github.com/wikimedia/operations-apache-config/tree/betacluster [00:53:33] wikimedia.conf is a symlink but with file contents? [00:54:04] i run git on windows, maybe it messed up somehow [00:54:17] Ah. ok. [00:54:42] Yeah it's messed up. Give me a few minutes and I'll fix (bio break needed) [01:03:19] bd808, https://gerrit.wikimedia.org/r/#/c/137848/ [01:03:33] self +2ing [01:03:58] yurikR: I already did [01:04:02] that patch [01:04:19] https://gerrit.wikimedia.org/r/#/c/137846/ [01:06:57] yurikR: Now I think the problem is varnish has cached the error page [01:07:11] I can get the content direct from an apache with the right curl command [01:08:00] yei! [01:08:10] rebooting varnish is needed [01:09:10] Ok. now the problem is the https redirect [01:09:18] yep [01:09:26] i just restarted appaches on 1&2 [01:09:35] i think that requires apache config change again [01:09:41] i just copied prod [01:10:20] I'll make a patch... [01:10:47] bd808, https://gerrit.wikimedia.org/r/137849 [01:11:02] sorry, just saw your msg :) [01:11:16] You need to keep the RewriteEngine On line [01:11:33] There are other rewrite rules in there. [01:11:55] bd808, done [01:13:45] bd808, its alive!!! [01:13:47] http://zero.wikimedia.beta.wmflabs.org/wiki/Main_Page [01:13:55] redirect is still varnished, but otherwise works [01:14:10] I just kicked the varnish too :( [01:14:36] bd808, there are more than one [01:14:45] backend + frontend [01:14:50] on each machine [01:14:52] plus mobil/e [01:15:07] deployment-cache-text02 hit (20), deployment-cache-text02 frontend hit (7) [01:16:30] I restarted varnish and varnish-frontend on deployment-cache-text02. Not sure what's up there [01:25:24] !log deployment-prep Live hacked /etc/apache2/wmf/hhvm.conf on apaches to allow them to start [01:25:27] Logged the message, Master [01:26:12] bd808, want to submit it as a patch? [01:26:53] I could, but I'm instead going to file a bug and leave the live hack to remind me to figure out why it's broken now [01:28:13] bd808, varnish is still not doing it right [01:28:26] i just restarted both varnish & varnish-frontend [01:28:53] oh, need to expire it [01:30:03] 3Wikimedia Labs / 3deployment-prep (beta): Apaches refuse to start due to hhvm config - 10https://bugzilla.wikimedia.org/66234 (10Bryan Davis) 3NEW p:3Unprio s:3normal a:3None I have live hacked this on the apaches by commenting out most of hhvm.conf. Needs a real fix. [01:32:02] bd808|BUFFER, fixed: varnishadm "ban req.url ~ /" [01:32:02] && varnishadm -n frontend "ban req.url ~ /" [02:00:15] yurik: Cool. Glad you got it working. [02:48:57] !log deployment-prep added role::labs::lvm::biglogs to deployment-salt because it is out of room on /var and I don't know what I can delete [02:49:00] Logged the message, Master [03:08:30] 3Wikimedia Labs / 3tools: Enable OpenJDK 8 - 10https://bugzilla.wikimedia.org/66171 (10Tim Landscheidt) a:3Marc A. Pelletier [03:47:44] mwalker: Thanks for adding the space on deployment-salt. [03:48:40] And additional thanks for logging that you did it. [04:10:28] bd808|BUFFER, doesn't seem to have applied though; I might have to restart it [06:16:13] Good morning. I made a job named task1.sh. inside I wrote "#!/bin/bash python compat/lonelypages.py -new -always" and then chmod +x task1.sh and jsub task1.sh. It worked but when I close Putty job stops. Can it work continuously? Thanks [08:06:02] * yurik wonders how much greg-g will be upset if i +2 another labs-only wmf-config file without doing prod sync... [08:07:29] https://gerrit.wikimedia.org/r/#/c/137892/ [08:07:57] ori, any thoughts on that? I really don't want to touch production at this time [08:08:12] hello. does tool-labs support python wsgi? [08:15:16] Hi. I am configuring an instance in wmflabs, is there a documentation done anywhere on how the database should created and managed? [08:16:05] I see the one for tools here: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [08:16:30] rohit-dua: Yes. [08:17:02] sucheta: _The_ database, or _a_ database? [08:17:39] a930913, _a_database. [08:17:55] I've build my web-application in python-cgi. Should I convert all of it to wsgi-compatible? [08:20:27] rohit-dua: See https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Web_services in particular the example configurations. [08:59:10] a930913, Um, could you help? [09:19:23] sucheta: Sorry, distracted IRL :p [09:21:35] sucheta: You have your own instance, so you can just set up your own db? Unless you're wanting to use the existing dbs? [09:46:38] a930913: Setting up my own db..but as tools had some steps to do that, I don't see a documentation for the same if I want to do it in labs server. [09:46:56] Is it the usual? [09:49:27] sucheta: Tools has a whole instance or something dedicated to being a database server. [09:49:48] a930913, Yeah, and labs? [09:51:55] sucheta: Can't you just install a database daemon locally? [11:15:28] a930913, hmm, what would be the mysql root password in labs ? [11:15:57] sucheta: Labs or tools? [11:16:28] a930913, Labs. [11:16:59] sucheta: What server? [11:17:27] a930913, MySQL server. [11:17:48] sucheta: I mean what instance? [11:18:10] a930913, Oh. language-lcmd. [11:18:36] sucheta: Who runs it? [11:51:59] a930913, arrbee, I think. Well I removed and reinstalled. [11:52:18] a930913, BTW, what would be the public IP for language-lcmd? [12:17:05] (03PS1) 10BBlack: add labs copy of zerofetcher auth file [labs/private] - 10https://gerrit.wikimedia.org/r/137918 [12:17:34] (03CR) 10BBlack: [C: 032 V: 032] add labs copy of zerofetcher auth file [labs/private] - 10https://gerrit.wikimedia.org/r/137918 (owner: 10BBlack) [12:19:37] ^ is there some process akin to puppet-merge for labs-private commits? [12:20:06] (to get them deployed on actual hosts post-merge, I mean) [12:41:54] bblack: hello [12:42:13] bblack: there is a cron job that pull the git repositories every minute [12:42:38] bblack: though for the beta cluster it is probably slightly different because we use a local puppetmaster [12:43:18] !log integration unbroken puppet on the jenkins slaves. We had some dupe definitions. Patches uploaded in Gerrit and cherry picked on local puppetmaster [12:43:20] Logged the message, Master [12:44:25] sucheta: Well then, you would need to talk to him for the passwords. [12:44:51] !log deployment-prep Updated labs/private.git on puppetmaster. Brings Brandon Black change "add labs copy of zerofetcher auth file" {{gerrit|137918}} [12:44:53] Logged the message, Master [12:44:58] bblack: updated :] [12:45:35] hashar: thanks! [12:45:47] sucheta: You need to assign a subdomain for the reverse proxy, because there aren't enough IPs to go round. [12:46:04] (I'm really out of my depth here btw.) [12:47:15] sucheta: I can help if needed. [12:47:31] a930913, She wasn't sure. So, I removed and reinstalled. But the with all the database configuration and the relevant clone., I cannot really access lcmd.wmflabs.org which is apparently where the dnsproxy is set to. [12:47:40] hashar, ^ [12:48:18] hashar, I am talking about the instance language-lcmd. [12:49:34] sucheta: can you summarize / give me the context? :D [12:50:59] hashar, Yes. My bad. :D OK, So I am configuring the instance language-lcmd. And, I have the clone I need under /var/www, have the database set up. [12:52:00] hashar, And, I was wondering what would be the public IP to access the instance. Apparently there are none, and the dnsproxy is set to lcmd.wmflabs.org. [12:52:01] I am on the instance (kartik gave me access to the 'language' project a few weeks ago) [12:52:15] Oh good. [12:52:26] so yeah instances only have a 10.x.x.x address [12:52:31] which is private / unreachable from outside [12:52:41] then the web proxy should make it accessible [12:53:03] and there is no entry for the instance [12:53:03] https://wikitech.wikimedia.org/wiki/Special:NovaProxy :D [12:53:17] But lcmd.wmflabs.org doesn't really work for me :/ [12:53:27] yup you have to add the entry in the webprox [12:53:28] y [12:53:38] we can do it together with a Google Hangout if you want [12:55:13] That would be nice. [12:55:16] Now? [12:55:24] sure [12:55:43] amusso@wikimedia.org [12:56:11] * a930913 runs to sign hashar up to many spams. [12:56:27] gmail takes care of the spam for us! [12:56:50] Hehe. [12:57:35] hashar: Which is why I've engineered the spam to slowly resemble the regular mail that you get. [12:57:50] So half your real mail ends up in spam too :p [13:02:30] sucheta: magic command : puppetd -tv [13:02:37] sucheta: that runs every half hour or so iirc [13:04:09] sucheta: so you just missed php5-mysql [13:04:26] sorry if I confused you with the puppet config. It was probably not needed. [13:05:41] hashar, It has been thoroughly helpful, many thanks! :D [13:06:03] happy hacking! [13:13:25] !log integration migrating instances to puppet 3 {{gerrit|137898}} [13:13:27] Logged the message, Master [13:34:38] In Service group list I removed my account name "جافيد" from tool "jbbot" because my email and then my account was hacked, please put my name account to the "local-jbbot" again. Thanks [13:44:33] And he's gone already. [13:48:18] heh [13:54:41] YuviPanda: BTW, at the moment we hinge whether to add a link to the list of tools at http://tools.wmflabs.org/ on the existence of /data/project/$TOOL/public_html/index.*. This doesn't work for example for rohit-dua who has a different setup with URL rewrites. So I thought about looking for SGE jobs with "lighttpd-$TOOL", but some custom web apps use "lighty-$TOOL" instead (or maybe even something completely different). The "source" [13:54:41] of course is Redis on tools-webproxy, so I'd want to pull the data from there. Question is, can I do this with nginx/lua without setting up yet another server? That is, have nginx on http://tools.wmflabs.org/admin/webservices.json a) if not from 10/8 => 403, b) otherwise dump the Redis table? [13:54:52] scfc_de: I was trying to figure out how to tell him "how are we supposed to know it's you?" [13:55:42] scfc_de: you can make lua easily serve out stats / queries on a prefix [13:56:51] scfc_de: just specify a prefix in urlproxy.conf and have it use the access_by_lua_file directive (or whatever directive we use otherwise as well). It can serve whatever you want [13:58:00] scfc_de: should be fairly trivial. you can test things on tools-trusty-test as well (it serves tools-proxy-test.wmflabs.org but the redis replicates from tools-webproxy so is up to date on routing tables) [13:58:16] Coren: I wouldn't know either :-) (cf. https://wikitech.wikimedia.org/wiki/User_talk:Tim_Landscheidt#Shell_block). Are changes to the mail address in MediaWiki preferences logged? [13:58:29] scfc_de: so reccomended route would be to write / test them live on tools-trusty-test, then submit a patch [13:58:42] scfc_de: only question is what prefix to use. I'd suggest '/proxy' [14:00:37] YuviPanda: Will look into that, thanks. Problem with /proxy could be name collision, that's why I wanted to stay within /admin (or something other established). (Yes, proxy would be a strange tool name ...) [14:00:59] scfc_de: yeah, we could reuse admin as well. [14:03:23] (Current mail address is javedbaker3@ which could either be a hack or a common name.) [14:45:08] Something just unclogged an Echo queue on wikitech. I'm pretty sure I didn't create any instances 18 minutes ago :-). [14:51:57] Does Magnus IRC? [14:57:30] a930913: not mostly [14:59:38] Good day. I logged in to Putty and I wrote "become jbbot" to be a tool but I got this message "sudo: sorry, a password is required to run sudo" how to fix it? [15:00:57] scfc_de: Coren ^ [15:02:44] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#I.27m_being_prompted_for_a_password_when_I_try_to_.27become_my-tool-account.27._What.27s_wrong.3F [15:04:20] javedbaker: How do we know that you are you and not someone else? :-) [15:11:10] hi. i did an scp file to my tools-account. the ownership of that file is for rohit-dua. how do I change it to tools.bub(my tool). chown gives Operation not permitted [15:11:53] rohit-dua: You can, after "become TOOL", use "take FILE" to take ownership of that file. [15:12:42] scfc_de: thank you. ::)) [15:17:15] scfc_de: PM? [15:29:56] scfc_de: thoughts on adding trusty exec nodes, and adding a param to jsub that submits to them? should be an easy way to transition things off 12.04 in the long term [15:31:59] 3Wikimedia Labs: Implement ability to search wikitext of current Wikimedia wiki pages with regular expressions (regex) - 10https://bugzilla.wikimedia.org/43652#c10 (10Nik Everett) 5NEW>3PATC https://gerrit.wikimedia.org/r/#/c/137733/ The patch isn't really fully ready but its on its way. Before we merge... [15:32:50] Sounds good to me. I'd have to delve deeper into this SGE queue business though, also for the outstanding bugs (limit running jobs per user, provide user_slot resource). [15:33:20] Should they be named differently, or just tools-exec-11, -12, etc.? [15:33:43] scfc_de: hmm, good question. my ultimate goal is to move all the nodes to trusty [15:35:02] I think naming the webgrid nodes differently makes sense, but we could cope with exec-nodes >= n being Trusty (and after some time all). [15:35:40] YuviPanda: Trusty nodes should keep the same name; but also we need to have a resource on the grid so that people can set a job to request trusty vs request precise. [15:36:05] scfc_de: yeah, and eventually we can backfill them if needed. [15:38:44] 3Wikimedia Labs / 3tools: Create views for user_daily_contribs table - 10https://bugzilla.wikimedia.org/61300 (10Marc A. Pelletier) 5PATC>3RESO/FIX [15:38:59] 3Wikimedia Labs / 3tools: Tool Labs: Provide filtered view of user_properties table containing short list of properties, linked to userID - 10https://bugzilla.wikimedia.org/64115 (10Marc A. Pelletier) 5PATC>3RESO/FIX [15:39:15] 3Wikimedia Labs / 3tools: Provide filearchive table with fa_storage_key or, if it exists and is sufficiently indexed and populated, fa_sha1 for commonswiki - 10https://bugzilla.wikimedia.org/57697 (10Marc A. Pelletier) 5PATC>3RESO/FIX [15:40:09] "ubuntu_precise"/"ubuntu_trusty"? (=> "I request an ubuntu_precise".) [15:40:52] !log deployment-prep Disabled puppet on deployment-salt to work on disk space issues [15:40:54] Logged the message, Master [15:43:01] scfc_de: why the ubuntu_ prefix? are we considering adding arch exec nodes later? :P [15:43:32] !log deployment-prep Archived deployment-salt:/var/log to /data/project/deployment-salt [15:43:34] Logged the message, Master [15:45:34] !log deployment-prep /var on deployment-salt still at 97% full after moving logs; /var/lib is our problem [15:45:35] Logged the message, Master [15:47:18] YuviPanda: "trusty"/"precise" alone didn't feel self-explanatory to me, and ubuntu_ was the first thing that popped up. But: Bikeshedding :-). [15:47:31] scfc_de: can we call them green and blue instead? :) [15:48:52] All of that said, I'd rather we don't do anything w/ trusty on exec nodes before my return from my honeymoon (Jun 26) [15:49:32] There may be delicate issues with the version of gridengine in trusty that might require a prior backport and grid-wide upgrade. [15:49:52] hmm, hadn't thought of that. [15:50:13] More time to think about the resource name :-). [15:51:15] scfc_de: pink? how about pink? [15:56:30] Coren: I need to redo the labs_lvm partition on deployment-salt. I have your instructions to Andrew in backscroll, so thanks! [15:57:23] bd808: Tell me if you hit a snag so I can unsnag you and adjust said instructions. :-) [15:57:33] Coren got married? :o [15:57:36] Excellent, will do [15:57:44] a930913: About to, in 8 days. [15:57:57] Coren: Congrats \o/ [15:58:24] Coren: Making the most of the next 7 days then? [15:59:03] a930913: Heh; after 20 years together, there's very little actual change upcoming except a very good excuse for a big party and some vacation time. :-) [15:59:26] Also: work. :-) [15:59:51] heh [16:00:51] (03PS1) 10BBlack: update beta netmapper password [labs/private] - 10https://gerrit.wikimedia.org/r/137946 [16:01:51] (03CR) 10BBlack: [C: 032 V: 032] update beta netmapper password [labs/private] - 10https://gerrit.wikimedia.org/r/137946 (owner: 10BBlack) [16:03:35] hashar: how did you do the deployment-prep update earlier? [16:03:46] (to push the above? I want to know so I don't have to bug people) [16:05:10] or anyone else know what the correct way is to push labs/private.git updates to betalabs? [16:08:00] bblack: ssh to "deployment-salt", sudo -s, cd /var/lib/git/operations/puppet, git fetch && cherry-pick [16:08:57] labs/private is there too at /var/lib/git/labs/private [16:09:51] You have to do `GIT_SSH=/var/lib/git/ssh git ` in the private repo. git fetch; git rebase origin/master should pull in merged things [16:10:26] * bd808 is doing disk surgery on deployment-salt at the moment [16:10:59] i got a notification that i had created instance dwl in prohect dwl today, but i didn't?? [16:16:19] Coren: Instructions to drop the lv seemed to work fine. [16:17:33] bd808: \o/ [16:18:42] !log deployment-prep Changed from role::labs::lvm::biglogs to role::labs::lvm::srv on deployment-salt and made /var/lib a symlink to /srv/var-lib [16:18:43] Logged the message, Master [16:19:00] * bd808 crosses fingers that this works [16:23:30] ori: there's no host deployment-salt in prod dns, what is it really? [16:23:55] deployment-salt.eqiad.wmflabs [16:26:51] !log deployment-prep Updated labs/private.git on puppetmaster. brings in updated zero+netmapper password for beta [16:26:54] Logged the message, Master [16:27:43] !log deployment-prep Made /var/log a symlink to /srv/var-log on deployment-salt [16:27:45] Logged the message, Master [16:28:45] bblack: I'd like to reboot deployment-salt to make sure all the disk changes I just made stick. Let me know when that's safe for what you are doing. [16:29:05] it's safe [16:29:11] cool. thanks [16:30:15] !log deployment-prep Rebooted deployment-salt [16:30:17] Logged the message, Master [16:36:54] gifti: Probably an Echo queue got unclogged. I received those as well, just ignore. [16:46:07] scfc_de: i got that twice [16:46:19] once the original [16:46:29] and once today [16:46:35] hm [16:52:55] gifti: Me, too. [17:02:17] ah [17:17:41] does tool labs have any special rules about slow db queries? [17:18:52] jackmcbarn: Not in particular; though we do occasionally snipe queries that last a long time /and/ lock tables (i.e.: don't write to a temp table from a long query) [17:33:05] Coren: Do we have any stats on web hits per tool/most popular etc? [17:33:35] I've never had them collected, but I could probably munge the logs to find out. [17:34:42] Though I can certainly tell you the top one off the top of my head: geohack [17:35:28] Would be nice to make a statistics feed of that and put it in one of the analytics nodes [17:36:59] Can you see how many hits //tools.wmflabs.org/heritage/* gets Coren? I updated some links to point there now [17:38:30] Well, if you want numbers for your /own/ tools you can simply look it its own access.log. :-) [17:39:19] Isn't there a proxy in front of it? Wouldn't that cache at least some of the hits? [17:39:54] multichill: It's not a caching proxy, it anonymises and balances only. [17:40:16] Ah, could it cache if I send the right headers? [17:41:37] multichill: We didn't turn on caching at all, though it's not a bad idea I suppose. It certainly won't harm if you add Cache-Control headers [17:43:44] multichill: Re stats, cf. https://wikitech.wikimedia.org/wiki/User_talk:Petrb#m:toollabs:awstats and https://bugzilla.wikimedia.org/59222. [17:46:18] scfc_de: Ah, nice. Would be nice to have the country codes in the access log btw.... [17:47:03] multichill: Specifically forbidden. Sorry. [17:47:20] By whom and why? [17:48:28] Legal, privacy. For most countries, it wouldn't matter but being able to match geographical location with tool usage is verbotten. Then again, for _aggregated_ stats it might be okay-ish, but it'd need an okay from Legal (which, I expect, wouldn't be too hard to make a case for) [17:50:37] It might be relatively easy, for instance, to have a rarely-to-never used tool being linked to someone who want to figure out the nationality of and correlate the stats to their access even though a project checkuser wouldn't want to do a CU [17:52:08] The tool labs (enduser) privacy policy is actually slightly /more/ stringent than the project policy because we want to be allowed to link/embed to tools from the projects. [17:53:15] Aggregation and minimum hits should take away the privacy problems [17:54:27] multichill: Like I said, it's probably possible to make a reasonable case to Legal about a way this could be done. [17:54:41] But by default, atm, it's not allowed. [18:15:53] how can we see which jobs on grid ended with an error. (not with any specific job name/id, but for all jobs) [18:16:50] rohit-dua: You can do 'qadm -u ' but that'll give you /all/ jobs. [18:17:09] So you'll need to grep a bit to find what you want, or process the output in some way. [18:18:56] qadm? not found [18:19:09] Coren: qadm command not found. where [18:19:22] robla: qacct, sorry [18:20:30] um.... is there a big security problem with the labs/private.git - for it is public for some reason ... or is the README file misleading? [18:21:12] yurikR: what does the readme say? [18:21:23] In theory it says "This is public and these are dummy passwords" [18:21:54] http://git.wikimedia.org/blob/labs%2Fprivate.git/1ef907c76c9401012cb6c8aa0dee800eb1b4db19/README [18:22:23] seems largely accurate…? [18:22:40] andrewbogott, so i guess its by design? i think text should say "IT IS PUBLIC!" or something like that :) [18:22:52] all that text about "partially obscure" is kinda weird :) [18:22:54] "Despite the name, the contents of this repo is visible to all Labs account holders." [18:23:10] which is not acurate - it is totally public on the site [18:23:30] "Also this repo is or at least was at one point publicly accessible without any [18:23:30] authentication." [18:23:45] andrewbogott: it's currently still visible to the world [18:23:59] yeah, I don't know why it says 'or at least was' [18:24:05] http://git.wikimedia.org/tree/labs%2Fprivate.git [18:24:36] "For that reason, please do not share the contents of this repo outside of Labs" [18:24:44] seems silly given we publish it to the world :) [18:25:02] (03PS1) 10Andrew Bogott: Updated README again [labs/private] - 10https://gerrit.wikimedia.org/r/137966 [18:25:43] anyways, a different question: I wanted to push an ops/puppet change to betalabs. I have the earlier stuff down about going into the correct directory and fetching, etc [18:25:50] (03CR) 10Yurik: [C: 032] Updated README again [labs/private] - 10https://gerrit.wikimedia.org/r/137966 (owner: 10Andrew Bogott) [18:25:54] do we really cherrypick into there? currently it's showing 6 ahead and 33 behind [18:26:29] (03CR) 10Andrew Bogott: [V: 032] Updated README again [labs/private] - 10https://gerrit.wikimedia.org/r/137966 (owner: 10Andrew Bogott) [18:27:26] bblack: If it's out of sync then probably you can nag hashar about that… it doesn't update automatically. [18:27:36] multichill: yeah, getting aggregated stats from the proxy is something I've been meaning to work on in a bit [18:27:40] multichill: also, PM? [18:27:43] In theory cherry-picking a test patch there is the right thing to do. [18:27:49] it seems intentional, there are some beta-only commits in there that are unmerged in origin/production [18:28:11] what I have is 3 commits that *are* merged to origin/production I'd like to put in place on betalabs, among the 33 we're behind on [18:28:24] cherry-pick those, or rebase the whole thing onto all 33 and bring it up to date? [18:28:31] I can't think of a reason why there should be any behind patches. [18:28:45] But might be best to check with hashar or another beta admin before rebasing. [18:28:59] 3Wikimedia Labs / 3deployment-prep (beta): Apaches refuse to start due to hhvm config - 10https://bugzilla.wikimedia.org/66234#c1 (10Antoine "hashar" Musso) Is that the Apache hhvm.conf ? If so we have it in operations/apache-config.git in the betacluster branch https://github.com/wikimedia/operations-apache... [18:29:02] http://paste.debian.net/hidden/0287a500/ [18:30:32] yesterday i was trying to restart apache on labs, and it failed due to errors in hhvm config. hashar did some magic without commiting it [18:33:54] In Service group list I removed my account name "جافيد" (Member) next to "jbbot" (Service group name) because my email hacked, please put my name account to the "local-jbbot" again. Thanks [18:34:34] javedbaker: How do we know that you are you and not someone else? :-) [18:35:07] please mail a DNA sample to the WMF offices :) [18:37:59] Actually this is a good question, but Im not admin to know, but of course you can know easily from IP address. [18:38:54] scfc_de: is there no log of service group membership changes? [18:40:30] ori, i think you were the last one to play with the hhvm.conf on the labs, and there were issues with apache reboots on labs yesterday. Arey ou still working on them? [18:41:59] 3Wikimedia Labs / 3deployment-prep (beta): Apaches refuse to start due to hhvm config - 10https://bugzilla.wikimedia.org/66234#c2 (10Bryan Davis) (In reply to Antoine "hashar" Musso from comment #1) > Is that the Apache hhvm.conf ? If so we have it in > operations/apache-config.git in the betacluster branch... [18:42:14] ahhhh. hello world! [18:42:46] javedbaker: Those would all be consistent for an impostor as well :-). Still, I'm inclined to believe "you". What about the tools javedbaker and javed? [18:43:54] valhallasw: I take (hourly, I think) snapshots of those, but here the question is whether javed is javed. That his account was a member of that service group is clear. [18:45:34] scfc_de: that doesn't really matter right? If A loses access to his/hers tools, and B requests to re-instate A as admin, there isn't really an issue. [18:46:05] but in general, it would be good if people could not remove the last remaining member of a project :-p [18:46:22] bblack: Our cherry picks should be based on top of the production branch. I can go in and try to fix that as it was "correct" yesterday evening when I last updated. [18:47:42] Hmm. backscroll says Coren gets married. 20 years. Always these hasty decisions :P. Congrats, anyway. [18:47:54] bblack: Looks like someone just did a git fetch without a rebase. It's fixed now [18:48:49] bblack: Instructions on how to do puppet git manipulation for beta at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/How_code_is_updated#Cherry-picking_a_patch_from_gerrit [18:52:06] valhallasw: +1 to your last suggestion. In this case, javed/someone posing as him wrote on my talk page that his account got hacked and his access should be removed (https://wikitech.wikimedia.org/wiki/User_talk:Tim_Landscheidt#Shell_block). So it matters (as far as it matters for Tools accounts which we give away for free) in that B would gain access to A's tools (which I assume doesn't matter as javed doesn't seem to have stored any [18:52:06] passwords in them). [18:52:44] scfc_de: ah, right. [18:53:35] But that he stays in IRC only for minutes at a time doesn't make me feel very cozy. [18:53:44] 3Wikimedia Labs / 3deployment-prep (beta): Apaches refuse to start due to hhvm config - 10https://bugzilla.wikimedia.org/66234 (10Andre Klapper) [19:14:34] bd808: fetch doesn't affect the checkout, though. I did the fetch [19:15:00] the plan was fetch then either rebase or cherrypick, but that's when I stopped to ask the question (rebase onto 33 new commits or cherrypick out my 3) [19:15:58] I guess the confusion was that all I needed was the rebase, but it sounded dangerous being so far behind (since the changes I wanted were already merged upstream) [19:16:09] bblack: *nod* The working copy state is now "Your branch is ahead of 'origin/production' by 6 commits." which is what I expect it to look like. [19:16:52] We don't have an automatic update process there yet, so it gets updated manually "when needed" [19:17:29] I want to get it updating every half hour via cron/jenkins but haven't gotten that done yet [19:18:07] I think there was a big burst of generic::systemuser changes today that made the diff sound worse than it really was too [19:23:29] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c5 (10metatron) Any progress on this thing? As already mentioned, both nginx-proxies (domainproxy & urlsproxy) went live. Thus it should be knickknack to run some sed to sanitize the logs - an... [19:24:13] (03PS1) 10BBlack: change netmapper pw again [labs/private] - 10https://gerrit.wikimedia.org/r/137985 [19:24:32] (03CR) 10BBlack: [C: 032 V: 032] change netmapper pw again [labs/private] - 10https://gerrit.wikimedia.org/r/137985 (owner: 10BBlack) [19:26:06] !log deployment-prep - synced labs/private on deployment-salt again [19:26:08] Logged the message, Master [19:27:18] ok one more labs-noob question: [19:27:51] the answer is 23 :P [19:28:08] on betalabs instance deployment-cache-mobile03, I'm trying to connect to http://zero.wikimedia.beta.wmflabs.org/ - can't ping it all [19:28:18] I'm guessing we don't nat traffic to the outside by default, and that is a public IP [19:28:32] how do we generally deal with this? custom DNS overrides in betalabs to use the instance IP? [19:28:57] ... or: if the answer isn't beer or women - it's the wrong question. [19:30:12] technically the answer is always beer or women, as those can be used to bribe someone else to answer your question [19:30:26] :-D [19:30:59] (or wine or men) [19:31:01] bblack: We don't have split horizon dns so the *.beta.wmflabs.org maps to the varnish hosts themselves I think... let me look in wikitech [19:31:44] zero.wm.beta was just set up yesterday, it could be something isn't set up right on it for this [19:31:46] bd808: bblack I can confirm the NAT issue :D [19:32:00] bblack: https://wikitech.wikimedia.org/wiki/Special:NovaAddress shows the mappings [19:32:08] there is a puppet class beta::natfix that list the map of public IP to the instance private IP [19:32:23] it is used on various instances across labs projects [19:32:34] ah ha. I would have remembered that eventually [19:33:10] ok, so that tells me that zero.wikimedia.beta.wmflabs.org is actually deployment-cache-text02 [19:33:35] luckily bblack wrote a DNS daemon. So I guess we can fill a feature request to add split horizon in gdnsd and migrate wmflabs to gdnsd *grin* [19:33:49] that one is the generic text varnish [19:33:55] just like the production text varnish [19:34:06] we also have a mobile varnish that serves the *.m urls [19:34:59] right, well, all that aside... [19:35:02] bd808: puppet 3 is ready! andrewbogott proposed to switch the beta cluster to puppet3 on monady. [19:35:27] bblack: but the zero.wikiPedia.beta.wmflabs.org is sent to deployment-cache-mobile03 [19:35:32] bblack: might want to add an entry there for wikiMedia [19:35:37] bblack: Actually zero.wikimedia isn't mapped there at all is it? [19:35:55] there's a script on deployment-cache-mobile03 that wishes to make an http connection to hostname "zero.wikimedia.beta.wmflabs.org", which is the public IP for deployment-cache-text02 [19:36:04] (I should say, it *a* public IP for ...) [19:36:17] how do I fix it so that hostname resolves as the private IP within betalabs? [19:36:24] you cant [19:36:32] ok :) [19:36:42] that resolves to the public IP so you will need the beta::natfix that rewrite the dest IP from the public IP to the private one [19:37:01] there's no dest IP to rewrite, the configuration in question is by-hostname [19:37:10] IP lookup happens down inside a python script [19:37:34] Change /etc/hosts on that box? /me pukes [19:37:46] this has to be a common issue with a normal solution, right? [19:38:27] bblack: We use private ips in the php config files rather than hostnames [19:39:05] yeah but if I change the script's config to use the IP directly (or the hostname deployment-cache-text02), it won't send the right host header to hit zero.wikimedia.beta.wmflabs.org [19:39:42] nothing we do today does betalabs -> betalabs http requests by-hostname? [19:39:49] It would be pretty easy to add some Host['foo.exampel.com'] puppet defines in a beta::* manifest [19:39:50] we do [19:39:53] the rewrite is at https://github.com/wikimedia/operations-puppet/blob/production/modules/beta/manifests/natfix.pp [19:40:04] which calls the define https://github.com/wikimedia/operations-puppet/blob/production/modules/beta/manifests/natdestrewrite.pp [19:40:23] so the script on the mobile03 would lookup the zero dns entry which is the public IP of cache-text02 [19:40:39] ah! [19:40:49] then establish a connection to the public IP. The beta::natfix rule rewrite the dest to the private IP of the instance that hosts cache-text02 [19:41:48] YuviPanda: Back, had to watch the World Cup :-) [19:41:54] multichill: hah :) [19:41:59] ok, so the IP in question is already in the list in beta::natfix [19:42:08] bblack: the nat table on an apache server is http://paste.openstack.org/show/83170/ [19:42:09] https://en.wikipedia.org/wiki/2014_Men%27s_Hockey_World_Cup <- the more fun WC : [19:42:26] multichill: for all I'm concerned the most fun WC is in 2015 :P [19:42:34] I guess beta::natfix just isn't applied on deployment-cache-mobile03 [19:42:35] multichill: http://en.wikipedia.org/wiki/2015_Cricket_World_Cup [19:42:50] Not all hosts in beta have the static nat rules applied mostly because it turns on ferm firewall management [19:42:58] bblack: yup i haven't added it on the varnish cause they never have to access the public IP [19:43:06] which then means that you have to make ferm rules for all inbound services [19:43:15] and what bryan said: default ferm policy is to drop [19:43:47] Ha! The World Cup is fun. Both Men's and Women's are here in the Netherlands so I get to watch a fun game every night [19:44:07] * bd808 is glad hashar shoed up and made things less confusing [19:44:17] multichill: :D btw, tools-mongo is a Tools MongoDB instance that should go live next week :) [19:44:19] s/shoed/showed/ [19:44:29] * YuviPanda shoes bd808 [19:44:36] bd808: ho I am sure you will have found out eventually. My bad for not documenting stuff :( [19:44:37] ok that sounds like a pain, I'm gonna go make some coffee :P [19:44:49] * bd808 ties a double knot for YuviPanda [19:44:59] * YuviPanda makes bd808 cut a steak up as well [19:45:01] <3 [19:45:14] Ah nice, monuments database is Mysql. I'm not going to change that. I'm just going to import everything to Wikidata and abandon it :P [19:45:16] bblack: easiest would probable be to have your script point directly to the IP instead of using the hostname [19:45:26] bblack: hard way is to get us dns split horizon in labs [19:45:35] * bd808 tries to get back to writing "fact based performance evaluations" [19:46:08] it's an HTTP request, which won't get answered correctly without the correct Host header. So I can't just swap in the IP for the hostname, I'd have to upgrade the script to support the idea of separately configuring the IP and the Host-header [19:46:11] Using the ip directly is going to be tricky for him because he needs to make http calls with the right header [19:46:16] multichill: indeed, that does sound like a better plan. [19:46:30] The WLM app should be working again btw Yuvi [19:46:41] The "easy" way would be to add puppet host defines to manage /etc/hosts [19:46:57] hashar: I don't see why split-horizon is hard? usually you can do that in your caches rather than try to do it in an auth server [19:47:12] e.g. the list from beta::natfix -> pdns_recursor config and you're done [19:47:26] Fuck NAT, go ipv6! [19:47:29] :P [19:47:40] fuck IP lets get back to X.25 [19:48:00] guarantied QoS from end to end! [19:48:04] * bd808 looks for appletalk cabling [19:48:19] (so long as the desired Q is very low) [19:48:32] multichill: coool! :D [19:48:43] bblack: yeah and that list could be build directly from wikitech since it lists the public IP -> instance IP mapping ( i.e. https://wikitech.wikimedia.org/wiki/Special:NovaAddress ) [19:48:45] Ha! Ever used 100.64.0.0/10 hashar? [19:49:24] someone should implement this idea. maybe someone in ops. [19:49:31] * bblack goes to ping ops people about this [19:52:33] Ipv6 support in Openstack isn't that good. Need Neutron for that AFAIK [19:55:30] I think I heard that is the goal for the new datacenter [19:55:33] multichill: Yeah I'm pretty sure that's the ipv6 labs blocker. There was an email thread about it recently [19:59:46] hashar: so, we're using dnsmasq as our dns server for labs on labnet1001. and it has a config file which takes lines like "alias=208.80.155.135,10.68.16.16" to rewrite addresses in results [20:00:03] I think as a first step I could add some puppet to take the beta::natfix list and stuff it in there [20:05:29] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c6 (10Yuvi Panda) I can make redacted logs available in a familiar pattern, with the following stripped out: 1. IP Address 2. Referrer fields The only problem is that currently the proxy's lo... [20:08:21] bblack: frankly, that would be quite awesome [20:08:25] YuviPanda: It looks like the Referrer field is not stripped out in the access.log I get or does that contain an exception for when the referrer is some WMF site? [20:08:39] multichill: hmm, I'm unsure. [20:09:02] multichill: I don't mind showing referrer if WMF Legal is ok with it. [20:09:12] multichill: I guess access.log also keeps UA [20:09:20] yup [20:10:02] YuviPanda: Look in ~heritage/access.log [20:10:12] You'll see a lot of phone agents etc [20:11:29] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c7 (10metatron) Great! (UA & referer would be fine though, as they are already present in tools logs). Concerning archive - maybe one could steal some ideas for this from prod.-varnishes ;-) [20:12:44] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c8 (10Yuvi Panda) Hmm, I don't see any non WMF Referrers in the access.log (looked at heritage's logs). Can someone verify / confirm? [20:13:27] multichill: yeah, UA looks kosher. referrer I'm usnure about since a bunch of grepping heritage's logs tells me there are no non-WMF referrers [20:16:22] YuviPanda: Hmm, true. I can also spot only wiki referers https://tools.wmflabs.org/paste/view/d2153936 [20:17:03] hedonil: yeah, me too [20:19:03] YuviPanda: maybe it's because all tools are linked from within wikis or called plain vanilla w/o referer [20:19:17] hedonil: that's possible, but it's also possible that the referrers are stripped [20:19:27] Coren: are referrers stripped in some way? [20:19:30] for access.logs? [20:19:53] That would need to be managed in nginx, wouldn't it? [20:20:12] (So I assume not.) [20:20:55] scfc_de: hmm, perhaps it's just that they are not often used outside of wiki projects [20:21:02] YuviPanda: They're not stripped, but afaik lighttpd's default log format doesn't stow it. [20:21:12] Coren: do we have a legal reason to hide them? [20:21:15] (It defaults to apache's common format, not the extended one) [20:21:24] YuviPanda: I don't beleive we do. [20:21:33] Coren: cool [20:21:37] hashar: https://gerrit.wikimedia.org/r/#/c/138017/ [20:21:40] YuviPanda: I don't think so either . easy to test. put a link on my own webserver .... [20:21:48] ^ that's the lame manual lazy version just copying data into files :) [20:22:04] Coren: as seen here referes are shown in the logs (if available) https://tools.wmflabs.org/paste/view/d2153936 [20:22:07] obviously, we could template that and pull data from Nova [20:22:14] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c9 (10Yuvi Panda) After conversations with Coren: Lighty's default format doesn't record referrers, but there's no reason for that. So I'll just strip out IPs. [20:23:59] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c10 (10Yuvi Panda) So, current plan would be to: 1. Have lograte set to rotate logs daily 2. Setup a post-processing script that runs after the rotation has happened, and strip IPs (more proba... [20:24:53] Coren: log format is: 10.68.16.4 tools.wmflabs.org - [06/Jun/2014:18:32:46 +0000] "" "" "" [20:25:17] hashar: (but that part sounds complicated and I'm out of time to work on that this week - but if you wanna merge the manual version as a step in the right direction, and/or make a better patch, feel free!) [20:25:52] bblack: I think that is totally awesome though I have no idea what the impacts are going to be. Might want to do that monday [20:26:21] yeah it could have scary interactions with other workarounds/hacks for the public names/IPs [20:29:45] bblack: commented on the gerrit change and added the labs folks to it (marc, andrew, ryan) [20:30:08] bblack: but yeah that looks fine to me. I can't think about something that will break because of it but we never know [20:30:12] kudos! [20:33:59] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c11 (10metatron) Would it be possible to logrotate/process them on an hourly basis? Like: https://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-06/ Just to be compatible and to allow a mo... [20:35:59] bblack: also the beta cluster uses a local puppet master ( deployment-salt.eqiad.wmflabs ) so one has to git pull in /var/lib/git/operations/puppet to have the change applied on the instances. [20:39:58] bblack: I am off [20:40:08] have sweet dreams and a nice weekend folks [20:41:59] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c12 (10metatron) (In reply to metatron from comment #11) > Would it be possible to logrotate/process them on an hourly basis? > Like: https://dumps.wikimedia.org/other/pagecounts-raw/2014/2014... [20:43:08] hashar, could you take a look at the channel log at http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-labs/20140606.txt [20:43:15] [19:15:58 [20:43:36] that's where bblack was having some conncerns with rebasing [20:44:01] or if you are gone, have an amazing weekend :) [20:50:07] yurikR: oh I am sure he will figure it out :) [20:50:32] yurikR: apparently it got rebased so all fine. [20:50:35] I am off for real! [20:50:42] gnight! [20:51:59] still the traffic is not identified :( bleh [20:57:59] 3Wikimedia Labs: Request to access redacted webproxy logfiles of (Tool) Labs - 10https://bugzilla.wikimedia.org/59222#c13 (10metatron) If you need some helping hands, provide me some 100k raw logs and I'll write a bash-script with awk to summarize & format the logs exactly like pageview dumps. [21:09:30] yurikR2: yeah all that's resolved, but the beta instance's zerofetcher script can't connect to the zero.wm.beta hostname to fetch the data (because that maps to a public IP) [21:09:40] hence all the stuff above about split horizon DNS, etc [21:10:36] yurikR2: we'll get something working next week probably [21:11:31] yurikR2: https://gerrit.wikimedia.org/r/#/c/138017/ is the general direction I'm headed for fixing it, but it needs some massaging, and in any case probably isn't a good idea to deploy on a Friday :) [21:12:43] bblack, ouch, looks scary [21:13:17] no rush thoug, i still have tons of stuff to test otherwise [21:20:06] bblack, btw, saw that you added some IP CIDR libs, will be great to re-use them to manage IPs in Zero and other JsonConfig stores [21:59:40] Coren: you about? [21:59:51] Betacommand: Kinda. [22:00:12] how do I set group write permissions? some of my files lost it [22:04:30] Coren: ^ [22:04:33] Betacommand: cgmod g+w [22:04:36] chmod* [22:04:43] valhallasw: thanks [22:04:50] or chmod 664 / chmod 775 [22:05:28] Betacommand: What valhallasw said (the first one); the last one sets the entire permissions at once which may or may not be exactly what you need; g+w does exactly just "add write to group" [22:05:55] Coren: thanks and the first is what I was looking for [22:06:17] Betacommand: Also, you can do "chmod -R g+w " to do it recursively. [22:06:27] the docs I was reading a few days ago used g= which never worked [22:06:56] "g=" sets the group permissions to "" (i.e.: no permissions at all). [22:07:16] "g=w" sets to "write only" which is, frankly, almost never used. [22:08:05] Coren: it was g= but I couldnt get it to work [22:08:24] instead of g [22:22:29] (03PS1) 10Yurik: Added JsonConfig to mobile channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/138101 [22:26:15] "chmod 775" should usually not be used on tools (sub-)directories as it removes the setgid bit that otherwise ensures that all files are owned by the tool's group.