[00:00:26] Coren: I maxed out my max_user connections [00:00:29] :( [00:00:31] sorry [00:00:38] there must be a bug somewhere that doesn't close them [00:00:53] but since I can't log in, I can't kill any of them [00:14:43] hm... problem went away [01:03:17] !log wikimania-support Updated scholarship-alpha to ad26241 [01:03:20] Logged the message, Master [01:20:29] !log integration Upgraded npm from v1.1.39 to v1.3.18 on integration-slave01 [01:20:32] Logged the message, Master [01:21:47] !log integration Ensured npm/grunt-cli (0.1.11) is globally available on integration-slave01 [01:21:49] Logged the message, Master [03:39:38] Coren: +++ b/modules/labs_vmbuilder/files/firstboot.sh [03:39:40] ? [04:22:39] mutante: that was me [04:22:43] did you merge it in? [04:24:48] Ryan_Lane: yes [04:25:16] poked you on ops, saw the commit message then,figured it was ok [04:25:17] cool. thanks [04:25:21] np [14:12:18] !log integration installing libsikuli-script-java on integration-selenium-driver for {{bug|54393}} [14:12:20] Logged the message, Master [14:12:21] zeljkof ^^^ [14:35:31] Coren, ping [14:35:42] Pong-ish. [14:35:50] Coren, http://ganglia.wmflabs.org/latest/?c=tools&h=tools-webserver-02&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [14:36:11] Someone is getting close to DDOSing that server [14:37:23] Doesn't look like; it's just catscan2 being heavy as usual. [14:37:54] Look at week-long stats; that's pretty much typical load for -02 [14:37:56] Coren, can that be disabled until it gets its own web server or some of the weight is lifted. [14:38:01] (which is why there isn't many things on that one) [15:11:18] Coren: you have a minute to help me with some sql fu ? [15:11:33] Betacommand: Sure. What be up? [15:15:27] Coren: see PM [15:29:36] hashar: thanks, working on getting it working on my machine [15:29:39] (sikuli) [15:38:51] zeljkof: yeah so we get the package [15:38:57] I have NO CLUE how it works though [15:39:11] potentially you could create a fake change that would attempt to use sikuli and see how it goes [15:39:16] hashar: I do, but I am not sure how to make it work on a headless server :) [15:39:19] maybe using ULS since that extension triggers browsertests [15:39:38] you could send a change that deletes all features in tests/browser and add a single feature to try out sikuli [15:39:43] ahh [15:52:31] Coren: is it possible to get a message when a job is killed because of memory issues? [16:05:47] Betacommand: Annoyingly not, afaik - it's a kill -9 [16:12:11] Coren: can you get the script to send an email to the owner? [16:22:44] mhoover: so, I've finished all the stuff I was going to do without being blocked on you [16:22:59] in the meantime I'm going to setup a 2nd region instance using folsom [16:23:13] so that I can test openstackmanager with keystone [16:23:38] I think it's possible to proxy neutron commands through nova's api, so we may not need to make immediate changes for it [16:24:45] once neutron is available I'll switch to using its api directly, though, since it's better to have proper support [17:27:46] anyone online that can approve an operations/mediawiki-config for the beta cluster? [17:46:16] (03PS1) 10BryanDavis: Add MySQL admin account for Wikimania Scholarships app [labs/private] - 10https://gerrit.wikimedia.org/r/102182 [18:00:33] Coren: what are your thoughts of adding an email feature to your memory killer? [18:13:02] mhoover: we'll need to deal with keystone somehow [18:14:38] having a unified keystone would be ideal [18:15:07] otherwise it's necessary to authenticate to each and have separate scoped tokens for each [18:17:39] well, OpenStackManager is configured to have a single keystone endpoint url [18:17:41] so that's a plus [18:18:29] on the other hand, we'd also need to replicate the mysql database for keystone and only keystone [18:18:44] Ryan_Lane: reading through this, think it's doable with your config? https://ask.openstack.org/en/question/16/how-to-setup-an-openstack-with-multi-region-support-with-single-keystone/ [18:19:18] yeah [18:19:30] OpenStackManager was written to support this [18:20:04] so, one possibility is to not replicate the databases and to use redis with replication for the tokens [18:20:21] then to just ensure the endpoint urls and such are configured the same in both regions [18:20:51] then we don't need to replicate mysql [18:21:17] Ryan_Lane: yes. you are already using redis for other things? [18:21:27] not for openstack [18:21:36] but we do have redis modules in puppet [18:22:23] hm. I wonder if dogpile driver exists in folsom [18:23:02] I can't test this otherwise. heh [18:24:21] Ryan_Lane: this might be useful https://github.com/icgood/keystone-redis [18:24:54] ah, cool [18:25:07] looks like dogpile was added in grizzly or havana [18:25:16] so this will work till then :) [18:27:47] mhoover: so, I was thinking OSM would point to a primary keystone [18:27:57] and services in each region would point to their local keystone [18:28:18] token requests would always be generated on the primary [18:28:28] and redis will replicate them [18:28:43] which will make it possible for the services in each datacenter to validate tokens [18:29:23] of course, we could also just replicate mysql and get the same thing, but we'd need to rename some databases for that [18:29:33] databases would need to be named per-datacenter [18:29:55] or we'd need to split keystone databases away from the rest [18:30:33] Ryan_Lane: redis sounds better for just this one job. better than polluting mysql [18:30:46] well, we're already writing tokens into mysql [18:30:53] we're not using memcache backend right now [18:33:46] Ryan_Lane: well, we could test each config. one mysql, one redis, see what kind of lag we get and pick the faster one. redis is most likely easier to manage for this one task [18:33:59] yeah. probably [18:35:02] well, redis is better for the tokens for a number of reasons [18:35:15] it would be ideal to replicate mysql for keystone either way, though [18:35:30] so that we don't have to manually keep endpoint/service info up to date [18:38:01] mhoover: seems we have some misc. databases in pmtpa and eqiad [18:38:10] and replication may be selective by db [18:38:27] db1001 (eqiad) and db9 (pmtpa) [18:38:50] need to wait till australia wakes up to verify [18:40:05] either way, I'll set up redis for this for now [18:41:02] Ryan_Lane: ok. should def have both avail. i'm running through puppet stack install, i'll add both to my setup [18:41:28] you may want to see this horrific change I made: https://gerrit.wikimedia.org/r/#/c/102185/ [18:41:31] heh [18:55:39] Ryan_Lane: I'm also setting up labs-specific databases, those might be suitable homes too at need. [18:55:59] what do you mean? [18:56:06] labs specific databases for...? [18:57:22] like user ones? [18:57:32] those definitely shouldn't go on db9/db1001 :) [18:59:22] Coren: ? [18:59:23] No, no; I'm setting up labsdb100[45] for user databases; my point was that there'd be room on there if you needed to store labs stuff. [18:59:38] ewww, no [18:59:40] :) [18:59:53] there's a possibility of users breaking those [19:00:14] I don't want openstack services to die because a user wrote too much into a database [19:00:20] True. I was just giving you the extra option, not counselling its use. :-) [19:00:46] if db9/db1001 aren't usable we'll continue using virt0/1000 [19:00:52] I was hoping to get mysql off of them, though [19:05:00] mhoover: hm. glance is also something that needs to be dealt with [19:05:15] otherwise we need to handle images in multiple locations [19:06:14] which would have different image ids [19:06:51] I could modify OSM to have default images per region. that's annoying, though [19:07:10] hm. I wonder if we could modify the image metadata to specify an image is default [19:08:19] you can definitely set a property [19:09:04] oh man, being able to define a banned image that way would rock too [19:12:00] if that's the case then I don't care if we deal with glance. adding images in two spots is easy [19:29:40] yep. that works and it's an incredibly simple way of handling this [19:34:08] is there a problem with the continous job queue on tools? My jobs are running fine, but when I run qstat -j on them I see: [19:34:09] scheduling info: queue instance "task@tools-exec-05.pmtpa.wmflabs" dropped because it is temporarily not available [19:34:09] queue instance "continuous@tools-exec-05.pmtpa.wmflabs" dropped because it is temporarily not available [19:34:22] just wondering ... [19:41:59] andrewbogott: https://gerrit.wikimedia.org/r/102285 [19:43:04] I've tested on nova-precise2 already [19:43:12] I'm going to set the metadata in production [19:45:40] hm. even better. I should set a property for images to show, then we can just remove the property when we want to hide it [19:46:24] (03CR) 10Ori.livneh: [C: 032 V: 032] "I made the corresponding change in the private repo." [labs/private] - 10https://gerrit.wikimedia.org/r/102182 (owner: 10BryanDavis) [20:26:20] !log wikimania-support Updated scholarship-alpha to ee0f62b [20:26:22] Logged the message, Master [20:27:37] weee [20:32:02] greg-g: Have you checked out the new hotness that is https://wikimania-scholarship.wmflabs.org/alpha/apply [20:32:36] bd808: pretty hawtt [20:32:49] Compared to https://wikimania-scholarship.wmflabs.org/app/ I think it's an improvement [20:33:13] agreed [20:33:46] definitely, well done! [20:33:59] I can't take any credit for the fancy graphics though other than having been smart enough to borrow them from the 2014 wiki [20:34:35] creativity is more than just art ;) [20:35:32] Now if I can just get Ellie to tell the committee that they need to stop changing the form content… [21:28:49] bd808|MEETING: An amateur borrows, an artist rips off. :-) [21:35:56] Coren: some wikis have the wrong name: mysql --defaults-extra-file=~/replica.my.cnf -h enwiki.labsdb meta_p -e 'select dbname,lang,name,family,url,is_closed from wiki;' | fgrep -e '\\' | column -t -s $'\t' [21:36:11] (but some seem more sane even with non-latin) [21:37:50] jeremyb: That data comes from the wikis themselves. [21:38:04] Coren: huh? [21:38:15] presumably someone would complained if they look like that [21:38:35] * jeremyb goes to spot check one. but idk what on the wiki to compare it to [21:38:36] What wiki do you see as being incorrect? [21:38:59] do you see my fgrep param? [21:39:07] iswikibooks is Wikib\\xE6kur wikibooks http://is.wikibooks.org 0 [21:39:49] $ curl -sSL 'https://is.wikibooks.org/wiki/' | fgrep '' [21:39:49] <jeremyb> <meta charset="UTF-8" /><title>Wikibækur [21:40:01] Hm. [21:41:03] And yet, was was entered into the DB is the exact output of an API call. [21:42:00] idk what to tell you :) [21:42:27] http://is.wikibooks.org/w/api.php?action=query&meta=siteinfo&siprop=general&format=json [21:43:34] And it's not just a charset issue; zhwiki reports '維基大典' and if han works, few things shouldn't break. [21:43:39] * Coren tries to find the difference. [21:44:49] Hm. They're both doing \u escapes for unicode codepoints. [21:44:52] * Coren boggles a little. [21:45:12] few things should break* [21:45:13] ? [21:45:44] Han is usually the first thing to break when there are transcoding problems. [21:45:57] right so few things should break [21:46:12] btw, as long as you're looking at that, maybe fix is_closed? :) [21:46:16] Urdu also works: ویکیپیڈیا [21:46:24] What's wrong with is_closed? [21:46:34] it's the same value for the whole list? [21:46:48] That's normal; we're not actually replicating any closed projects. :-) [21:47:20] Same deal with is_private. [21:47:21] errr [21:47:41] not even replicating on initial cluster import? [21:48:41] * jeremyb doesn't get it [21:48:57] you have a row for aawiki which says is_closed=0 [21:49:04] how could that be justified? [21:49:35] Hm. [21:49:45] By a bug, obviously. [21:50:16] For some reason the check against closed.dblist isn't working right. [21:50:50] again data leakage? [21:51:19] Oh duh! [21:51:27] * Coren facepalms. [21:51:46] Ignore me; I was talking about is_deleted not is_closed. [21:51:57] That is_closed isn't properly reflected is just a dumb bug. [21:52:37] errr, is_deleted would mean it's impossible to replicate, right? :) [21:52:49] you just expose a snapshot. optionally [21:53:12] jeremyb: No, actually, it means the /project/ is gone, not necessarily the database itself. Those we do not replicate. [21:56:20] As for the transcoding error, I can't reproduce it -- I'm getting the right data when I run the update process again. I'll just push an update and that'll fix everything. [21:58:13] * Coren wonders how that came about. [21:58:15] Coren: i get no \\ now [21:59:00] Yeah, I'm not sure how that happened in the first place. Apparently, about 35 projects were saved with transcoding errors. [21:59:19] I'm in the process of pushing an update on all shards. [21:59:49] That should also update is_closed. I was simply not putting the value in the table. :-) [22:01:09] {{done}} [22:04:04] where's the script that does that? [22:04:41] විකිපීඩියා, නිදහස් විශ්වකෝෂය is arguably the neatest project name. [22:06:09] what is that? [22:09:20] Jeff_Green: https://git.wikimedia.org/blob/operations%2Fsoftware/HEAD/maintain-replicas%2Fmaintain-replicas.pl [22:09:28] Bah. jeremyb ^^ [22:13:07] Coren: that does meta_p?? [22:13:23] Coren: and what project is the neatest? [22:13:35] විකිපීඩියා, නිදහස් විශ්වකෝෂය [22:13:44] You probably don't have the font for it. :-) [22:13:55] siwiki i guess [22:14:06] i have the font. just looks weird (squished) [22:14:09] Yep. siwiki. [22:16:20] 37 wikis not on s3 [22:16:27] all are open [22:17:43] Such as? [22:18:04] huh? [22:18:08] e.g. enwiki [22:20:32] jeremyb: I don't get what you mean; enwiki is in meta_p.wiki on s3 too. [22:20:44] hah! [22:21:20] i mean that meta_p.wiki says that 37 wikis are not on s3 [22:21:27] Ah! [22:21:38] Well yeah, s3 is the "everything else" slice. :-) [22:22:38] right. that's why i was filtering it out [22:33:00] Coren: s7.dblist is missing centralauth [22:34:12] centralauth is handled specially. [22:34:27] huh? [22:34:40] centralauth is not in special.dblist either fwiw [22:34:42] :P [22:44:04] Coren: so centralauth? should i just bug springle? [22:44:09] Coren: also, why perl??????? [22:45:16] Because perl rules. But centralauth is replicated and in the wiki table. What's missing? [22:45:53] centralauth isn't in any .dblist because it's not a project database. [22:45:54] right, it is in the wiki table. it's not in s7.dblist though [22:46:06] huh? [22:46:47] Perhaps I don't get what you mean? [22:49:00] hrmmmm, so what else is not a project? [22:52:27] There are a couple others, but those aren't replicated. [22:55:24] mhoover: ok, so I just did mysql replication via replicate-wild-do-table for now [22:55:49] I also dealt with multi-region glance [22:56:30] I've gotten as far as verifying the region appears properly in the OSM interface when the new region endpoints are added via keystone [22:59:33] heh. well it's apparently all broken :D [23:00:06] ah. right. I set the endpoint ips to their ips rather than 127.0.0.1 and none of them are bound to that [23:01:40] Ryan_Lane: you know wild-do-table ain't perfect... only pays attention to `use db` but not fully qualified db refs [23:01:56] this is only for testing [23:02:16] k. just was warning :) [23:05:49] Ryan_Lane: that rules. would keystone be writing to one instance in pmtpa while eqiad reads? or vice versa [23:06:11] at first writing to pmtpa and reading in eqiad [23:06:18] but we'd reverse than when we switch [23:06:22] (03PS1) 10Jforrester: Add new VisualEditor/*.git repos [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102340 [23:06:48] we could do one of two things: have a single keystone hostname and switch it in DNS when we're ready [23:06:57] or, just switch the config in MW [23:07:23] it's likely a good idea to give it a proper endpoint [23:07:28] if we ever want to expose the API [23:10:57] (03PS1) 10Krinkle: Direct VisualEditor/* streams to #mediawiki-visualeditor [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102342 [23:12:35] (03CR) 10Catrope: [C: 04-1] Add new VisualEditor/*.git repos (031 comment) [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102340 (owner: 10Jforrester) [23:13:38] (03PS2) 10Jforrester: Add new VisualEditor/*.git repos [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102340 [23:14:29] (03Abandoned) 10Jforrester: Direct VisualEditor/* streams to #mediawiki-visualeditor [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102342 (owner: 10Krinkle) [23:14:40] (03CR) 10Krinkle: "Abandoned" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102342 (owner: 10Krinkle) [23:15:20] (03CR) 10Catrope: [C: 032] Add new VisualEditor/*.git repos [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102340 (owner: 10Jforrester) [23:17:08] (03PS1) 10Jforrester: Send grrrit pings about grrrit to #wikimedia-dev as well [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102346 [23:17:42] (03CR) 10Legoktm: [C: 032] "So meta." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/102346 (owner: 10Jforrester) [23:18:25] Who's restarting grrrit? [23:18:33] me, sorry [23:18:44] legoktm: !log that shit, man [23:19:15] !log tools rebooted grrrit-wm with new config stuffs [23:19:17] Logged the message, Master [23:19:32] legoktm: Too bad you can't specify comic sans when you're doing a gerrit review. [23:19:37] :D [23:55:27] i cannot seem to ssh into my instance, unicorn.wmflabs.org [23:56:54] Coren: Any luck with the permissions/groups issue from yesterday?