[00:02:30] scfc_de: my first thought was about bigbrother, how to track which service is running with high or low limit [00:03:11] Coren: scfc_de: and yes, it should be simple for both sides to configure and control things [00:05:02] So, the current thing with: ask and you get quickly assigned a variable in data/project/.system/config seems to be the most satisfying solution for both sides [00:05:47] and easy to put in a script [00:08:08] you don't have to care about multiple queues or track settings; $webservice (re)start will always be the same [00:08:40] and I'd be glad to remove the custom lighty-scripts [01:08:37] 3Wikimedia Labs / 3tools: Old version of webservice script on tools-dev - 10https://bugzilla.wikimedia.org/66845 (10Tim Landscheidt) 5NEW>3ASSI a:5Marc A. Pelletier>3Tim Landscheidt [01:08:40] (03PS2) 10Tim Landscheidt: Package webservice [labs/toollabs] - 10https://gerrit.wikimedia.org/r/122841 (https://bugzilla.wikimedia.org/66845) [01:10:05] 3Wikimedia Labs / 3tools: tools-submit has no database aliases/NAT - 10https://bugzilla.wikimedia.org/65308 (10Tim Landscheidt) 5NEW>3RESO/FIX [01:14:21] 3Wikimedia Labs / 3tools: install gawk libyaml-dev libgdbm-dev libncurses5-dev bison libffi-dev - 10https://bugzilla.wikimedia.org/65974#c1 (10Tim Landscheidt) 5NEW>3ASSI a:5Marc A. Pelletier>3Tim Landscheidt Do you need those packages installed only for development on the bastion hosts (tools-login,... [01:42:50] 3Wikimedia Labs / 3tools: Install git review on tools - 10https://bugzilla.wikimedia.org/62871#c4 (10Tim Landscheidt) 5REOP>3RESO/FIX It is already included in exec_environ (cf. http://git.wikimedia.org/blob/operations%2Fpuppet.git/production/modules%2Ftoollabs%2Fmanifests%2Fexec_environ.pp#L283) and, in... [02:07:51] 3Wikimedia Labs / 3tools: install gawk libyaml-dev libgdbm-dev libncurses5-dev bison libffi-dev - 10https://bugzilla.wikimedia.org/65974#c2 (10Asaf Bartov) I'll need the libraries on the grid nodes as well, thanks. A. [02:59:18] !log deployment-prep promoted legoktm to project-admin [02:59:22] Logged the message, Master [03:36:49] andrewbogott_afk: I haven't gotten any echo notification emails from wikitech since Sun, 20 Jul 2014 21:38:05 +0000. I'm more than a bit worried that my patch to change the echo emails broke something. [03:41:17] andrewbogott_afk: < legoktm> umm, is wikitech running processEchoEmailBatch.php? [03:41:34] andrewbogott_afk: < legoktm> by adding "email-body-batch-params" and "email-body-batch-message" I think you enabled batching support, which requires a script to be run [03:41:45] okay, I'll just talk in here now :P [03:42:55] legoktm: I was just queuing up things for Andrew's bouncer to tell him in the morning :) [03:44:45] bd808: is wikitech's configuration public anywhere? [03:44:54] $wgEchoEnableEmailBatch = false; should fix it [03:45:12] but people would miss emails...setting up a cronjob is probably better [03:45:20] I don't think the config is public. At least I've never seen it. [03:45:29] Indeed, not public and not versioned. [03:45:39] poor wikitech [03:45:51] the cobbler's children... [03:46:26] :( [06:33:08] Hello, I'm trying to add myself an account on wikitech, but I'm getting the following error: Account creation error The user name "DChan (WMF)" has been banned from creation. It matches the following blacklist entry: ^(User:).*(WMF).*$ [06:33:21] Does that mean someone else has to create it for me? [06:33:37] probably, yes [06:35:39] you probably don't need (WMF) on wikitech [06:35:39] I can't think of anyone else who has it [06:37:06] Ok, thanks [08:05:53] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499 (10Magnus Manske) 3NEW p:3Unprio s:3blocke a:3Marc A. Pelletier I'm getting a "403 forbidden" error on http://tools.wmflabs.org/mix-n-match/api.php It worked fine yesterday; haven't touched it since three day... [08:41:51] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499#c1 (10metatron) Your webservice has been restarted 2014-07-24 07:47:05: (server.c.1512) server stopped by UID = 51434 PID = 23122 2014-07-24 07:47:33: (log.c.166) server started 2014-07-24 08:02:19: (server.c.1512) s... [09:58:36] (03PS1) 10Hedonil: webserviće: Bug: 68431, Bug: 68499 - allow setting individual tools memory limits via config files in /data/project/.system/config - create blank .lighttpd.conf, if it doesn't exist [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 [11:30:55] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499#c3 (10Magnus Manske) (In reply to metatron from comment #1) > Your webservice has been restarted > 2014-07-24 07:47:05: (server.c.1512) server stopped by UID = 51434 PID = > 23122 > 2014-07-24 07:47:33: (log.c.166) se... [11:32:09] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499#c4 (10Magnus Manske) (In reply to Magnus Manske from comment #3) > (In reply to metatron from comment #1) > > Your webservice has been restarted > > 2014-07-24 07:47:05: (server.c.1512) server stopped by UID = 51434 PI... [11:33:53] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499#c5 (10metatron) Yeah. Karma & Coincidence!. See labs-l: DB-upgrade in progress my backlog right now http://tools.wmflabs.org/newwebtest/xtools-status.html Just terminated the service for now. [11:45:50] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499#c6 (10Magnus Manske) 5PATC>3RESO/FIX Huh. Now it works. Thanks if you fixed it, otherwise - magick! ;-) [11:55:38] 3Wikimedia Labs / 3tools: 403 error on PHP script - 10https://bugzilla.wikimedia.org/68499#c7 (10metatron) Ha! magic it is ;) Yep it's better as you're getting a quick error-response and not a 100sec timeout, if you try to access affected databases. [12:00:56] 3Wikimedia Labs / 3tools: view globalimagelinks missing at db commonswiki_p on new mariadb 10 sql server - 10https://bugzilla.wikimedia.org/68505 (10merl) 3NEW p:3Unprio s:3enhanc a:3Marc A. Pelletier Table globalimagelinks is missing on commonwiki database mysql -hs5.labsdb -e "show tables like 'gl... [12:00:56] 3Wikimedia Labs / 3(other): (Tracking) Database replication services - 10https://bugzilla.wikimedia.org/48930 (10merl) [12:01:11] 3Tool Labs tools / 3[other]: merl tools (tracking) - 10https://bugzilla.wikimedia.org/67556 (10merl) [12:01:11] 3Wikimedia Labs / 3tools: Missing Toolserver features in Tools (tracking) - 10https://bugzilla.wikimedia.org/58791 (10merl) [12:03:23] 3Wikimedia Labs / 3tools: Add some of the missing tables in commonswiki_f_p - 10https://bugzilla.wikimedia.org/59683 (10merl) [12:03:23] 3Wikimedia Labs / 3tools: view globalimagelinks missing at db commonswiki_p on new mariadb 10 sql server - 10https://bugzilla.wikimedia.org/68505 (10merl) p:5Unprio>3High [12:03:24] 3Wikimedia Labs / 3tools: Table "globalimagelinks" is missing from replicated commons database - 10https://bugzilla.wikimedia.org/48899 (10merl) [12:20:39] (03PS2) 10Hedonil: webserviće: Bug: 68431, Bug: 68499 - enable individual memory limits via config files in /data/project/.system/config - create blank .lighttpd.conf, if it doesn't exist [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 [12:27:06] Finally Toolscript is working again .... the mariadb migration has been a real pain [12:55:12] What can be fix for https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=consoleoutput&project=language&instanceid=92d010af-066b-4c52-9beb-d4f6697cc3f5®ion=eqiad [12:55:22] hashar: ^ ?) [12:55:23] :) [13:04:32] (03CR) 10Tim Landscheidt: "Why is an empty .lighttpd.conf needed? It seems like the evaluating logic is in toollabs/files/lighttpd-starter, and if there's a bug, th" (031 comment) [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 (owner: 10Hedonil) [13:05:06] I'm having trouble accessing commonswiki.labsdb from wikimetrics [13:05:19] enwiki.labsdb works, here's my /etc/hosts: [13:05:58] 192.168.99.1 s1.labsdb enwiki.labsdb [13:05:58] 192.168.99.4 s4.labsdb commonswiki.labsdb [13:06:36] wikimetrics is on the instance wikimetrics1.eqiad.wmflabs [13:06:49] milimetric: Have you seen the announcement on the DB move in labs-l? [13:07:34] (Don't know *if* it is related, but the timing'd be right.) [13:07:38] (03CR) 10Hedonil: "You're absolutely right, but seemed to me like the quickest solution. If it's absolutely out of place, I'll remove it here and patch it th" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 (owner: 10Hedonil) [13:11:13] kart_: Only members of the project "language" (and Labs admins) can see and thus help you with that. [13:12:36] scfc_de: Jul 24 13:01:02 language-mleb2 /etc/init/mysql.conf[2640]: ERROR: The partition with /var/lib/mysql is too full! [13:12:40] that [13:14:37] (03PS3) 10Hedonil: webserviće: enable individual memory limits [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 (https://bugzilla.wikimedia.org/68431) [13:15:36] kart_: blank page [13:15:52] kart_: icant access the console of instances in language project :D [13:16:18] kart_: and yeah /var in labs instance is only 2GB [13:16:20] kart_: An instance's additional disk space isn't available until you configure it that way, and MySQL was probably not content with the default 2 GByte. But fixing this isn't simple. [13:17:03] the instance should have extended disk space mounted using role::labs::lvm::srv [13:17:16] the migrate the data somewhere under /srv i.e. /srv/mysql [13:17:22] and adjust your mysql data_dir to point there [13:18:15] scfc_de: thanks, that sounds related but I can't find the message, what was the subject / date? [13:18:31] Thanks hashar, scfc_de [13:19:24] ah! what? labs-l is different from wikitech-l - doy! [13:22:02] ok scfc_de, and anyone that has the same problem, ssh into tools-login and cat /etc/hosts to get an updated version [13:22:06] thanks! [13:33:32] scfc_de: what's your suggestion for this fix in lighttpd-starter? Just insert $touch -a ~/.lighttpd.conf or modify the whole condition? the former would be real quick. [13:38:30] scfc_de: hm, I updated the /etc/hosts and I still can't really get to commonswiki, now the error changed from 111 (connection refused) to 110 (timeout) [13:38:55] oh... my ipfilters would have to change too right? [13:42:10] does anyone know how I can get iptable rules saved from tools-login, I don't have sudo there [13:50:16] milimetric: http://paste.ubuntu.com/7847723/ ("sudo iptables -t nat -L"). [13:50:32] thanks very much scfc_de [13:52:31] hedonil: IIUIC you want to add the default fastcgi.server config if there's no .lighttpd.conf or it doesn't contain a fastcgi.server config? [13:53:17] scfc_de: add the default fastcgi.server config if there's no .lighttpd.conf [13:56:26] hedonil: Then I would replace the "if [[ ! $(cat "$home/.lighttpd.conf" | grep -P '^(?:[ \t]*fastcgi.server[ \t\+\(=]+".php"|#no-default-php)') ]]; then" with an "else" (and remove a "fi"). [13:57:14] scfc_de: ok. will try that. thanks [14:21:12] !log deployment-prep killed hhvm process on deployment-mediawiki01 and 02. init script does not work. [14:21:17] !log deployment-prep killed hhvm process on deployment-mediawiki01 and 02. init script does not work. [14:21:19] Logged the message, Master [14:26:09] 3Wikimedia Labs: Report when an instance has finished its initial Puppet run - 10https://bugzilla.wikimedia.org/68508 (10Tim Landscheidt) 3NEW p:3Unprio s:3enhanc a:3None When someone "builds" an instance, immediately an Echo message gets generated "$USER built an instance [...]". Soon after the insta... [14:35:37] The replication of Wikidata has stopped for Reasonator ... is there something you can do ? [14:51:29] !log tools Removed ignored file /etc/apt/preferences.d/puppet_base_2.7 on all hosts [14:51:32] Logged the message, Master [14:57:04] Coren: would you mind to review a hotfix https://gerrit.wikimedia.org/r/#/c/149015/ [15:26:34] scfc_de: I'm still seeing 'libvips-dev : Depends: libtiff5-alt-dev but it is not installable' on a couple of tools boxes. Do I recall that you understand that problem and didn't like my fix last time? [15:33:42] andrewbogott: One moment, please. [15:33:52] no rush :) [15:43:29] !log deployment-prep Updated hhvm-luasandbox to 2.0-3 and restarted hhvm instances [15:43:31] Logged the message, Master [15:44:12] !log deployment-prep Updated MW config to re-enable luasandbox mode [15:44:15] Logged the message, Master [15:44:59] andrewbogott: Situation is: User (https://bugzilla.wikimedia.org/52717) required inter alia libvips-dev. That has a broken dependency in the WMF apt repo for libtiff5-alt-dev. I worked around that in https://gerrit.wikimedia.org/r/102609/ which got shot down by paravoid. Coren fixed the problem properly in the Git deb repo (https://gerrit.wikimedia.org/r/102617/), but the fixed package was never recompiled and uploaded to WMF apt [15:45:00] (https://rt.wikimedia.org/Ticket/Display.html?id=7852). YuviPanda removed (and you +2ed) the package libvips (!= libvips-dev) with https://gerrit.wikimedia.org/r/146259/. So Puppet is *still* failing because libvips-dev is not installable due to the broken requirement libtiff5-alt-dev, and I reverted (propose to revert) YuviPanda's change with https://gerrit.wikimedia.org/r/146348/. [15:45:50] scfc_de: ok -- do you have a moment to rebuild the fixed package? Or coach me through it? [15:46:24] scfc_de: I'd offer the package rebuilt/etc but I could only get to it by next week atm. [15:46:50] I can maybe do it right now, lemme see how it goes. [15:47:53] andrewbogott: Incidentally, s2, s4 and s5 are now tokudb and federation free. I'm still working some kinks out with Sean but it looks +good. [15:48:10] cool! [15:48:58] Coren: is building that package just a simple 'debuild'? [15:48:59] * gifti waits for double plus good [15:49:22] oh, it's configure/make [15:49:45] andrewbogott: I don't know *how* packages are built in ops (i. e. if you have a special build box); for the most part, it was just checking out the deb repo, downloading the source .tar.gz "from the InterNet" (I assume that's different in ops), running "dpkg-buildpackage -S -sa -rfakeroot -us" to get a source package and then feeding that to pbuilder. I assume there are some more stringent SOPs in ops :-). [15:51:10] MySQL, MariaDB, TokuDB, those names keep changing fast :-). [15:52:36] 3Wikimedia Labs / 3deployment-prep (beta): HHVM: crashes with "boost::program_options::invalid_option_value" exception - 10https://bugzilla.wikimedia.org/68413#c8 (10Bryan Davis) Updated beta servers to hhvm-luasandbox 2.0-3 build, changed config back to luasandbox and restarted hhvm fcgi container. Still se... [15:52:56] scfc_de: tl;dr: MySQL you know. MariaDB->fork of MySQL we use. TokuDB->the actual database engine now in use for the replicas, and the reason we switch from MariaDB 5 to MariaDB 10. [15:53:13] http://www.tokutek.com/products/tokudb-for-mysql/ [15:53:39] Replicas were InnoDB before, TokuDB now. [15:54:29] andrewbogott: Yeah, you should be able to just debuild it and push it to our repo. Worst case, we can stuff it in the tools' local repo even. [15:54:54] andrewbogott: But since prod uses the package with the broken dependency too, I think, the better fix is to really send it to our apt repo. [15:55:23] Yeah, I'm assuming I'll update the version on carbon once it's built [16:09:48] !log deployment-prep Reverted MW config to re-enable luasandbox mode; back to luastandalone for now [16:09:51] Logged the message, Master [16:15:24] 3Wikimedia Labs / 3deployment-prep (beta): beta labs no longer listens for HTTPS - 10https://bugzilla.wikimedia.org/68387#c7 (10Chris McMahon) Zelko had this issue also. [16:37:02] Coren: what is lab's IP and is it constant. [16:37:28] Define "lab's IP"? There are very many of them. [16:37:56] The external IP that is visible to the public. [16:38:51] It depends a lot. You mean for outbound connections? Tool labs has several possible (every grid node has one); tools-login and tools-dev each have one, and there is a default NAT for most everything else. [16:40:16] Coren: Alright, say my user analysis tool (supercount) web tool queries another site. What IP will that site see? [16:41:40] If it's a web tool, it'll get the default NAT. 208.80.155.255 [16:41:52] Is it static? [16:41:55] * andrewbogott back after lunch [16:42:57] Coren: is the IP static? [16:43:39] CP678|iPhone: Yes. [16:43:55] Coren: thanks. :-) [17:02:13] Coren: would you mind https://gerrit.wikimedia.org/r/#/c/148977/ and add theses entries /data/project/.system/config/ ? [17:02:55] Coren: I could quit my gerrit excursion then ;) [17:03:48] Coren: these entries https://tools.wmflabs.org/paste/view/70e69994 [17:23:51] (03PS4) 10Tim Landscheidt: webservice: Enable individual memory limits [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [17:24:43] (03CR) 10Tim Landscheidt: "Added the Debian packaging tinsel and fixed the "c" in the commit message." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [18:23:52] scfc_de: ok, I've built many a deb but apparently this is different… I can configure and make the software, but when I build the debian it says 'no upstream tarball found' [18:24:19] which… what does it mean by 'upstream'? Why isn't it just making a package out of the thing I just built? [18:29:32] (03CR) 10coren: [C: 032] "Sane." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/148977 (https://bugzilla.wikimedia.org/68431) (owner: 10Hedonil) [18:32:40] hm… Coren, same question. do I really need an 'upstream tarball' to build this? [18:32:44] andrewbogott: That's what I meant by "downloading the source .tar.gz 'from the InterNet' (I assume that's different in ops)". *I* went to the libvips project page, downloaded the .tar.gz for the version that was to be packaged and placed it in -- I think, ".." (wherever dpkg-* expects it). But I assume that in ops those sources are already available somewhere. [18:33:08] Oh... [18:33:09] hm [18:33:21] I assumed that what I checked out from gerrit /was/ the source. And the debian files. [18:33:29] Since it seems to contain code which builds... [18:33:44] andrewbogott: It's /our/ source; but building a deb needs to make diffs versus upstream. [18:34:14] Really? Is that something that all debs do or is this one just configured to track the diff? [18:35:13] And, in this context, what is 'upstream'? Just the most recent version that we packaged? [18:38:05] andrewbogott: That's part of how the whole deb thing works. Upstream should be the most recent version we packages, yes; the one whose version preceedes the -wmf* [18:39:01] I am feeling slightly disoriented by having built .debs for a decade and never encountering this before :( [18:39:26] ... what? [18:39:33] exactly! [18:39:49] So, anyway, it's asking me for a tarball and not a .deb. So I can't literally provide it with the older package, right? [18:40:15] Wait, that's the difference between building one's own package vs building one against an upstream :-) [18:40:26] And yes, it wants a tarball. [18:42:48] And, ok, still feeling stupid… where would that tarball come from? The nice folks at ubuntu provide .debs, do they also provide tarballs just in case I want to do this? [18:43:18] andrewbogott: Yes, you shoul have a link to the .tar.gz.orig from the package page. [18:44:45] I think you can also --no-tgz-check to debuild if you don't need to build the source deb, but I don't recall if that has other impacts. [18:45:08] Also, .orig.tar.gz not .tar.gz.orig [18:47:16] So, does that mean download vips-7.32.3.tar.gz and rename it to vips-7.32.3.orig.tar.gz, or is the .orig. something different? [18:48:05] * andrewbogott clearly has a lot of reading to do [18:48:41] Hah! Utopic Unicorn. Ryan's own! [18:49:18] andrewbogott: Just write down a nice 1., 2., 3. afterwards for the next guy :-). [18:49:28] Are there really no non-fictional U animals? Seems like they're breaking new ground. [18:50:35] andrewbogott: The download-and-rename shoudl work provided that is the exact upstream version this was built with. [18:50:45] ok... [18:50:52] andrewbogott: Alternately, try --no-tgz-check. :-) [18:52:56] Coren: Ah great. mind also updating /data/project/.system/config/ with new settings files https://tools.wmflabs.org/paste/view/70e69994 ? [18:53:10] Coren: I'd be all set. Then. I think. [18:53:19] ;-) [18:55:47] Coren: Woo. thanks [18:57:28] bigbrother has now 6 new customers. hopefully he's watching with lovin' grace [18:57:35] I haven't yet rebuilt the package yet, soon. [19:08:58] how to resolve "Enable irc feed for wikitech.wikimedia.org site" ?:) [19:09:37] can help on https://bugzilla.wikimedia.org/show_bug.cgi?id=34685 [19:11:44] andrewbogott: ping? did we get a deploy window for OAuth on wikitech? [19:12:19] YuviPanda: I haven't thought about it a lot. Is there more involved than just adding a line to the config? [19:12:28] I had the impression from early conversation that it was complicated... [19:13:34] andrewbogott: I... don't think it's more complicated. Add a couple of lines of config + hand out the user right [19:13:42] Reedy: ^ OAuth config? I don't think it's that messy [19:14:26] I'd have to check [19:14:42] andrewbogott: Wikitech should probably catch up with production again sometime ;) [19:14:57] considering it's still 1.23 (wmf22) [19:15:13] Reedy: that is never as easy as it sounds :( [19:16:30] haha, yeah :( [19:17:01] Reedy: andrewbogott OAuth says 1.22+ tho [19:17:32] Yeah, it's just had some updates and such [19:17:34] andrewbogott: http://p.defau.lt/?gF8gsZ_9D4g7MbqzCaY8Xw [19:17:38] That might be enough to begin with [19:17:39] yeah [19:22:04] Reedy, that sets up wikitech as a consumer or a provider? [19:22:08] Maybe I'm confused about which is which [19:22:48] Nice new feature: Query State: Queried about 2130000 rows [19:22:50] provider [19:22:52] provider [19:23:08] ok [19:34:48] Woo! \o/ and *unprivileged* show explain - command for long running queries. Incredible. [19:35:13] hedonil: does EXPLAIN SELECT WORK? [19:35:14] err [19:35:15] work? [19:35:39] YuviPanda: Explain itself didn't work for me [19:35:51] hedonil: heh, that's sad [19:35:54] stupid mysql [19:36:09] YuviPanda: yeah privilege thingy.. [19:37:50] YuviPanda: if you have long running query (> ~ 5 sec or so), you can issue > show explain for 179739; <- running thread [19:42:39] provides the infos you want to inspect, finally! http://tools.wmflabs.org/tools-info/misc/show_explain.txt [19:43:33] Yeah, that stupid rule about "can't explain unless you have select on all the underlying tables" is still there. I mean, otherwise it's a HUGE breach of security, you /might/ be able to estimate the number of rows in a table because of the query plan! [19:44:30] Coren: yeah, at worst should've made that another grantable thing [19:48:11] There will be an even more accurate statement https://mariadb.com/kb/en/mariadb/mariadb-documentation/sql-commands/administration-commands/analyze-statement/ [19:48:24] unfortunately 10.1 [19:48:58] hedonil: ah, nice [19:49:03] hedonil: do you know when that's supposed to be out? [19:50:16] YuviPanda: I think it is out, but labs is currently 10.0.11 [19:50:29] YuviPanda: yep: Release date: 30 Jun 2014 [19:53:47] Nonetheless, Show explain is a great leap. definitively. [20:05:19] !help [20:05:19] !documentation for labs !wm-bot for bot [20:05:26] hm. why isn't wm-bot in the #salt channel? [20:05:30] !help [20:05:30] !documentation for labs !wm-bot for bot [20:05:37] -_- [20:05:40] !wm-bot [20:05:41] http://meta.wikimedia.org/wiki/WM-Bot [20:06:13] This channel is already in db [20:06:13] @add #salt [20:06:21] @part #salt [20:06:29] @join #salt [20:07:08] -_- [20:07:14] petan: ^^ what's going on? [20:07:36] @join #salt-devel [20:08:08] !help [20:08:08] !documentation for labs !wm-bot for bot [20:08:14] This channel is already in db [20:08:14] @join #salt [20:08:33] This channel is already in db [20:08:33] @join #salt-devel [20:08:51] @part #salt-devel [20:10:07] @part #salt [20:10:16] @add #salt [20:14:31] Hoi Coren ... any update for the dumps ? [20:15:02] Chris sez: "wipes are currently in progress, probably be done today" [20:15:42] Lots 'o' disks to wipe; this is a big server with three shelves of disks. [20:16:45] Coren: how much storage are we getting? [20:17:29] YuviPanda: *way* more than we need. I don't have the exact specs yet but on the order of 120T [20:17:50] can someone restart wmbot? [20:17:54] it's obviously broken [20:17:59] it's not in any of the channels its supposed to be in [20:18:05] and hasn't been for days [20:18:17] https://wikitech.wikimedia.org/wiki/Wm-bot [20:19:13] 3Wikimedia Labs / 3wikitech-interface: Enable irc feed for wikitech.wikimedia.org site - 10https://bugzilla.wikimedia.org/34685#c11 (10Daniel Zahn) i can confirm that virt1000 (wikitech) can talk UDP to argon. checked with netcat the wiki config of wikitech is not currently puppetized though and it would ne... [20:19:59] 3Wikimedia Labs / 3wikitech-interface: Enable irc feed for wikitech.wikimedia.org site - 10https://bugzilla.wikimedia.org/34685#c12 (10Daniel Zahn) config is at: root@virt1000:/srv/org/wikimedia/controller/wikis/config/ Debug.php Local.php oauth.sql Private.php Settings.php [20:21:42] could I get a picture of that server .. [20:21:43] Coren: cool :) [20:21:46] as an illustration [20:22:14] ssh bots-labs [20:22:19] ssh_exchange_identification: Connection closed by remote host [20:23:13] hm. I wonder if it OOM'd [20:28:18] Ryan_Lane: AFAIUI, wm-bot still runs in the Bots project, so I assume you're still root there? [20:28:41] (I don't think anyone apart from petan really knows the setup, though.) [20:28:42] I don't have ssh access from here [20:28:42] Looks to me like it lives on instance wm-bot [20:28:46] 3Wikimedia Labs / 3wikitech-interface: Enable irc feed for wikitech.wikimedia.org site - 10https://bugzilla.wikimedia.org/34685 (10Andre Klapper) a:5Ryan Lane>3None [20:28:49] bots-labs doesn't exist anymore [20:29:08] but I need it working for the #salt channel [20:29:40] I can log into that box but don't know how wm-bot is handled... [20:29:45] doesn't seem to have an init.d [20:29:49] * andrewbogott looks [20:30:11] mono wmib.exe .. wow [20:30:35] yeah, it's a C# project [20:32:34] killed it [20:32:40] and some restart.sh brought it back [20:33:00] eh.. "Bot is using external bouncers. That means, you first start the bouncers, then you start restart script which keep core running. The core is connected to bouncers, which are connected to network. In case you need to reboot the core, bot stays connected to network. " .. shrug [20:33:30] mutante, are you working on this? If so I'll stand back... [20:33:38] the directions on petan's page say to kill restart.sh first [20:34:06] i only see that in the "complete shutdown" part [20:34:18] ah [20:35:32] eh, andrewbogott but i did not do this part: [20:35:38] wm-bot 23814 0.0 0.0 4408 660 ? S Jun07 0:00 sh restart.sh [20:35:45] -bash: kill: (23814) - No such process [20:36:32] which instructions are we following? [20:36:32] If you need to restart the bot because you changed the binaries and you need it to completely reload, you just need to execute "halt" or @restart in any channel [20:36:40] Permission denied [20:36:40] @restart [20:36:54] Ryan_Lane: do you have permissions for that maybe? [20:37:05] Permission denied [20:37:05] @restart [20:37:07] nope [20:37:12] I think only petan does [20:49:28] the user apparently changed from wmib to wm-bot [20:49:29] there we go [20:49:33] Ryan_Lane: try again [20:49:42] Ryan_Lane: security? [20:49:50] Permission denied [20:49:50] @join #salt [20:50:13] This channel is already in db [20:50:13] @join #salt [20:50:18] This channel is already in db [20:50:18] @join #salt-bot [20:50:28] hrrmm [20:50:33] working now [20:50:36] oh it is? nice [20:50:40] it's joining all the channels slowly [20:51:06] whereever it says "wmib", replace with "wm-bot" then [20:51:43] and startbouncer.sh as the right user [20:52:44] thanks! [20:53:02] yw [20:53:18] !log tools Set SGE "mailer" parameter again for bug #61160 [20:53:20] Logged the message, Master [20:53:47] * Ryan_Lane waves [21:04:35] has anybody an idea why my program gets a wrong ip for s3.labsdb? jdbc:mysql://s3.labsdb:3306/ Can't connect to MySQL server on '10.64.37.4' . problem exists on execution hosts and also on tools-dev, but other bash tools like host or nslookup are returning correct ip 192.168.99.3 [21:31:07] GerardM-: You'd have to ask cmjohnson1; he's the one at the DC [21:32:41] Merlissimo: I'm guessing the JDBC is trying to be too smart for its own good; try to use 's3.labsdb.' (note the trailing dot) to avoid it trying to append a domain name? [21:34:52] i'll try your fix. but connection to all others s[1-2,4-7].labsdb is created successfully [21:36:03] no: java.net.UnknownHostException: s3.labsdb.: Name or service not known [21:36:15] 3Wikimedia Labs / 3wikitech-interface: puppetize wikitech wiki configs - 10https://bugzilla.wikimedia.org/68535 (10Daniel Zahn) 3NEW p:3Unprio s:3normal a:3None puppetize the mediawiki config of the wikitech wiki, so we can apply changes via gerrit and not live hack. There are a couple files though... [21:36:19] * Coren boggles a little. [21:36:43] anyone know how the hhvm debugger works? I see its enabled in mediawiki-vagrant, but when i load it up via `hhvm --debug-port 8091` and use `break start`, it says 'Breakpoint 1 set start of request' but running requests through the hhvm server at that point doesn't result in my breakpoint triggering [21:38:07] Merlissimo: Sorry, I have no idea. No tool I try seems to treat s3 differently, and the /etc/hosts file never mentions the underlying addresses. I don't even know what could possibly return the 10.64.37.4 address. [21:38:11] reverse lookup for 10.64.37.4 is labsdb1002.eqiad.wmnet [21:38:40] mutante: did you update the wm-bot docs accordingly? [21:39:01] andrewbogott: yes [21:39:07] thanks! [21:39:10] Merlissimo: Yeah, that's the DB that used to host s2, s4 and s5. In other words, not only are you getting a raw address, but you're getting the /wrong/ raw address [21:39:44] ebernhardson: I set it up for sandbox testing, so it's not really hooked into the live fcgi requests [21:39:58] It's more a replacement for phpsh [21:40:09] maybe i should only wait for this nameserver ttl to expire [21:40:48] Merlissimo: That's just it - no nameserver has ever had that name in them; that's strictly through hosts files. [21:40:52] ebernhardson: If you want to poke at the config and figure out how to inspect the live fcgi server I'd be glad to review config patches. [21:41:43] Merlissimo: And no host file has ever held the raw addresses, only the NATed ones. [21:42:19] I have *no* idea where Java is getting that value from. [21:42:34] i am not really sure because it sounds so strange: java is [21:42:41] seems to reslve ipv6 [21:42:58] and mysql converts it bakc to ipv4 [21:43:18] ... wat. [21:43:38] YuviPanda: are tools-proxy-test and tools-trusty-test still in use? [21:43:48] (I'm just cleaning up puppet on tools and those are the stragglers) [21:44:00] andrewbogott: you can kill them for now [21:44:12] according to debug java.net.Inet6AddressImpl.lookupAllHostAddr is called, but jdbc return that this ipv6 adress is used [21:44:13] Sure? [21:44:18] It's not important, I'm just being compulsive... [21:44:34] i don't know. maybe i should simply sleep about it [21:44:38] andrewbogott: yeah, should be ok... [21:45:00] Merlissimo: That's even more confusing to me because none of labs has any IPv6 support atm, and no host has AAAA rrs [21:45:20] andrewbogott: hey, I had an issue last night with not getting sudo in a new instance I created (twice) in the deployment-prep project [21:45:33] ori just ended up using his sudo to add me to the local sudoers file [21:45:57] legoktm: is there a sudo policy for you, in theory? [21:45:58] (and urged me to ping you guys about it) [21:46:05] the instance was deployment-sentry (now deleted) and now deployment-sentry2 [21:46:12] Generally an instance has to finish its first puppet run before sudo works properly... [21:46:12] andrewbogott: uhhh, I don't know? [21:46:21] I waited like 3 hours [21:46:23] ok, I'll look... [21:46:26] 3 hours should do it :) [21:47:23] thanks [21:47:43] legoktm: what's your username on wikitech? [21:47:46] also re: email on wikitech not working, I think you need to run echo's processEmailBatch.php script [21:47:49] andrewbogott: "legoktm" :P [21:47:59] Ah, well -- [21:48:12] * YuviPanda renames legoktm to ktmehta everywhere [21:48:17] I take it neither you nor ori looked at https://wikitech.wikimedia.org/wiki/Special:NovaSudoer ? [21:48:24] * legoktm looks [21:48:36] woah [21:48:39] I did not know that was a thing [21:48:41] Seems pretty straightforward, and you're not in there. [21:48:46] That's where sudo comes from [21:48:56] (Except for ops who are magically escalated.) [21:49:11] i didn't know that was a thing either [21:49:15] ok :) [21:49:18] It's in the sidebar. [21:49:21] I didn't know that was a thing! [21:49:22] if you get projectadmin you get sudo by default [21:49:26] but that's because of default sudo rule [21:49:28] 'Manage Sudo Policies' [21:49:33] but he is projectadmin [21:49:34] so you usually don't go there at all [21:49:35] hm, I'm a projectadmin though? [21:49:47] hmm, projectadmins should get sudo by default... [21:49:53] maybe that's why i didn't know it existed; i had always just projectadminned folks to grant them sudo and that had always worked [21:50:04] I am not sure that's right, about project admins. [21:50:13] Project admins can set sudo policies [21:50:15] hmm, it's held true so far to me on very different projects... [21:50:19] But I don't think it's automatic. [21:50:30] andrewbogott: maybe the default sudoer stuff gives projectadmins sudo? [21:50:31] New projects have a permissive sudo policy by default. [21:50:40] So unless you locked it down on purpose, all projet members have sudo for everything. [21:50:48] oh [21:50:48] lol [21:50:52] I... didn't know that. [21:51:37] The spirit of collaboration, etc :) [21:51:58] heh, so I guess I just added people to projectadmin and assumed that gave them sudo, while they actually had sudo anyway [21:52:29] so, legoktm, the right solution is for you to add yourself to a group on that page, and remove the by-hand change. That'll get things set right on all instances. [21:52:57] ok, will do [21:53:22] it's a bit funny that I can manipulate sudo groups but not use sudo itself :P [21:54:53] bd808: ok i'll poke around [21:55:06] ebernhardson: Cool [21:56:24] ebernhardson: If you're in the SF office, Brett Simmers may be able to help [21:59:19] andrewbogott: added myself to the group and removed the local override, and still working. thanks for the pointer! [21:59:28] cool [21:59:41] I would've been uncertain about that page working, except I just tested it a bunch earlier in the week :) [22:00:32] Coren: i added extra code to resolving ip and now jdbc gets ip instead of hostname. this works [22:01:01] Merlissimo: It's still annoying and confusing. I wish I knew what was going wrong. [22:01:49] btw: dewiki replication has stopped [22:02:49] Merlissimo: I'll have Sean take a look when he wakes up. [22:02:57] There may still be kinks in the new setup. [22:03:18] seems to raise again : 20140724215303 [22:03:48] Might just have been someone holding a lock during a query. [22:04:40] was 17 minutes lag before [22:06:00] 3Wikimedia Labs / 3tools: libvips-tools, libtiff etc install - 10https://bugzilla.wikimedia.org/52717 (10Tim Landscheidt) 5PATC>3ASSI [22:18:02] 3Wikimedia Labs / 3tools: view globalimagelinks missing at db commonswiki_p on new mariadb 10 sql server - 10https://bugzilla.wikimedia.org/68505#c1 (10merl) 5NEW>3RESO/FIX fixed by Marc [22:21:44] Coren: can you approve a shell request? username: Arkiver [22:22:38] Nemo_bis: {{done}} [22:23:33] thanks [22:32:44] 3Wikimedia Labs / 3tools: Performance problem on database server s5 using commonswiki - 10https://bugzilla.wikimedia.org/67602#c6 (10Marc A. Pelletier) @merl; how did the transition to MariaDB 10 go? [22:52:57] 3Wikimedia Labs / 3tools: Performance problem on database server s5 using commonswiki - 10https://bugzilla.wikimedia.org/67602#c7 (10merl) After the change to mariadb10 nearly all of my script always failed with one of two different errors: ERROR 2013 (HY000): Lost connection to MySQL server during query ERR... [23:02:58] * ebernhardson got absolutly nowhere with the hhvm debugger ... will have to poke brett when hes in the office [23:50:56] How do I send an email to an address from tools? [23:51:23] "echo -e "Subject: Test message subject\n\nTest message" | /usr/sbin/exim -odf -i user@example.com" didn't do what I expected to do [23:51:31] scfc_de: ^ maybe you know? [23:52:41] I guess I could do "toolname.anything@tools.wmflabs.org" [23:52:46] and create ~/.forward.anything [23:58:11] legoktm: "echo Test | mail -s Test tim@tim-landscheidt.de" works for me from tools-login's command line. I saw something about sending mail in jobs that end immediately afterwards, but I don't remember the details. What wasn't what you expected it to do? [23:59:12] I'm trying echo Test | mail -s Test legoktm@wikimedia.org [23:59:25] but it goes to legoktm.wikipedia@gmail.com which is my email on wikitech