[01:53:56] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2400928 (10AlexMonk-WMF) a:03AlexMonk-WMF I'm having a go at this. > It blows up and rebuilds all wikis on every run. It truncates the meta_p.wiki... [02:51:38] apparently this jenkins job is stalled? https://integration.wikimedia.org/ci/job/tox-jessie/8954/ (ping enterprisey) [07:18:55] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: adywiki and jamwiki are missing the associated *_p databases with appropriate views - https://phabricator.wikimedia.org/T135029#2401040 (10jcrespo) labsdb1002 will never get fixed. [08:07:34] OH-: I aborted it. [08:09:52] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: adywiki and jamwiki are missing the associated *_p databases with appropriate views - https://phabricator.wikimedia.org/T135029#2401109 (10Jdforrester-WMF) 05stalled>03Resolved a:03Jdforrester-WMF In that case, I'm declaring this fixed. [08:09:55] 06Labs, 10labs-sprint-116, 10DBA, 13Patch-For-Review: Make watchlist table available on labs - https://phabricator.wikimedia.org/T59617#2401114 (10Jdforrester-WMF) [08:22:12] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2400728 (10jcrespo) I have to add a view to a newly created labs-only table, so it is created for new wikis, too: ``` MariaDB L... [08:22:17] (03PS2) 10Lokal Profil: Add wikidata connection to monuments_all and Qid tester to updater [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295594 (https://phabricator.wikimedia.org/T55808) [08:23:26] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2401137 (10jcrespo) [08:23:29] 06Labs, 10labs-sprint-116, 10DBA, 13Patch-For-Review: Make watchlist table available on labs - https://phabricator.wikimedia.org/T59617#2401136 (10jcrespo) [08:24:48] (03CR) 10Lokal Profil: "This is just a quick fix to make these slightly easier to find." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295579 (owner: 10Lokal Profil) [08:57:14] 06Labs: Nova_Resource:Puppet.privpol-captcha.eqiad.wmflabs will not go away - https://phabricator.wikimedia.org/T138417#2401204 (10Andrew) 05Open>03Resolved a:03Andrew There is, indeed, a hook, deep in the guts of designate-sink. I just tested this, and it worked fine for me: https://wikitech.wikimedia.o... [09:12:46] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Nodepool has trouble taking snapshots on OpenStack labs - https://phabricator.wikimedia.org/T138106#2401255 (10hashar) Real way to reproduce what Nodepool is doing would be: ``` ssh labnodepool1001.eqiad.wmnet become-nodepool nodepool im... [09:42:14] !log tools.heritage Added the fa Wikipedia account for Pywikibot. This should fix the broken unused image job [09:42:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [09:44:44] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Nodepool has trouble taking snapshots on OpenStack labs - https://phabricator.wikimedia.org/T138106#2401318 (10Andrew) When I try those commands, it gets stuck on 2016-06-23 09:44:02,341 INFO urllib3.connectionpool: Starting new HTTP co... [09:59:30] 06Labs, 10Labs-Infrastructure, 10Continuous-Integration-Infrastructure: Nodepool has trouble taking snapshots on OpenStack labs - https://phabricator.wikimedia.org/T138106#2401389 (10hashar) Yes sir! Apparently looping forever trying to `GET /v2/contintcloud/images/` which raises a 404 cause the sna... [09:59:38] andrewbogott: hello :-) [09:59:46] hi! [09:59:57] andrewbogott: if you are attending Wikimania, the task about Nodepool not being to take snapshot can wait :} [10:00:16] as you stated, if labs infra currently is over capacity, that is totally understandable [10:00:20] Well, I have 30 minutes of downtime now… after that I'll be distracted [10:00:30] and I guess the real fix is going to take way longer than an half hour hack while attending the conf :} [10:00:32] Well… it really shouldn't 404 in that case [10:01:01] andI tried manually using the openstack command line, apparently it works [10:01:03] :( [10:01:08] huh [10:01:22] I should try again. Only tried once yesterday and it passed [10:01:27] might be a false positive [10:01:52] if I hammer the nodepool image update command, after a while it eventually manage to take a snapshot. But I refrain from doing that in fear it add loads / stress the infra [10:02:50] either glance is wild or compute nodes are somehow unable to snapshot (maybe due to disk) [10:03:07] the nova-api.log doesn't have any useful info (it does not log the response) [10:04:49] I tried to turn on some more logging, not sure if it made a difference though [10:05:09] I ended up using strace [10:05:41] strace -f -e recvfrom,sendto -s 1024 nodepool image-update wmflabs-eqiad ci-jessie-wikimedia [10:05:52] that is how I have filled the json response on https://phabricator.wikimedia.org/T138106 [10:06:09] apparently there is a snpahost attempt going on [10:06:20] with the instance being ACTIVE openstack server show ci-jessie-wikimedia-1466676088 [10:06:47] it is never going to complete though since apparently the snapshot requests failed somehow and there is no image [10:06:55] so the URL to request the image status always 404 [10:06:59] (03PS1) 10Lokal Profil: Fix issus with Iranian monuments in Farsi [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295645 (https://phabricator.wikimedia.org/T138377) [10:16:37] poor rabbitmq :D [10:23:17] hashar: I've learned nothing, as you predicted [10:29:50] :-( [10:45:03] (03PS1) 10Lokal Profil: Add wd_item mappings for datasets on nl.wikipedia [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295653 [10:48:53] (03CR) 10Lokal Profil: "Just checking. This patchset still had the database access issues right?" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/291198 (https://phabricator.wikimedia.org/T136351) (owner: 10Jean-Frédéric) [10:49:38] andrewbogott, there's a couple of new instance pages that haven't gone away when the instances were deleted, https://wikitech.wikimedia.org/wiki/Nova_Resource:Captcha-web-04.privpol-captcha.eqiad.wmflabs and https://wikitech.wikimedia.org/wiki/Nova_Resource:Captcha-web-03.privpol-captcha.eqiad.wmflabs need to be deleted. This happened after I opened that task. :/ [10:59:34] (03CR) 10Lokal Profil: "The downside of the above suggestion is that you could no longer watchlist your nice empty tables to catch when something has gone wrong.." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295579 (owner: 10Lokal Profil) [12:01:16] tom29739: I wiped those pages as well… we'll have to see if things are still leaking. [12:01:54] Thanks. [12:23:25] !log privpol-captcha Fixed Hiera config making puppet create a public key file on the salt minions without a new line at the end of the file that was preventing the salt minion service from starting. [12:23:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Privpol-captcha/SAL, Master [12:27:00] (03CR) 10Multichill: [C: 032] "This should work" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295645 (https://phabricator.wikimedia.org/T138377) (owner: 10Lokal Profil) [12:39:38] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2401809 (10jcrespo) >> It blows up and rebuilds all wikis on every run. >It truncates the meta_p.wiki table but it doesn't drop... [12:48:04] (03CR) 10Jean-Frédéric: [C: 032] Fix issus with Iranian monuments in Farsi [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295645 (https://phabricator.wikimedia.org/T138377) (owner: 10Lokal Profil) [12:48:43] (03CR) 10Jean-Frédéric: [C: 032] "Sounds useful. We can still transclude everything in one page if needed. FWIW, I do have some of these pages in my watchlist ;)" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295579 (owner: 10Lokal Profil) [12:50:32] (03CR) 10Jean-Frédéric: [C: 031] "Looks good to me, one question though." (031 comment) [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295653 (owner: 10Lokal Profil) [12:53:00] (03CR) 10Jean-Frédéric: [C: 032] Add wikidata connection to monuments_all and Qid tester to updater [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295594 (https://phabricator.wikimedia.org/T55808) (owner: 10Lokal Profil) [12:53:20] (03CR) 10Jean-Frédéric: [C: 032] "Oh, I read the dependency in the wrong sense. Approving :)" [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295653 (owner: 10Lokal Profil) [12:54:50] (03Merged) 10jenkins-bot: Fix issus with Iranian monuments in Farsi [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295645 (https://phabricator.wikimedia.org/T138377) (owner: 10Lokal Profil) [13:08:25] (03CR) 10Jean-Frédéric: "No, it should be fine − just takes forever to load the SQL file to the database when starting the container." [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/291198 (https://phabricator.wikimedia.org/T136351) (owner: 10Jean-Frédéric) [13:08:26] (03CR) 10Jean-Frédéric: [C: 032] Enable PHP CodeSniffer with MediaWiki preset [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/293889 (https://phabricator.wikimedia.org/T134764) (owner: 10Jean-Frédéric) [13:09:13] (03Merged) 10jenkins-bot: Add category to "Unknown fields" reports [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295579 (owner: 10Lokal Profil) [13:10:28] (03Merged) 10jenkins-bot: Add wikidata connection to monuments_all and Qid tester to updater [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295594 (https://phabricator.wikimedia.org/T55808) (owner: 10Lokal Profil) [13:12:05] (03Merged) 10jenkins-bot: Add wd_item mappings for datasets on nl.wikipedia [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295653 (owner: 10Lokal Profil) [13:14:17] (03CR) 10jenkins-bot: [V: 04-1] Enable PHP CodeSniffer with MediaWiki preset [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/293889 (https://phabricator.wikimedia.org/T134764) (owner: 10Jean-Frédéric) [13:23:33] (03PS2) 10Jean-Frédéric: Enable PHP CodeSniffer with MediaWiki preset [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/293889 (https://phabricator.wikimedia.org/T134764) [13:24:49] (03CR) 10Jean-Frédéric: [C: 032] Enable PHP CodeSniffer with MediaWiki preset [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/293889 (https://phabricator.wikimedia.org/T134764) (owner: 10Jean-Frédéric) [13:39:19] (03Merged) 10jenkins-bot: Enable PHP CodeSniffer with MediaWiki preset [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/293889 (https://phabricator.wikimedia.org/T134764) (owner: 10Jean-Frédéric) [13:58:07] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: adywiki and jamwiki are missing the associated *_p databases with appropriate views - https://phabricator.wikimedia.org/T135029#2402068 (10Krenair) a:05Jdforrester-WMF>03ori [14:01:37] !log tools.heritage Deployed latest from Git: 4030533, bb95d23 (T55808), bd96bbd (T138377), 0a3247d, be9b1a9 (T134764) [14:01:40] T134764: Enable PHP CodeSniffer on Tools.Heritage - https://phabricator.wikimedia.org/T134764 [14:01:40] T55808: Add wd_item to the database - https://phabricator.wikimedia.org/T55808 [14:01:41] T138377: Add Iran in Farsi to the Monuments Database - https://phabricator.wikimedia.org/T138377 [14:01:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [14:05:16] (03PS8) 10Jean-Frédéric: Add local dev environment with docker-compose [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/291198 (https://phabricator.wikimedia.org/T136351) [14:24:40] !log tools.heritage Monuments API currently down because of PHP 5.5 syntax, and host running 5.3 [14:24:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [14:27:55] !log tools.heritage Stopped webservice, restarted and tying on trusty (`webservice --release=trusty start`) [14:27:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [14:33:07] !log tools.heritage Running 'ALTER TABLE `monuments_all` ADD COLUMN `wd_item` varchar(255) DEFAULT NULL;' ; taking a while... [14:33:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [14:54:21] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2402245 (10AlexMonk-WMF) >>! In T138450#2401809, @jcrespo wrote: >>> It blows up and rebuilds all wikis on every run. >>It trunc... [14:55:53] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2402249 (10jcrespo) > Who in the ops group could be its 'sole owner'? No one else has any access to these systems. Maybe labs a... [15:06:23] yuvipanda hi do you know how i can use ssh in jenkins please. I want to try and ssh into the same instance as my test jenkins to test ssh before using gerrit but it dosent seem to be working. [15:06:37] http://gerrit-jenkins.wmflabs.org/computer/Test/ [15:11:33] fyi labsen: labmon1001 just went down for the data copy and reimage [15:11:59] paladox: yuvi is at wikimania so his replies in irc may be appropriately delayed [15:12:14] robh, oh, thanks for replying. [15:33:17] !log tools.heritage Added column wd_item to monuments_all, by copying monuments_all to tmp, alter table, and rename back to avoid locks. (T55808) [15:33:19] T55808: Add wd_item to the database - https://phabricator.wikimedia.org/T55808 [15:33:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.heritage/SAL, Master [16:07:55] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2402468 (10chasemp) I'm 100% on board for being on the hook for this process, or at least being a partner. We can coparent :)... [16:09:50] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2402474 (10chasemp) p:05Triage>03High [16:42:38] (03CR) 10Lokal Profil: [C: 032] Add local dev environment with docker-compose [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/291198 (https://phabricator.wikimedia.org/T136351) (owner: 10Jean-Frédéric) [16:43:36] (03Merged) 10jenkins-bot: Add local dev environment with docker-compose [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/291198 (https://phabricator.wikimedia.org/T136351) (owner: 10Jean-Frédéric) [17:02:07] (03PS1) 10Lokal Profil: Add two missing fields to i18n [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/295736 [17:44:18] 06Labs, 10Labs-Infrastructure, 10DBA, 06Operations, and 2 others: adywiki and jamwiki are missing the associated *_p databases with appropriate views - https://phabricator.wikimedia.org/T135029#2286195 (10ksmith) Thanks to everyone who helped get this unstuck and fixed! [18:08:59] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2402864 (10RobH) So I went ahead and got the files copied back to usb, there was a huge list of differences but in the screen session I wasnt able to scroll up to copy/paste. The secondary run to pick up ch... [18:58:36] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2402954 (10RobH) [19:01:55] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2400728 (10scfc) I once thought of a tool that does something like `diff -u <(mysqldump --no-data) <(what-views-and-triggers-and... [19:05:21] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2402968 (10jcrespo) @scfc redactatron is a horrible piece of software and we do not want to expand it, but kill it. It has its f... [19:17:20] labmon1001 update: host is in the process of restoring its archived data, once complete services will be tested and restored. (hopefully!) [19:37:56] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2403017 (10RobH) System's been reinstalled and data restoration is in progress: 40.21G 99% 31.87MB/s 0:20:03 (xfr#94363, ir-chk=1011/111912) 298G total, and 40GB in 20 minutes 298 / 40 = 7.... [19:38:07] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2403018 (10RobH) [20:05:07] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2403090 (10AlexMonk-WMF) >>! In T138450#2401133, @jcrespo wrote: > I have to add a view to a newly created labs-only table, so i... [20:20:15] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2403157 (10AlexMonk-WMF) @jcrespo: I was wrong in my last comment and have uploaded https://gerrit.wikimedia.org/r/295751 which,... [20:25:18] sweet man [20:35:18] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2403187 (10AlexMonk-WMF) >>! In T138450#2402468, @chasemp wrote: > * For [[ https://phabricator.wikimedia.org/T135029#2400629 |... [20:47:57] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2403204 (10jcrespo) @scfc BTW, I actually documented [[ https://wikitech.wikimedia.org/wiki/MariaDB/Sanitarium_and_Labsdbs | red... [20:48:18] Q1 goals (July-Sept) for my Tool Labs support work now published at https://www.mediawiki.org/wiki/Wikimedia_Engineering/2016-17_Q1_Goals#Community_Tech [21:08:27] awesome [21:08:40] ours fall under ops and say draft but are pretty set https://www.mediawiki.org/wiki/Wikimedia_Engineering/2016-17_Q1_Goals#Technical_Operations [21:12:53] chasemp: no wonder andrewbogott said he wanted to ask me django questions. :) [21:13:01] heh [21:13:21] all aboard the horizon train choo choo [21:23:09] 06Labs, 10Labs-Infrastructure, 10DBA, 07Blocked-on-Operations, 13Patch-For-Review: maintain-replicas.pl unmaintained, unmaintainable - https://phabricator.wikimedia.org/T138450#2403326 (10ori) >>! In T138450#2403187, @AlexMonk-WMF wrote: >>>! In T138450#2402468, @chasemp wrote: >> * For [[ https://phabri... [21:24:32] 06Labs, 10Tool-Labs, 06Community-Tech-Tool-Labs, 15User-bd808: Develop vision and roadmap for Tool Labs enhancements - https://phabricator.wikimedia.org/T132610#2204284 (10bd808) 05Open>03Resolved The vision document on meta will be a living document for the foreseeable future, but the basic outline is... [21:45:34] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2403402 (10RobH) Not sure why I did all that math with 298GB, its 893GB. we just hit 300GB. [21:51:36] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2403415 (10RobH) There is a root owned screen session running with the command of: rsync -ah --append-verify --info=progress2 /media/backup/carbon/whisper/ /srv/carbon/whisper/ This looks as if it will tak... [22:01:01] 06Labs, 10Labs-Infrastructure: Copy labmon data to new SSDs - https://phabricator.wikimedia.org/T137924#2403465 (10RobH) Sent to labs list: > Please note that the expended downtime for labmon1001 has been extended to 2016-06-24 @ 17:00 GMT. Details are noted on https://phabricator.wikimedia.org/T137924. >...