[00:05:00] Project beta-update-databases-eqiad build #10301: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10301/ [01:05:00] Project beta-update-databases-eqiad build #10302: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10302/ [02:05:00] Project beta-update-databases-eqiad build #10303: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10303/ [03:05:00] Project beta-update-databases-eqiad build #10304: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10304/ [04:05:00] Project beta-update-databases-eqiad build #10305: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10305/ [04:18:42] Yippee, build fixed! [04:18:43] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #91: 09FIXED in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/91/ [05:05:00] Project beta-update-databases-eqiad build #10306: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10306/ [06:05:00] Project beta-update-databases-eqiad build #10307: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10307/ [07:05:00] Project beta-update-databases-eqiad build #10308: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10308/ [08:05:00] Project beta-update-databases-eqiad build #10309: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10309/ [08:28:45] !log deploying fedd675 to ores in sca03 [08:28:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [08:29:04] wtf [08:29:07] keeps aborting? [08:29:21] > 07:20:15 ...Update 'cleanup empty categoriBuild timed out (after 45 minutes). Marking the build as aborted. [08:29:24] PROBLEM - Puppet staleness on deployment-ms-be01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [43200.0] [08:39:23] Aand celery doesn't work in ores in beta [09:05:00] Project beta-update-databases-eqiad build #10310: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10310/ [10:05:00] Project beta-update-databases-eqiad build #10311: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10311/ [11:01:13] 10Continuous-Integration-Config, 06Release-Engineering-Team, 10DBA, 10Datasets-General-or-Unknown, and 3 others: Automatize the check and fix of object, schema and data drifts between production masters and slaves - https://phabricator.wikimedia.org/T104459#2511287 (10ArielGlenn) Well at the risk of being... [11:05:00] Project beta-update-databases-eqiad build #10312: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10312/ [11:54:30] 10Continuous-Integration-Config, 06Release-Engineering-Team, 10DBA, 10Datasets-General-or-Unknown, and 3 others: Automatize the check and fix of object, schema and data drifts between production masters and slaves - https://phabricator.wikimedia.org/T104459#2511355 (10jcrespo) Do not link to the testing, w... [12:05:00] Project beta-update-databases-eqiad build #10313: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10313/ [13:02:29] 10Continuous-Integration-Config, 06Release-Engineering-Team, 10DBA, 10Datasets-General-or-Unknown, and 3 others: Automatize the check and fix of object, schema and data drifts between production masters and slaves - https://phabricator.wikimedia.org/T104459#2511457 (10ArielGlenn) Awesome, thanks! [13:05:00] Project beta-update-databases-eqiad build #10314: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10314/ [14:05:00] Project beta-update-databases-eqiad build #10315: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10315/ [14:18:54] 10MediaWiki-Releasing, 06Release-Engineering-Team, 06Operations, 10Parsoid: debian signing keyid E84AFDD2 has expired - https://phabricator.wikimedia.org/T141400#2511561 (10fgiunchedi) 05Open>03Resolved a:03fgiunchedi resolving, key updated on wikitech/mediawiki.org/etc [14:53:32] ostriches hi im wondering if we could get this plugin installed [14:53:33] https://gerrit.googlesource.com/plugins/labelui/ [14:53:54] brings back the table for reviewers, makes it easy again to tell who has v+2 which means they have c+12 [14:53:57] c+2 [14:54:04] only works for logged in users. [14:56:00] Seems needlessly complicated if its just adding a table [14:56:28] ostriches actually it isent you just install the plugin. [14:56:39] and it adds it to the preference section in the users preference [14:56:45] ive tested it on gerrit-test [14:56:59] I'm not saying it's complicated to install a plugin. [14:57:05] I'm saying the plugin is overly complicated. [14:57:12] And it hasn't been maintained for a year.... [14:57:16] Oh [14:57:17] yeh [14:57:25] but it still works with gerrit 2.12.3 [14:57:29] That's nice. [14:57:37] Yep [14:57:51] I also found some plugins that lets us rename projects [14:58:09] Renaming and deleting is destructive. That plugin just means I don't have to do it in the database. [14:58:17] No [14:58:21] what it does is it imports [14:58:29] PROBLEM - Puppet run on deployment-sca02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:58:30] it into a another project. [14:58:33] I understand how it works. [14:58:37] Oh [14:58:42] I'm saying the act of renaming is destructive. [14:58:50] Yeh [14:59:15] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:59:18] ostriches also i have this patch https://gerrit.wikimedia.org/r/#/c/302129/ that fixes phab links [14:59:32] based on someone else regex which i also named who in the comment msg. [15:00:13] 10Deployment-Systems, 03Scap3 (Scap3-Adoption-Phase1), 10scap, 10Analytics-Cluster, and 2 others: Deploy analytics-refinery with scap3 - https://phabricator.wikimedia.org/T129151#2511674 (10thcipriani) >>! In T129151#2499656, @Ottomata wrote: > ​Hm, will this fix all of the permissions and ownership recurs... [15:00:21] but it will only break if it has #1 in it, but i uploaded a follow up here https://gerrit.wikimedia.org/r/#/c/302229/ that supports #1 links [15:00:24] ostriches ^^ [15:01:13] I have tested it, and i made sure not to break anything, but i carn't be 100% sure. [15:03:07] But using the labelui plugin, it will make reviewing easy again, plus it is optional so users can choose which way they want it [15:03:24] 10Deployment-Systems, 03Scap3 (Scap3-Adoption-Phase1), 10scap, 10Analytics-Cluster, and 2 others: Deploy analytics-refinery with scap3 - https://phabricator.wikimedia.org/T129151#2511680 (10elukey) Update: this task is blocked until the pwstore vault will be usable again to store the new keyholder pass (ho... [15:05:01] Project beta-update-databases-eqiad build #10316: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10316/ [15:31:26] PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: puppet fail [15:57:56] PROBLEM - Puppet run on integration-slave-trusty-1013 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:02:53] ostriches, ive been looing into phab comment support in gerrit [16:02:57] doint we have to also update [16:02:57] this [16:02:58] [trackingid "bugzilla"] [16:02:58] footer = Bug: [16:02:58] match = "\\#?\\d{1,6}" [16:02:58] system = Bugzilla [16:03:09] to support the regex for #1 comments [16:05:00] Project beta-update-databases-eqiad build #10317: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10317/ [16:06:51] Updating match = "\\#?\\d{1,6}" to match = "\\#?\\d{1,6}(#\\d{1,6})?" [16:06:55] ostriches ^^ [16:11:49] ? [16:14:31] 07Browser-Tests, 03Reading-Web-Sprint-78-T: Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2511937 (10Jdlrobson) 05Open>03Resolved a:03Jdlrobson Build 99 is passing https://integration.wikimedia.org/ci/view/Mobile/job/selenium-MobileFrontend/99/ a... [16:14:47] 07Browser-Tests, 03Reading-Web-Sprint-77-Segmentation-fault, 03Reading-Web-Sprint-78-T, 07Unplanned-Sprint-Work: Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2511940 (10Jdlrobson) [16:14:59] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-77-Segmentation-fault, 03Reading-Web-Sprint-78-T, and 2 others: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2511941 (10MBinder_WMF) 05Resolved>03Open p:05Triage>03Hi... [16:15:44] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-77-Segmentation-fault, 03Reading-Web-Sprint-78-T, and 2 others: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2505954 (10Jdlrobson) a:03Jdlrobson [16:35:25] 10Beta-Cluster-Infrastructure, 10Shinken: Shinken alert for beta error rate - https://phabricator.wikimedia.org/T141785#2512020 (10thcipriani) [16:42:46] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Regression, 07Unplanned-Sprint-Work: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2512095 (10phuedx) a:05Jdlrobson>03dr0ptp4kt [16:42:57] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Regression, 07Unplanned-Sprint-Work: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2505954 (10phuedx) [17:04:33] mobrovac: you around for deployment cabal this week? [17:04:58] ostriches im not sure if we have to do this match = "\\#?\\d{1,6}" to match = "\\#?\\d{1,6}(#\\d{1,6})?" [17:05:01] Project beta-update-databases-eqiad build #10318: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10318/ [17:05:02] for trackingid [17:05:02] ? [17:26:30] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2512365 (10greg) a:03dduvall Assigning to Dan per meeting discussion. [17:28:42] ostriches they seem to be releasing gerrit 2.12.4 now. [17:28:50] Release notes are there [17:28:59] but they havent yet released the war [17:29:14] https://gerrit.googlesource.com/gerrit/+/04e4e2eec95c260e35a2e4726f5f00bdda676cd5 [17:29:20] wrong link [17:29:27] https://gerrit.googlesource.com/gerrit/+/1112a3f1dce8e61d0be9833713b9e0e91e874deb [17:32:24] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Regression, 07Unplanned-Sprint-Work: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2512417 (10jhobs) @Jdlrobson @phuedx Is https://gerrit.wikimedia.... [17:34:45] Eh, nothing super interesting in 2.12.4. [17:35:09] oh [17:36:25] ostriches but it would be nice to have the labelui plugin since you just need to install the java file in plugins/ and will start working once gerrit is restart [17:36:42] I don't want to. [17:36:53] Oh [17:36:58] Why? [17:37:11] Because I said the plugin is overly complicated for doing something so simple. [17:37:16] Oh [17:37:17] ok [17:38:02] ostriches would you be able to review https://gerrit.wikimedia.org/r/#/c/302129/ please :) [17:38:12] and https://gerrit.wikimedia.org/r/#/c/302229/ please [17:38:35] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:38:36] No. I don't have the time. [17:38:42] ok [17:41:14] paladox: Why do you ping people on IRC about a patch created only a few hours ago? [17:41:51] I was only asking if he would be able to review it and sorry [17:42:21] paladox: Gerrit has the concept of reviewers. Reviewers receive notifications. [17:42:33] ok [17:42:51] paladox: There is no need to send additional emails, IRC messages etc only a few hours later, and I have asked you before to be more patient. [17:42:59] paladox, I understand you want to get things done, and I aprreciate that. [17:43:06] Oh sorry. [17:43:24] paladox: However we all already get lots of notifications in many places. There's really no need to duplicate notifications. [17:43:36] Thanks for considering. And keep up the good work! :) [17:43:38] ok [17:43:46] :) [17:50:01] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Regression, 07Unplanned-Sprint-Work: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2512452 (10Jdlrobson) No. It wasn't needed. I forgot to abandon a... [18:00:53] 06Release-Engineering-Team, 10ArchCom-RfC, 06Developer-Relations, 06WMF-Legal, 07RfC: Create formal process for CREDITS files - https://phabricator.wikimedia.org/T139300#2512489 (10ZhouZ) > In T139300#2500359, @ZhouZ wrote: > Is there a potential, perhaps corner case issue, that the commit individual is... [18:05:00] Project beta-update-databases-eqiad build #10319: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10319/ [18:19:14] thcipriani: sorry, got completely held up in another meeting [18:20:45] mobrovac: np, notes posted: https://www.mediawiki.org/wiki/Deployment_tooling/Cabal/2016-08-01 only thing I had to mention was that new scap pacakge is rolling pretty soon, live in beta already: probably worth a poke if you have time. (autocompletion is a thing :)) [18:21:40] autocomplete!!! [18:21:41] yay [18:21:59] :D [18:23:33] PROBLEM - Puppet run on deployment-mediawiki02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:28:47] 06Release-Engineering-Team (Long-Lived-Branches), 10ReleaseTaggerBot: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches - https://phabricator.wikimedia.org/T141278#2492749 (10greg) >>! In T139210#2492581, @Jdforrester-WMF wrote: >>>! In T139210#2491737, @Aklapper wrote: >>>>! In... [18:34:36] For the record, there are only 4 UBN! tasks right now: 2 related to Fundraising (they use UBN! for their stuff in a way that doesn't mean "block the world!"), 1 in VE (ditto), and 1 in pywikibot (ditto). [18:38:20] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:38:44] PROBLEM - App Server Main HTTP Response on deployment-mediawiki02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:43:13] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 44838 bytes in 1.291 second response time [18:48:33] RECOVERY - App Server Main HTTP Response on deployment-mediawiki02 is OK: HTTP OK: HTTP/1.1 200 OK - 44483 bytes in 1.323 second response time [18:52:00] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Nodepool has trouble taking snapshots on OpenStack labs - https://phabricator.wikimedia.org/T138106#2512795 (10hashar) Le 27/07/2016 à 17:25, Andrew a écrit : > Andrew added a comment. > > > Is this still failing, or are things resolve... [18:53:11] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Nodepool has trouble taking snapshots on OpenStack labs - https://phabricator.wikimedia.org/T138106#2512797 (10hashar) 05Open>03Resolved [18:57:56] PROBLEM - Host deployment-parsoid08 is DOWN: CRITICAL - Host Unreachable (10.68.18.117) [19:05:01] Project beta-update-databases-eqiad build #10320: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10320/ [19:05:01] heh greg-g, created an instance for parsoid on jessie 2 minutes ago [19:06:05] :) :) [19:06:11] <3 mobrovac [19:06:32] you have a family, man! [19:06:35] :D [19:06:36] PROBLEM - Puppet run on deployment-parsoid09 is CRITICAL: CRITICAL: 16.67% of data above the critical threshold [0.0] [19:07:10] 06Release-Engineering-Team, 06Operations, 15User-greg: Institute a weekly review of all UBN! tasks - https://phabricator.wikimedia.org/T141130#2512832 (10greg) ```lang=irc 18:34 <+ greg-g> For the record, there are only 4 UBN! tasks right now: 2 related to Fundraising (they use UBN! for their stuff in a... [19:11:36] RECOVERY - Puppet run on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [0.0] [20:05:00] Project beta-update-databases-eqiad build #10321: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10321/ [20:13:17] hmm...so that job seems to be hungup on a lot of swift posts timing out... [20:13:47] can't seem to ssh to deployment-ms-be01.deployment-prep.eqiad.wmflabs [20:18:52] Krenair: is it fine to reboot deployment-ms-be01? Anything manual to do pre/post reboot? Seems to be hung up. Causing beta-update-databases to timeout seemingly. [20:18:57] Oddly netcat from deployment-ms-fe01 times out for port 22 but succeeds for 6001 (even though all requests are timing out) [20:19:08] when going to deployment-ms-be01 [20:24:21] hm [20:24:25] godog: ^ [20:25:10] Keep getting: SwiftFileBackend::setContainerAccess: unexpected rcode value (503) [20:26:33] yeah should be fine to reboot, nothing manual afaik it runs the same puppet code as production [20:27:19] okie doke. Thanks. [20:28:03] !log restarting deployment-ms-be01, not responding to ssh, mw-fe01 requests timing out [20:28:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:28:39] there wasn't anything in console was there? [20:29:22] nah, console was blank on wikitech, timedout in horizon [20:31:26] (03PS1) 10Urbanecm: Whitelist my second e-mail adresss [integration/config] - 10https://gerrit.wikimedia.org/r/302314 [20:31:28] nice, I can get in now, listening on 6000-6002, swift running [20:32:58] (03CR) 10Urbanecm: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/302314 (owner: 10Urbanecm) [20:33:21] and https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/10322/ finished right after I rebooted. [20:33:32] sweet! when there's more capacity we add more machines to swift too [20:36:31] (03CR) 10Paladox: [C: 031] Whitelist my second e-mail adresss [integration/config] - 10https://gerrit.wikimedia.org/r/302314 (owner: 10Urbanecm) [20:43:21] godog, I don't think we need more space in beta... Disk usage: space used: 10 GB of 214 GB [20:44:26] RECOVERY - Puppet staleness on deployment-ms-be01 is OK: OK: Less than 1.00% above the threshold [3600.0] [20:51:33] hehe indeed, if it stays that low might as well do with smaller instances but more [20:54:06] ostriches hi, i created a new instance to use puppet only for installing gerrit [20:54:10] but were hitting a problem. [20:54:15] how to handle puppetized system users in labs when they conflict with LDAP users [20:54:23] when system users are handled in puppet manifests.. and they work find in prod [20:54:31] but in labs they conflict with existing LDAP users [20:54:35] Yep [20:54:36] Error: Could not set uid on user[gerrit2]: Execution of '/usr/sbin/usermod -u 444 gerrit2' returned 6: usermod: user 'gerrit2' does not exist in /etc/passwd [20:54:39] what paladox said :) [20:54:46] root@gerrit-test3:/home/paladox# id gerrit2 [20:54:46] uid=2069(gerrit2) gid=1002(nda) groups=1005(labsadminbots),1002(nda) [20:54:47] :) [20:54:55] remember what we did for similar issues in the past? [20:55:26] also, the gerrit2 user is a member in the "nda" group for humans.. wut [20:55:33] LOL [21:04:19] RECOVERY - Free space - all mounts on deployment-stream is OK: OK: All targets OK [21:08:33] Project selenium-Wikidata » firefox,test,Linux,contintLabsSlave && UbuntuTrusty build #72: 04FAILURE in 2 hr 18 min: https://integration.wikimedia.org/ci/job/selenium-Wikidata/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=test,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/72/ [21:09:11] PROBLEM - Puppet run on deployment-db01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:14:12] RECOVERY - Puppet run on deployment-db01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:14:20] PROBLEM - SSH on deployment-elastic06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:15:23] (03CR) 10Niedzielski: "Needs a rebase but looks good to me." [integration/config] - 10https://gerrit.wikimedia.org/r/301370 (https://phabricator.wikimedia.org/T141440) (owner: 10Mholloway) [21:15:45] well.. ran out of time again because there is always fire-fighting [21:20:46] 10Beta-Cluster-Infrastructure, 10VisualEditor: Getting error while uploading image with VisualEditor - https://phabricator.wikimedia.org/T141814#2513148 (10Ryasmeen) [21:22:02] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10Parsoid, 06Services, 03Language-Q4-2016-Sprint 2: Migrate BetaCluster Node.JS services to Jessie and Node 4.3 - https://phabricator.wikimedia.org/T125003#2513152 (10ssastry) 05Open>03Resolved [21:31:47] PROBLEM - Puppet run on deployment-aqs01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:33:22] PROBLEM - Puppet run on integration-slave-precise-1002 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:34:38] PROBLEM - Puppet run on integration-slave-precise-1012 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:35:43] PROBLEM - Puppet run on integration-slave-precise-1011 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:39:11] greg-g hi, there was this message PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: puppet fail at 16:31 pm my time bst. [21:39:27] I havent seen any other message to say it was fixed. [21:39:29] paladox: did it recover? [21:39:35] I doint think so [21:39:42] * greg-g looks in icinga webui [21:40:01] Thanks [21:40:57] at 21:39 it's OK: Puppet is currently enabled, last run 12 minutes ago with 0 failures [21:41:18] Oh [21:41:20] thanks [21:42:06] looks like it recovered at this point (if I'm understanding icinga correctly): [21:42:09] Service Ok[2016-08-01 17:38:35] SERVICE ALERT: gallium;puppet last run;OK;HARD;3;OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:42:49] PROBLEM - Puppet run on deployment-changeprop is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:45:10] Oh [21:45:12] thanks. [21:45:23] Maybe it recovered when i went offline [21:45:29] but thanks for looking into it [22:04:50] paladox: on the gerrit2 user, https://gerrit.wikimedia.org/r/#/c/301896/ and https://gerrit.wikimedia.org/r/#/c/299164/ improve how the user is managed. [22:04:59] Ok [22:05:01] thanks [22:05:14] It means the user is only installed by the package (if needed) and not by puppet. [22:05:16] thanks [22:06:12] ostriches who needs to review https://gerrit.wikimedia.org/r/#/c/299164/ ? [22:10:08] ostriches is there a way we can make ipv6 optional [22:10:25] since it is hard to find a ipv6 address in labs without having to use a test one. [22:10:28] please [22:10:45] I suppose... [22:10:50] Thanks [22:10:51] :) [22:11:02] We are installing gerrit all the way from puppet this time [22:11:02] :) [22:11:13] which we can use for testing purposes [22:11:17] to make sure something works. [22:12:56] https://gerrit.wikimedia.org/r/302359 [22:13:23] Thankyou [22:13:24] :) [22:14:07] ostriches also im not sure about this bit since i havent got that far [22:14:08] but is [22:14:09] this [22:14:12] role::gerrit::server::bacula optional [22:14:18] ? [22:14:21] Yeah it is. [22:14:25] It's set to undef by default [22:14:25] Ok [22:14:28] thanks [22:14:29] and :) [22:15:03] 06Release-Engineering-Team (Long-Lived-Branches), 10ReleaseTaggerBot: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches - https://phabricator.wikimedia.org/T141278#2492749 (10Legoktm) It answers the question "When is the patch that fixes my bug going to be deployed?" The real an... [22:15:44] ostriches do you have time to review https://gerrit.wikimedia.org/r/#/c/302129/ please. [22:16:00] Test is at http://gerrit-test.wmflabs.org/gerrit/#/c/16/ [22:16:06] And is in the commit msg box [22:17:07] 10Beta-Cluster-Infrastructure, 10VisualEditor: Getting error while uploading image with VisualEditor - https://phabricator.wikimedia.org/T141814#2513134 (10greg) I was trying to find the whole error message (it looks cutoff in your screenshot) but I couldn't reproduce: http://en.wikipedia.beta.wmflabs.org/w/in... [22:18:32] Anyone know why deployment-elastic06 is broken? ebernhardson? [22:18:41] nothing in console [22:19:46] 06Release-Engineering-Team (Long-Lived-Branches), 10ReleaseTaggerBot: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches - https://phabricator.wikimedia.org/T141278#2513352 (10demon) >>! In T141278#2513331, @Legoktm wrote: > It answers the question "When is the patch that fixes my... [22:19:46] :) [22:21:59] ostriches i did all the testing and nothing seems broken [22:22:16] the only way that will break those links again if you use #1 in the link too [22:22:30] but we need to merge that patch seperatly that supports that [22:22:43] so we can quickly revert incase it breaks again [22:29:20] ostriches /me has found a bug in gerrit [22:30:07] If someone updates the commit msg, the grrrit-wm bot reports it as the author of the commit [22:30:10] not the commit [22:30:19] for example [22:30:20] (PS3) Chad: Gerrit: Make IPv6 optional [puppet] - https://gerrit.wikimedia.org/r/302359 (https://phabricator.wikimedia.org/T133070) [22:30:37] that patch was done by dzahn (mutante) but it says you did it. [22:30:56] We know. [22:31:01] There's already a bug filed about it [22:31:02] oh [22:31:13] Where is the bug filed [22:31:13] ? [22:31:16] please [22:32:05] https://phabricator.wikimedia.org/T141329 [22:32:12] 06Release-Engineering-Team, 15User-greg, 07Wikimedia-Incident: Identify "first responders" for "all" "components" deployed on Wikimedia servers - https://phabricator.wikimedia.org/T141066#2513427 (10greg) >>! In T141066#2496346, @faidon wrote: > I'm still taking this proposal in, but a few preliminary notes:... [22:32:33] (03PS3) 10Mholloway: Make apps-android-wikipedia-lint voting [integration/config] - 10https://gerrit.wikimedia.org/r/301370 (https://phabricator.wikimedia.org/T141440) [22:32:34] Oh [22:32:36] yeh sorry [22:32:54] ostriches should we file it upstream which is where gerrit will most likly try and fix the problem [22:33:04] since it worked in gerrit 2.8 but broke in gerrit 2.12 [22:33:10] unless there is an api change [22:33:23] It's not a bug in gerrit really. [22:33:44] Oh [22:33:54] Basically "rebase" makes you the committer now. Which is mostly correct actually. [22:34:05] Yep [22:34:11] 10Beta-Cluster-Infrastructure, 10VisualEditor: Getting error while uploading image with VisualEditor - https://phabricator.wikimedia.org/T141814#2513438 (10matmarex) [22:34:14] rebasing still works [22:34:15] It's not what the bot expects though :) [22:34:19] oh [22:34:24] the bot needs updating [22:34:27] Yep [22:34:39] Maybe :) [22:34:41] It's fun! [22:34:41] Ok [22:34:45] :) [22:35:02] Bot should probably report the author & committer, not just the committer. [22:35:15] I haven't looked at stream-events in forever, so I dunno though [22:35:17] Ok [22:35:19] Yeh [22:37:34] LOL me getting french adverts on youtube, and i doint speak french. [22:38:01] 10Deployment-Systems, 06Operations: dologmsg doesn't work on terbium - https://phabricator.wikimedia.org/T141619#2513444 (10greg) p:05Triage>03Normal [22:38:13] Pandora started giving me ads in Spanish after I moved to California. [22:38:26] I'm like...uhhh.... just because I live in CA.... [22:38:28] Oh [22:38:30] lol [22:40:21] ostriches we have a pandora shop [22:40:29] It sells jewlerry [22:41:11] ostriches im getting this error now [22:41:12] Notice: /Stage[main]/Gerrit::Jetty/Exec[install_gerrit_jetty]/returns: Checksum mysql-connector-java-5.1.21.jar OK [22:41:12] Notice: /Stage[main]/Gerrit::Jetty/Exec[install_gerrit_jetty]/returns: Exception in thread "main" java.lang.RuntimeException: Cannot save secure.config [22:41:14] Notice: /Stage[main]/Gerrit::Jetty/Exec[install_gerrit_jetty]/returns: at com.google.gerrit.server.securestore.DefaultSecureStore.save(DefaultSecureStore.java:88) [22:41:19] Notice: /Stage[main]/Gerrit::Jetty/Exec[install_gerrit_jetty]/returns: at com.google.gerrit.server.securestore.DefaultSecureStore.unset(DefaultSecureStore.java:65) [22:41:42] PROBLEM - Puppet run on deployment-cache-upload04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:42:08] Why is gerrit.war writing the secure.config? Puppet should've created it already [22:42:52] Im not sure [22:44:43] ostriches it seems the file was created [22:45:11] Or, maybe it's trying to add something to it? [22:45:28] PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:45:52] now i get [22:45:53] Aug 01 22:45:04 gerrit-test3 gerrit[6602]: ** ERROR: GERRIT_SITE not set [22:45:53] Aug 01 22:45:04 gerrit-test3 systemd[1]: gerrit.service: control process exited, code=exited status=1 [22:45:53] Aug 01 22:45:04 gerrit-test3 systemd[1]: Failed to start LSB: Start/stop Gerrit Code Review. [22:45:53] Aug 01 22:45:04 gerrit-test3 systemd[1]: Unit gerrit.service entered failed state. [22:46:04] 06Release-Engineering-Team (Long-Lived-Branches), 10ReleaseTaggerBot: Decide how ReleaseTaggerBot fits into the brave new world of long-lived-branches - https://phabricator.wikimedia.org/T141278#2513493 (10greg) >>! In T141278#2513331, @Legoktm wrote: > It answers the question "When is the patch that fixes my... [22:46:10] ostriches ^^ [22:46:15] paladox: That gets fixed with the 2 patches I mentioned earlier [22:46:23] Actually no. [22:46:24] Oh eyh [22:46:28] That's just in the debian package one [22:46:28] i have those patches [22:46:31] Oh [22:46:35] Can you merge it [22:46:37] please [22:46:41] i spoke with qchris [22:46:45] in -labs [22:46:55] qchris_ Hi could you quickly review https://gerrit.wikimedia.org/r/#/c/299164/ please [22:46:55] ? [22:46:55] paladox: It's hard to "quickly review" debs :-) [22:46:55] ost-riches asked me about the mysql-connector remove. [22:46:55] I have not tried that, but it's worth trying. [22:46:55] Oh [22:46:57] IIRC, he said hel'll try. [22:46:59] Ok [22:47:01] ostriches ^^ [22:47:23] It dosent seem like anything is wrong with your patch for debian [22:47:24] :) [22:48:44] ostriches :) :) [22:50:44] Yeah well I can merge it, but I still need someone from ops to build the package and upload to apt :) [22:50:50] Oh [22:51:55] Could we merge and then find someone in ops to merge please? [22:51:57] ostriches ^^ [22:54:01] ostriches maybe we can create a task for building the gerrit deb and ask Moritz if he or she can build it please [22:54:07] and upload it to apt? [22:54:30] Well he's already on the review list, as is Daniel :) [22:54:35] Oh [22:55:38] 10Beta-Cluster-Infrastructure, 10VisualEditor: Getting error while uploading image with VisualEditor - https://phabricator.wikimedia.org/T141814#2513544 (10Tanveer07) [22:56:34] ostriches But the patch has been siting since 15 july, if we merge and create a task [22:56:42] then it will most likly be delt with quicker [22:56:43] ;) [22:56:44] :) [22:58:28] ostriches :) :) [23:21:36] sigh... for how long has nginx been refusing to start on cache-upload04 [23:22:57] Probably for a long time. Puppet broke for that awhile ago because ssl. [23:23:59] Yes but I fixed it. [23:24:05] Now someone has broken it *again*. [23:29:44] 10Continuous-Integration-Config, 10Analytics-Wikimetrics: tox runs all tests (including manual ones) - https://phabricator.wikimedia.org/T71183#2513693 (10greg) p:05Low>03Lowest >>! In T71183#2492178, @Nuria wrote: > @harshar: tests cannot be run from depo alone as they require a wikimetrics instance runni... [23:32:53] ostriches could you merge please, im going to ask tomarror if Moritz can build the dpkg [23:32:54] :) [23:33:00] paladox: gerrit 301673 and 302229 are mostly the same thing, the latter just has the negative lookahead for " [23:33:10] Can you abandon the 301673 then? [23:33:14] Ok [23:33:18] Yeh [23:33:44] Done [23:34:09] ostriches by merge i mean https://gerrit.wikimedia.org/r/#/c/299164/ merge that please [23:34:10] ::) [23:34:12] :) [23:35:33] I want them to finish reviewing it. [23:35:37] Self-review is bad ;-) [23:35:49] oh ok [23:41:33] ostriches we can build the dpkg from jenkins [23:41:34] :) [23:46:19] grrr... taking minutes to move my cursor around nano here [23:46:28] ping in the tens of thousands [23:50:30] RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:54:15] (03PS4) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309)