[00:52:44] 10Beta-Cluster-Infrastructure: Review rights removal at beta-enwiki - https://phabricator.wikimedia.org/T115196#1737212 (10Krenair) 5stalled>3Resolved I've looked through this and I am unhappy, both with the manner in which these rights were removed and with the way this discussion has gone. I'm not going to... [02:03:25] i know its a bit late...but the cirrus multi-dc deploy earlier today ended up having to roll back the config change due to errors on commonswiki. finally figured out silly me didn't realize phase0 rolled forward but nothing else so a couple key patches are missing in 1.27.0-wmf.3. mind if i cherry-pick those forward and re-deploy the config change? [02:04:58] the code is all running on phase1 and phase2, just phase0 is SOL [02:57:22] chasemp: Yes, it is. [02:57:35] And yes, it does collect. [02:59:14] All proposed changesets are linked from the Gerrit UI and available for fetching. In practice, most of them aren't used much. [03:00:45] chasemp: The actual space taken up should be minimal, assuming things diff well. [03:02:04] 15G is what it takes up on both iridium and ytterbium, so I think the actual reference fetching won't make much a difference. [03:18:44] Yippee, build fixed! [03:18:44] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #754: 09FIXED in 28 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/754/ [03:20:38] 10Continuous-Integration-Infrastructure, 5Patch-For-Review, 7Upstream: Fails npm build failure "File exists: ../esprima/bin/esparse.js" - https://phabricator.wikimedia.org/T90816#1737284 (10Krinkle) >>! In T90816#1728898, @awight wrote: > Side note, it would be nice if we didn't have to keep our CI glue up-t... [03:37:32] 6Release-Engineering-Team, 6operations: deployment: user trebuchet gets added and removed from group wikidev on every puppet run - https://phabricator.wikimedia.org/T115760#1737288 (10faidon) Well as those puppet runs and logs prove, user trebuchet belongs in group wikidev only for about 1 minute every 30. I t... [04:26:31] Yippee, build fixed! [04:26:32] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #597: 09FIXED in 34 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/597/ [05:17:16] PROBLEM - Puppet failure on deployment-cache-parsoid04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [05:17:16] PROBLEM - Puppet failure on deployment-cache-mobile04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [05:27:34] Yippee, build fixed! [05:27:34] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #577: 09FIXED in 25 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/577/ [06:38:23] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [08:34:56] Yippee, build fixed! [08:34:57] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #757: 09FIXED in 24 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/757/ [09:30:37] PROBLEM - Puppet failure on deployment-fluorine is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [09:34:13] 10Continuous-Integration-Infrastructure, 10MediaWiki-Database, 7Database: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255#1737652 (10jcrespo) [09:35:45] 10Continuous-Integration-Infrastructure, 10MediaWiki-Database, 7Database: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255#1516355 (10jcrespo) [09:36:37] !task is https://phabricator.wikimedia.org/$1 [09:36:37] Key was added [09:39:10] 10Continuous-Integration-Infrastructure, 10MediaWiki-Database, 7Database: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255#1737679 (10jcrespo) [09:40:31] 10Beta-Cluster-Infrastructure, 6Labs, 10Labs-Infrastructure, 6operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#1737694 (10Chmarkine) >>! In T50501#1669896, @Chmarkine wrote: > [[ https://letsencrypt.org/ | Let's Encrypt ]] provides free tru... [09:53:20] 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10MediaWiki-Database, 7Database: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255#1737707 (10hashar) Adding #beta-cluster-infrastructure to the loop, since we might well want to enable the strictness there... [10:03:38] 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10MediaWiki-Database, 7Database: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255#1737719 (10jcrespo) This is not about 5.0 compatibility, if any, 5.7 compatibility. But that doesn't matter (we do not use... [10:04:33] greg-g: FYI: Meh. Might need a 1.23.12 tarball release, see https://phabricator.wikimedia.org/T115991 [11:18:37] 10Continuous-Integration-Config, 5Patch-For-Review: replace Jenkins job mwext-testextension-zend by mwext-testextension-zend-composer - https://phabricator.wikimedia.org/T115061#1737853 (10hashar) For this repository, we should get rid of the jobs `mwext-testextension-zend` since it relies on `mediawiki/vendor... [11:19:04] 10Continuous-Integration-Config, 5Patch-For-Review: replace Jenkins job mwext-testextension-zend by mwext-testextension-zend-composer - https://phabricator.wikimedia.org/T115061#1737854 (10hashar) composer merge plugin: https://github.com/wikimedia/composer-merge-plugin#composer-merge-plugin [11:42:52] (03PS1) 10Hashar: Use pipeline name as context for Zuul diff [integration/config] - 10https://gerrit.wikimedia.org/r/247543 [11:43:31] (03CR) 10Hashar: [V: 031] "I already refreshed the job for convenience. An example is https://integration.wikimedia.org/ci/job/integration-zuul-layoutdiff/6122/conso" [integration/config] - 10https://gerrit.wikimedia.org/r/247543 (owner: 10Hashar) [11:48:49] greg-g: ah, maybe I was overcautious. I think you can ignore my previous line [13:27:58] RECOVERY - Puppet failure on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [13:31:50] (I thought I fixed everything, but then I turned my back and it all broke itself again behind me) [13:43:57] PROBLEM - Puppet failure on deployment-cache-text04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [13:46:30] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-OAuth, 10Pywikibot-OAuth, 5Patch-For-Review, and 2 others: "Nonce already used" regularly occurring on beta cluster - https://phabricator.wikimedia.org/T109173#1738126 (10Aklapper) This task has [[ https://www.mediawiki.org/wiki/Phabricator/Project_mana... [13:55:23] (03PS1) 10Hashar: [RFC] allow paragraphed singled lines comments [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/247565 [13:56:08] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-OAuth, 10Pywikibot-OAuth, 5Patch-For-Review, and 2 others: "Nonce already used" regularly occurring on beta cluster - https://phabricator.wikimedia.org/T109173#1738143 (10XZise) Primarily a response from @jayvdb to the comments in https://gerrit.wikimed... [13:56:25] (03CR) 10Hashar: "Real world example usage: https://gerrit.wikimedia.org/r/#/c/247560/3/tests/phpunit/phpunit.php,unified" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/247565 (owner: 10Hashar) [13:57:30] (03CR) 10jenkins-bot: [V: 04-1] [RFC] allow paragraphed singled lines comments [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/247565 (owner: 10Hashar) [13:58:57] RECOVERY - Puppet failure on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:01:00] (03CR) 10Hashar: "I am wondering if we could allow paragraphed multi line comments such as:" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/224648 (https://phabricator.wikimedia.org/T101872) (owner: 10Galorefitz) [14:09:28] RECOVERY - Puppet failure on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:14:09] 10Beta-Cluster-Infrastructure, 6operations: Beta Cluster no longer listens for HTTPS - https://phabricator.wikimedia.org/T70387#1738192 (10Krenair) I made a certificate for beta on deployment-puppetmaster and replaced the star.wmflabs.org cert with it there (also had a mess around with some other settings to g... [14:17:12] RECOVERY - Puppet failure on deployment-cache-mobile04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:21:04] Ummmm: [14:21:04] krenair@deployment-cache-parsoid04:~$ puppet agent -tv [14:21:05] Error: Could not request certificate: getaddrinfo: Name or service not known [14:21:05] Exiting; failed to retrieve certificate and waitforcert is disabled [14:21:16] er, right [14:21:17] needs sudo [14:21:20] of course [14:24:43] 5Continuous-Integration-Scaling: Isolate contintcloud nova project from the rest of the wmflabs cloud - https://phabricator.wikimedia.org/T86168#1738239 (10hashar) @andrew do we have an easy way to prevent a given labs project (`contintcloud`) from reaching other projects? Specially we will need a way to set o... [14:27:09] ostriches: I'm worried more about clutter than disk space, changesets will persist in phab after they are merged? that seems more harmfully confusing and strange than useful, I don't know what the use case is really [14:27:49] What's confusing? You don't see them anywhere by default unless you explicitly fetch the references. [14:28:28] I doubt Phab will expose refs/* other than refs/heads/* and refs/tags/* [14:29:03] (via the UI, at least) [14:29:11] right but we don't know [14:29:20] Can find out easily enough. [14:29:49] I think we should only because it if does expose them that's where my head is at when I'm saying it's confusing [14:30:08] but if it doesn't, what's the use case for it? [14:30:12] not that I care in that case [14:30:25] but what are we going to do w/ proposed changes from gerrit in phab forever? [14:30:41] We're replacing Gitblit, and that's one of the things it does. [14:31:03] what do people use it for on gitblit? [14:31:06] Those (gitblit) links next to every change. [14:31:15] They let you view a change in the git viewer. [14:31:16] I see [14:31:37] eg: https://git.wikimedia.org/commit/operations%2Fpuppet/05e520667b4fdb75c5cdcdf255d47bf967007ef9 [14:31:40] I get it now, I mean...phab won't work that way? but ok sure [14:31:57] 10Continuous-Integration-Config, 10Analytics, 6WMDE-Analytics-Engineering, 10Wikidata: Add basic jenkins linting to analytics-limn-wikidata-data - https://phabricator.wikimedia.org/T116007#1738262 (10Addshore) [14:32:32] chasemp: Yeah I know Phab doesn't work that way, but Gerrit does and if we want to decom Gitblit I don't want a bajillion broken links everywhere. [14:33:05] you are thinking ppl link to these directly in places and we can do some redirect magic and at least carry over all historical things [14:33:33] 10Continuous-Integration-Infrastructure, 10pywikibot-core, 5Patch-For-Review: run at least pep8 and pep257 for new changesets submitted to pywikibot/core for any user - https://phabricator.wikimedia.org/T87169#1738273 (10JanZerebecki) We talked about this during the CI weekly and while T86168 is not yet done... [14:33:33] Something like that [14:35:01] are the refs still hanging around in gerrit in any way, won't phab miss out on historical refs anyway? [14:37:12] Yeah they still hang around in Gerrit. [14:37:15] It doesn't prune those. [14:37:23] https://phabricator.wikimedia.org/diffusion/OPUP/ - so, it doesn't mess with most things [14:37:53] It mentions it on a specific commit tho: https://phabricator.wikimedia.org/rOPUP41df0a06658e0b2d2fcbfd3b864be8f9a63c62e9 [14:38:06] Which is...actually kind of cool. Makes it easy to associate which Gerrit change went with something [14:38:41] neat [14:41:12] I also already tested this on a repo that doesn't exist in Gerrit, and adding unknown references to its fetch doesn't make it blow up. [14:41:20] It just...doesn't fetch them! :) [14:41:49] no errors in phd.log even? it tends to just throw away a job when it whigs but it will keep retrying [14:41:56] Error: Failed to apply catalog: Could not find dependency Package[git-core] for File[/etc/gitconfig] at /etc/puppet/modules/phabricator/manifests/vcs.pp:52 [14:42:02] ostriches: ^ [14:42:19] Bleh, package names. Can just remove that dep. [14:44:42] has anyone changed the password for Selenium_user on en.beta? [14:45:23] why do you ask stephanebisson? [14:46:16] Krenair: because I now get "WrongPass (MediawikiApi::LoginError)" when I run the browser tests against beta. It was working not too long ago. [14:48:57] I think someone gave it a bot flag recently, but I'm not aware of any password changes [14:49:01] PROBLEM - Puppet failure on deployment-puppetmaster is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:50:45] nevermind the password, I was able to login manually just now [14:51:47] ostriches: gtg [14:53:09] 5Continuous-Integration-Scaling, 7Tracking: Investigate using a Squid based man in the middle proxy to cache package manager SSL connections - https://phabricator.wikimedia.org/T116015#1738384 (10hashar) 3NEW a:3hashar [14:54:41] PROBLEM - Puppet staleness on deployment-restbase01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [14:56:12] 5Continuous-Integration-Scaling, 7Tracking: [tracking] Disposable VMs need a cache for package managers - https://phabricator.wikimedia.org/T112560#1738401 (10chasemp) We should definitely shy away from NFS as a solution for this. [14:56:29] 5Continuous-Integration-Scaling, 7Tracking: Investigate using a cache store/restore system for package managers - https://phabricator.wikimedia.org/T116017#1738413 (10hashar) 3NEW [15:01:04] 5Continuous-Integration-Scaling, 7Tracking: [tracking] Disposable VMs need a cache for package managers - https://phabricator.wikimedia.org/T112560#1738425 (10hashar) I have filled T116017 about setting up a cache store / restore system. Probably using rsync and a dedicated instance with a bunch of space. O... [15:04:00] RECOVERY - Puppet failure on deployment-puppetmaster is OK: OK: Less than 1.00% above the threshold [0.0] [15:07:15] Okay. [15:07:19] andrewbogott: hello :-} do we have any way to set egress/outgress security rules in labs ? :-} [15:07:20] Maybe I'm missing something obvious. [15:08:04] hashar: labs-wide you mean, or for a specific instance? [15:08:28] andrewbogott: for a specific project. I would like the Nodepool instances to not be able to reach other labs instances except some specific ones [15:08:31] if at all possible [15:08:59] Horizon doesn't let me create any egress filter. Not sure if it is at all possible or if it is a permissions that hasn't been granted [15:09:13] Why would nginx serve old SSL certificates that no longer exist on the filesystem? [15:09:14] hashar: hm… if you look at horizon.wikimedia.org, rules have a ‘direction’ indicator [15:09:23] which sort of implies it’s configurable although I don’t immediately see how [15:09:40] And why do I sometimes get different certificates? [15:10:07] Allows specifications of ingress and egress (Nova security groups defines ingress rules only) -- https://wiki.openstack.org/wiki/Neutron/SecurityGroups bahh [15:12:07] hashar: ok, that explains why it’s in the gui :) [15:12:47] andrewbogott: so we would need OpenStack Networking ? [15:13:02] neutron, it looks like [15:14:02] which one are we using ? :D [15:14:23] nova-network [15:14:32] Chase is doing some Neutron research but it’s pretty far off [15:15:51] andre__: :) [15:15:58] 5Continuous-Integration-Scaling: Isolate contintcloud nova project from the rest of the wmflabs cloud - https://phabricator.wikimedia.org/T86168#1738480 (10hashar) From a discussion with @andrew on IRC. Horizons shows the direction next to rules which seems to indicate we can change it. But from: https://wiki.... [15:17:10] 5Continuous-Integration-Scaling: Isolate contintcloud nova project from the rest of the wmflabs cloud - https://phabricator.wikimedia.org/T86168#1738490 (10hashar) And from https://bugs.launchpad.net/nova/+bug/1267140 : > ... nova-network only supported ingress rules, so nova API matches that. If you want egres... [15:17:16] > > nova-network only supported ingress rules, so nova API matches that. If you want egress rules, you should use the Neutron API. [15:17:28] andrewbogott: so yeah not doable / blocked by Neutron [15:17:42] yep [15:18:04] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce build #215: 04FAILURE in 3.2 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce/215/ [15:18:35] 5Continuous-Integration-Scaling: Isolate contintcloud nova project from the rest of the wmflabs cloud - https://phabricator.wikimedia.org/T86168#1738497 (10hashar) [15:18:55] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling, 10releng-201415-Q3, 10releng-201415-Q4, 7Epic: [EPIC] Run CI jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#1738503 (10hashar) [15:18:56] 5Continuous-Integration-Scaling: Isolate contintcloud nova project from the rest of the wmflabs cloud - https://phabricator.wikimedia.org/T86168#1738499 (10hashar) 5Open>3stalled stalled / blocked by {T85611} [15:18:58] andrewbogott: thank you [15:19:49] hashar: is this to limit CI instances to the caching hosts / certain resources? [15:20:49] chasemp: yeah [15:20:58] hashar: How can i fix extension WebPlatformAuth jenkins tests failing cause i tryed extension.json and it is still failing the test. [15:21:05] chasemp: aka have the labs project / tenant instances to be network isolated from the rest [15:21:45] paladox: the extension is assuming the dependencies are installed in the extension vendor repository ( /extensions/WebPlatformAuth/vendor/ ) [15:22:02] what is scandium or whatever git-daemon server had a cache for packages I wonder [15:22:05] or whatever resources [15:22:12] paladox: but on Jenkins, we use the composer merge plugin and invokes composer install from mediawiki core. So the dependencies are installed in MediaWiki core /vendor/ . [15:22:21] hashar: Oh ok, What should i do to get it to look in the right place. [15:22:39] paladox: I have no idea whetherh using extension.json will get composer from mediawiki core to install the dependencies of the extension in /vendor/ [15:23:07] hashar: I am trying with the composer merge package and still causing it to fail cause it carnt find vendor/autoload.php [15:23:08] paladox: one sure thing, the WebPlatformAuth.php entry point should be enhanced to just assert one of the class_exists() instead of attempting to find the .php file [15:23:20] hashar: How would i do that. [15:23:52] chasemp: yeah potentially we can use scandium for that. I noticed your reply on the task, sorry I eventually completely forgot about that part of the infra [15:24:09] hashar: Instead of checking if there is autoload.php check if the class for the required package exist. Or workaround is to create a black autoload.php file which will be overwritten. [15:24:32] paladox: a blank autoload.php sounds lame :-D [15:24:39] I think I finally killed the nginx instances that were serving the old certificate [15:24:48] I have no idea how they sat around while nginx had supposedly stopped [15:24:52] paladox: I am really wondering how one can get all extensions composer dependencies easily installed though [15:24:57] hashar: Yes. But could work if checking if class dosent exist. [15:25:28] paladox: ideally one would just download mediawiki/core + extensions. extract the extensions under /extensions/ run composer install which takes care of finding all the extensions composer.json, merge them all and do the install [15:25:29] hashar: Yes. Maybe we would have to manualy add each extension to a composer file that jenkin tests against. [15:26:12] paladox: Jenkins currently does a glob of /extensions/*/composer.json and forge a file 'composer.local.json' which is used by the composer merge plugin [15:26:36] Krenair: I guess none of the puppet patches took care of removing them ? [15:26:54] the old ones? [15:27:00] oh no [15:27:00] that wasn't the issue [15:27:10] chad recreated them all recently (to migrate varnishes toJessie) [15:27:21] hashar: Oh ok. How would i check if a class exist because i would like to try it that way the idea you suggested. If that dosent work we would need to fallback to manualy creating an autoload.php file for the time being to stop tests failing. [15:30:32] bd808: I put the oom/hhvm graphite logging thing on puppetswat for today [15:34:01] +1 [15:36:15] ostriches: I know you don't want to think about gerrit, but do you have any idea what change broke searching? It double URL encodes things now which makes searching for a user's commits and several other things a big pain in the ass. [15:36:57] Hmm, I haven't changed anything lately. [15:37:29] it is something in the javascript ui I think. [15:37:45] if you fix the busted urls it makes they stay fixed [15:37:45] Hmm, nothing has changed there in quite some time.... [15:38:41] bd808: link example? I search yesterday I thought for someting [15:38:41] weird. I wonder if it is a browser thing somehow then? [15:39:09] ...could be [15:39:21] chasemp: go to the search box and search for 'owner:"BryanDavis " ' [15:39:36] the < and > will end up double encoded in the url [15:39:45] https://gerrit.wikimedia.org/r/#/q/owner:%22BryanDavis+%253Cbdavis%2540wikimedia.org%253E%22,n,z [15:39:46] 10Beta-Cluster-Infrastructure, 6operations, 5Patch-For-Review: Beta Cluster no longer listens for HTTPS - https://phabricator.wikimedia.org/T70387#1738608 (10Krenair) a:3Krenair Using a real trusted certificate is covered in T50501, T75919 and T97593. [15:40:03] should be https://gerrit.wikimedia.org/r/#/q/owner:%22BryanDavis+%253Cbdavis%40wikimedia.org%3E%22,n,z [15:40:22] err https://gerrit.wikimedia.org/r/#/q/owner:%22BryanDavis+%3Cbdavis%40wikimedia.org%3E%22,n,z [15:40:51] third times the charm ;) [15:40:59] it both found the history and displayed an error [15:41:06] yeah [15:41:11] User BryanDavis %3Cbdavis%40wikimedia.org%3E not found but [15:41:14] it did find it :) [15:41:29] good old gerrit making life interesting [15:41:39] I just noticed it last week but it may have been around longer [15:46:13] hashar: MediaWiki manualy add there autoload.php in the vendor dir. [15:46:37] hashar: Plus how do i do the recognise the class, Meaning does the class exist. [15:46:50] hashar: In composer or php when loading autoload.php [15:47:49] paladox: class_exists ( "some class provided by one of WebPlatformAuth dependencies " ) ? [15:48:08] paladox: seems the author assumes composer to be run in the extension root [15:48:32] paladox: and I am not sure MediaWiki core actually load such dependencies, but apparently the extension attempts to do so [15:48:54] hashar: So if i do ../../vendor/autoload.php that should work should it. [15:49:29] paladox: so in the entry point, I would first look whether the class exists. it can have installed via MediaWiki core and the composer merge plugin in /vendor/ [15:49:50] paladox: else reuse the current logic which is to look up in the extensions/WebPlatformAuth/vendor/ [15:49:58] paladox: finally, die(); [15:50:26] so when one install the dependencies with composer merge plugin, hopefully they will be available in the autoloader when invoking class_exists() in WebPlatformAuth.php [15:50:53] hashar: Ok but i thought that the dependacys seem to not load into the vendor folder instead they load in the mediawiki core vendor folder. [15:51:51] paladox: that is what Jenkins does [15:51:59] Hello, is here someone, who nows, if this users was created for browser tests? http://en.wikipedia.beta.wmflabs.org/wiki/User:SB_Selenium_User [15:52:18] paladox: but the extension is hardcoded to look in the extension /vendor/ directory, so that fails. Even if the needed class are already loaded by MediaWiki core /vendor/ [15:52:23] zeljkof: ^ re Luke081515 's question [15:52:31] Ok [15:52:58] paladox: so before the if ( is_readable( __DIR__ . '/vendor/autoload.php' ) ) [15:53:30] hashar: So i should do class_exists before if ( is_readable( __DIR__ . '/vendor/autoload.php' ) ) [15:53:39] paladox: you want to add a if class_exists( 'Guzzle ???? or whatever class' ) {  /**all fine */ } elseif ( is_readable ... [15:53:42] yeah [15:53:48] and hopefully that will work [15:54:07] hashar: Thankyou for helping i will try that now. [15:55:17] hashar: Would the class be GuzzleHttp i am looking from https://github.com/guzzle/guzzle/blob/master/composer.json [15:55:44] greg-g: Thanks, I don't know, who knows something about browser tests ;) [15:56:25] Luke081515: zeljkof is a good first point of inquiry :) [15:56:28] paladox: I have no idea [15:56:36] paladox: maybe look at use of Guzzle in WebPlatformAuth [15:56:52] paladox: maybe Guzzle\Http\Client [15:57:38] PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:57:44] hashar: Ok [15:57:47] time for meeting [16:01:09] 6Release-Engineering-Team, 6Phabricator, 10Traffic, 6operations: Phabricator needs to expose ssh - https://phabricator.wikimedia.org/T100519#1738690 (10mmodell) [16:01:33] Luke081515: I think this one is used for browser tests http://en.wikipedia.beta.wmflabs.org/wiki/User:Selenium_user cc greg-g [16:01:59] so, Selenium_user, not sure who uses SB_Selenium_User [16:02:21] hm. this account has only one edit at the moment [16:02:28] hasharMeeting: It seems to pass test at https://gerrit.wikimedia.org/r/#/c/247572/ but not sure weather it has found the class so if it found the class loads the autoload.php correctly i am not sure if it wored. [16:02:41] 10Beta-Cluster-Infrastructure, 10Browser-Tests: Make selenium users use botflags at beta-cluster - https://phabricator.wikimedia.org/T116027#1738691 (10Luke081515) 3NEW [16:03:03] I would prefer this ^ solution, it makes patroling the recent changes easier ;) [16:03:15] (for the other accounts) [16:03:42] I found SB_Selenium_User at the abuse log, he matched a filter [16:06:03] paladox: yeah apparently that installed it [16:06:26] paladox: so now maybe you want to split https://gerrit.wikimedia.org/r/#/c/247572/ in smaller / atomic changes that are easier to review and get merged :D [16:06:39] hasharMeeting: How can you tell if it loaded autoload.php. [16:06:58] paladox: look at the composer job https://integration.wikimedia.org/ci/job/mwext-testextension-hhvm-composer/167/consoleFull [16:07:02] hasharMeeting: Thankyou for helping me figure out how to fix the error. [16:07:03] [merge] Merging guzzlehttp/guzzle [16:07:17] so the mediawiki core composer install managed to find [16:07:18] it [16:07:32] because jenkins injects the extension in a composer.local.json file [16:08:21] hashar: Ok thanks but what about loading the vendor/autoload.php through php since that was where it was failing. I am not sure if the class worked. But i would think elseif works even if if didnt. [16:08:32] hasharMeeting: [16:15:28] paladox: well if mediawiki core composer installed the dependency, the classes it defines should be available by the time WebPlatformAuth.php is included [16:15:39] paladox: so class_exists( 'Guzzle..xx/yy/' ) should return true [16:16:21] thus you can skip the is_file check since the class has already been loaded by core [16:18:17] hasharMeeting: Ok i have done the composer fixes here https://gerrit.wikimedia.org/r/#/c/246710/ to pass extension tests. Please could you review it. [16:18:43] (03PS1) 10JanZerebecki: Work around cucumber pretty formater bug [integration/config] - 10https://gerrit.wikimedia.org/r/247602 (https://phabricator.wikimedia.org/T110510) [16:23:41] (03PS2) 10JanZerebecki: Work around cucumber pretty formater bug [integration/config] - 10https://gerrit.wikimedia.org/r/247602 (https://phabricator.wikimedia.org/T110510) [16:25:11] bd808: We haz production stats! [16:27:38] 6Release-Engineering-Team, 6operations: Monitor Phabricator and Gerrit availability - https://phabricator.wikimedia.org/T115611#1738843 (10mmodell) icinga has paged me, and opsen, on multiple occasions when phabricator was down. I'm pretty sure that it's working. [16:28:22] 6Release-Engineering-Team, 6operations: Monitor Phabricator and Gerrit availability - https://phabricator.wikimedia.org/T115611#1738853 (10hashar) 5Open>3Resolved a:3hashar Based on our experience we have good enough monitoring for either Gerrit or Phabricator. The critical bits are monitored via Icinga... [16:29:41] chasemp: I will follow up on scandium soonish. Gotta context switch to it first :/ [16:37:41] RECOVERY - Puppet failure on deployment-pdf02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:37:43] (03PS5) 10Paladox: [EventLogging] Add jsduck test [integration/config] - 10https://gerrit.wikimedia.org/r/246773 (https://phabricator.wikimedia.org/T88343) [16:42:58] 6Release-Engineering-Team, 3releng-201516-q2: QR action item: Phabricator - https://phabricator.wikimedia.org/T115176#1738915 (10greg) From mukunda: https://secure.phabricator.com/w/roadmap/ is the upstream roadmap with stuff that can be prioritized [16:47:14] twentyafterfour: did you get the Differential plugin installed on the prod Jenkins ? [16:47:39] (03PS1) 10Jforrester: Revert "Delete failing VisualEditor browsertests Jenkins jobs" [integration/config] - 10https://gerrit.wikimedia.org/r/247612 [16:48:08] ah yeah [16:49:28] twentyafterfour: theplugin is there and apparently nothing broke. If you get some job already setup I don't mind reviewing it [16:49:53] https://grafana.wikimedia.org/dashboard/db/releng-production-logging [16:49:58] twentyafterfour: you might need to specify on what kind of nodes to run it on. UbuntuTrusty && contintLabsSlave is probably a safe choice [16:50:34] ostriches: great :-) [16:50:59] Once we get some OOM stats, we'll have a panel for OOMs as well. [16:51:07] But production isn't running out of memory, grr :p [16:51:18] ostriches: do the metrics comes from the logstash elasticsearch ? [16:51:22] Yep. [16:52:00] Well, logstash tells statsd about them. [16:52:01] 6Release-Engineering-Team, 15User-greg: Create weekly Diffusion/erential migration meeting - https://phabricator.wikimedia.org/T116037#1738947 (10greg) 3NEW a:3greg [16:52:04] And then I query from that. [16:52:36] ostriches: ah but it is relayed via statsd/graphite under logstash. hierarchy [16:53:27] yeah. logstash -> statsd -> graphite [16:53:43] 5Continuous-Integration-Scaling, 7Tracking: Investigate using Drydock for CI - https://phabricator.wikimedia.org/T116038#1738961 (10mmodell) 3NEW [16:53:52] logstash is a stream processor so it can't really count things itself [16:54:22] ostriches: bd808: the next Grafana version should be able to query directly from ElasticSearch https://github.com/grafana/grafana/issues/1034 [16:54:34] "Elasticsearch as timeseries datasource #1034" [16:54:44] sure, but that will always be slower than graphite [16:55:01] and we only keep 30 days of the Elasticsearch data [16:55:56] ELT via logstash for interesting metrics with graphite as the timeseries db is scalable [16:56:03] yup [16:56:29] at least for shot time span for corner cases, you would not have to get the data relayed via graphite [16:56:53] I think it would be most useful for prototyping [16:58:43] ostriches: that dashboard looks a lot like the sekret bd808-test dashboard :) [16:58:49] I stole it :p [16:59:12] Removed memcached rate, added hhvm. [16:59:16] Will add OOM when it's done. [16:59:23] +1 [17:02:43] Could someone review https://gerrit.wikimedia.org/r/#/c/247294/ please. [17:03:30] legoktm: I am escaping for dinner, but I have it a PHP style issue earlier today regarding comments . if you could hint me at the preferred way to do multiline comments https://gerrit.wikimedia.org/r/#/c/247560/3/tests/phpunit/phpunit.php,unified :-} [17:03:57] greg-g: you talked a bit about https://gerrit.wikimedia.org/r/#/c/240888 during the meeting right? I should’ve chimed in then... [17:05:24] bd808: Adding another metric for apache2 syslog rate. [17:05:36] My thought is that if the change were in production the MZ would be (at least a little bit) justified in worrying about turf issues. But as the patch is only targeting beta, I have a different question [17:06:05] which is: How do beta people feel about a potentially log-lived patch in beta, and the use of beta as a dev platform? [17:06:17] * ostriches takes a breath [17:06:20] (Seems like some days beta is for dev and some days it’s only for patches that are one day away from prod) [17:06:30] SMalyshev: ^ [17:06:38] I don't like the idea of beta as a platform for developing stuff that doesn't have a clear path to production. [17:06:53] * andrewbogott nods [17:07:06] I am really wondering we need www.wikipedia.org to be part of the main mediawiki cluster [17:07:13] when it can be a standalone server [17:07:41] Some time earlier in the phab task someone tried to steer the discovery folks towards making labs project for this… why was that passed up? [17:07:52] andrewbogott: I hope it shouldn't be that long-lived [17:08:06] hasharMeeting: It gets enough traffic that it needs to be on text-lb and load balanced to the apache. [17:08:15] ok ok :-} [17:08:16] SMalyshev: that seems unrealistic, because changing that in production is a political question [17:08:19] andrewbogott: but right now you can just deploy unmerged stuff on beta, so I'm not sure there's a problem with merged stuff [17:08:20] and those are never quick to resolve [17:08:42] SMalyshev: I don't like that you can deploy unmerged stuff on beta. [17:08:43] SMalyshev: why not just make a labs instance or a vagrant project to demonstrate future portal work? [17:08:55] andrewbogott: well, I'm not getting in there [17:09:00] anyway it is dinner time [17:09:06] hasharMeeting: replace the // with empty newlines [17:09:08] .. [17:09:18] andrewbogott: I won't know any way of doing suitable apache configs on labs instance [17:09:41] andrewbogott: given that those configs have names hardcoded [17:09:49] The funny thing is, the apache config has zero shared code between beta and production. [17:09:49] Isn’t that exactly what wm-vagrant (or whatever it’s called today) is for? [17:09:59] So doing it in beta -> not a canary for production [17:10:33] ostriches: well, it's a copy of the production, also it's a testbed for the process [17:11:16] ostriches: the hard part there is not an apache config (though that needs to be tested too, and already was on beta) but people interacting with it [17:12:29] to see that people are ok with it, we need to have it somehwere where it resembles the actual interaction [17:12:55] with deployment process, etc. [17:14:16] ok with what, though? I don’t see that there is an design or a prototype or even a plan for what it is you want to show people. [17:15:25] SMalyshev: sorry, I’m not trying to be negative. I’m just unclear on whether you have a specific thing you want to demonstrate on beta, or if you want a dev platform. [17:15:30] Everything in the phab task implies the latter. [17:15:39] In which case… beta is probably not the right place for it. [17:16:17] andrewbogott: we want to have test deployment for the mechanism of the new portal, which I was under imression beta actually is. [17:16:48] Ah, so you want to test the gitifying of the portal [17:17:03] before demonstrating what the gitifying is for [17:17:08] andrewbogott: not only. Also to test how mxn and others would work with it [17:17:27] andrewbogott: we know what it is for. We're not doing it just to have something to do with git [17:17:54] ok [17:18:06] andrewbogott: and it is actually explained on the task, by both the actual portal maintainer (who is completely on board with it) and Deskana [17:18:32] Hang on, I didn’t say you didn’t /know/ what it was for [17:18:38] I said without demonstrating :) [17:18:57] SMalyshev: sorry, tell me again why you can’t make your own demo portal for this rather than using beta’s? [17:21:39] andrewbogott: because I don't know how to make our own demo portal that uses scap deployment, beta apache configs, etc. and it seems that beta portal exists exactly for this purpose - at least if it doesnt then I don't see why have beta portal at all [17:22:16] and beta apache configs are close enough to actual prod configs that if those work we can be reasonably sure prod ones would work too [17:22:22] same for the deployment processes [17:23:17] absent that, of course I can deploy VM and check out the HTML page, but tesing if we can publish HTML page is useless - we know we can, it's the surrounding deployment/process aspects that are tricky [17:23:33] HTML page I can test on my local browser... [17:31:27] SMalyshev: I can only defer to beta people about the question of whether this is or isn’t an appropriate use. [17:35:01] As long as it's temporary. [17:35:14] If it ends up stalling going to prod, we should back things out. [17:35:48] ostriches: if we decide it's not good for prod, we can always revert it [17:36:01] that's the point of beta, if it's not works, it's out [17:47:26] 10Continuous-Integration-Config, 10Fundraising Tech Backlog, 10Fundraising-Backlog, 7Technical-Debt: Use the name "grunt-jscs" in all Fundraising CI glue - https://phabricator.wikimedia.org/T115642#1739254 (10atgo) [17:48:46] getting here late, but in terms of plans for the portal the initial set if ~5 A/B tests of visual changes to the portal, with more tests and programatic functionality being built in while working on various tests: https://phabricator.wikimedia.org/T112172 [17:48:52] 10Continuous-Integration-Config, 10Fundraising-Backlog, 5Patch-For-Review, 7WorkType-Maintenance: Switch wikimedia/fundraising/slander to use tox as an entry point - https://phabricator.wikimedia.org/T114250#1739273 (10awight) @hashar Thank you, this is awesome! Just letting you know however that we've pu... [17:51:11] (03CR) 10JanZerebecki: [C: 031] Use pipeline name as context for Zuul diff [integration/config] - 10https://gerrit.wikimedia.org/r/247543 (owner: 10Hashar) [18:13:07] (03PS1) 10JanZerebecki: pywikibot/core: also run tox-jessie in check pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/247623 (https://phabricator.wikimedia.org/T87169) [18:13:30] so there's no branch cut today? [18:14:09] (03CR) 10jenkins-bot: [V: 04-1] pywikibot/core: also run tox-jessie in check pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/247623 (https://phabricator.wikimedia.org/T87169) (owner: 10JanZerebecki) [18:14:28] (03CR) 10Legoktm: [C: 031] "Awesome!" [integration/config] - 10https://gerrit.wikimedia.org/r/247543 (owner: 10Hashar) [18:27:43] (03PS2) 10JanZerebecki: pywikibot/core: also run tox-jessie in check pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/247623 (https://phabricator.wikimedia.org/T87169) [18:49:16] greg-g: sent a mail about deployments [18:49:43] but since hoo is in SF, he might do some of it today (like update our stuff in 1.27.3) [19:03:19] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-OAuth, 10Pywikibot-OAuth, 5Patch-For-Review, and 2 others: "Nonce already used" regularly occurring on beta cluster - https://phabricator.wikimedia.org/T109173#1739719 (10Tgr) [[ https://graphite.wmflabs.org/render/?width=1500&height=1000&_salt=14417588... [19:12:45] bd808: Hmm, we don't get enough OOMs to make a pretty graph. [19:12:48] Which is...good [19:13:08] But it means I can't find a good way to include the stat on my dashboard that makes sense :p [19:13:36] hard to complain about that :D [19:15:23] 6Release-Engineering-Team: Reduce production log errors to zero* - https://phabricator.wikimedia.org/T115630#1739825 (10demon) [19:30:18] 6Release-Engineering-Team, 6operations: deployment: user trebuchet gets added and removed from group wikidev on every puppet run - https://phabricator.wikimedia.org/T115760#1739925 (10thcipriani) So, currently, it doesn't matter if the `trebuchet` user is in the `wikidev` group, this has only been the case sin... [19:47:00] 6RelEng-Admin, 15User-greg: Create KPIs for #releng-201516-Q2 - https://phabricator.wikimedia.org/T107905#1739950 (10demon) [19:47:01] 6Release-Engineering-Team: Reduce production log errors to zero* - https://phabricator.wikimedia.org/T115630#1728605 (10demon) [19:47:02] 6Release-Engineering-Team: Implement "WMF Log Errors count" KPI - https://phabricator.wikimedia.org/T108749#1739945 (10demon) 5Open>3Resolved A little bit of explanation about what stats we're tracking here and what the goals are for them: * **MW error logs by severity** - Ideally anything WARNING/FATAL/ERRO... [19:47:12] greg-g: ^^ [20:06:51] 5Continuous-Integration-Scaling, 7Tracking: Investigate using a Squid based man in the middle proxy to cache package manager SSL connections - https://phabricator.wikimedia.org/T116015#1740000 (10hashar) [20:07:16] 5Continuous-Integration-Scaling, 7Tracking: Investigate using a Squid based man in the middle proxy to cache package manager SSL connections - https://phabricator.wikimedia.org/T116015#1738384 (10hashar) I have updated the task description with my attempts at using Squid as a man in the middle proxy. [20:07:59] (03PS2) 10Niedzielski: Remove redundant Android tests [integration/config] - 10https://gerrit.wikimedia.org/r/247629 [20:24:04] 6Release-Engineering-Team, 6operations: deployment: user trebuchet gets added and removed from group wikidev on every puppet run - https://phabricator.wikimedia.org/T115760#1740032 (10faidon) >>! In T115760#1739925, @thcipriani wrote: > It seems like the Right Thing™ would be to make `wikidev` the primary grou... [20:25:38] ostriches: ^5, \o/, yippee!, etc etc :) [20:25:44] well done [20:28:23] (03CR) 10Niedzielski: "deployment looks good. ready 4 review" [integration/config] - 10https://gerrit.wikimedia.org/r/247629 (owner: 10Niedzielski) [20:32:13] 6Release-Engineering-Team: Implement "WMF Log Errors count" KPI - https://phabricator.wikimedia.org/T108749#1740050 (10hashar) I have edited the releng main board https://grafana.wikimedia.org/dashboard/db/releng-main-page to list out all dashboards tagged `releng`. Such dashboard list is a feature of Grafana 2... [20:37:16] (03CR) 10Hashar: "Happy you like it :-} Who's going to +2 it ?" [integration/config] - 10https://gerrit.wikimedia.org/r/247543 (owner: 10Hashar) [20:39:53] (03PS1) 10Hashar: fundraising/slander now has tox-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/247649 (https://phabricator.wikimedia.org/T114250) [20:40:48] 10Continuous-Integration-Config, 10Fundraising-Backlog, 5Patch-For-Review, 7WorkType-Maintenance: Switch wikimedia/fundraising/slander to use tox as an entry point - https://phabricator.wikimedia.org/T114250#1740069 (10hashar) Such a pity, poor `slander`. At least there is a test entry point and CI is bei... [20:43:25] there's no branch being cut today right? [20:43:34] and it's going to happen next week instead? [20:47:10] legoktm: right right [20:47:21] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<30.00%) [20:48:41] * legoktm merges a scary change [20:53:32] (03CR) 10Hashar: "I think that was for T104264 "Enable PHPUnit testing on the wikimedia/fundraising/SmashPig repo"" [integration/config] - 10https://gerrit.wikimedia.org/r/228580 (owner: 10Awight) [20:55:03] (03CR) 10Hashar: [C: 031] "Ugly but that is a workaround so... Does it actually fix the issue?" [integration/config] - 10https://gerrit.wikimedia.org/r/247602 (https://phabricator.wikimedia.org/T110510) (owner: 10JanZerebecki) [20:57:31] (03PS2) 10Hashar: Revert "Delete failing VisualEditor browsertests Jenkins jobs" [integration/config] - 10https://gerrit.wikimedia.org/r/247612 (https://phabricator.wikimedia.org/T94162) (owner: 10Jforrester) [20:58:26] (03CR) 10Hashar: "Updated commit message to point to the task that were listed in the reverted commit." [integration/config] - 10https://gerrit.wikimedia.org/r/247612 (https://phabricator.wikimedia.org/T94162) (owner: 10Jforrester) [21:02:23] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [21:21:30] Yippee, build fixed! [21:21:31] Project browsertests-QuickSurveys-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #40: 09FIXED in 5 min 29 sec: https://integration.wikimedia.org/ci/job/browsertests-QuickSurveys-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/40/ [21:42:11] 6Release-Engineering-Team: Implement "WMF Log Errors count" KPI - https://phabricator.wikimedia.org/T108749#1740221 (10demon) Moved to https://grafana.wikimedia.org/dashboard/db/production-logging per Faidon on T81030 [21:43:12] greg-g: apparently l10nupdate scapped the group1 to wmf3 change which was left unsync'd last week. [21:43:42] That seems incredibly unwise to have l10nupdate blindly syncing whatever happens to be on tin at the time [21:45:35] the whole l10n systems deserves an overhauling [21:46:07] but that is a daunting task [22:03:57] yeah [23:01:59] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [23:03:14] ^ this is me, working on it. [23:16:55] RECOVERY - Puppet failure on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:44:05] twentyafterfour: holy.. that's not good, file a task to track that? [23:45:28] greg-g: apparently I was wrong, someone scapped it [23:45:44] huh [23:45:56] * greg-g is reading backscroll in -ops, just got to discussion about l10nupdate [23:47:14] ah yeah, I see [23:49:43] RECOVERY - Puppet staleness on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [3600.0]