[00:25:16] Krinkle: my dev wiki (after a cache warmup) definitely seems a lot faster with the bagostuff fixes [00:25:16] lol [00:25:30] nice [00:25:51] When did you start using php 7,2? [00:25:54] * Reedy tries to find the bug again [00:26:03] it's been a problem since php 7 :P [00:26:35] 16.04? [00:28:49] So the backports into the supported branches will likely help someone else :) [02:19:40] Reedy: ah nvm, the issues wasn't new in 7.2, the issue is with php-memcached [02:19:55] I misrembered the reason being that I'm on 7.1, but the reason is I don't have mc local [05:42:23] <_joe_> TimStarling: still around? I wanted your opinion on bblack's proposal https://phabricator.wikimedia.org/T91820#4254387 [05:42:52] <_joe_> our big problem is, we can't write to the session storage from the non-primary DC at the moment [05:43:47] <_joe_> so, unless we're 100% sure we don't touch sessions in GETs, I would advise we follow that path for now [05:44:58] reading [05:47:23] what will happen if it tries to write to the session? [05:48:18] <_joe_> say a request in codfw writes to the session [05:48:21] <_joe_> it will succeed [05:48:29] <_joe_> but the data written will only be in codfw [05:48:43] <_joe_> and will be overwritten as soon as one request for the same session is made in eqiad [05:48:51] <_joe_> well, one write request [05:50:57] that would need to at least be logged [05:51:49] we do create a new session when there is a GET request to Special:UserLogin [05:52:05] <_joe_> we can also make the codfw redises read-only [05:52:17] and that is a localised page name [05:52:23] <_joe_> sigh [05:53:01] <_joe_> ok so we need to change the session storage before we go multi-dc, it would seem? [05:53:16] <_joe_> I thought login happened via a post, and that created the session [05:53:55] there is CSRF protection [05:54:04] <_joe_> or use dynomite, which allows multi-dc smart replication [05:54:21] <_joe_> right, I even know about that, we had an RFC earlier this year [05:57:05] pity the SessionManager refactor did not add replication awareness [05:58:19] <_joe_> we already have in mediawiki a REST session store which can use cassandra+restbase for storing sessions, I would expect that to be the way to go [05:58:34] <_joe_> but ofc that requires hardware, and some more work [05:59:02] <_joe_> but it was the correct solution anyways [05:59:47] how does it help though, if you use a sequence of get, modify, set [06:00:04] <_joe_> well, cassandra helps :P [06:00:12] unless you wait for a quorum on set [06:00:31] <_joe_> yeah, of course you need to [06:00:36] <_joe_> but it's usually very fast [06:01:13] presumably that could be done with other data stores at the application layer, but OK [06:01:47] e.g. we have multi-write wrappers, where the application just sends all writes to both data stores, waiting for each [06:02:08] <_joe_> yes [06:02:16] <_joe_> we will have the problem of encryption then [06:02:26] sure [06:02:33] <_joe_> cross-dc writes to the session store will need to be encrypted [06:02:43] <_joe_> I need to run some numbers on that too [06:03:03] I hope you are not saying that we should move sessions to cassandra just for encryption [06:03:09] <_joe_> nope [06:03:32] <_joe_> I think that it's the best option for having a cross-dc data store with reliable write and read performance [06:03:39] it's awesome in other ways [06:03:41] <_joe_> and that we currently use and have expertise fore [06:03:47] <_joe_> *for [06:04:18] <_joe_> we could ofc keep using redis, tunnel connections, and solve the problems at the application layer [06:05:02] <_joe_> but that seems to *me* like spending a lot of effort, not necessarily in the right direction [06:05:03] is redis used for anything other than sessions now? [06:05:23] <_joe_> file locking, and the last small bits of the old jobqueue [06:05:23] if we can completely get rid of redis then that is a win for complexity [06:05:57] the reason for using it was for replication, but that hasn't really worked out has it? [06:06:02] <_joe_> we have other services who use redis anyways: ores and change-prop both use redis, so redis is never going away [06:06:18] <_joe_> Well, the advantage redis gives you over memcached [06:06:41] <_joe_> is replication (which for the sessions, where we don't abuse lua, works), and persistence [06:07:00] <_joe_> if a redis is restarted, sessions will persist on the restart [06:07:22] <_joe_> *if* we could use memcached for sessions, we could solve the problem of cross-dc replication using mcrouter [06:07:22] ok [06:07:31] <_joe_> but we'd lose persistence [06:08:12] <_joe_> you would have multi-dc replicated sessions and we could use that redundancy to cover for loss of persistence, but then you have the ill tendency of memcached to evict data [06:08:35] <_joe_> I tend to prefer not to store sessions in memcached for that reason [06:08:48] and memcached has eviction, in fact IIRC there are some obscure locking cases in memcached where it will throw away your data even if it has enough space, because it can pretend to have evicted it [06:09:35] <_joe_> yes, the slab allocation issue [06:09:47] <_joe_> it's got better in the latest versions, but it's still there [06:10:24] <_joe_> if you have too many objects of the same size, memcached will put them all in the same slab, and once it's full, it will start to evict data [06:11:02] sure, I had epic battles with this in the early days of memcached [06:11:09] <_joe_> so it happened to me in the past (circa 2009?) to have a memcached server with 10% of the RAM used, but it was evicting like crazy [06:11:16] <_joe_> heh [06:11:24] including, IIRC, trolling them on their mailing list about how they should have just used malloc() [06:11:30] <_joe_> ahah [06:16:52] I'm writing on phab [06:17:16] <_joe_> thanks [06:19:14] bbiab [09:19:57] tgr: hey. does php have an operator I don't know, or is this a typo? [09:20:01] https://gerrit.wikimedia.org/r/c/405015/91..92/includes/page/WikiPage.php#2931 [09:20:06] $statusRev = $status->value['revision'] ?? null; [09:32:22] DanielK_WMDE_: coalesce operator, PHP7 [09:33:06] it's like ?: but is lax in the left side check like empty/isset [09:34:14] <_joe_> does it work with HHVM? [09:40:22] Ah, I see. [09:40:50] _joe_: I'd hope whoever merged it to master checked that [09:41:40] DanielK_WMDE_: I rewrote the test yesterday to have all the logging checks in the same test, but didn't push yet; should I do that or are you working on the code already? [09:41:48] tgr: it's a bit prettier, but why did you change that line in this patch? It seems unlrelated. [09:41:56] tgr: it's done [09:42:05] cool, thanks [09:42:14] the change is from the rebase [09:42:28] someone changed it in WikiPage [09:43:52] ah, ok [09:44:12] i guess i have to set my IDE to php7, then :) [09:48:24] tgr: i pushed a new version [09:48:35] did you resolve the npm problem? [09:50:07] the npm install one, yes, but the tests are still broken [09:50:23] havent had much time to work on MCR in the last few days [09:50:46] "tests are broken" means they fail, or something else? [09:50:47] I'll fix them today or worst case do a bunch of manual testing if I can't figure that out [09:50:57] yeah, they fail on master [09:51:07] can you tell me how to run the selenium tests on the vps? I have never done that :) [09:51:17] npm run selenium [09:51:57] well, the Node.js ones, there might be extensions which still have Ruby tests [09:53:44] tgr: i'll try that. and i'll ask addshore to break mcr-full with manual tests :) [09:54:29] uh... [09:54:36] http://mcr-full.wmflabs.org/ doesn't exist? [09:54:51] mcr-base and mcr-sdc does... [09:58:42] DanielK_WMDE_: uh, sorry, I converted that to a wikifarm and forgot to tell [09:58:45] try http://dev-mcr-full.wmflabs.org/wiki/Main_Page [09:59:03] or http://wikis-mcr-full.wmflabs.org/ [09:59:39] although I haven't created the rest of the proxies yet [10:03:34] DanielK_WMDE_: if I want to get a list of sitelinks from a Q number, would I do something like WikibaseRepo::getDefaultInstance()->getStore()->newSiteLinkStore()->getSiteLinksForItem( new ItemId( 'Q123' ) ) ? [10:05:54] tgr: yes. Well, you'd want to inject the SiteLinkStore. And if you have the Item object loaded anyway, you can also ask it directly. But that should work, yes. [10:06:55] tgr: the list of wikis on http://wikis-mcr-full.wmflabs.org/ is a lie, no? [10:07:40] hm... you converted that one vm to a farm, but not the others, correct? [10:07:52] the list of domains is a lie, the wikis do exist [10:08:00] I'll create the proxies in a sec [10:08:15] ok, thanks! [10:14:37] DanielK_WMDE_: should be working now [10:16:10] tgr: most are, but http://en-mcr-full.wmflabs.org/ isn't [10:16:32] works for me [10:16:44] are you using Firefox? [10:16:58] it tends to cache DNS lookup failures for a few minutes [10:18:13] ...even when you click "try again"? *wigh* [10:18:16] *sigh* even [10:19:00] tgr: so... is the wikifarm still using vagrant? I'm abit confused about the setup. [10:20:35] I'm also confused about where to run npm run selenium. when i ssh into mcr-full.eqiad.wmflabs, there seems to be no npm installed, and no vagrant running... [10:20:45] sorry for being a noob ;) [10:23:43] DanielK_WMDE_: yeah, a single Vagrant box with a single MediaWiki checkout, not unlike Wikimedia production (except that uses two checkouts so that the train deploy can be staggered), just a lot more hacky internally [10:24:22] so on the vagrant side, it's just MediaWiki using the Host: header to set the database ID and the config [10:25:01] WMCloud is not really aware of all that, it just has a list of domains and a list of boxes+ports to point them to [10:25:59] you ssh into mcr-full, go to /srv/mediawiki-vagrant, run vagrant ssh, go to /vagrant/mediawiki, run selenium tests there [10:27:56] so this is the https://www.mediawiki.org/wiki/Selenium/Node.js/Inside_MediaWiki-Vagrant setup [10:28:51] as opposed to https://www.mediawiki.org/wiki/Selenium/Node.js/Inside_MediaWiki-Vagrant where you'd run the tests from the WMCloud box [10:30:11] DanielK_WMDE_: oh, sorry, the proper command is MW_SERVER=http://dev-mcr-full.wmflabs.org npm run selenium [10:34:57] oh, right. change to /srv/mediawiki-vagrant before running vagrant ssh. silly me [10:38:19] [10:37:57] [E] [MWBOT] Login failed: Selenium_user@http://dev-mcr-full.wmflabs.org/w [10:49:52] annoyingly, it does not seem possible to X11-forward through Cloud hosts [11:12:42] tgr: o/ [11:12:52] how can I go about giving myself some rights on the test wikis? [11:15:26] addshore: there is a maintenance script [11:15:41] createUserSomething? [11:16:13] how does one run said maintenance script in the vagrant setup though? :) [11:17:19] mwscript createAndPromote.php --wiki=wiki --sysop --bureaucrat --force [11:17:26] inside vagrant [11:17:50] or you can just log into the default account (Admin:vagrant) which is a bureaucrat [11:17:54] its the inside vagrant bit I dont know :) [11:18:06] aaah, okay, whats the password for it? [11:18:12] vagrant [11:18:16] oh wait, admin, vagrant :) [11:18:18] awesome, thanks! [11:22:09] tgr: does a vagrant install have both a master and a slave [11:22:09] ? [11:24:25] and mcr-base is just current master? or? [11:25:37] addshore: you mean DB slave? [11:25:42] tgr: yup [11:26:02] no, there is a single local MariaDB server [11:26:37] okay, is it possible to setup with a slave in vagrant, as many issues we have identified before for other mcr related patches ended up being only visible while using a master & slave [11:26:40] Aaron.Schulz had a hack for setting up a replica, I don't think it ever made it into a proper vagrant role [11:27:44] there is no reason it couldn't be done in theory, I don't know much about DB administration though [11:50:13] addshore: T93047 [11:50:14] T93047: Create vagrant role for master-slave DB setup - https://phabricator.wikimedia.org/T93047 [11:51:48] turning that into a proper vagrant role with configurable replag and such would be cool [11:53:40] hasharAway: can I run selenium in a real browser with npm? [12:24:02] addshore: testing master/slave is more relevant for the RevisionStore stuff than for PageUpdater. [12:24:30] the critical stuff to test for PageUpdater is really extensions using hooks. Because all the hook points got moved. [12:25:30] so, abusefilter, confirmedit, echo, babel, geo... [12:25:41] flow as well [12:39:10] the headless tests fail mostly on on login, it seems [12:39:45] I wanted to avoid having centralauth on mcr-full because it complicates things but apparently the flow role pulls it in :/ [12:40:28] and headful (is that a word?) tests do not run on my machine at least [12:40:49] I'll put up instructions, would be nice if someone else could try [12:41:20] could be OOM again, maybe we should just scrap mcr-full before we have invested too much work in it, and redo it on a larger box [12:41:48] or we could try the selenium video recording thingie I guess [12:49:24] tgr: should be possible yeah [12:49:33] webdriver.io would for sure if you have DISPLAY set [12:50:30] which requires a X11 server to run [12:50:41] or Xvfb [12:51:02] Xvfb is an X server that has the display purely in memory [12:51:10] you can even watch it by attaching to it: https://amusso.blogspot.com/2017/11/watching-xvfb-frame-buffer.html [12:51:19] /usr/bin/Xvfb :94 -screen 0 1280x1024x24 -fbdir "$HOME" [12:51:21] xwud -in Xvfb_screen0 [12:51:32] then DISPLAY=:94 [12:52:05] though xwud might need a xserver as well hehe [12:52:29] but if inside vagrant you have -fbdir mounted from thehost, you can watch from the host i guess [12:53:25] another alternative is to run the npm selenium suite from your host and target mediawiki vagrant by setting MW_SERVER and MW_SCRIPT_PATH [13:01:40] hashar: thanks! I've run some circles with zeljkof in #relend in the meanwhile [13:02:33] not quite working yet, but it tries to create the X11 window at least [13:33:12] DanielK_WMDE_: ack, I'll be sure to try and test all the installed extensions [13:33:19] DanielK_WMDE_: just reviewing https://gerrit.wikimedia.org/r/#/c/432980 now [13:35:43] just left 5 comments, all only minor, I think thats probably ready to just get merged, can get it out of the way [13:46:08] addshore: only after we tested it on mcr-full :) [13:47:40] DanielK_WMDE_: that thing is only tests right? and at the start of the chain [13:48:27] tgr: it would be great to add wikidatawiki and commons to the test setup for mcr-full, would that be hard? [13:49:43] commons is not hard [13:49:56] wikidatawiki I am not familiar with, will have a look [13:50:15] there should be a role for it afaik that should work [13:50:29] I vaguely remember it being not fully multiwiki-compatible [13:51:16] addshore: oh, the one that just reaftors the tests! yes, sure, that can just go in :) [14:59:17] anomie: DanielK_WMDE_ are we okay to nuke those test dbs now and get them updated then? :) [14:59:22] just confirming before I make it so [14:59:54] noone is still actively using them right now, and we don't need any data currently on them [15:00:38] addshore: The only concern is whether the DBAs are able to update them within whatever timeframe we're looking at. [15:01:16] anomie: should be ~1 hour each [15:01:31] and I believe they will start as soon as I confirm on the ticket :) [15:01:49] That's fine then. I was worried that the time would be measured in days or weeks rather than hours ;) [15:01:56] great! :) [15:02:03] (whether for the copying or for the having time to do it) [15:06:04] anomie: okay, they said it would in total be offline for a day for the data reload and a software update, but I guess that is still in the timeframe [15:07:16] Fine with me. [15:07:49] ack [15:09:02] anomie: It is scheduled in for tomorrow [15:17:04] CindyCicaleseWMF: what date did you say we were aiming to get the train? [15:17:21] June 18 [15:17:28] There is no train that week [15:17:41] D: [15:17:44] Well, that would be a problem :-D [15:17:56] When is the next train after June 11? [15:18:01] week of the 25th [15:18:28] but, that might even be able to work to our advantage, I'll have a discussion with releng [15:18:53] OK, great. I will update the doc and we can discuss in tomorrow's meeting. [18:25:21] Reedy: btw, need to sync the 1.31 notes back to master :) [18:25:25] RE: php-mc 3 [18:25:32] It also knida needs moving [18:25:54] I'll do it :) [18:33:23] Why is wikibugs double posting? [18:33:24] https://gerrit.wikimedia.org/r/#/c/437803/ [18:33:27] https://gerrit.wikimedia.org/r/#/c/437804/ [19:03:02] MaxSem: https://github.com/me-shaon/GLWTPL/blob/master/LICENSE [20:41:43] Reedy quick question if i set wgAuthenticationTokenVersion to 1 would that work? [20:41:54] Define work [20:42:29] It's currently null but it dosen't explain here https://github.com/wikimedia/mediawiki/blob/7793c8acc6d21c451cd5737fd6b98b1a7d9a5e00/includes/DefaultSettings.php#L4929 if it's a boolean or a number. [20:42:36] and by work i mean log everyone out :) [20:43:01] https://github.com/wikimedia/mediawiki/blob/master/maintenance/resetUserTokens.php#L33 says to use it [20:43:19] * @var string|null [20:43:45] ah thanks so '1' [20:43:56] (i missed that param heh) [20:44:30] Why not just run the invalidateUserSessions.php script? [20:44:40] that's per user [20:44:56] Meh [20:45:02] we need all users to have there session invalidated due to a stupid issue i introduced by mistake heh [21:03:11] mobrovac: btw, is there a dashboard like https://grafana.wikimedia.org/dashboard/db/job-queue-health?orgId=1&refresh=1m for the new job queue? I found a more generic dashboard, but that seemed more useful for those working on the service, as opposed to devs I think. [21:03:22] there's a few though, I may've missed it [21:04:00] Krinkle: we have https://grafana.wikimedia.org/dashboard/db/jobqueue-eventbus?orgId=1 [21:04:35] mobrovac: Okay. That's closer to what I was looking for, thanks :) [21:04:52] :) [21:04:58] mobrovac: Mind if I add a row on top for a broader overview? (assuming I can get the queries to work) [21:05:28] Krinkle: that'd be awesome! [21:50:00] mobrovac: What are the topics that have an extra ".change-prop.partioned." between "eqiad" and "mediawiki.job.{type}", and how do those related to the regular variants of the same job type? [22:10:27] no_justification: https://gerrit.wikimedia.org/r/#/c/437867/ I left a comment about the patch mode since itl ooks wrong to me [22:12:06] A new patch release would involve committing to vendor's release branch, updating core submodule (automatically?) then tag and release from core [22:12:12] The core tag is what pins the submodule sha1 [22:12:36] makerelease shouldn't need to clone vendor if vendor is a submodule [22:16:46] ok [22:17:09] so we need to add it as a submodule to REL1_31 [22:17:34] why did we need a patch to makerelease.py again? [22:19:38] To remove the composer bits [22:20:05] We don't need to composer update anymore :) [22:21:54] do we need to worry about older branches? [22:22:29] Eh, only for "supported" [22:22:31] branches [22:22:43] I ripped out a *bunch* of like 1.26 and below support recently [22:26:04] * Krinkle tries to understand [22:26:22] so for a patch release, we only need to do git submodule update --init --recursively [22:26:39] for a new release branch, clone and add submodule (or do we do that outside the script) [22:30:17] no_justification: should I just add the vendor submodule to 1.29 and 1.30 then? [22:36:27] Would be easiest and most consistent [22:36:50] Krinkle: Correct. For new *release* you just clone the repo recursively. [22:36:57] For a new *branch*, you run the branching script [22:37:10] Basically, makerelease is going to clone stuff, tar a few things and gpg sign them [22:37:22] (I wanted to package this even easier with `git archive` but that doesn't do submodules) [22:37:41] 99% of the complexity in makerelease is gone [22:38:05] (I had lots more plans here, but my time runs short) [22:42:04] https://gerrit.wikimedia.org/r/#/q/topic:vendor-submodule [22:48:41] no_justification: Wanna sit https://gerrit.wikimedia.org/r/#/c/421949/ tomorrow? [22:48:44] (puppet swat) [22:50:43] thanks James_F [22:50:45] I can be on IRC. It's so freaking trivial :) [22:53:35] Reedy: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/0d08079ec056a23d52c53b79d1e339a93d92487e%5E%21/wmf-config/mc-labs.php [23:05:34] legoktm: Thanks for making the world a bit better. :-) [23:07:20] Krinkle: yay, bugs