[14:28:41] mdholloway: niedzielski-afk: bearND|afk: looks like we're in 'unbreak now' mode. i'm unable to log in or save edits in the app. [14:28:59] coreyfloyd: ^ can you check whether you're able to save an edit in the iOS app? [14:29:55] dbrant: yeah [14:30:06] dbrant: i don't believe AuthManager was merged yet but i hope anyone reading this will correct me if i'm mistaken... [14:30:50] dbrant: no, it hasn't been, it's something else [14:32:09] dbrant: I was able to save but I was getting weird "no internet connection" as I did it [14:32:22] I'm mobile right now so I can't say what it is [14:33:38] dbrant: coreyfloyd: testing with postman, the api appears to be working fine [14:34:00] dbrant: can you confirm the text on the test wiki article "Test" is "Hello work again!"? I want to make sure it's not a cache issue [14:34:08] dbrant: coreyfloyd: for login, at least [14:34:17] mdholloway: or if you can? ^ [14:34:46] "World" not "work" [14:34:52] yep: "Hello world again!" [14:34:54] Typo [14:35:51] mdholloway: ok so I just did that, but I did get weird internet connection errors so it may be something with the flow that is blocking you but he iOS app ignores. [14:36:31] I'm walking to the gym now so I can help more in like 90 minutes. [15:03:15] mdholloway bearND dbrant: the login issue repros over here too :( [15:04:34] niedzielski: dbrant: yeah, i'm unable to log in from the app either, to be clear. i assume we're all seeing the same SocketTimeoutException in logcat? [15:05:00] mdholloway: that's what it seems to be in all scenarios on my side [15:05:15] i'm seeing IOException - REFUSED_STREAM [15:07:42] dbrant: i get refused stream on a second attempt. the first attempt is just SocketTimeoutException [15:10:15] niedzielski: dbrant: i hadn't seen the REFUSED_STEAM but just repro'd the same behavior as niedzielski now when attempting to save / retrying [15:15:28] niedzielski: dbrant: if I hard-code site.secureScheme() to false when constructing the Api object in WikipediaApp.getAPIForSite(), I can log in / edit [15:15:54] mdholloway: does that mean that HTTP work but HTTPS doesn't? [15:16:00] niedzielski: yep [15:20:25] but i'm using HTTPS to log in successfully through the api with postman, so i'm afraid it looks like the problem is somewhere in the app's networking layer [15:22:19] mdholloway: niedzielski: on iOS, i'm able to log in without issues. When editing, and showing the preview, the iOS app shows an error of "No internet connection", but it seems to tolerate that fault, and allows me to continue saving anyway... [15:24:30] dbrant: niedzielski: i wonder if they're (still) showing 'no internet connection' as a catchall like we were a while back... [15:45:17] dbrant: niedzielski: when was the last time you logged in or edited successfully via the app? unfortunately i haven't touched login or editing since before the offsite [15:45:49] mdholloway: one of my apps was logged in this morning but i'm not sure when i actually re-logged in :/ [15:46:00] same; at least prior to the offsite [15:46:32] dbrant: niedzielski: btw, looks like 'there's no internet connection' can mean a few things in the iOS app including a timeout: https://github.com/wikimedia/wikipedia-ios/blob/da2b7fadcb39356c5535eddd8926a2f9990a30de/Wikipedia/Code/NSError%2BWMFExtensions.m#L42-L63 [15:47:40] dbrant: niedzielski: and yeah, i can repro the same on iOS (errors being displayed but somehow powering through them) [16:14:14] mdholloway dbrant bearND: curling the (deprecate) way the app logs in seems to work fine [16:19:17] niedzielski: dbrant mdholloway: I'm wondering if T134246#2266391 has to do with this. With wmf.23 being rolled back the associated patch is missing on now active wmf.22 [16:19:17] T134246: Changing the email addresses sends both emails to the old address, none to the new address - https://phabricator.wikimedia.org/T134246 [16:19:37] https://phabricator.wikimedia.org/T134246#2266391 [16:20:09] bearND: hm, good idea [16:22:37] niedzielski: hmm, never mind. I see https://gerrit.wikimedia.org/r/#/c/287218/ has been merged to wmf.22 [16:22:45] bearND: i was looking at that too but i thought the fix was backported [16:22:46] yeah [16:28:37] bearND dbrant mdholloway: if i set explicit timeouts on the okhttp client, it acts the same way but spins a lot longer waiting for a response [16:36:48] niedzielski: bearND: dbrant: https://github.com/square/okhttp/issues/2543 [16:38:59] mdholloway: hm, does HTTP (not HTTPS) working support this bug as the culprit? [16:42:57] niedzielski: not sure. [16:43:04] niedzielski: got there from here: https://github.com/square/okhttp/issues/2506 [16:43:17] niedzielski: although afaik we're a ways off from moving to http/2 [16:47:28] niedzielski: otoh: https://phabricator.wikimedia.org/T96848 [17:01:34] bearND: are you far enough along in the okhttp upgrade patch that you could test login? [17:02:14] bearND: i'm trying to set logging on the okhttp client but it doesn't seem to be working [17:03:27] niedzielski: I can run with retrofit2 and okhttp3 fine. The only blocker is getting some proguard settings adjusted to make CI happy. [17:03:40] bearND: does login work? [17:06:07] niedzielski: it failed the first time, not it's hanging, probably going to time out [17:06:15] now* [17:07:37] bearND: er actually, i guess logging works in general for okhttp2 but i'm not getting any output for the login sequence specifically [17:08:35] dbrant: mdholloway how is the bug work going? [17:08:41] niedzielski: LoginTask still uses AsyncTasks, not retrofit [17:09:09] bearND: right but it uses okhttp [17:09:33] bearND: sorry, the logging i was talking about was on the okhttp client [17:09:38] niedzielski: yes [17:10:41] coreyfloyd: we're still digging. my theory is that the problem has to do with our networking client library not playing nicely with nginx in certain situations over http/2; going to go over to operations and ask bblack about it after our standup, i think [17:11:51] Does anyone know what version of nginx we're running? Saw some issues reported for 1.9.5 [17:15:09] bearND: from a skim it looks like it was upgraded to 1.9.5 last week as part of https://phabricator.wikimedia.org/T96848 [17:15:25] dammit, finding headphones... [17:28:51] niedzielski: OkHttp logging is only turned on for Retrofi. I'll add another patch to enable it for the remaining calls, though [17:29:38] dr0ptp4kt: have you been following the http/2 rollout (and related nginx stuff)? [17:30:38] mdholloway: not particularly closely [17:31:15] mdholloway: hmm [17:31:16] https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Annoying_.22Secure_connection_failed.22 [17:31:32] and i noticed some SPDY error in my chrome console this weekend. [17:31:47] mdholloway: https://github.com/square/okhttp/issues/1844 mentions the nginx issue I eluded to earlier. Look for 1.9.5 on that page [17:32:38] all related to 1.9.5 perhaps... sounds a little bit too coincidental [17:32:45] thedj: interesting... thanks for the heads-up [17:33:45] mdholloway bearND does the legacy version of the app use okhttp and is it affected? [17:34:10] that is the pre android os 4.whatever version [17:35:12] dr0ptp4kt: yes, we've been using okhttp for a long time [17:37:04] dr0ptp4kt: yeah, the gingerbread release also uses it [17:37:52] mdholloway bearND dbrant: not sure if because of http or different MW, but http://wikipedia.beta.wmflabs.org works [17:38:40] btw. are we using certificate pinning ? I believe support for that was added to okhttp3 [17:39:33] not that our apps handle that much confidential data . [17:41:20] thedj: Not at this time. I'm working on moving our code to use retrofit2 and okhttp3. [17:41:33] k. [17:42:31] if you ever do, the documentation for that is not exactly in sync with the implementation my collegue informed me, be sure to double check the implementation :) [17:46:51] thedj: thanks for the tip [17:55:10] bearND: in case you filter gerrit emails, i updated the mcs redirect handling patch [17:55:48] mdholloway: great. thanks [18:07:56] mdholloway: did you already ping bblack? if not, i think i'll give him a shout [18:08:15] niedzielski: not yet, go for it [18:08:39] niedzielski: i'm working away at setting our okhttpclient not to use http2 and see how that goes [18:08:55] mdholloway: my version of curl doesn't actually support http2 [18:09:05] mdholloway: i'd have to compile in support [18:09:45] mdholloway: the curl tests i ran were all on http1.1 [18:10:09] niedzielski: maybe that's why it was working? [18:10:38] mdholloway: yeah, that's what i was wondering [18:30:02] niedzielski: bearND: dbrant|brb: proposed workaround: https://gerrit.wikimedia.org/r/#/c/287675/ [18:33:07] mdholloway dbrant bearND: do we want to cherry pick this? [18:34:35] if it works, then totally. I would also propose that we cherry pick my facepalm patches from last sprint... [18:35:07] niedzielski: i think the default set is Protocol.HTTP_2, Protocol.SPDY_3, Protocol.HTTP_1_1 so could do either way -- yours may be a bit more elegant [18:35:29] mdholloway: so it doesn't include HTTP_1_0? [18:36:06] nope, looks like it'll actually throw an exception if you try to set 1.0 [18:36:28] mdholloway: ok then i'm good :) [18:36:42] sound good to me [18:37:00] mdholloway: would you name the phab tasks in the commit msg? [18:37:15] dbrant: sure [18:37:31] dbrant: niedzielski: bearND: also +1 to cherrypicking the Facepalm Series [18:38:04] dbrant: just to confirm facepalm = ccf68e9e3c55c72043ad6614ad5a7dfa53e47c93 b37847710edec27e0be77f6f70094e1ba1bc48ff [18:38:38] niedzielski: yep [18:42:39] mdholloway dbrant bearND: looks good. all good for merge? [18:42:53] niedzielski: think i just need to add a phab task [18:44:13] dbrant: niedzielski bearND: do we have a phab task for the app issue or should i create one (referencing the http/2 switch task)? [18:45:11] mdholloway: phab tasks are already there; i just assigned to you. [18:45:26] coreyfloyd mhurd ^^^ not sure how applicable to ios people, but we're forcing android to HTTP1.1 for now [18:55:28] mdholloway dbrant bearND: any objections to merging? [18:55:56] niedzielski: nope, just tested login and editing one last time for good measure and it seems sound [18:56:03] + [18:58:14] ok here goes [18:58:18] niedzielski: interesting... that was for the weird edit connection error issue? [18:59:19] mhurd: yes, we think the connection issue you guys see in the ios app is related [18:59:49] niedzielski: go for it [19:00:01] niedzielski: gotcha thanks! [19:34:40] bearND: are you planning on deploying today or waiting to do it after puppet swat tomorrow? [19:35:36] mdholloway: I'd like to deploy it today. I'll come up with a simple patch we could use for tomorrow [19:37:22] bearND: ok, cool. i think i remember tyler saying having the scap config in the deploy repo before the puppet change won't hurt anything [20:06:39] mdholloway: hmm, having internet issues right now. I think we better wait until tomorrow. Hope gwicke doesn't mind. [20:07:44] bearND: i can deploy too [20:09:30] mdholloway: ok, if you don't mind doing it then go ahead [20:09:49] bearND: k [20:10:54] thanks, mdholloway [20:15:50] mdholloway dbrant bearND: will be pushing beta soon. release is up here https://phabricator.wikimedia.org/diffusion/APAW/history/master/;beta/2.1.144-beta-2016-05-09 [20:19:17] niedzielski: mdholloway: bearND: when do you think we should push to production? (- pass it through TSG? - let it sit overnight, then push tomorrow morning? - push immediately?) [20:19:45] dbrant: i'm up for pushing it pronto [20:19:55] dbrant: maybe give it a spin on your end first though [20:21:26] niedzielski: dbrant: bearND: ditto [20:21:45] niedzielski: when following your link i don't see anything useful for the top 4 commits . Just says "Importing..." [20:22:11] bearND: you might need to check it out locally while diffusion is processing [20:26:06] dbrant: do we still update release notes per language? [20:27:35] niedzielski: we do not :( there's no clear process to do this yet, outside of TWN [20:28:21] dbrant: ok makes sense. should i open a phab to remove the release notes from our strings.xml? [20:28:40] niedzielski: yep, good call [20:28:52] dbrant: cool, thanks [20:31:07] mdholloway dbrant bearND: i'm about to push the beta unless i hear any objections [20:31:24] niedzielski: dbrant: bearND: none here [20:31:28] niedzielski: dbrant: bearND: we'll probably want to do a GB maintenance release with the OkHttp protocol workaround as well, right? [20:31:49] niedzielski: dbrant: bearND: it's hard to say what priority Square is putting on it [20:31:57] mdholloway: i vote no. this effects editing abilities not browsing [20:32:06] (i.e., whether it will be fixed in the next day, month, etc.) [20:32:25] niedzielski: no objections. [20:32:57] mdholloway: have we heard from server side folks on when we would move to a newer version of nginx? [20:33:38] bearND: from the logs/gerrit i think we just went up to 1.10.0 last week; haven't pressed the issue [20:34:32] mdholloway: do you have a link for that? [20:34:34] dbrant: for release notes, are you cool with something like "Fix login and editing issues due to recent server changes."? [20:34:49] niedzielski: dbrant: bearND: as for the gingerbread release, i won't lose sleep if we don't patch either [20:34:59] niedzielski: yep, that's fine [20:35:31] mdholloway: good thinking on that btw [20:35:32] bearND: https://gerrit.wikimedia.org/r/#/c/286508/ [20:36:00] bearND: https://phabricator.wikimedia.org/T96848#2257672 [20:36:44] mdholloway: great. thanks [20:39:38] bearND, mdholloway, niedzielski, dbrant Is there any known problem why I can't login using the Wikipedia app (stable/beta/alpha)? (There's at least one user who has the same problem) [20:39:39] dbrant: the prod apk is in the alpha tab as per usual. you can make the call as to when to promote it :) [20:39:40] :/ [20:40:05] FlorianSW: we've just spent the day solving it :) [20:40:19] FlorianSW: https://phabricator.wikimedia.org/T134759 [20:40:29] :P Great :) [20:40:44] Ah, that's why I haven't found the task, I was looking for "login" only, not for "log into" :P [21:06:52] mdholloway: niedzielski-afk: bearND: about to promote to production, unless anyone objects. [21:07:13] dbrant: no objection [21:07:16] \o/ [21:45:30] mdholloway: niedzielski: bearND: rolled out to production. Nice work, all! [21:45:41] \o/ [21:45:51] 👍 [21:59:40] mdholloway: o/ still around by chance? i'm digging into the saved page service review and wanted to batcave if you have a moment. if not, that's cool. i can hit you up tomorrow too! [21:59:59] niedzielski: still here, let's hit the batcave! [22:00:07] yay! omw [22:24:04] who's familiar with the transformations done to mobile html content after it's taken from the parser cache? is that still in php-land or moved to an external service? [22:24:14] trying to figure out what's up with this srcset stuff [22:24:39] the current hack 'breaks the wall' of the parser cache, inserting mobile-specific stuff that can affect desktop and vice versa [22:25:06] and ori's apparently telling me folks don't want to recache data? [22:25:53] so there's no hash key element to prevent mobile and desktop from interfering when the hack applies [22:26:02] and my patch that adds one is -1'd for having it [22:38:59] brion: hi there. there are a couple of different things you might be referring to and i might know about one or more of them -- do you mean the removal of srcset attributes for mobile web page requests? (https://gerrit.wikimedia.org/r/#/c/270793/3/wmf-config/mobile.php) [22:39:32] brion: or stuff being inserted? [22:40:25] mdholloway: removal of srcset attributes is currently accomplished in MobileFrontend with a hook into thumbnail image output [22:40:45] this affects HTML that goes into the parser cache, conditionally on whether we're on mobile or not [22:41:01] the stored parser cache entry may then be read out either on a future mobile request or a future desktop request [22:41:09] which feels like a big ol' bug but ori says it's intentional [22:41:27] see https://gerrit.wikimedia.org/r/#/c/286502/ [22:41:57] curious [22:42:05] mdholloway: https://gerrit.wikimedia.org/r/#/c/270793/3/wmf-config/mobile.php appears to duplicate the hack that's already in MobileFrontend [22:42:09] do you know why both are being used? [22:42:50] in both cases, it would appear to break the parser cache [22:43:38] brion: i don't; sorry, i rarely work on MobileFrontend. [22:45:07] man i am supser confused [22:45:21] cause i see immediately following that on current version: [22:45:22] $wgRenderHashAppend .= '!responsiveimages=0'; [22:45:31] which would seem to be separating the parser cache [22:45:34] and it was put in by ori [22:45:36] so .... what? [22:47:34] if jdlrobson is around he might have more of a clue ... my involvement with ori's patch consisted of making sure it didn't alter the html sent to the apps, which depend on srcset [22:48:46] fun times [22:49:02] ok so i guess ori shouldn't object to a parser cache separation since he put one in already [22:49:08] i'm just waiting for response from him now :) [22:51:46] hi mdholloway [22:53:30] jdlrobson: hey jon -- brion has some questions about srcset removal in MobileFrontend and i'm not familiar enough with what's going on there to answer them -- do you know why https://gerrit.wikimedia.org/r/#/c/270793/3/wmf-config/mobile.php is the way it is when apparently something similar could be accomplished in MF? [22:57:32] anyway -- i don't have much of value to add here. [23:01:00] niedzielski: \o/ \o/ \o/ [23:01:07] :D [23:01:26] * mdholloway goes to run a victory lap around the house [23:02:30] lol [23:04:37] all right everyone, good night and good luck.