[00:03:33] it's april 28th now for phabricator [00:04:02] yep [00:04:14] it can tell the time, that's a good start [00:18:15] "You have made too many recent login attempts. Please wait 5 minutes before trying again." <- while trying to change my password, really? [00:18:25] I don't think that should be counted as a login attempt. [00:22:06] It's all lumped in with the same counter [01:02:56] I'd like the sidebar stuff to be on top or bottom, not on the side. I can't stand layout which steals horizontal space from content, partly cuz I use vertical monitor. Any suggestions? [01:12:30] hi all [10:28:25] Hi guys, got a bit of an issue with MediaWiki templates. I've got Semantic MediaWiki turned on, and what I want to achieve ought to be defined by [[{{{1}}}|{{{2}}}]] [10:28:42] i.e. Link, with parameterised URL and text [10:29:13] (using this for links in a table as a result of {{#ask: }} [10:29:27] any suggestions why the above doesn't work? [10:29:59] I would say "i'll just give you access to the wiki to see it", but I can't do that as it's internal [10:31:52] hi [10:32:17] is there a way how to dynamicaly translude subpages to the page? [10:32:49] cm13g09-work: than ask in SMW irc [10:33:28] Juandev: ok - I thought this was probably more a generic templating question than an SMW though - hence why I came here [10:33:48] cm13g09-work: I am not pushing you away [10:34:10] cm13g09-work: wait two more hours, I guess thats the time, when US crew wakes up [10:35:06] ok Juandev [13:01:22] can lua detect the skin the reader is using, e.g. vector, monobook, mobile skin, etc.? [13:07:10] Don't think so. [13:07:38] Lua is meant to be user independent and page independent, I believe. [13:07:52] s/page/instance/ I guess. [14:48:27] Image uploads have URLs that end in .JPG or .PNG and Google does not seem to index these pages (except if you're Wikipedia). Has anyone had to deal with this? [16:58:10] anyone know of a good way to save an old revision page under a new page besides cutting and pasting? [17:11:39] SUSY: I don't believe there's a way to do a "save as" sort of thing. [17:16:12] thanks Yaron. I found out that trying to retitle an old revision does not make a copy under a new page name but instead just reverts the current page to that revision. :-( [17:16:44] cutting and pasting is not that hard though, but i was looking for the *proper* way to do it i guess. [17:53:23] Yaron: I'm having some trouble suppressing the link to view the latest rev of a page with the ApprovedRev extension. Can you see my comment on the extensions Discussion page when you get a minute? [18:14:47] wmat - what version of Approved Revs are you using? [18:15:28] first time i see Yaron chatting :D [18:15:40] Really? [18:15:44] I'm here all the time. [18:15:58] I've even talked to you. [18:16:01] i saw that but never saw you chat [18:16:02] you did? [18:16:05] omg :| [18:16:13] im sorry i dont remember [18:16:16] Yaron: the REL1_22 branch [18:16:40] biberao: it wasn't even that long ago, I don't think... I mean, I remember your name. [18:16:53] i apologize then :D [18:17:00] wmat: what version number shows up in the page Special:Version? [18:17:53] biberao: I might be wrong, never mind... I mean, you would probably remember. [18:18:15] Yaron: Version 0.6.5 (075c175) [18:18:27] Ah, okay. [18:18:46] well i know who you are [18:19:09] I'm the guy who never chats. :) [18:19:37] wmat: well, there's a newer version of AR, 0.6.6, although I don't know if that's the issue... [18:20:03] Yaron: no ive googled you before :D [18:20:23] Alright. [18:23:45] Yaron: I checked out the master branch (0.7.0) but the msg is still on the page [18:27:15] wmat: I don't know - I just tried things with your settings, and it works correctly for me. [18:27:35] Could it be that those settings you have are getting overridden, further down in LocalSettings.php? [18:27:58] i'll check, but i don't think so [18:28:53] the view link to latest GroupPermissions only appears in the one place [18:29:49] What happens if you comment out those other five settings lines? Does that have an impact? [18:30:55] aha, the msg is gone now [18:31:32] Oh. [18:32:00] Yaron: you could come to my country and advise ppl using mediawiki :D [18:32:15] wmat: Well, here comes the fun part... could you isolate which of those lines (assuming it's just one) is causing the problem? [18:32:27] already trying ;) [18:32:31] Ah! [18:32:41] biberao: what's the country? [18:32:46] Portugal [18:33:23] Ooh, nice, saudade. [18:33:46] I don't think I can come anytime soon, though, unfortunately. [18:34:02] you've been here? [18:34:48] No, but I really want to visit... maybe "saudade" is the wrong word. [18:35:16] well [18:35:31] saudade is the name we give to it [18:35:41] but then way you would say would be "saudades" [18:35:52] the correct way i mean [18:35:54] Ah, okay. [18:36:21] like imagine you would say "I miss Portugal" [18:36:34] "Eu sinto saudades de Portugal" [18:36:48] we have the most beautiful beach in the world ;D [18:37:04] Man. [18:37:17] wrong place chatting i bet youll say that [18:37:19] im sorry :D [18:38:10] Hey, no problem - maybe someone else would say that, though. :) [18:38:35] But seriously, I would love to go to Portugal. [18:38:40] i thought about that when you said "Man." it would mean something bad would be coming :D [18:39:00] Ah, no - I was just thinking about how great it would be to be on a beach right now. [18:39:02] how do you insert the translation of the page name into the wikicode? I know {{PAGENAME}} doesn’t work [18:43:34] Yaron: $egApprovedRevsBlankIfUnapproved = true; [18:43:42] that seems to be the culprit [18:43:47] Ah... that would have been my guess. [18:43:53] varnent: -> http://www.mediawiki.org/wiki/Help:Extension:Translate/Page_translation_administration ? [18:44:00] I really need blank unapproved pages though :/ [18:45:24] wmat: ah, yes, I see the same behavior now too. [18:45:27] biberao: I didn’t see anything there - I know I’ve seen it in wikicode somewhere - but I cannot remember where [18:45:30] I was looking at the wrong thing. [18:46:22] wmat: I think this is actually a feature, not a bug... the idea is that people should always know why the page is blank. [18:46:37] I could be wrong, though - I really don't remember all the details. [18:47:06] Yaron: the reason I need it is so that the general public can't click through to see revisions [18:47:23] Well.. won't they be able to see the history tab anyway? [18:47:24] the requirement is for "staging" pages sort of [18:47:51] yeah, they will, but that's an acceptable risk [18:53:11] wmat: okay, this looks like just a bug - there should be a check of 'viewlinktolatest' when it goes to display the message for blank pages, but there isn't. [19:24:29] Yaron: that's great :) [19:34:03] Yaron: is that an easy fix? [19:34:33] Yes... probably just one or two lines; I just need to add it in. [19:36:20] that would be much appreciated ;) [19:36:35] also, bought the new rev. of your book. Excellent work. [19:37:43] huh, wmat has a username [19:39:46] wmat: is it 2013 or 2014? [19:40:28] Juandev: username where? [19:40:45] wmat: you are not wikimedia AT? [19:40:51] biberao: the book? 2014 [19:40:55] okay [19:40:58] thanks [19:41:36] Juandev: what's that? [19:42:30] wmat: its an organisation [19:42:51] Juandev: what's the AT stand for? [19:42:57] austria [19:43:02] ah [19:43:15] no, i'm in Canada :) [19:44:12] sound of music :X [19:45:39] perhaps i need a new nick: wmca ;) [19:46:23] sure [19:46:32] :-) [20:21:34] wmat: thanks! sorry, I missed what you wrote before. [20:22:57] np [20:23:22] wmat: I have a question, actually: right now, when the page is blank, the message at the top says, "No revision has been approved for this page. View the most recent revision." [20:23:52] yes [20:24:06] If 'viewlatestrevision' is turned off for that user, do you think it's better for the page to just say, "No revision has been approved for this page", or not to have a message at all? [20:24:29] i prefer no message at all [20:25:08] I suppose that makes sense... [20:26:17] it'd be useful (for me anyway) if History could be suppressed completely for pages with no approved revisions [20:26:51] That's... more extreme than I was planning with this extension. :) [20:27:15] I feel like I'm veering into CMS-like functionality, which I want to avoid, tbh [20:27:59] Well, if the goal is to make the wiki essentially a publishing platform, you could have a 2nd skin, that doesn't have any of the standard wiki links/tabs, that's just shown to non-registered users. [20:28:13] That's mentioned in my book, actually. [20:28:27] oh? I haven't seen that part yet :) [20:28:45] It's in the "skins" section... right at the end, I think. [20:29:03] * wmat looks [20:31:47] interesting [20:33:43] It may or may not work for you, depending on what exactly the needs are. [20:35:29] it may at a later date, but right now, i just have to give a small group of Users the ability to collaborate on a page until it's ready to be made public, i.e. the Approved Revision [20:35:56] and I don't want to go as heavy weight as FlaggedRevs [20:45:40] wmat: okay, I just checked in what I believe is a fix. [20:46:30] Yaron: thanks! Indeed, it seems to work. [20:46:39] Okay, great! [20:48:26] should not logged in users be able to see unapproved pages? [20:54:27] also, anonymous users can see Special:ApprovedRevs [21:16:20] I'm having some trouble with the API: I'm specifying revids, and I've specified rvdir="older", but the list of revisions I get is still ordered with the oldest revision first, rather than the newest. Is there an easy way around the problem? [21:18:34] Nihiltres: do you have an example call? [21:18:52] Nihiltres: rvdir is for revisions of the same page (when you specify one page and you want all its revisions), I don't think it would work for revids for different articles [21:19:58] They're all from the same article, just I'm calling them by ID to avoid bad assumptions [21:20:39] historyCall.get({action: "query", prop: "revisions", revids: revsToGet.join("|"), rvprop: "size", rvdir: "older"}) [21:20:52] ^That's the main block of the call I'm using [21:22:30] I'm making two API calls in succession: one to get a list of revisions, and the next to get the size of the revisions from the parentid's of the first call's revisions [21:22:50] That way I can get a size diff for each revision, which isn't offered by the API [21:23:57] but in testing I discovered that the revids call is being automatically ordered in reverse of the first one, to which rvdir does apply [21:24:26] which is frustrating because you'd assume with a list of revids that you'd get them back in the order they were supplied :/ [21:24:48] Nihiltres: reverse the other query, maybe? :-) or just match them up with a dict/hashtable [21:25:54] Nihiltres: rvdir=older only applies when you're listing multiple revisions of the same page [21:26:04] Like &titles=Foo&rvlimit=50 [21:26:21] If you have a bunch of revids in the URL, it'll be ignored [21:27:17] RoanKattouw: Yeah, I know… it's just the way the call is constructed, because they are all revisions of the same page :P [21:27:36] Is there a way to force the order of a call that uses revids? [21:29:45] No I don't think there is [21:30:01] It's possible that the order in which you specify the revisions matters, but I don't remember if it does [21:32:26] It doesn't, because I'm specifying them in order and I tried specifying them in reverse [21:32:54] I guess I get to add an extra preprocessing step of reversing an array :P [21:34:51] Well, thanks anyway :) [22:25:41] Krenair: a bit less noisy in here if you don't mind [22:25:45] sumanah, okay [22:26:14] Krenair: basically I am trying to think through how a MediaWiki developer profiles their code in case they want to watch out for performance issues [22:26:50] I'm watching http://commons.wikimedia.org/wiki/File:MediaWiki_Performance_Profiling.ogv which might be getting a bit obsolete [22:26:56] there's https://www.mediawiki.org/wiki/Manual:How_to_debug#Profiling of course [22:28:07] I'll see what Reedy tells me when he's done flushing old stuff from performance.wikimedia.org/profiler/report [22:28:31] I feel like, with front-end performance, I gotta look at stuff like http://ljungblad.nu/post/83400324746/80-of-end-user-response-time-is-spent-on-the-frontend and ResourceLoader hints [22:29:02] but backend stuff - is http://gdash.wikimedia.org/ useful? [22:29:41] so, i'd like to create a list that self-populates based on maybe input to a specified mailing list. i don't want to require that those writing to that list have accounts to the wiki or need to be able to read PHP/HTML in order to fill out the list. that is why i'm trying to find an easy alternative. [22:29:42] sumanah: non-NDA people don't have access to the fun stuff :( [22:29:49] i feel this will require some sort of script. [22:30:07] https://bugzilla.wikimedia.org/show_bug.cgi?id=54713 [22:30:16] i'm hoping there's some advanced build-in feature i don't know about that does something similiar? [22:30:36] MatmaRex: well, that's annoying [22:30:51] MatmaRex: I guess you could sign an NDA? [22:31:14] sumanah, gdash looks helpful [22:31:24] though I don't think I've used it myself [22:31:27] SUSY: hi there! when you say "create a list" - you mean, like, a bulleted list on a wiki page? [22:31:28] sumanah: signing NDAs doesn't scale :P [22:31:37] MatmaRex: I hear ya [22:31:45] no, more like a table list. [22:31:52] sumanah. [22:32:06] My situation with access to stuff like that is not clear at all, I can't see graphite or icinga at the moment [22:32:06] not an ordered or unordered list. [22:32:13] it's better that there's an option to sign an NDA now, but yeah, it would be far better if we could just be transparent (while still ensuring security/privacy of user data) [22:32:45] SUSY: you mean a table kind of like https://www.mediawiki.org/wiki/Requests_for_comment#Seeking_feedback , right? [22:33:10] btw what is your mailing list software? [22:33:13] I don't think I meet legal's requirements to have a direct NDA with WMF [22:33:29] :( [22:33:50] sumanah: the group is using google groups right now, but there is discussion of mailman. [22:34:26] maybe having a field in the wiki for people to write in would be easier? [22:34:39] sumanah, But I have my wikimedia.org email, officewiki, etc. access anyway through my (sub)contract work on VE [22:34:47] SUSY: so, I can imagine a few ways to do this. if you're okay with everything being public, this feels like something you could potentially do with IFTTT .... there's also Semantic Forms, an extension to MediaWiki [22:35:12] Krenair: that's ..... a suboptimal incongruence [22:35:25] sumanah: yes, that is the type of list. we have it as an html list right now. i want certain fields to be overwritten/entered by anyone with a valid email (or some other validation scheme). [22:36:09] * sumanah defers to other, more experienced people in the matter of SUSY's desired feature [22:36:38] Krenair: you can see gdash, right? like, you can click around and see graphs? [22:36:43] yes [22:36:58] ok, that's something [22:37:01] sumanah: thanks! now i have names of things to look up. i can begin my research with that. it was tough to find anything based on my description. [22:37:54] SUSY: you could also write a script to read, like, the first line in an email to the list, then use the MediaWiki API to write an additional row to the table in the wiki page [22:37:59] !api [22:37:59] The MediaWiki API provides direct, high-level access to the data contained in the MediaWiki databases. Client programs should be able to use the API to login, get data, and post changes. Find out more at < https://www.mediawiki.org/wiki/API >. For client libraries in various languages, see < https://www.mediawiki.org/wiki/API:Client_Code >. [22:38:39] SUSY: you have heard about the new Mailman 3? [22:38:48] * sumanah is perhaps disproportionately excited about Mailman 3 [22:38:55] sumanah: no. lol. [22:39:06] https://mail.python.org/pipermail/mailman-announce/2014-April/000191.html [22:39:29] * sumanah kind of cannot wait to send her first mail to a list running Mailman 3 [22:40:42] it looks like Mailman 3 has a better API so I imagine it would be somewhat easier to do fancy stuff like this with Mailman 3 than it was with 2 [22:42:37] so Krenair I'm sort of thinking through: let's say that you have written new code and it has just deployed. You are happy! Yay! You maybe even are listening to a pleasant bit of music in celebration. But what graphs should you keep an eye on to see whether your new code has materially affected performance? [22:42:52] that's great. i need another argument to convert the group over to mailman. i think no one knows how to administrate it in my group has been the problem traditionally, but i guess i'll just have to take on yet another task myself. ;-) i'll have to work harder at local recruiting. [22:43:17] :-) [22:43:42] sumanah, well gdash is split up into different areas of the code [22:44:12] nod [22:45:29] from what I can guess by simply looking at gdash, you'd browse to the right page and the most recent graphs appear to be those nearer the top [22:45:43] The "Show Code Deploys" checkbox looks useful [22:47:31] Krenair: can you point me to a graph where checking that box makes a difference? [22:47:42] I have perhaps been looking at graphs where it does not [22:49:19] can't find one [22:54:25] Krenair: do you ever look at http://ganglia.wikimedia.org ? [22:54:43] not recently [22:55:56] I'm rereading https://blog.wikimedia.org/2013/02/05/how-the-technical-operations-team-stops-problems-in-their-tracks/ now to remember what's what :) [22:57:09] * sumanah also looks at https://wikitech.wikimedia.org/wiki/UDP_based_profiling  [22:58:26] sumanah, Krenair: "show code deploys" in gdash is broken at the moment -- https://bugzilla.wikimedia.org/show_bug.cgi?id=62667 [22:58:35] aha! thank you bd808 [22:59:05] * sumanah was squinting [23:00:24] bd808: I welcome any thoughts from you on my quest [23:00:43] sumanah, I can probably find my way around ganglia. IIRC, it's more about network status than performance of individual parts of the code [23:01:55] sumanah: I wait until Faidon reverts my change and writes an email to ops-l explaining what I broke ;) [23:01:58] Ganglia tends to be focused on "what's going on with/on this machine" [23:02:05] yeah, that makes sense - it's very rare that a change you make in MediaWiki is gonna, like, materially affect how RAM works [23:02:11] And aggregates machines into clusters [23:02:28] Most stats it collects are generic things like CPU usage, mem usage, load average, etc [23:02:52] But for some boxes it also collects service-specific stats like # of open connections to the service [23:03:10] I think for database servers in particular we have stuff like that [23:04:54] sumanah: Personally, my primary use of Ganglia is, when there's a problem, look at what the general health state of different component is and based on that try to reason about what might be going on [23:05:09] RoanKattouw: so according to https://wikitech.wikimedia.org/wiki/Graphite , gdash and graphite.wikimedia.org are both frontends to the Graphite metrics, right? [23:05:17] I... think so? [23:05:28] I have barely used those [23:05:46] OK! I will ask you the question I asked Krenair [23:05:46] gdash's link saying 'Data Browser' goes to graphite [23:05:49] RoanKattouw: let's say that you have written new code and it has just deployed. You are happy! Yay! You maybe even are listening to a pleasant bit of music in celebration. But what graphs should you keep an eye on to see whether your new code has materially affected performance? [23:06:03] That depends radically on what kind of code it is [23:06:24] * sumanah listens [23:06:31] In most cases, I don't worry because I only worry when I have a feeling that something is potentially impactful [23:06:33] the initial division I would think of is frontend vs backend [23:07:06] In cases where I do worry, yeah frontend vs backend matters [23:07:21] For frontend things I usually watch network graphs [23:07:50] Because most frontend performance problems correlate in some way with changes in network patterns on the bits cluster (which serves JS/CSS code to users) [23:08:39] For other things, I tend to look at general health graphs (load avg, CPU) of the cluster that would be affected (bits Apache (ResourceLoader), API, etc.) [23:08:52] That's what I used to do a few years ago anyway [23:08:56] Back then it was the best you could do [23:08:57] These sound like Ganglia graphs? [23:09:00] Yes [23:09:11] I get pretty lost trying to find things in Ganglia [23:09:16] Nowadays there's more advanced stuff around but my graph-reading skills are still stuck in 2010ish [23:09:29] (maybe 2011/12) [23:10:15] So Ganglia's organizational hierarchy is clusters -> machines -> measurements [23:10:25] Where clusters have some aggregate graphs as well [23:10:56] For frontend stuff, for instance, we often look at the network graphs for the bits Varnish cluster [23:11:42] The default view is 1hr though which is not often very interesting. If you use the bar on the top to get a wider time window, you'll start to see circadian trends and stuff [23:12:45] For instance: https://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=cpu_report&s=by+name&c=Bits+caches+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&hide-hf=false&sh=1&z=small&hc=4 [23:12:47] AHA, ok, that was something I missed [23:12:52] (swimming in options) [23:13:22] so the super-umbrella for all of these clusters is "Wikimedia grid" [23:13:26] Yeah this is like the dialog box with 47 checkboxes [23:13:28] Yup [23:13:40] From there it breaks down into clusters, which are usually (service, location) pairs [23:13:55] Or (service, subservice, location) [23:13:58] Like "bits caches eqiad" [23:14:02] "cpu report" rather than "network report"? [23:14:16] Oh, that one is largely stupid [23:14:29] you mean the choice of metric is badly named? [23:14:45] If you are looking at a view of lots of different things (clusters on the main page, machines on a cluster page), that box decides which graph is used as the one graph they get to show for that cluster/machine [23:14:59] So the link I gave you shows multiple graphs for the cluster, but only one graph per machine within that cluster [23:15:12] You can click on a machine's graph to get all of its graphs [23:15:30] (that is, graphs for all thousand of its metrics?) [23:15:34] But if you want a quick comparison of the same graph across all of those machines that just happens to not be the one it's showing now, that dropdown adjusts which one is used [23:15:36] Yeah exactly [23:15:56] OH [23:15:58] cpu_report is the stack graph of how much CPU is used and by what categories of usage [23:16:26] so it does not affect the "overview" load/memory/cpu/network quartet at the top [23:16:30] So, if on the page I linked, you change cpu_report to network_report, you'll see how the network usage breaks down per box [23:16:32] Exactly [23:16:35] Only the per-box view [23:16:41] This cluster only has 4 boxes [23:16:46] But some other clusters have 100+ [23:16:47] * sumanah does not swear aloud, which takes some effort [23:17:32] sumanah: Perhaps a more instructive example is https://ganglia.wikimedia.org/latest/?r=day&cs=&ce=&m=cpu_report&s=by+name&c=Application+servers+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&hide-hf=false&sh=1&z=small&hc=4 [23:17:40] It's the main Apache cluster, 136 boxes [23:18:02] so that quartet up top is always load/memory/cpu/network, RoanKattouw? [23:18:08] Yes, for the aggregate [23:18:15] that's useful [23:18:15] So, the bottom is individual views, and the top is the aggregation of those [23:18:23] Yeah [23:18:55] The top aggregates the four things you most often care about [23:19:03] (in theory) [23:19:34] so, RoanKattouw, I see that a few hours ago, network load jumped and then dropped down again [23:19:41] Yeah [23:19:56] This is the sort of thing one looks for, I guess? [23:20:00] Yeah [23:20:13] It seems vaguely likely that that could be related to the GettingStarted errors we had today [23:20:17] was there an outage? [23:20:27] because all 4 graphs have a weird gap at that timeperiod [23:20:28] You know, the ones you reported ;) [23:20:38] The gap is an outage in the Ganglia collector [23:20:51] There are gaps way more often than there should be [23:20:54] omg, epistemological issues already [23:20:59] The collection isn't incredibly robust [23:21:23] I'm growing up so fast :-) [23:21:36] Anyways those GS errors could have caused the increase in CPU + network because errors are uncacheable [23:21:53] right! [23:22:08] Which is a good policy but also means that if some fraction of pages have errors, those pages now have a cache hit ratio of 0% instead of 90% or whatever it is [23:22:28] so yeah, I'm trying to figure out how to check for a big increase or decrease in various kinds of cache hits [23:22:56] memcached eqiad, etc, all the clusters that have "cache" in their name [23:23:00] that's what I check, right? [23:23:49] Those might have hit graphs [23:23:57] But generally I tend to look at the backends behind those caches instead [23:24:13] cause they take the hit when the caches stop caching [23:24:40] that's deep RoanKattouw :) [23:24:48] haha yeah [23:25:19] There is a graph somewhere out there that shows # of fatal errors and PHP warnings against time [23:25:25] Caches gonna cache. [23:25:41] So I'd try to correlate that with the other graphs and see if they happened around the same time [23:26:19] so, like, instead of looking at the Memcached eqiad cluster report, you look at the general Application servers in eqiad report. [23:26:25] Yeah [23:27:08] I'd be interested in seeing the error/warning graph [23:27:31] https://ganglia.wikimedia.org/latest/graph.php?r=hour&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&n=&hreg[]=vanadium.eqiad.wmnet&mreg[]=fatal|exception>ype=stack&glegend=show&aggregate=1&embed=1 [23:27:36] I don't know how to get there from Ganglia [23:27:41] I just now Ori gave me that link once [23:27:54] This graph didn't exist in 2011 and consequently I have no idea how it works [23:28:08] so I think I understand what you would look at if you were worried about JS/CSS perf issues (the bits servers), or about parsercache or general object cache hits (the app servers). What about reductions in varnish hits? [23:29:01] For Varnish hits I'd look at the app servers [23:29:03] Yay, I changed it into a last-day graph through URL hacking! https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&n=&hreg[]=vanadium.eqiad.wmnet&mreg[]=fatal|exception>ype=stack&glegend=show&aggregate=1&embed=1 [23:29:04] For object cache hits, I don't know [23:29:37] Certain portions of the object cache cache things that if the hit rate went too low, you'd notice an impact on something [23:29:40] But that "something" varies [23:29:57] It could be the app servers or the databases depending on what it is [23:30:59] got it [23:31:48] RoanKattouw: I saw a gdash page that seemed to be about total frontend page load time [23:32:11] Yes! [23:32:16] which - I assume - would be a proxy for "did we accidentally reduce the localstorage and/or native browser cache hit rate" [23:32:21] That's one of Ori's amazing contributions to the universe [23:32:26] ori is great! [23:32:27] Not just that [23:32:44] Also, "did we just deploy something that adds slow code to the critical path for loading a page" [23:32:56] "Did we just deploy something that causes more stuff to be downloaded on a page view" [23:33:11] Right [23:33:41] Ori has said that we have a lot of frog in boiling water stuff going on with that kind of stuff [23:33:45] And he has the graphs to prove it now [23:34:50] so I think I should put what you have just told me, + links to http://commons.wikimedia.org/wiki/File:MediaWiki_Performance_Profiling.ogv and http://ljungblad.nu/post/83400324746/80-of-end-user-response-time-is-spent-on-the-frontend , into an email to wikitech-l [23:35:34] RoanKattouw: ori and AaronSchulz and I were hoping to put together some kind of "try profiling this code" exercise that would include frontend + backend stuff [23:35:38] I feel closer to that goal now [23:37:16] I think there are some aspects where you can only know how it interacts with everything else once it's in production; your dev env and beta cluster are only gonna help so much [23:37:39] Yeah exactly [23:38:01] A lot of this stuff is post facto investigation [23:38:16] Through which we gain experience about what things generally cause problems [23:38:26] Which enable us to try to prevent people from doing those specific things [23:38:34] But it's all very inexact [23:38:50] *enables [23:38:52] basically my advice would be: in your dev env, follow https://www.mediawiki.org/wiki/Manual:How_to_debug#Profiling for backend and look at your Chrome developer tools for frontend :-) and once it's in production, watch these n graphs [23:38:59] Yeah [23:39:17] if this kind of soup-to-nuts advice exists somewhere thorough, I think I haven't seen it [23:39:30] (which is, like, why I'm writing it) [23:39:34] Yeah [23:39:43] please embarrass me now by linking to it [23:39:54] rather than later this week when I have written a duplicate [23:39:56] Well, I think the problem is a lot of it is the kind of thing that we have just learned over time [23:40:09] It's very difficult to predict how things will scale in production [23:40:43] nod [23:41:55] RoanKattouw: ori wants one of the performance guidelines to be "Be scrupulous about measuring performance and know where time is being spent" - as in, the developer is responsible for the performance of her own code and for responding to "hey your code increases latency" bug reports. So we gotta give people an overview of *how to do that* [23:42:27] and if that overview exists, then I should just add to it. But I do not think it does [23:43:01] we have some "how to optimize queries" stuff but not, like, "when to hit the job queue and when not, how to use the init module, what graphs to look at" stuff [23:43:05] hey, i'm just catching up [23:43:07] i missed the pings earlier [23:43:22] Yeah, frontend latency increase stuff is definitely more measurable [23:43:24] Oh hi Ori - I'm sorry, they were more mentioning you than pinging you I think [23:43:32] For the things that are measureable we should definitely have better docs [23:43:56] Sorry I got a bit confused going back and forth to different corners of the performance universe [23:44:00] Some of these corners are very dark [23:44:12] * sumanah hands RoanKattouw a comfort object such as a teddy bear for the dark moments [23:44:38] It's OK, I've spent enough time in the dark corners that I've developed night vision [23:44:58] But for some of the corners we can install light fixtures rather than requiring everyone to develop night vision over time [23:45:02] * sumanah laughs [23:45:03] yes [23:45:04] While for other corners that's much harder [23:45:17] I think the other corners may be Plato's Cave [23:50:08] sumanah: basically my advice would be: in your dev env, follow https://www.mediawiki.org/wiki/Manual:How_to_debug#Profiling for backend and look at your Chrome developer tools for frontend :-) and once it's in production, watch these n graphs [23:50:15] that's exactly right [23:51:17] ori: if you have a list of graphs, like, a set of bookmarks in some folder of your browser, I would be grateful [23:51:29] heck, maybe you already have it up someplace I've missed [23:52:08] * sumanah suspects Timo already has a nicely laid-out set of "here are all the important graphs" links on a page somewhere she's missed [23:55:01] sumanah: i'm sorry to ask again since i think you've already explained this before, but could you clarify how wikimedia-specific are these instructions supposed to be? [23:55:55] ori: I think you and I haven't really nailed it down before, just talked in generalities [23:57:20] I am happy to link to best practices elsewhere for things that apply to all web apps, or all MySQL/MariaDB queries, or what have you. But I want a developer to be able to think systematically through that *plus* RL/jobqueue/our caching layers/etc., stuff specific to Wikimedia's setup [23:58:31] * ori nods [23:58:33] so there's no need to reinvent the wheel re "how to optimize DB queries" in general, for instance, but I do want this to be something a new WMF developer can come in and use to help her see why Faidon is either frowning or smiling :) [23:58:54] including "this graph is good and you should feel good" [23:59:15] Does that help ori? [23:59:23] sumanah: yeah, just writing up a quick tip [23:59:57] Thank you