[17:13:27] test [17:29:14] pass [17:29:52] :D [20:41:03] test [20:41:06] haha [20:42:01] qgil: not working? [20:42:47] is anyone here able to help qgil get his voice back? [20:44:48] how do I even know who the ops are here? [20:45:21] pinging brendan_campbell [20:45:29] brendan_campbell: maybe you can help Quim? [20:47:06] he getting the following error when sending to channel: testing [20:47:06] * #wikimedia-office :Cannot send to nick/channel [20:50:23] James_F: ping [20:50:39] Now? [20:50:44] Right [20:50:50] yep :) [20:51:06] Due to persistant spam, this channel is configured so that only those who are identified to NickServ can speak. [20:51:24] thank you AntiComposite for the explanation [20:51:29] Turns out that users need to be registered to reply to this room. I guess I have been too long without stepping here. :) https://freenode.net/kb/answer/registration [20:51:54] Ah, AntiComposite was faster. Thanks! [20:53:08] are you preparing for a meeting in here? We were planning to talk about https://phabricator.wikimedia.org/T249419 starting on the hour [20:55:47] TimStarling, I am preparing... but it is for tomorrow at 15:00 UTC. Actually, I just wanted to send a headsup here since it is not an office hour. [20:56:19] Whats it going to be about qgil ? [20:56:24] ("tomorrow" might be "today" for you, Tim :D ) [20:58:08] * akosiaris around for https://phabricator.wikimedia.org/T249419 [20:58:36] The 2030 Movement Brand Project has a presentation over video and we are inviting people to ask questions here and on YouTube. More at https://meta.wikimedia.org/wiki/Talk:Brand_Network#April_update_on_the_project_timeline_and_planning [20:58:45] I'll start the meeting now to verify that meetbot is working [20:58:46] When: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200416T15&p1=1440&ah=1 [20:58:52] * qgil shuts down [20:59:03] #startmeeting RFC meeting [20:59:04] Meeting started Wed Apr 15 20:59:03 2020 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot. [20:59:04] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [20:59:04] The meeting name has been set to 'rfc_meeting' [20:59:32] #topic RFC: Render data visualizations on the server [20:59:58] Evening all [21:00:28] o/ [21:00:51] I'll quickly summarize and kick off the discussion? [21:01:24] ping Pchelolo kaldari [21:01:31] so, the current graphoid service is being undeployed due to not finding a steward through its stewardship request (https://phabricator.wikimedia.org/T211881) [21:01:32] o/ [21:02:06] howdy [21:02:28] this RFC proposes splitting up what the current graphoid service does for easier maintenance / SRE happiness [21:03:05] if anyone missed it, the meeting topic is https://phabricator.wikimedia.org/T249419 [21:03:18] tgr summarized the split nicely in one of his last comments: we need a fetching service, a rendering service, and a way to store the result of the render [21:03:34] * RhinosF1 ‘ere [21:03:54] the rendering service is dead simple, just a thin wrapper around any graphing library that can render server-side [21:04:42] so the main question is: is it ok to push the rest of the complexity of "graphs on wikis" to the other parts of our stack? And I'll give an example of such complexity: [21:06:23] graph specs are used through templates, so the spec isn't known until PHP parses the page and expands the template. If the new service is so simple it does not call out for external data, and instead needs it all provided in the POST request, then we have a tricky problem to solve (how do you call this service from PHP?) [21:06:42] ok, I'll stop talking now and see if anyone is following [21:07:14] Right now, this complexity is inside graphoid? Or just missing? [21:07:25] yes, I'll briefly describe the situation right now: [21:07:31] graphoid takes a vega spec [21:07:44] sanitizes any URLs to make sure they're on a whitelist of domains (but this includes wikidata query service) [21:08:11] calls out for data, which apparently produces a lot of errors, eg. when wikidata query service was down, and renders the graph [21:09:07] renders it to what format? [21:09:23] png/svg afaik [21:09:23] it can render it to png or svg, but right now it's png [21:09:37] AIUI right now, fetching is inside graphoid (not good because it needs to know too much about MediaWiki), storage is outside but simplistic (no invalidation) [21:10:05] the solution proposed with graphoid being a pure renderer works pretty well for mathoid [21:10:06] hi all. It only renders to png. SVG is disabled. [21:10:39] I thought there were interactive graphs [21:10:47] those can only be rendered client-side [21:10:58] Pchelolo, Mathooid is very different - it has no external data/image dependencies [21:11:08] IIRC the big difference between mathoid and graphoid is that a formula definition fits into an URL, a graph definition does not. [21:11:09] so graphoid would only ever render static snapshots of the graph [21:11:37] the whole of vega runs on the client side as well, including fetching? [21:11:38] Pchelolo: how do image from Mathoid make their way to media storage currently (Swift). Done by the Node service or MW? [21:11:53] Krinkle: currently they're stored in restbase [21:12:08] TimStarling, correct [21:12:08] right, so its a magic store/cache due to having been requested [21:12:12] the main difference with mathoid is that data can change without the graph spec changing [21:12:29] Before we go too far into tech, the main issue I see from above is the lack of steward. Would rewriting the service guarantee stewardship? If so, this would be a valid reason to spend considerable dev resources + deal with initial bugs. [21:13:04] I will be steward of this thing for as long as possible [21:13:20] but ideally we would find a proper home for it in a dedicated team [21:13:42] Would it resolve the SRE requirements? And what are those that are not being met at the moment? [21:13:49] milimetric: does the "change" ability need to be a primitive in the system, or could it work without it? E.g. would it be fine to require a manual purge or edit to re-render and not need any kind of polling or update service. Also, do we plan to support third party urls for data, or only Data/json pages on Commons? If the latter then we can treat it like any other cascading update. [21:15:10] Currently wikidata queries are a supported data source. Those wouldn't generate anything to purge off of [21:15:13] Krinkle: WDQS changes live without any MW-land data updates. [21:15:18] Yeah, what AntiComposite said. [21:15:45] Krinkle: tough question, not strictly required as a primitive, but one of the problems I'm trying to solve is updating graphs like the COVID graphs on a daily basis. And we could limit data to our data, but more than just commons, for example: wikidata query service query results [21:16:11] AntiComposite: right, but that's also why upon making such queries in wikitext, wikibase client stores which items are used and upon edit on wikidata.org it generats an RecentChange entry on the local wiki and re-parses all consuming pages. [21:16:15] Krinkle, in theory, graphs could rely on pageviews and WDQS api, but in practice cascading updates from data pages & images is more than enough for the regular service, with the optional force-purge requests. There could also be a feature of the graph itself to force refresh it (e.g. daily). [21:17:16] Right, I see. for more complex WDQS we would just have a general ttl, e.g. 24h, like some magic words do already. [21:17:23] exactly [21:17:27] then it would naturally be updated on-demand upon cache miss. [21:17:32] no push updates needed. [21:17:48] I think the data problem should be solved in an elegant way with a new data namespace or dataset project, like wikidataset.org or something like that. In the back of my mind in this discussion is that other solution can be temporary until we do that, which I think we should sooner than later. [21:18:59] For directly editable data (unlike page view data), we could eventually optimise with a hook upon edits to commonswiki/Data: edits. But for the initial MVP, I think letting regular wiki re-parse and cache ttl handle it probably suffices? [21:19:51] Krinkle: yeah, I'm ok with lots of compromises. One of the benefits of making a simpler service and splitting it out is that it can be called in more flexible ways, for example by a bot that just has a vega spec and a data dictionary [21:20:18] also in context with SRE/Perf looking at more generally reducing the amount of cascading and reparsing we do in prod, leaning more in general TTLs, e.g. 1 day TTL instead of tracking template edits. So introducing yet another recursive update service would be nice to avoid. [21:20:20] the question I have is, does everyone else here think this will cause more problems than it solves? [21:21:45] There's a clear value and demand in providing this. [21:22:08] Not sure about the longer-term product concept of wikidataset.org/etc. [21:22:43] so the code stewardship page describes three kinds of problems [21:22:43] I feel the bigger issue is still unanswered: is it possible/desirable to make Graphoid a pure POST service (pass in all required data). Vega graph can use images. Example: draw a map (uses maps snapshot service), take top 5 COVID19 countries (data page), and draw country flags using Commons (based on the country codes in the data). There is no way all this data can be prepared without first running Vega. [21:22:57] - code quality [21:23:16] - graphoid doing calls to other MW endpoints in suboptimal ways [21:23:35] - "unorthodox architecture" which AIUI means not being a lambda service [21:23:55] which are we trying to solve here? [21:24:28] point two and three are really the same, right? [21:24:36] tgr, note that #2 is identical on the client side too, and is much worse because every client would do it rather than one graphoid and the image would be cached. [21:24:38] no, the "unorthodox architecture" is more about the need to way the graphoid and mw interoperate [21:24:52] s/the need to/the way/ [21:25:19] ...beyond "graphoid doing calls to other MW endpoints in suboptimal ways"? [21:25:51] akosiaris: can you be more specific? [21:25:54] I personally want to solve all the problems, they seem mostly related to me [21:26:07] TimStarling: sure, lets see [21:27:07] graphoid need to be called with the hash of graph. That hash can in practice only be generated by mediawiki as it is the "data store" for graphoid. Practically speaking the exposed API is not easily callable by anything/anyone else (including monitoring) as they can't know the hash of the graph [21:27:17] which ends up causing the following loop in requests [21:27:23] TimStarling: a simple example might be that graphoid has to call mw to get the graph spec, and so if you want to test the service (for performance or correctness), you'd have a hard time mocking its dependencies [21:27:41] mw -> graphoid -> mw and back [21:28:56] because the hash includes fetched resources? [21:29:22] no the hash is just an identifier generated by mw when it stores the graph spec into page props [21:29:32] akosiaris, mw does not call graphoid. Browsers calls graphoid using the graph hash that MW computed. The graphoid loads all the needed data from MW and other sources, including the graph itself. [21:29:33] the hash is just a sha1 of a normalized version of the graph (including data IIRC) [21:29:46] without data [21:29:59] MW cannot know what data is needed [21:30:13] TimStarling: stores JSON in a page property, which Grapoid JS then fetches from pageprops API by hash. [21:30:16] (I think) [21:30:23] correct [21:30:46] MW validates the URLs doesn't it? how can it not know what data is needed? the data is just those same URLs that it validated isn't it? [21:30:53] so outputs to restbase/graphoid/:title/:hash and it then does the rest [21:31:12] (currently) [21:31:18] TimStarling, no, it doesn't. It only validates that JSON is ok. It is Graphoid that validates that all external data is allowed. [21:31:47] I'll clear up this point real quick, because it's important [21:31:49] the "unorthodox architecture" section of the stewardship task seems to assume graphoid fetching the data from MediaWiki is avoidable, but I really don't see how it could be [21:31:51] knowing what data is needed requires a full run of Vega [21:31:58] hang on yurik, I got this [21:32:03] :) [21:32:11] ok, so PHP can't know what data is needed to render a graph, because [21:32:25] (in a fairly stretched example) [21:32:36] one datasource can be: urls generated based on data from another datasource [21:32:51] can we stop allowing that? [21:32:59] so vega would load dataset 1, let's say countries in the top 5 pageviews [21:33:12] and then load flag icons for each of those countries, and maybe also population data [21:33:52] we can do anything, but I don't think we should, it doesn't actually simplify much because PHP would still need to run vega somehow to figure out what data it requests [21:34:23] this is the trouble with loose coupling [21:34:31] we could also find another graphing library besides Vega that gives us good enough graphing functionality, the way I see this service so far, we always have that option [21:35:14] we need *something* to fetch the data. What is it / where does it live? It could be Lua code that renders the graph spec, but that would mean dropping support for AQS/WDQS, and probably lots of trouble with parser limits. [21:35:35] It could be its own service but not sure if that would be an improvement [21:35:58] tgr: that could be a small improvement, because it would let some paths call graphoid directly [21:36:24] (the rendering service, sorry) [21:36:59] not really because it is invoked via image URLs and you probably can't fit a graph spec into an image URL [21:37:21] tgr: ? image urls? [21:38:23] graphoid is invoked by the browser via image URL. I guess you could call it from elsewhere and store the image, not sure it would simplify things. [21:38:56] tgr: yeah, I think that's mixing the storage with the rendering. The call I was envisioning would be a POST with the graph spec and a data dictionary [21:38:59] this example of Vega discovering more dependencies as it goes, doesn't that conflict with the idea of splitting out a fetching service? [21:39:11] tgr: Vega needs to run in order to identify needed data. Vega is a JS code. Restructuring it into two services (render + fetcher) does not seem to make any sense - the fetcher would be a simple proxy. [21:39:16] so basically: POST ( graphSpec, { url1: data, url2: thumbnail } ) [21:39:56] TimStarling: well, so if PHP had a graphSpec, it could call a fetching service and say: POST ( graphSpec ) and get back { url1: data, url2: thumbnail } [21:40:12] and then call the rendering service with that [21:40:24] (not saying that's what should happen, but it could, would that be awful? [21:40:40] yeah, having a fetching service (or having MediaWiki to do the fetching) would limit what you can do in Vega [21:40:48] the Vega code would only see some static dataset [21:41:03] milimetric, that would mean the fetcher would do exactly the same work as the renderer -- it would load Vega, load the needed data, parse it all, record the data / images it needs, and return that. The second call to the render would be identical, except that the result will be an image. You may as well return both in one call. [21:41:07] tgr: it could do what I say above [21:41:40] yurik: we can have it be two endpoints in the same service, but I think they should be distinct so we can support clients calling it directly [21:42:51] When would this POST happen? From PHP on page save? [21:43:32] maybe instead of templating on top of graph specs we could make a different graph editing interface that would have the same UX as filling out a template but be completely client-side. This would allow the user agent to have the full graph spec and post it to the rendering service [21:43:52] let me phrase it this way, if the choice is between undeploying graphoid and removing the ability for one dataset to depend on another, would that be a meaningful choice? or would that break most use cases either way? [21:44:33] tgr, one dataset depending on another is not a big issue at the moment. The issue is if dataset drives which images it needs. [21:44:53] tgr: I don't see why that has to be the choice, if something is going to render a vega graph, it pretty much needs to run vega to figure out all the URLs it needs, there's no easy way to statically analyze the graph spec for that, you would be rewriting a bunch of the vega logic [21:44:54] sure, images are a dataset [21:45:21] tgr: I think right now we are at a point where graphoid is going to undeployed. The question is what we do next. [21:45:59] if the ability of one external source depending on another external source needs to be kept, I think the current architecture can't really be changed [21:46:06] sounds like you will end up rewriting almost an identical service TBH. [21:46:31] tgr: but if we split out fetching and rendering, we could refactor the client, like this (is this naive?): [21:46:50] editor presses edit, and they type in a graph tag [21:47:04] they get an editor that lets them pick from curated graph spec templates [21:47:22] they pick one, change params, the client-side JS knows the full spec at all points [21:47:23] milimetric is offering to maintain the extension and thus has to be granted flexibility to take it in a direction he's comfortable with [21:47:34] as yurik said, splitting fetching and rendering doesn't have much value if both are performed by executing the Vega spec [21:47:43] when they click save, client side the JS simulates a vega render and gets all the URLs requested, posts them to the rendering service [21:48:02] gets back an image URL, saves it with the page [21:48:06] wait, I thought milimetric was offering to maintain the graph rendering service. Is he adopting the Graph extension as well? [21:48:20] rendering service works async to render an image there [21:48:40] it would be possible to come up with some custom format for describing dependencies and Vega would receive those as data files. That would limit functionality but make the architecure simpler. [21:48:55] so tgr, if the above is not naive, then splitting lets us eventually refactor to this, and so makes the fetching service part a temporary crutch [21:49:07] tgr, this would be great, but I still don't see a clear path for any service to generate that list [21:49:31] milimetric: So you're saying that the POST request to the fetching service happens in VisualEditor or another JS tool? Would that break the ability to do graph specs from the wikitext editor / without JS tooling? [21:49:35] we can't trust the client to generate the image [21:49:45] the extension is only 363 lines of PHP code [21:49:51] tgr: yeah that could be elegant, make editors specify the data dictionary explicitly, try to guess for ourselves client-side based on some lua logic? [21:49:57] RoanKattouw, i think that's what milimetric is suggesting, and I think that will break a lot of logic (bots, etc) [21:50:08] (oh, sorry, you said image url) [21:50:12] RoanKattouw: yes, wouldn't work without JS, but it shouldn't really [21:50:27] actually, tgr's idea is great, here's how: [21:50:41] Where would the full graphspec be stored? In wikitext, or somewhere in the graph service? [21:50:45] milimetric: yeah lua would be the poor man's version of it. But as I said, no AQS, no WDQS, size limits [21:51:13] this extension is simple and cheap, but complex to support in production, it's like a prototype [21:51:15] lua wouldn't be able to do much with the graph, unless you propose to implement Vega in Lua :) [21:51:16] you're expected to statically specify the list of needed URLs. Then you POST that, and if the server tries to fetch a URL you didn't specify a blob for, it just fails and you have to do better :) [21:51:59] "you" is the problem -- you require rich client to do the parser's work [21:52:07] if milimetric wants to dive into Vega internals and split out a new API, I'm happy to encourage that [21:52:16] but I don't see that work scoped in the RFC page [21:52:51] Vega lib allows custom data loaders (that's how it is used now as well). It is very straightforward to do any kind of redirects/custom data loading [21:52:58] yeah, this RFC is about the rendering service only, but approving or opposing it depends on how much work it pushes on everyone else, that's the main question we're trying to work out here [21:53:49] the alternative to statically specifying the URLs is to run Vega twice, right? [21:54:12] or once, and fetch/render at the same time, which is what the current graphoid is doing [21:54:15] but yes [21:54:52] A simple mental model that might help -- think of Vega as of a browser. You give it "index.html", and it fetches everything else required. And just like a browser, you cannot really do much with that index.html unless you build most of the browser's code yourself. [21:56:00] yeah, so it's valid to question whether we should support all of vega. For the immediate use cases, like the COVID graphs that need updating and basic charts, we don't need dynamic data, we could just build a rendering service and a basic static graph spec analyzer to POST to it, even from PHP [21:56:05] and what we are discussing is having a mirroring service and running Vega on the mirrored folder instead of giving it internet access [21:56:32] tgr, Vega doesn't have internet access, only WMF internal services [21:56:43] yurik: *graphoid [21:56:48] I was trying to go with your metaphor. [21:57:27] ah, sorry, yes. And that's the problem -- because when it loads the first part (first data), it could ask for a bunch more. The folder wouldn't have it. [21:57:57] yeah so some of the functionality would break [21:58:21] you'd only have graphs where you can fully predict the resources you need when writing the graph spec [21:58:23] i would say most of it. Example -- [21:58:25] https://en.wikipedia.org/wiki/Template:Graph:Street_map_with_marks [21:58:46] so it's almost time, I'm happy to keep talking but I know it's crazy late for some. The question that remains for me is the same, is it ok to push the complexity of the fetching service out of graphoid, and to let a rich client and mediawiki PHP battle as to which is going to handle it? I think the discussion so far is super helpful and makes me understand the problem better, and potential work arounds, but doesn't give me [21:58:46] confidence that everyone's ok with the proposed RFC [21:59:26] I suspect there will be a very complex task to make the client perform all the needed tasks. [21:59:33] I'm sorry for not being more prepared for this RFC [21:59:36] (user's browser on "save") [21:59:55] I think the rich client part is unnecessary complexity [21:59:57] it's totally ok, this one's weird because it's somewhat outside of our normal work [22:00:33] tgr: wouldn't it be simpler? Then MW doesn't have to worry about anything, just has a nice graph spec with an image placeholder to parse [22:00:55] I would prefer that the POST to the fetching service be done by MW, not by the client, but I think the concept of having a fetching service makes sense [22:01:00] milimetric, the current editing paradigm is that you don't need to RENDER in order to SAVE [22:01:06] I'd like to see development of the RFC continue and to hear more details about the various options [22:01:07] It would solve the weird loop of requests between the service and MW [22:01:28] It would have to have a fallback for if the service somehow forgets the graphspec associated with a hash though [22:01:58] #endmeeting [22:01:58] Meeting ended Wed Apr 15 22:01:58 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:01:58] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-04-15-20.59.html [22:01:58] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-04-15-20.59.txt [22:01:58] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-04-15-20.59.wiki [22:01:58] RoanKattouw, but per before - there is no "loop", it is more of a "graphoid" using "mw" as the storage/api for everything it needs. MW doesn't call graphoid at all. [22:01:58] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2020/wikimedia-office.2020-04-15-20.59.log.html [22:02:20] TimStarling: yep, I intend on getting to the bottom of this [22:02:25] yurik: it instructs the client (browser) to do so, it's indirect, but it is there [22:02:31] Sure, it's not technically a loop: MW outputs HTML that instructs the user's browsre to call Graphoid which then calls MW [22:02:50] RoanKattouw: how does the client fetching the data make a loop between MW and the service? [22:02:53] In any case, with a fetching service we'd have MW calling the service, not the service calling MW [22:02:56] yes, agree [22:03:14] milimetric: It doens't, I was contrasting to the present state of affairs with Graphoid, not to the client-does-everything alternative [22:03:14] RoanKattouw: wait, no [22:03:18] Sorry if that was unclear [22:04:07] RoanKattouw: oh got it, yeah, agreed as I implied above [22:04:12] RoanKattouw, not the later -- some service would need to figure out what it needs first somehow (i.e. with the rich client complexities), and only then the fetcher would get everything needed. [22:05:51] I think the two key questions we haven't really answered are: would this break so many use cases that it's not even worth the effort? is this enough to re-deploy the service? [22:06:29] it doesn't get rid of the loops, the loops are inherent in the fact that we are serving images and the definition is too large to fit into the image URL [22:06:38] tgr: I think deploying the basic service that renders an image seems useful at least on wmflabs or something, to solve immediate use cases that come up after graphoid is undeployed [22:06:50] like static snapshots of graphs on highly trafficked pages [22:06:56] so if the loops are a problem (it's not clear to me why they would be) not sure this helps [22:07:21] making WP pages depend on wmflabs for high-traffic pages? oh my [22:07:36] OTOH if the problems is graphoid making requests to other services, that could be handled by splitting out the data fetching part [22:08:14] tgr, i think the bigger question is -- what specifically (other than the lack of maintainer) is the problem we are trying to solve with the Graphoid. I agree about code quality (this is a function of maintaining, regular updates, etc, or else you get bitrot). I am not sure about SRE -- it has been mentioned, but no specifics. [22:09:01] splitting the fetcher would only increase complexity [22:09:25] and yes, it would be awesome to only use POST-based graphoid (which it already supports but is not used) [22:10:08] yeah as I said I'm not entirely confident we are solving the real problem here [22:10:14] but POST requires some sort of a proxy that will covert browser's request for an image into a graph spec [22:11:34] the other thing a separate fetch logic could handle properly is cache invalidation, but no one seems to be concerned about that [22:11:50] or we just did not have the time to dig into this [22:12:52] agree, i have been getting that same feeling ... as in Russian translations of https://en.wikipedia.org/wiki/The_Internationale -- "we shall destroy everything, and then build something better"... destroying is very easy, but unless there is a very clear idea of what needs to get built, you end up with the same old stuff, possibly worse because you ignore past knowledge [22:12:53] tbh, whenever I think about all of this, it feels like we are trying to shoehorn graphs into mediawiki [22:13:27] I definitely care about refreshing the graphs, I had asked if it's possible to just have a cron-like feature that did time-based re-rendering of the graph, storing it on the same path in Swift so the pages wouldn't even need to re-render, you answered that on the task tgr, I was thinking about your answer [22:13:29] akosiaris, this is true -- essentially we are trying to add an interactive service to a static page system. [22:14:24] akosiaris: maybe it's just that vega is too complex, so it's not graphs, it's shoehorning vega [22:15:03] Vega is in a way a reflection of MW - allows multiple pages (templates) to be combined for a single view. Ignore Vega -- any other graph would have the same issue -- you have external data that needs to be combined. [22:15:14] Tim had a great idea, we could just limit what we support to a handful of templates that are nicely designed and expose a much simpler spec to the editors with vega as the back-end [22:15:37] so we implement: info-maps, bar charts, line charts, pie charts [22:15:56] milimetric, won't work. vega already does that -- i created a bunch of basic templates. But the end result was a bunch more lua modules that were generating vega on the fly because each use case required something different [22:16:24] and then you end up with as many template parameters as Vega itself, thus overcomplicating things even further [22:16:24] templates are not the right place for that functionality [22:17:01] if by templates you mean pages on the wiki in the template namespace [22:17:03] and then people post something like { type: map, data: { boundaries: world-topo.json, perCountry: data.json, flags: country-icons.json } } [22:17:53] yurik: I think that milimetric's point is that you have a handful of locked down templates that do 1 thing and do it well. You don't end up exposing much of vega's functionality nor so many parameters of vega [22:18:05] by templates I mean like json schemas that define the simplest possible ways to define a handful of charts [22:18:07] a really dumbed down version of vega, I 'd call it [22:18:21] yeah, like vega-duh [22:18:25] and forget about all the bells and whistles and the niceties [22:18:28] basically, a module system [22:18:34] akosiaris, that's exactly what we already have, and it wasn't enough. As for Vega-duh -- its called Vega-Lite :) [22:18:41] right, that is better [22:18:43] it will break a lot of stuff I guess, but it might be an overall ok system? [22:18:44] vega-lite is still too tricky [22:18:59] hm... or is it... [22:19:14] we could just support vega lite, if we can statically analyze the data easily in PHP [22:19:17] but that's a product problem, mixing it with the architecture only confuses things [22:19:19] milimetric, nope, otherwise you are risking of reimplementing it, making your own version of "tricky" :) [22:19:39] it would maybe make graphs more appealing / accessible to editors. It would not really change the architecture. [22:19:39] true [22:19:39] currently we have pages on the wiki in the template namespace that call Lua modules [22:20:05] I'm ok with that, at least to start. See this is the nice flexibility I see with decoupling these three phases that tgr pointed out: fetch/render/store [22:20:35] milimetric, i think you are trying to shoehorn both Vega and MW into that paradigm, and it doesn't work too well [22:20:47] you will end up cutting a lot of corners and dup code for that one :) [22:21:10] TimStarling, not sure i understood your point [22:21:41] most templates i saw simply call #invoke - they function as call wrappers [22:22:35] tgr, agree [22:22:38] anyway, I shall keep working on a basic prototype and take discussion to the RFC when I have a new set of options, based on what we talked about here [22:23:33] milimetric, sounds awesome! If you have some time, could you also try to document the specific goals you are trying to solve about Graphoid -- otherwise we may end up with its clone [22:25:01] I think it would be best to bring back the discussion to the stewardship page and try to arrive at a more specific problem statement [22:25:11] \o/ [22:25:15] the RFC defines the problems I'm trying to solve with the rendering service fairly well, and I will add the problems it pushes on other parts of the stack [22:25:39] I still think splitting rendering and fetching would solve a number of issues, I'm just not sure those are the issues it is getting undeployed for [22:26:27] i think the main issue is "no steward". And that issue can be solved by milimetric rewriting it and becoming a steward, or someone (including milimetric ) deciding to become Graphoid steward [22:27:00] stewardship page already reached a decision, based on nobody wanting to steward it. The reasons seem moot as relates to that decision, so I'm hesitant to continue the discussion there. But I agree this discussion is germain [22:27:19] but unless there is an institutional buy-in (e.g. a team), it may end up the same way still [22:27:33] maybe, but it will at least live as long as I do :) [22:27:50] hehe [22:28:15] well, the stewardship issue is solved by you volunteering, so presumably that is not the reason for undeploying [22:28:26] this ^^ [22:29:11] I don't think milimetric volunteered to adopt the service as-is though, right? [22:29:37] if milimetric, after doing a POC, realizes that everything he is trying to build is already in Graphoid, he might changed his mind :) [22:29:56] and btw, I am willing to help with any cleanup effort [22:30:24] Graphoid service in of itself is not that much bigger than the graph extension [22:30:38] so is the issue that the current code is hard to maintain (for some unspecified reason) so it needs rewriting to make stewardship a manageable task? [22:30:40] most of it is a cookie-cutter stuff from service template [22:30:52] it does sound that way, yes [22:31:32] or is the rewriting trying to address the architectural problems brought up on the stewardship task [22:31:53] tgr: the code is too short to be hard to maintain, the rewrite is not really any harder than maintaining, the main issue to me is the architecture and easing SRE burden [22:33:13] writing a specification for fetching data seems a lot harder than leaving that outsourced to the third-party Vega library [22:33:26] I don't think it's trivial effort [22:33:30] milimetric, please please specify SRE burden. I have seen it used a lot, but no specifics [22:33:58] yurik: what Alex mentions in the stewardship task, it’s very clear [22:34:33] tgr: leaving it in vega implies somehow solving “how to call vega from mediawiki client or server” [22:35:14] you mean solving differently from how it's solved now [22:35:32] milimetric, i have read that ticket, but not sure which part you refer to [22:35:52] the issues the task lists now: [22:35:59] the way it’s solved now leads to SRE burden, so yes, solving while optimizing for that [22:36:09] - no library version pinning (code quality issue, not architectural) [22:36:57] milimetric, you will have to solve “how to call vega from mediawiki client or server” regardless. Lets try to get to the bottom of it if you have some time. BTW, we can do it in a hangout -- might be faster than typing [22:37:07] - uses abstract protocols instead of raw HTTPS URLs, with no clear documentation of them (seems easy enough to fix) [22:37:32] https://hangouts.google.com/call/YTfSO_RYznb91SlOATfIAEEE [22:37:35] - uses the wrong domain name for some services (seems trivial to fix) [22:38:54] tgr, could you give a link to it? [22:38:57] yurik: I can’t I’m at dinner :) [22:39:01] - "unorthodox architecture" which is where the whole discussions about render vs fetch vs cache came from [22:39:12] milimetric is multitasking :D [22:39:20] yurik: the task? sure, it's https://phabricator.wikimedia.org/T211881 [22:39:44] tgr, that's the one i have been looking at ... weird, which comment? [22:39:53] the description [22:40:01] this is in the task description (I'm paraphrasing heavily) [22:40:15] also going bottom up for some reason [22:40:15] ah, ok, sorry, misunderstood [22:40:55] so yes, i think the biggest issue is the "unorthodox architecture", which (in my mind) is simply the result of the task at hand, not some evil scheming on my part :) [22:41:19] i.e. the architecture was dictated by the problem [22:41:43] tgr: are you suggesting just fixing and adopting the current service? I think it’s actually easy enough to rewrite on node 10 with ES6 syntax and await/async type stuff, it’s literally like 40 lines and might as well if I’m going to own it. I think anyone would do the same [22:42:34] even yuri if he was faced with having to deploy a new service [22:42:41] milimetric: as a general rule, engineers always want to rewrite from scratch and it's rarely the right choice, so I would always consider fixing/adopting first [22:42:47] milimetric, you may still save a lot of time if you do it gradually - just replace a few features at a time. [22:43:10] lol... tgr , are you sure you are not my sockpuppet ? [22:43:15] tgr: heh, not me, but in this case ... [22:44:10] milimetric, you will see - you will end up with almost the same thing unless you fundamentally change the architecture... and from the above discussion, i haven't heard of any significantly different approach tbh (i wish there was though) [22:44:10] I mean, updating to node 10, sure, if you mean that by rewriting [22:44:35] but that's very different from changing the architecture, pulling functionality out of Vega and reimplementing it [22:45:30] so the problem statement part of "unorthodox architecture" is that it makes benchmarking hard [22:45:41] btw, milimetric, not sure if you are aware, there was a GSoC-style student who did some preliminary work on the "url block" parsing, adding many unit tests, etc. So a lot of this work has already been done [22:45:51] but graphoid does support posting the graph spec, that could be used for benchmarking [22:46:21] is the issue that there's no log to retrieve the potential POST bodies from? [22:46:38] or that it is not easy to start from a wiki page and derive the POST body? [22:46:41] the architecture question is not up to me. I’m happy to update to node 10 and move on, but I prefer to find consensus so I can find a steward [22:47:15] milimetric, true, so we should figure out the architecture issue ... I think akosiaris was the person? [22:47:34] tgr: the latter [22:47:55] tgr, the POST approach is difficult because it requires a proxy -- something that converts Vega spec hash -> Vega spec [22:48:17] yurik: yes but it’s late so lets not ping Alex :) [22:48:27] but why do you need the hash? just take the raw wikitext and post it [22:48:29] oh, wow, time flies [22:48:47] or would benchmarking imply using thex exact same external data that was used for a hash? [22:48:49] tgr, you can't -- the HTML of the wikipage should have something [22:48:58] (but external data is not represented in the hash anyway) [22:49:45] for benchmarking you can totally do POST. And you can also use it during page editing (if really needed - because you may as well do it locally on the client). [22:50:06] but not for in-production use -- because HTML only contains an image URL, which is a hash [22:51:24] so benchmarking here means crawling Wikipedia pages and checking how long the various URLs take? [22:51:54] is that easier than taking graph specs from the database or something similar? [22:52:20] hmm, now i am not too sure what benchmarking means in our context... [22:52:38] are we talking about integration testing? [22:52:56] I mean, if the current architecture interferes with ops benchmarks, that's a problem. But I don't think the problem is explained well on the task. [22:53:13] I think where the graph spec is stored and how it’s used via templates is a big UX problem, and this is not covered by the stewardship thing because it’s client side. But it’s obvious from no product work and lack of wide adoption [22:53:16] So, as I said, maybe best to take the discussion back there. [22:54:10] i think the biggest problem with the wider adaption is 1) no Vega-Lite (much easier for simple graphs), and 2) no updates to the latest version - because that's where most of the documentation has been happening [22:54:18] yeah, there's a product problem with it being hard to use. I don't see how that's related to architecture issues. [22:54:46] and that problem can be solved with templates (whatever "template" means in this case) [22:55:08] splitting out the services, at least initially, forces us to consider carefully where to put every piece and what graph engine we want to support. This, in my estimation, will make sure we have the product discussions. So it’s strategic more than technical [22:55:13] maybe a different graph language would be better. Maybe a custom spec that's then used to generate Vega code. All that doesn't really affect the architecture. [22:55:23] it does! [22:55:41] like for example two of those choices may not need a fetch step [22:56:11] (or sorry, they make the fetch step so easy it can happen anywhere) [22:56:30] not really. Any type of a graph lib requires external data -- we are building "data driven visualizations". If the lib supports images (which is a must), those images would be based on data. Thus you already have a dependency [22:56:47] the fetch step depends on how much data you need, not how you process the data [22:57:17] it depends on how easy the spec is to read outside its main implementation [22:57:43] milimetric, you are again falling into the re-implementation trap. The moment you have to look at the spec, you are re-implemnting some of the lib [22:57:52] I'm sure it's better to drop image support than to drop graphs altogether, so I don't think there are any musts here. But I still don't see it making the architecture fundamentally simpler. [22:58:10] so: { data: [ url, url1, url2 ] } is easy to read in php, whereas vega specs are not [22:58:53] milimetric, that's not entirely correct. Reading { data: [ { url: url1}, {url: url2}, ... ] } is just as easy [22:59:03] if we can run the fetch step in php, problems above go away [22:59:10] but you cannot :) [22:59:10] yeah. But can you assume it can be encoded in the image URL? Otherwise, there will still be a fetch step [22:59:36] and plus the whole idea behind Graphoid was to NOT require PHP to do any external access :)))) [22:59:49] a step yes, but one that’s feasible to run in php. Using vega proper is not [23:00:11] what do you gain by running it in PHP, though? [23:00:42] current system: URL hash -> Graphoid -> fetch needed data. Proposed system: URL hash -> MW (assemble all the data) -> Graphoid ? [23:00:53] You don’t have to depend on the client doing it, or the rendering service doing it [23:01:00] better cache invalidation in some cases, because when you use a commons datafile title instead of an URL for example, you can add it to the links table. But no one seems to care deeply about invalidation. [23:01:26] milimetric, is what i wrote what you have in mind? [23:02:00] so if fetch has to happen somewhere, we can pick between php, js client, and service [23:02:27] js client is kind of orthogonal [23:02:28] People seem to hate the last two, so I see choices that make it easy in php as having an impact on architecture [23:02:58] no, js client coupd do it pre-save every time and fetch a url for an image via a post [23:03:22] and then we’d have all the problems people mentioned [23:03:54] milimetric, sec, lets pretend you know exactly what resources a graph needs [23:04:07] it could, but it would duplicate still-needed server-side behavior for no particular reason [23:04:14] can you describe what happens when an HTML asks for an image by its hash? [23:04:53] tgr: still needed is debatable, we could just render a 404 if the client is not able to get an image url from the spec for some reason [23:05:34] yurik: it renders the image, it’s stored directly at that hash in swift [23:06:03] you need to render an HTML page that can be cached by Varnish. That cannot be done client side. No point in coming up with a separate mechanism for page previews, the same mechanism will work fine. [23:06:30] tgr: right now, with graphoid undeployed, that page has a hole in it [23:06:47] ok, so this has nothing to do with POSTing or PHP, because Graphoid could add an image to swift just as well. [23:07:50] tgr: so the difference a js-client-only solution would make is that for most edits, that hole would be filled with a link to a static image, rendered by the simple service I'm proposing [23:07:54] milimetric, if you agree to take over Graphoid, it doesn't need to be undeployed -- esp if there is no clear understanding of why it should be (per tgr 's above comments) [23:08:30] i'm just trying to understand the MVP of a service we are talking about [23:08:30] yurik: no, I don't agree to that, I tried to update dependencies and upgrade it to node 10 and I wasted a day with crazy errors I'd rather not deal with [23:08:49] milimetric, you should have poked me - would have been done in an hour ;0 [23:08:52] ;) [23:09:02] plus we don't need to support vega 1 or 2 in the new service, just latest vega to start, because 1 and 2 will be client-side only [23:09:08] storing the image to Swift doesn't really change things. If it's not content-addressible (including dependencies), it needs to expire, at which point Swift needs to proxy the request somewhere. If it's content-addressible, that implies fecthing the data can be done synchronously. That makes page saving more fragile, and I don't se what benefit you get in exchange. [23:10:10] tgr: it's ok for it to expire at some predetermined interval, and for the 404 to trigger a re-render [23:10:33] milimetric, designing a system that has broken images by design seems weird :) [23:10:55] tgr: or for the 404 to just leave a hole, as is the current case. So my point is, client-side-only fetching strictly improves from the current situation [23:11:08] yurik: well, that's what we have, like it or not, with graphoid undeployed [23:11:38] client-side-only fetching is the status quo once graphoid is shut down, isn't it? [23:11:47] look, we're talking in circles a bit. It is beyond our control to re-deploy graphoid. It's not going to happen. If it were, someone would have fixed graphoid by now, including you, yurik. It just hasn't happened so now that's done. [23:12:16] i was not participating because I was under an assumption that it has to be a WMF employee with server access [23:12:31] i would be totally fine maintaining it -- simple enough [23:12:33] tgr: no, I mean fetching in the context of the architecture here, as in, fetch so a render can happen server-side. Status quo will be client-side-only rendering [23:12:49] client-side-only rendering must be enabled first [23:12:55] ah, I see. [23:13:00] it will be, Seddon is working on that [23:13:01] its a setting on graph ext [23:13:09] just flip a switch :)))) [23:13:25] yeah, he's working on the switch, and also I think implementing latest vega [23:13:47] still not clear what's the point. If you can client-side-fetch you can also client-side-render right? [23:14:10] tgr, ok so two very different use cases: [23:14:15] but you can't do that on noJS clients, or on pages with lots of images, or for clients with poor bandwidth... [23:14:20] same issues as client-side rendering [23:14:24] reading: sees hole, client-side renders slowing everything down [23:14:29] set wgGraphImgServiceUrl=false :) [23:14:36] editing: sees preview, saves, hole is still there for reading [23:14:38] versus: [23:14:48] (so that's the status quo) [23:14:54] versus what I'm talking about: [23:15:15] reading: sees an image, most of the time, unless a bot or something without JS abilities messes with a graph spec [23:15:22] yurik: I know the setting don't worry :P [23:15:41] editing: sees preview, saves, fetch happens, render is triggered, image url is saved with wikitext [23:16:14] so yes, tgr, you can't edit in the situations you mentioned, but you get rid of a lot of holes [23:16:45] also, you can't edit vega graphs without js... [23:16:54] so basically the graph URL would be user input? [23:17:14] graph URL? You mean static snapshot image URL? [23:17:27] and it would only work when saving pages in VE, not the wikitext editor? [23:17:48] the wikitext editor could have some extension for it, no? It has an onsubmit handler doesn't it? [23:17:53] you seem to be suggesting that the client tells MediaWiki on page save what the URL is [23:18:25] the client just sends wikitext to save, part of that wikitext is a link to a file in swift [23:18:53] milimetric, you do realize that the rendered graph in preview is currently generated by the server side? [23:19:10] I did not, why is that, seems wrong [23:19:12] that would mean the graph applet has to interact with the wikitext or VisualEditor layer in some way [23:19:58] e.g. you click "preview", the server parses templates, computes the proper content of the tag (pure JSON), attaches that JSON as an extra var in the returned HTML, and then browser takes that var and gives it to Vega lib [23:20:07] tgr, yeah, I mean, I guess if people wanted they could POST to it, get the link, and stick into wikitext themselves [23:20:27] again, seems a lot better than literally holes on every single graph [23:20:49] holes are created by decomissioning a working (but unmaintained) service :) [23:20:51] ...that does not seem like a good way to increase uptake of Graph in the editor community. [23:20:52] yurik: right, because of templates, yeah, yet another reason I'd love to keep everything client-side [23:21:17] but you cannot.... because some graphs are several pages long, and are used in many different pages [23:21:28] tgr: right, which is why I don't love that solution :) I'm just saying how the choice of graphing libraries affects architecture [23:21:30] you are trying to rewrite the whole architecture of mediawiki [23:21:51] per above, any other graphing lib would result in a similar set of problems [23:22:04] mediawiki fundamentally has templates and {{#tag}} feature [23:22:48] I think this conversation got out of hand :) [23:23:38] I think we're just in a very bad position, with graphoid being undeployed [23:23:39] nah, just trying to come up with a reasonable solution that actually addresses the presented problems (and we do have problems, i'm not denying that) [23:23:52] self inflicted wounds are the best :) [23:24:03] and I think it's unacceptable for doctors to be wasting time with the consequences of that [23:24:58] milimetric, someone had to pull the trigger and cause the problem in the first place... and Covid-19 is an especially good time to cause problems as people are much more data-dependent than ever [23:25:04] so I'd love any help in thinking of a viable short term and long term solution. I don't know enough about mediawiki, but I don't think we should hack anything in or rewrite mediawiki, but I also don't think we should flip over backwards to support vega as is just to keep from rewriting a tiny handful of graphs that are already in use. [23:25:22] I'll personally rewrite all the graphs if we can get to a better solution that people at WMF are happy to maintain [23:26:01] people are already doing graphs in excel because they cannot use any reasonable data graphs soultion with wiki [23:26:11] yeah, that's not ok [23:26:22] well, this was never a WMF priority AFAIK [23:26:46] (not to be confused with the priorities of individual developers) [23:30:48] so to sum up -- milimetric, i will be happy to work with you (even pair program if needed) to get this resolved. If you propose a new architecture to solve it -- great too, but it should be better than what was used before, and actually solve some of the raised issues. Per tgr -- the issues do not seem to be architectural, but we definitly should try to make it more palatable to everyone. [23:31:36] k :) [23:50:40] https://xkcd.com/2294/ [23:53:00] TimStarling: Too true.