[01:59:11] My name is Oggy. [01:59:21] Oggy the Otter. [01:59:49] Yes you are. [04:11:02] re-hello BRPever [04:11:48] hello :D [07:14:54] sjoerddebruin: did you also check the correlation of the WDQS lag with PAWS use? I've been monitoring the lag for some days now, and PAWS seems to tip it over several times... as if QS alone it can handle, but not when PAWS kicks in too... [07:15:43] AFAIK https://www.wikidata.org/wiki/Special:Contributions/MsynBot is the most active bot on PAWS and he already lowered his edit rate manualy [07:15:55] let me check RC again [07:16:22] jup https://www.wikidata.org/wiki/Special:RecentChanges?hidecategorization=1&tagfilter=OAuth+CID%3A+429&limit=100&days=1&damaging__likelybad_color=c4&damaging__verylikelybad_color=c5&urlversion=2 [07:19:48] ok, and these items do not seem overly large either [07:20:25] I will see if I can add these to my R notebook later... [07:20:32] first WikidataCon abstract deadlines [07:28:21] egonw: i still think it's Wikicite related http://wikidata.wikiscan.org/?menu=live&date=24&list=users&sort=tot_size&filter=all [07:29:17] how do you know? [07:29:31] I don' know what Arthur has been doing... [07:29:37] editing items over 1 MB [07:29:39] https://www.wikidata.org/w/index.php?title=Q56640847&action=history [07:30:00] I have used QuickStatements a lot this weekend for chemistry [07:30:30] (but this is just last 24h, so maybe not a big part of it) [07:30:40] ah, +1 [07:30:45] okay, clear [07:31:27] you only have 2.4 GB this month so far http://wikidata.wikiscan.org/?menu=dates&filter=all&sort=tot_size&date=201904&list=users [07:31:50] I was just looking at http://wikidata.wikiscan.org/utilisateur/Egon_Willighagen :) [07:32:46] okay, okay... [07:32:55] I'll shut up now... [07:33:10] so, we have funding for hardware in our Scholia grant... [07:33:21] do you see the three wikicite projects at the top, compare their number of edits and volume :) [07:33:25] I want to understand all this, the tiny details, so that we can spend that money well... [07:33:36] not Scholia != Wikicite, but clearly closely related [07:33:56] Scholia needs a lag free query service I assume [07:34:00] KrBot is not Wikicite, or? I mostly know it for chemistry edits [07:34:20] lag free would help, but anything <1h would be quite acceptable, IMHO [07:34:25] yeah those are averagely sized tho [07:34:32] we're not pitching it as a live system [07:34:37] tho that part has been cool [07:35:17] I think it is quite important to keep the lag low, otherwise it disrupts a lot of workflows [07:35:30] ^ [07:36:18] pintoch: sure! but that was not what sjoerddebruin asked [07:36:38] i.e. there are plenty of reasons the lag should be low, but Scholia is not one of them [07:37:05] SourceMD is an example for which it seems essential [07:37:11] ah right, ok [07:37:47] but thanks egonw, i see what you mean about paws [07:37:49] daniel has been in contact with the hardware people... I do like to learn how we can contribute to making the Wikidata platform better [07:38:10] https://grafana.wikimedia.org/d/000000170/wikidata-edits?refresh=1m&orgId=1&from=now-3h&to=now [07:40:02] potentially the pool of WDQS servers could be split into various latency levels (some would have low lag with a low timeout thershold and some others with a higher timeout threshold) and people could choose their pool depending of the needs [07:40:33] as in, if I am just resolving items by ids, the request is cheap to exe [07:40:35] *execute [07:40:49] so I would use the low lag, low timeout threshold pool [07:40:54] pintoch: I was thinking the same thing too... [07:41:04] it's clear the old cluster that has the biggest issues... [07:41:13] so SourceMD should only use the new cluster [07:41:27] that would solve at least some duplication issues [07:44:13] or sourcemd could use the search engine with "haswbstatement" to look up items by ids instead of SPARQL [07:44:32] yes, which it actually may do... [07:44:51] I was assuming it used SPARQL bc of incorrect detecting if some DOI was already present [07:45:18] wasn't that some unperfected regex? [07:46:07] I don't see issues anymore with DOI casing differences, if that is what you mean [07:46:24] I do see items created occassionally for the same DOI soon after each other [07:46:45] (not often, and I fix them when I see them) [07:48:21] mm http://wikidata.wikiscan.org/gimg.php?type=edits&date=6&size=big [07:49:06] it seems like when stuff goes above 500 edits total per minute, it lags... [07:50:18] 500x1MB ... maybe some network bandwith limit on an intermediate router, you think? [07:51:23] does the lag correlate with the CPU/system load on the (lagging) WDQS servers? [07:51:48] those are not very readable to me, honestly https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1 [07:52:38] but click wdqs1004 on updates per second and compare it to the lag... [07:52:46] on none of the servers the load even hits 60 (I assume %?) [07:53:07] how do I do that? [07:53:20] click on 1004 in the legend [07:53:32] https://www.dropbox.com/s/g1ourzssebzmd58/Schermafdruk%202019-04-29%2009.53.28.png?dl=0 [07:53:40] https://www.dropbox.com/s/8d2rdohvv8lbwf7/Schermafdruk%202019-04-29%2009.53.38.png?dl=0 [07:54:21] same for batch progres [07:54:35] it takes breaks or something... [07:56:46] looking at these plots, I cannot see why the lag should go up [07:59:07] ok, need to finish those WikidataCon abstract first now [09:54:22] ok, WikidataCon19 abstracts (2) submitted [09:56:11] good timing [09:57:53] for some reason I assumed applications would close around midnight, but apparently it’s on noon UTC+2 [09:57:55] i. e. in three minutes [10:56:20] PROBLEM - High lag on wdqs1003 is CRITICAL: 3667 ge 3600 https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [10:57:18] welp [11:04:26] Lucas_WMDE: is this actively discussed there or? I'm running out of ideas for now :| [11:11:48] SMalyshev: would you like to confirm pintoch's message here, might convince him https://www.wikidata.org/wiki/User_talk:ArthurPSmith#Query_service_lag [11:14:37] 1,2 MB per edit in the last 24hr... hmm [11:52:35] sjoerddebruin: we’ll probably pick up the maxlag task tomorrow, and hopefully get it done before next week’s deployment [11:52:45] Lucas_WMDE: woah [11:53:01] well, perhaps that’s too optimistic for the deployment [11:53:05] but we should start working on it soon ^^ [13:49:30] ACKNOWLEDGEMENT - High lag on wdqs1003 is CRITICAL: 5343 ge 3600 Gehel update lag above threshold, but starting to recover https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?orgId=1&panelId=8&fullscreen [14:38:47] okay this is terrible [15:00:04] sjoerddebruin: it's been far worse :) [15:02:13] it’s interesting that the three eqiad servers (wdqs100*) are all lagged whereas the three codfw ones (wdqs200*) seem to be more or less fine [15:02:29] as far as I’m aware, they should need the same updates [15:02:33] but eqiad sees the majority of query load [15:03:16] what is the criterion to route requests to one or another? [15:04:47] not sure, https://wikitech.wikimedia.org/wiki/Wikidata_query_service#Hardware seems to indicate it’s geographical [15:04:55] but eqiad and codfw are both in the US [15:05:07] fairly close to each other too, if I recall correctly [15:05:10] one or two states apart [15:06:19] okay nevermind that last bit, it’s 2000 km after all :) https://www.openstreetmap.org/directions?engine=fossgis_osrm_car&route=39.0437%2C-77.4875%3B32.9537%2C-96.8903#map=6/36.084/-87.189 [15:26:33] Lucas_WMDE, pintoch, sjoerddebruin: ok, we seem to have funding in the Scholia grant to aleviate some of the burning questions... ie. but we need to know how/what to contribute... who can I talk to about this?