[00:50:23] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10jlinehan) >>! In T233634#5582558, @Nuria wrote: >>Default behavior: if no config for stream, just return em... [03:12:30] 10Analytics, 10Analytics-Kanban, 10Chinese-Sites: MediaWiki history dumps have some events in 2025 - https://phabricator.wikimedia.org/T235269 (10Shizhao) [03:48:38] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10Nuria) @jlinehan I think we will need to live with a use case in which no config is required if loading the... [05:06:20] (03CR) 10Mforns: [V: 03+2] Fix report name in wmcs config [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/543719 (https://phabricator.wikimedia.org/T232671) (owner: 10Srishakatux) [05:37:12] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019), 10Patch-For-Review: Puppetize reportupdater to run wmcs reports - https://phabricator.wikimedia.org/T235718 (10mforns) @srishakatux This is the puppet patch. When it gets merged by one of our ops people, RU will start to execute your job p... [08:28:15] hello folks, I realized that I didn't update this han [08:28:19] *chan [08:28:29] so today I upgraded archiva to 2.2.4 [08:28:35] \o/! [08:28:44] that worked, but a major problem came up [08:28:52] ? [08:29:12] namely that /var/lib/archiva/conf/archiva.xml is overwritten by the deb, AND we don't have it in puppet [08:29:21] so files are there, but archiva lost its config [08:29:23] Aouch :( [08:29:25] ldap, repositories.. [08:29:36] BUT, we should have a copy in bacula [08:29:45] so hopefully I should restore soon [08:30:45] :( Let me know if you need any help elukey [08:31:26] I tried to re-configure manually but it takes a long time + LDAP config is tricky, so I hope that bacula will save the day [08:31:38] the next step will be to add archiva.xml to puppet of course [08:31:50] I can imagine [08:32:09] the last time I think that I didn't get this problem since I created a new VM [08:32:37] And therefore you could copy the XML config? [08:33:03] yes exactly, since we backup /var/lib/archiva [08:33:06] where the xml is [08:33:12] makes sens [08:33:35] the main problem is that I scheduled a restore job in bacula (first time ever) and it will take a bit IIUC [08:33:43] because there are a ton of jobs running [08:33:47] k [08:47:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Create test Kerberos identities/accounts for some selected users in hadoop test cluster - https://phabricator.wikimedia.org/T212258 (10elukey) >>! In T212258#5582604, @Isaac wrote: > @elukey I'm having trouble ssh-ing into an-tool1006.eqia... [09:04:42] archiva should be working now! [09:05:57] elukey: a lot faster!! [09:06:03] (web UI) [09:13:29] I hope that the builds are ok as well :D [09:14:33] elukey: artifact serving works indeed :) [09:16:01] joal: super, thanks a ton for testing [09:16:05] feeling better now [09:17:29] joal: if you want I can also attempt to upgrade eventlogging [09:17:45] sure elukey - no problem with me [09:17:59] elukey: if you don't mind, let's wait 1/2 hour (I'm in a meeting) [09:18:08] ah sure [10:15:29] joal: ok to proceed? [10:15:52] Yessir ! sorry, meeting took longer than expected - We're closing ,I'm in :) [10:15:58] ack! [10:18:57] !log move eventlogging to python 3 [10:18:59] done! [10:18:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:19:25] * joal watches heavily the chan for possible errors [10:20:41] yeah there are sigh [10:21:55] never seen this in deployment-prep [10:22:06] basically there seems to be some json events not serializable [10:22:11] I guess I have to rollback [10:22:23] wow [10:22:25] :( [10:26:00] !log rollback eventlogging back to Python 2, some errors (unseen in tests) logged by the processors [10:26:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:26:07] all good now [10:26:35] :S [10:26:41] Not nice elukey :( [10:26:49] rollback all good? [10:28:34] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10elukey) Tried to deploy in prod, but had to rollback due to the following exception logged in the processors (causing them to fail and rebalance etc..): ` Oct... [10:28:36] yep yep [10:28:53] I am sad since I didn't see that error in deployment-prep [10:29:22] of course [10:30:40] thank you so much for yesterday's deployment joal,, I'll begin backfilling tops shortly [10:31:34] no prob fdans - the train goes on :) [10:38:13] going to eat something! [10:38:19] 10Analytics, 10Analytics-Kanban: Add Mon Wikipedia to analytics setup - https://phabricator.wikimedia.org/T235747 (10jhsoby) [11:39:42] I think that the el fix is https://gerrit.wikimedia.org/r/#/c/eventlogging/+/543832 [11:40:50] elukey: I was looking at EL code and couldn't trace back to that [11:40:58] But I trust you [11:41:37] it is weird, I thought I had fixed it, plus in labs in theory it should have surfaced [11:42:02] elukey: seems related to error-event only, no? [11:43:24] joal: it seems so, but it is in the writer.send(), and the logs show a 'u..' event when logging that it is not serializable [11:44:08] yup [11:44:24] weird :( [11:44:52] cause the error message is in the form of L123 in processor code [11:47:29] joal elukey I don't mean to add more poop to the toaster but I get this when querying the new tops keyspace: [11:47:34] https://www.irccloud.com/pastebin/KzYcxL06/ [11:48:02] as you can see cqlsh crashes and exits [11:49:25] fdans: no "granularity" column in that table [11:49:49] joal: oh you're right, but that's still a pretty weird error [11:49:56] fdans: "year", "month", "day" however [11:52:17] joal: yes I'm still getting the same thing, I'll investigate further [11:52:21] when i curl i get "Error in Cassandra table storage backend" [11:54:25] joal: so from https://docs.confluent.io/current/clients/confluent-kafka-python/#confluent_kafka.Producer.produce it seems that produce needs a string [11:54:33] in there we pass a binary string [11:54:39] fdans: I have the problem for CQL I think: weird UTF8 double quotes in your query [11:55:28] fdans: this has worked for me: select * from "local_group_default_T_mediarequest_top_files".data where "referer" = 'all-referers' and "media_type" = 'all-media-types' and "_domain"='analytics.wikimedia.org' and year = '2019' and month = '05' and day = '17' limit 10; [11:56:13] joal: OMG THE APPLE NOTES APP [11:56:25] joal: I'm sorry :( [11:56:42] fdans: fairly classical as well unfortunately [11:56:44] :) [11:57:32] joal: I started using it this morning for commands and little snippets that are repetitive but easily indexable, like cassandra queries :( [11:57:50] fdans: Use sublime :-P [11:58:35] * joal hides before fdans throws tables all over the chan [11:58:36] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10elukey) Quick test: ` elukey@eventlog1002:~$ python3 Python 3.5.3 (default, Sep 27 2018, 17:25:39) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "cr... [11:58:42] joal: that's what I've been using so far, but notes is easier to make permanent without having to find a place to store them [11:58:59] I'd love a nice notes application that is code friendly [11:58:59] I don't know notes - Was just trolling [11:59:08] interesting fdans [11:59:26] notes is pretty wonderful but this encoding shenanigans is a total bummer [11:59:47] joal: ah no wrong parameter sorry, we need bytes [11:59:49] ok now I am confused [11:59:51] ahahaa [12:00:06] joal: in other news, we can proceed to backfill the whole range [12:00:14] elukey: I;m not even trying to understand :) [12:00:51] python strings have always been one thing that discourages me using python (but I still like it for other reasons, so use it sometimes) [12:01:39] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10elukey) From https://docs.confluent.io/current/clients/confluent-kafka-python/#confluent_kafka.Producer.produce it seems that produce() wants a string bytes pa... [12:01:53] joal: please support my crazyness nodding like you would do with a mad man [12:02:09] I'll stop I promise :D [12:02:30] elukey: nodding as you'd do to a mad man to a non-mad man would probably drive him mad? [12:02:34] :) [12:04:09] * fdans is testing an oozie job without testing the underlying hive query first [12:04:22] * fdans the gods are not pleased with my hubris [12:04:37] * joal keeps quietly hidden [12:06:14] joal: exactly! [12:06:19] but we are used to it in here right? [12:06:48] I have no clue waht you're talking about elukey B-) [12:08:31] (03PS1) 10Fdans: Add historical backfilling queries for mediarequests per file [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) [12:16:12] (03PS2) 10Fdans: Add historical backfilling queries for mediarequests per file [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) [12:31:03] * fdans the oozie gods are merciful and did not fail the job this time [12:32:12] brb [12:36:46] (03PS3) 10Fdans: Add historical backfilling queries for mediarequests per file [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) [12:56:32] fdans: the oozie error about the test keyspace is me having deleted it yesterday - I'm sorry for that :( [13:00:47] joal: nono it was a different thing, I noticed the keyspace had been gone earlier :) [13:01:06] Ah ok - sorry nonetheless, I should have told you [13:01:10] fdans: --^ [13:01:33] joal: jesus does loading tops to cassandra take a million years [13:02:39] fdans: Please let me know if I'm wrong but I think what actually takes time is cimputing the top - Cassandra step should be relatively fast :) [13:02:56] joal: yes sorry I meant per file [13:03:11] I've been mixing the names in my head lately [13:03:18] Yes - computing top values over relatively big datasets is expensive [13:04:38] While it's actually not so expensive to keep an ongoing-top job over stream of data, and send values at regular intervals (ping ottomata :-P) [13:09:31] joal: we can start backfilling this if you have no issues with the query - I just tested the job successfully [13:09:33] * fdans https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/543837/ [13:14:43] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10elukey) >>! In T233231#5583298, @elukey wrote: > Tried to deploy in prod, but had to rollback due to the following exception logged in the processors (causing... [13:16:25] joal: meeting? [13:16:49] YES ! [13:18:40] (03CR) 10Joal: "some comments" (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) (owner: 10Fdans) [13:18:43] fdans: --^ [13:21:09] joal: no need for a coalesce on referee because all referers is already included in the map used by the lateral view [13:21:30] fdans: then maybe no grouping-set? [13:22:58] oh shit you’re right [13:58:24] I am missing some trivial bit [13:58:33] where does eventlogging publishes invalid events? [13:58:47] I thought it was on a topic in kafka (eventlogging-errors) [13:59:15] I think that's it elukey [13:59:59] but I can't find it [14:00:10] also I can see [14:00:12] Oct 17 10:17:59 eventlog1002 eventlogging-processor@client-side-01[11799]: 2019-10-17 10:17:59,838 [11799] (MainThread) root [INFO] Publishing invalid raw events to [14:00:16] kafka-confluent:///kafka-jumbo1001.eqiad.wmnet:9092,kafka-jumbo1002.eqiad.wmnet:9092,kafka-jumbo1003.eqiad.wmnet:9092,kafka-jumbo1004.eqiad.wmnet:9092,kafka-jumbo1005.eqiad.wmnet:9092,kafka-jumbo1006.eqiad.wmnet:9092?topic=eventlogging_{schema}&message.send.max.retries=6,retry.backoff.ms=200. [14:02:01] ah wait [14:02:02] eventlogging_EventError [14:11:44] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10Ottomata) We might be able to satisfy both of you! > EventLogging JS API will get stream config from calle... [14:16:01] need to take my cat to the vet, will be back for standup! [14:17:06] (03PS4) 10Fdans: Add historical backfilling queries for mediarequests per file [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) [14:17:55] (03CR) 10Fdans: Add historical backfilling queries for mediarequests per file (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/543837 (https://phabricator.wikimedia.org/T228149) (owner: 10Fdans) [14:18:04] joal: all comments addressed [14:19:31] gotta run an errand, bbiab [14:45:03] 10Analytics, 10Operations, 10SRE-Access-Requests: SSH access for Lex Nasser, analytics intern - https://phabricator.wikimedia.org/T235688 (10crusnov) p:05Triage→03Normal a:03Dzahn [14:56:16] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: Modern Event Platform: Stream Configuration: Implementation - https://phabricator.wikimedia.org/T233634 (10jlinehan) >>! In T233634#5583898, @Ottomata wrote: > We might be able to satisfy both of you! > >> EventLo... [14:58:57] b [15:01:10] back! [15:04:27] 10Analytics, 10Analytics-Kanban, 10Performance-Team (Radar): Upgrade python-kafka to 1.4.7 - https://phabricator.wikimedia.org/T234808 (10Gilles) @elukey what does your Monday look like next week (21st)? Preferably during EU morning for me. [15:09:08] 10Analytics, 10Analytics-Kanban, 10Performance-Team (Radar): Upgrade python-kafka to 1.4.7 - https://phabricator.wikimedia.org/T234808 (10elukey) >>! In T234808#5584214, @Gilles wrote: > @elukey what does your Monday look like next week (21st)? Preferably during EU morning for me. I'll be out next week for... [15:14:31] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10leila) @JAllemandou should we expect to see this kind of change recorded in https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest#Changes_and_known_problems_since_2015-03-04 ? That... [15:40:37] joal when you are back: [15:40:43] I can't compile refinery-core [15:40:52] UAParser.java [15:40:53] cachingParser = new CachingParser(10000); [15:40:54] but [15:40:57] https://github.com/ua-parser/uap-java/blob/master/src/main/java/ua_parser/CachingParser.java#L38-L40 [15:41:15] Error:(78, 47) java: incompatible types: int cannot be converted to java.io.InputStream [15:46:38] does it need archiva? I have upgraded today to 2.2.4 [15:46:56] IIUC yesterday Joseph did a release [15:49:39] hm [15:49:54] i'd imagine that i'd get a different error if i was getting a wrong ua parser jar [15:49:59] actually, i can compile fine via maven directly [15:50:04] its just intellij that fails [15:50:05] ah! [15:50:06] so probably a prloblem there. [15:55:57] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Create test Kerberos identities/accounts for some selected users in hadoop test cluster - https://phabricator.wikimedia.org/T212258 (10Isaac) Yep, now able to access -- thanks! I'll do my best to test today and report back with an lgtm if no issues arise. [16:02:23] nuria: standup! [16:03:02] isaacj: thanks a lot! Please don't rush, only when you have time [16:03:28] also you are the first of testing kerberos [16:03:35] let me know if it is too weird etc.. [16:04:19] sounds good. for my sake, it's easier to just do it soon because otherwise it'll end up on a list of things that I forget to do :) [16:04:40] ack :) [16:08:49] ottomata: Our version of the uap-java code has a constructor for int only - does your pom version reflect new ? [16:09:34] ottomata: Also - Do you mind if I highjack your patch for spark-2.4.4, adding it the modifications I find? [16:18:16] 10Analytics, 10Research: Taxonomy of new user reading patterns - https://phabricator.wikimedia.org/T234188 (10JAllemandou) @leila : Indeed doc is available, but in https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageview_hourly, not in webrequest page. I wonder if we should n't always add lines... [16:18:51] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10elukey) @Ottomata this is surely unrelated, but should we upgrade librdkafka on eventlog1002? ` elukey@eventlog1002:~$ apt-cache policy librdkafka1 librdkafka... [16:35:42] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging to Python 3 - https://phabricator.wikimedia.org/T233231 (10Ottomata) Sure! [16:45:33] (03PS1) 10Ottomata: Add HDFSCleaner to aid in cleaning HDFS tmp directories [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/543897 (https://phabricator.wikimedia.org/T235200) [16:45:48] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Create HDFS /tmp/ cleaner - https://phabricator.wikimedia.org/T235200 (10Ottomata) [17:20:32] 10Analytics, 10Analytics-Kanban, 10Performance-Team (Radar): Upgrade python-kafka to 1.4.7 - https://phabricator.wikimedia.org/T234808 (10Gilles) I can't tomorrow and the week after I'm on vacation. @krinkle could you take care of this the week of Oct 27? [17:37:04] Trying again - ottomata - Can I highjack your patch for spark-2.4.4, adding it the modifications I find? [17:43:35] milimetric: re GEOEDITORS again, i saw your comment here https://phabricator.wikimedia.org/T131280 [17:45:40] nuria: which one (was just reading Asaf's recent comments there) [17:45:46] milimetric: ok, yes, we are saying the same thing in a case with 15 edits and three countries buckets of (1-10, argentina), (1-10, ecuador), (1-10, guayana) gives you a fuzzy probability of (1/15)/3 of an editor being on one of those countries. the real probabilities might be 1/15 and 13/15 in the most extreme cases [17:46:18] milimetric: but you already acknowledged that on your comment [17:46:52] milimetric: so lets 1) create documents for this dataset release (for now on our google drive until is public) and 2) i will merge change [17:47:43] milimetric: makes sense? [17:47:58] nuria: I can just make the docs on wikitech and keep a little note on top that says "this is not yet published but will be soon". That seems better, then if anyone has comments we can get them as early as possible [17:48:13] milimetric: sounds good, too, [17:48:25] I'll make a subpage of the current dataset and explain: [17:48:27] the data [17:48:29] the privacy review [17:48:36] the country blacklist [17:48:41] milimetric: this is the comment: [17:48:42] the bucketing (and its limitations [17:48:48] https://www.irccloud.com/pastebin/qA9Ia02A/ [17:49:19] right, where I did some math. I still think that's a good amount of fuzziness that we're adding, like, to me 5% fuzziness would be great [17:49:57] milimetric: on that i disagree i think the differences are too small given how small the numbers are but that's ok [17:50:40] (03CR) 10Nuria: [C: 03+2] "My comments had been addressed by @milimetric on ticket. We shall document the shortcomings of bucketing approach for such a small number," [analytics/refinery] - 10https://gerrit.wikimedia.org/r/530878 (https://phabricator.wikimedia.org/T131280) (owner: 10Milimetric) [17:50:40] :) yep, it's ok to disagree [17:51:07] I'll focus on the queue today and write docs later if I can [17:51:30] milimetric: differences of 5% matter in sets of thousands or hundreds, in sets of 10 elements tehy do not add much [17:52:21] I'm just thinking from the point of view of an attacker, 90% likelihood would be one thing and 85% likelihood might be reason to pause [17:52:34] but yeah, I don't think we're going to agree on this :) [17:53:43] milimetric: this data does not even need an attack to reveal loads of info, that is what makes it useful but also risky [18:01:33] !log update librdkafka on eventlog1002 and restart eventlogging [18:01:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:02:47] ottomata: --^ [18:26:56] (03PS1) 10Framawiki: query-status.html: hide Explain button until bug is solved [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/543923 (https://phabricator.wikimedia.org/T205214) [18:28:02] ※\(^o^)/※ [18:30:16] hehe [18:32:34] ottomata: o/ - is there a gotcha that I have to keep in mind when copying events from eventlogging-client-side to deployment-prep to avoid the Unable to process? (as opposed to unable to validate) [18:34:03] ah perhaps [18:34:08] where are you copythin gthem from? [18:34:24] kafkacat from jumbo [18:34:32] hm from the topic itself hm [18:34:37] then copy paste to a file (Tried with and without tabs) [18:34:45] and then kafkacat -P [18:34:50] in deployment-prep [18:34:53] hm [18:35:02] there shouldn't be [18:35:03] but i will try too [18:35:39] no I mean I am trying to insert a prod one (that may fail) into deployment-prep [18:36:30] aye [18:36:40] btw i wouldn't even try to produce it to kafka [18:36:45] just pipe it to el processor [18:36:53] ahh interesting [18:37:09] cat raw.event.txt | eventlogging-processor '%q %{recvFrom}s %{seqId}d %D %{ip}i %u' stdin:// /tmp/elce.1.txt stdout:// [18:37:11] i usually do [18:37:16] export PYTHONPATH=/srv/deployment/eventlogging/analytics [18:37:20] export PATH=$PATH:/srv/deployment/eventlogging/analytics/bin [18:37:54] oops [18:37:55] sorry [18:38:08] cat raw_events.txt | eventlogging-processor '%q %{recvFrom}s %{seqId}d %D %{ip}i %u' stdin:// stdout:// [18:38:08] like that [18:40:21] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Shorten the time it takes to move files from hadoop to dump hosts by Kerberizing/hadooping the dump hosts - https://phabricator.wikimedia.org/T234229 (10elukey) >>! In T234229#5581817, @Bstorm wrote: > Well, so far, we generally coordinate on maintenance, reboo... [18:41:44] elukey: does that work? [18:42:35] trying [18:44:47] still unable to process [18:44:57] I am keeping the tabs in the raw file txt [18:47:36] anyway, will restart tomorrow [18:47:47] it is getting late and I am clearly not finding the bug today [18:47:49] uff [18:47:57] thanks for the help! [18:47:58] o/ [18:48:35] bye elukey [18:50:20] ottomata: news on the CachingParser side? [19:01:39] elukey: if you give me an event i can try [19:01:48] joal: i have no problem compiling in maven [19:01:57] i think that i had to maybe update my intellij maven projects [19:02:06] i still can't compile in intellij, but not due to a different error [19:02:46] ottomata: ok - looks like I can't really help :( [19:03:02] stopped working on it tho so no worries! [19:03:11] nuria: dsaez: mforns: is it fine if I just add the dates of the events along with the information I have? Turnilo is not giving me the graphs I need (or rather, I don't know how to generate them :) [19:15:24] sukhe, this would be OK to me! Maybe dsaez or nuria disagree? [19:21:02] Ok for me [19:40:21] (03PS2) 10Joal: Bump spark.version to Spark 2.4.4 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/542226 (https://phabricator.wikimedia.org/T222253) (owner: 10Ottomata) [19:46:32] sukhe: sounds great [19:46:56] sukhe: i think dates on that wiki would be better [19:47:07] sukhe: i can help you with turnilo too [19:47:19] sukhe: make sure you have pageview_daily, not hourly [19:47:31] sukhe: as that second dateset is only 3 months long [19:55:38] Gone for tonight team [20:16:54] nuria: thank you! yeah, I tried pageviews_daily but was not getting the result I wanted. I will try again [20:17:16] sukhe: if you tell me what you need maybe i can help [20:19:29] nuria: I wanted filter: 10 Jan 2019-20 Jan 2019, country: VE. split by: time and ISP [20:22:39] I will put all this in text on the wiki in any case, I wanted to add graphs but that's fine. the text will probably be more useful to you to get the data [20:24:38] sukhe: there is no ISP in data that is less than 90 days old [20:24:50] sukhe: that bit is deleted after 90 days [20:25:12] sukhe: it could be kept with the country , but not with pageview title [20:26:14] nuria: ah! that would explain it and makes sense. ok, that's fine then. the VE example was just one case in which the ISP data would give us a good picture, we have other country-specific examples [20:26:15] ottomata: did you solved this with luca? https://phabricator.wikimedia.org/T233231#5583801 [20:26:29] I will also check my email to see if I have some graphs from that time, I should have them [20:27:15] ottomata: cause i think we just need to change the serialize method [20:27:19] https://www.irccloud.com/pastebin/xV2Yl00f/ [20:27:41] to: json.dumps(str(a,'utf-8')) [20:27:55] ottomata: .. it might be pointing out the obvious... [20:29:02] ah nuria no i didn't read that bit of his comment [20:30:09] ottomata: same thing happened ( cc milimetric ) in wikimetrics like a million years ago no? [20:31:35] is the utf-8 stuff still even requried in python 3? [20:31:39] i thougght it was the default [21:53:47] ottomata: i think is "only" needed in python3