[04:08:53] (03CR) 10Nuria: ">Should we also do that for pageviewInfo ?" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356307 (owner: 10Nuria) [05:27:03] (03PS10) 10Nuria: [WIP] UDF to tag requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) [05:27:40] (03CR) 10Nuria: "@mforns: just a rebase, will address your comments on next patch" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [08:09:36] 06Analytics-Kanban, 07Easy, 13Patch-For-Review: Don't accept data from automated bots in Event Logging - https://phabricator.wikimedia.org/T67508#3306830 (10Tgr) What about logging from the job queue (which could in theory happen for `PageDeletion` etc when some job creates/moves/deletes pages)? That will pr... [08:13:46] 10Analytics, 06cloud-services-team: Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3305259 (10Tgr) Background: T67508 broke logging for this schema. Also it's one of two schemas which don't use the standard JS or PHP logg... [08:53:38] (03CR) 10Joal: [C: 032] "Tested, looks good to me - Merging" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356307 (owner: 10Nuria) [08:59:40] (03Merged) 10jenkins-bot: Memoize host normalization [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356307 (owner: 10Nuria) [09:14:41] (03PS1) 10Joal: Correct bug in last-access uniques monthly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/356552 [09:15:15] (03CR) 10Joal: [V: 032 C: 032] "Self merging bug correction for later deploy today." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/356552 (owner: 10Joal) [09:21:55] 10Quarry, 10Community-Wikimetrics, 10DBA, 10Icinga, and 2 others: Evaluate future of wmf puppet module "mysql" - https://phabricator.wikimedia.org/T165625#3306961 (10jcrespo) Some modules maybe should install wmf-mariadb101-client ? [10:19:26] joal: people in ops are really happy about the 1/128 webrequests in pivot :) [10:19:52] elukey: if ops happy then [10:20:05] * joal is happyyyYYYYYYY [10:20:37] elukey: I knew if it was to be, data needed to be loaded regularly first :) [10:20:44] yay [10:22:33] it has already been used for https://phabricator.wikimedia.org/T166695 [10:22:55] :) [11:00:47] hi team :] elukey -> I'm here, in case you want to pair [11:01:34] mforns: o/ - later on this afternoon, I am fighting with puppet and Riccardo left 22 comments :D [11:01:51] elukey, :] just ping me then [11:03:22] mforns: deploy time? [11:03:28] joal, sure! [11:03:38] what needs to be deployed? [11:03:41] mforns: there actually some stuff :) [11:03:53] mforns: refinery-source first - at least two patches [11:04:08] saw your reviews [11:04:19] Then refinery, with new patches to update refine to use new jars to pickup the new pageview definition [11:04:30] aha [11:04:42] mforns: I also think we could add the global uniques, if you're ok merging it [11:04:54] yes, ok by me [11:05:05] mforns: let; DO IT ! [11:05:29] mmm joal I also can quickly change the code to take care of your comments and add those patches too [11:05:32] mforns: about the reviews, I think somebody else (possibly andrew) should review the python code [11:05:37] oh ok [11:06:00] mforns: just because I'm not a python person, and andrew is the one who's done most of those scripts [11:06:06] joal, ok ok [11:06:11] mforns: But we could ask him when he arrives, sounds good [11:06:34] mforns: And I completely agree on merging oozie monthly banner [11:07:00] OK cool will merge banner patch, now, if you want I can take care of the deploy by myself, and ping you if needed [11:07:55] mforns: sounds great [11:07:56] I mean, I'd also like to pair, but just let you know that you don't need to stop doing whatever you were doing [11:08:04] K [11:08:19] about banners mforns - are you ok with the comments on comments? [11:08:42] yes, I read them and agree, will change code to follow them [11:27:16] * elukey lunch! [11:28:37] (03PS1) 10Mforns: Update changelog.md to v0.0.46 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356568 [11:30:17] (03CR) 10Mforns: [V: 032 C: 032] "Self-merge for refinery-source deployment" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356568 (owner: 10Mforns) [11:32:25] Hey mforns - Have you started to deploy ? [11:32:35] joal, just merged the changelog [11:32:57] mforns: Do you mind if we make the changelog a bit more descriptive for lines 1 and 3 ? [11:33:32] joal, not at all, I thought the lines should equal the titles in the change sets [11:34:17] mforns: not really, we use the changeset as a help, but IMO changelog should contain enough for us to understand :) [11:34:30] K np [11:34:48] And here, in line 1 and 3, I think it lacks the context (like, in Webrequest for instance) [11:35:52] And actually, if you don't mind, since you correct changelog, do you mind going over the last few message, making sure they are explanatory enough [11:36:09] mforns: I just see thqt 0.0.45 also have lines not really complete (IMO) [11:36:27] will try to improve them too [11:36:47] mforns: I can help, please let me know :) [11:37:18] xD it's OK for now, will shout if in need :] [11:45:46] (03PS1) 10Mforns: Improve changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356571 [11:46:52] (03CR) 10Mforns: [V: 032 C: 032] "Self-merging for refinery-source deployment" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356571 (owner: 10Mforns) [11:56:15] Hey mforns, I'm sorry I should have been clearer - The change I was expecting was to add context to the current line - For instance: Memoize host normalization in Webrequest.normalizeHost [11:56:57] Your patch explains better the change, but still miss the part I was expecting, which is contextualisation of the change (the refinery-source package contains lot's of stuff) [11:57:17] joal, I see [11:57:40] The line on is_productive is way better in that respect, but the one on memoization is not [11:58:15] anf the one on czeck wiki is not really contextualised either [11:58:23] k, will do [11:58:41] sorry to bug you mforns :( [11:59:19] np, that's the way I will do it right next time :D [12:04:26] (03PS1) 10Mforns: Further improve changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356575 [12:04:34] joal, how about now ^ [12:05:00] \o/ [12:05:19] :], K will merge thx! [12:05:25] I can now see, looking at the changelog, where the patches apply ! [12:05:28] Thanks again mforns :) [12:05:38] np [12:06:10] (03CR) 10Mforns: [V: 032 C: 032] "Self-merging for refinery-source deployment" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/356575 (owner: 10Mforns) [12:19:36] tqking a break a-team [12:47:53] !log Deployed refinery-source v0.0.46 using jenkins [12:47:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:26:16] (03PS12) 10Mforns: XUpdate banner monthly job to reuse index [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [13:26:33] (03CR) 10Mforns: XUpdate banner monthly job to reuse index (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [13:28:38] (03CR) 10Mforns: [V: 032 C: 032] "Self-merging after Joal's approval, after implementing his suggestions." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [13:35:37] * elukey sees 12 peer reviews to do [13:35:45] /o\ [13:49:03] mforns: sorry I need to start reducing the Namely backlog asap otherwise I will not make it for the 12th :( [13:49:18] AlexZ, np I'm deploying the cluster [13:49:27] sorry AlexZ, wrong ping [13:49:39] elukey, np I'm deploying the cluster [13:57:10] (03PS1) 10Mforns: Bump up jar version in oozie jobs for v0.0.46 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/356588 [13:57:56] (03CR) 10Mforns: [V: 032 C: 032] "Self-merging for refinery deployment (0.0.46)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/356588 (owner: 10Mforns) [14:10:03] joal, yt? [14:17:10] ottomata: hiiiiiiiiii [14:17:14] do you have a min? [14:17:35] hiii [14:17:35] yes [14:17:59] thanks! [14:18:29] so zookeeper_cluster_name would need to be moved from hieradata/eqiad/codfw to all the roles that uses it [14:18:58] since as hiera lookups works, to override it (like for druid) we only have the option of host specific hiera override [14:19:20] am I right to say that all the roles using kafka_config() will need to have it ? [14:19:37] better: will need to have a hiera role override for zookeeper_cluster_name ? [14:21:16] hmm, i don't think so [14:21:42] kafka_config() uses the value of zookeeper_cluster_name defined for specific kafka cluster in the kafka_clusters hiera hash [14:21:59] so, it will get the zookeeper_cluster_name from that [14:22:09] then look up the appropriate zk nodes from the global zookeeper_clusters config [14:22:12] by that name [14:22:16] not by a globally defined one [14:23:00] ahhh yes I confused zk_clusters = function_hiera(['zookeeper_clusters']) [14:23:27] ya [14:23:29] cluster['zookeeper_cluster_name'] [14:23:31] is grabbing it out of kafka clsuters [14:23:35] gooooooood [14:23:38] thanks :) [14:29:27] !log Deployed refinery using scap, then deployed onto hdfs [14:29:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:43:18] ottomata: other issue to solve if you have time [14:43:27] node /^druid100[123].eqiad.wmnet$/ { [14:43:27] role(analytics_cluster::druid::worker, [14:43:27] analytics_cluster::hadoop::client, [14:43:29] analytics_cluster::druid::zookeeper) [14:43:58] so ::worker and ::zookeeper have now zk_cluster_name = druid-eqiad [14:44:14] meanwhile the second main-eqiad (that should be right no?) [14:44:43] but I get an error by the pcc saying that druid1001 with my change have conflicting zookeeper_cluster_name value [14:46:10] (03CR) 10Ottomata: "I've got another thought about this new .tag package." (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [14:47:10] oooo [14:47:14] interesting [14:47:30] elukey: where is the conflicting value from? [14:48:06] is there a zk_cluster_name defined for hadoop/client role? [14:48:25] ottomata: no no sorry it is part of https://gerrit.wikimedia.org/r/#/c/354449 [14:48:36] yes yes [14:48:42] but i don't think hadoop client needs a zk [14:48:43] OHHh [14:48:46] but in puppet it does [14:48:49] since hadoop client is hte base class [14:48:59] for hadoop master, and configs are defined there [14:49:01] hmmm [14:49:14] yargh, elukey i think we shouldn't use the global zookeeper_cluster_name to configure druid then [14:49:40] that is really just intended to be a default [14:50:08] this works for hte kafka stuff i'm doing because just like we talked about, the name is configured specifically for a kafka cluster [14:50:20] we need to do the same here. not saying we should make a druid_clusters hash like kafka [14:50:21] but [14:50:26] probably the param needs to be specific [14:50:55] ah so something like zookeeper_cluster_name_something [14:51:31] $zookeeper_cluster_name = hiera('profile::druid::zookeeper_cluster_name') (if we had a druid profile) [14:51:54] yeah [14:51:58] right now it si in role::analytics_cluster::druid::common [14:52:07] so for this patch i guess you can just change it to that scope [14:52:09] ? [14:52:16] or even for now [14:52:20] druid::zookeeper_cluster_name [14:52:21] if you like? [14:52:22] dunno [14:52:51] the main issue is https://gerrit.wikimedia.org/r/#/c/354449/21/modules/role/manifests/analytics_cluster/druid/zookeeper.pp [14:53:04] that reuses the zk server profile [14:53:11] but request zk_cluster_name :( [14:53:59] but good news is, druid is the last problem for the refactoring [14:54:05] all the rest is a no-op [14:58:55] oh [14:58:58] OH [14:58:59] hm [14:59:54] oh elukey but if you don't set zookeeper_cluster_name specifically and differently for druid hosts [14:59:57] then it should be ok? [15:00:05] hopefully you can make [15:00:20] ping ottomata standdupp [15:00:22] role/analytics_clsuter/druid/zookeeper.yaml and have druid::zookeeper_cluster-Name [15:00:23] ah! [15:00:24] coming! [15:04:55] 06Analytics-Kanban, 13Patch-For-Review: Improve processing of host on refine step on Webrequest.java - https://phabricator.wikimedia.org/T166628#3307835 (10Nuria) a:03Nuria [15:11:21] 06Analytics-Kanban: Improve Oozie error emails for testing - https://phabricator.wikimedia.org/T161619#3307864 (10Nuria) [15:11:27] 06Analytics-Kanban: Improve Oozie error emails for testing - https://phabricator.wikimedia.org/T161619#3137344 (10Nuria) https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Oozie [15:20:11] 10Analytics, 10Pageviews-API: Endpoint for average view rate in Pageview API - https://phabricator.wikimedia.org/T162933#3307878 (10Halfak) @Hall1467 & @DarTar FYI, this will make our work for entity usage (view rates) much easier. But it's not coming soon, so this is just an FYI. [15:28:05] 10Analytics-Tech-community-metrics, 06Developer-Relations (Apr-Jun 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3307909 (10Aklapper) > I'm afraid you are using the latest sortin... [15:31:32] 10Analytics-Tech-community-metrics, 06Developer-Relations (Apr-Jun 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3307924 (10Albertinisg) Great!! I'll wait for that and keep you u... [15:39:21] fdans: grooskin [15:39:30] sorry! [15:41:10] 06Analytics-Kanban: Provide top domain and data to truly test superset - https://phabricator.wikimedia.org/T166689#3307980 (10Nuria) [15:56:09] 06Analytics-Kanban, 07Easy, 13Patch-For-Review: Don't accept data from automated bots in Event Logging - https://phabricator.wikimedia.org/T67508#3308032 (10Tgr) Re IRC question: for the MediaWikiPingback schema, if the UA is not recorded, it won't be missed. All the information that could be possibly learne... [17:14:38] a-team: I may have corrupted db1047 while stopping mysql (alter table in progress and I didn't know) [17:16:04] ah, didn't realize i wasn't on IRC sorry! [17:16:06] elukey: what's up? [17:16:29] nothing, I stopped mysql on db1047 while an alter table was running [17:16:40] classic mistake of people not working on dbs [17:17:04] oo [17:17:25] yeah, the work was scheduled and it didn't occur to me that an alter table was running [17:17:32] occur to think that [17:17:58] people Jaime asked if we are interested in https://mariadb.com/products/technology/columnstore [17:19:51] hmm, interesting, elukey is it distributed? [17:20:21] no idea, Jaime is the SME [17:20:23] :) [17:25:38] ottomata: if you have time - https://puppet-compiler.wmflabs.org/6637/druid1001.eqiad.wmnet/ [17:25:59] so it seems to me that the hadoop client on druid* nodes are using druid's zookeeper [17:26:02] not main-eqiad [17:26:05] is it intended? [17:27:32] elukey: not intended, but it doesn't matter [17:27:38] all hadoop nodes including clines [17:27:40] share the same config [17:27:46] but not all use the config [17:27:55] those zk settings are only used by the master [17:28:08] so, this change is good, since it makes the configs match [17:28:13] but effectively won't do anything [17:28:31] ottomata: sure sure, I wanted to make sure that the change was fine [17:28:36] ya should be good :) [17:28:36] then the refactoring seems good now [17:28:55] lemme run pcc also for hadoop master nodes [17:28:57] JUST IN CASE [17:32:55] INFO: Compilation failed for hostname analytics1001.eqiad.wmnet with the change. noooooo [17:33:06] * elukey cries in a corner [17:33:21] surely I need to add the zk cluster name for it too [17:48:18] * fdans uses the situation to hug elukey [18:18:19] ottomata: it seems that db1047 is ok, Riccardo helped a bit but we are not seeing anything weird [18:18:52] eventlogging_sync is running too [18:19:54] the alter table seemed not to have big effects, especially since mysql stopped clealy [18:19:57] *cleanly [18:25:46] ottomata: joal: If we want to schedule candidates for you to interview for the research scientist position, should we ask for 30-min or 45-min of your time? I'm asking cuz I thought one of you said the other time that 30-min was too short for you? Obviously, whatever works for you. [18:29:24] joal: ottomata: I requested 45-min. If you end up needing only 30-min, that's fine, too. [18:29:36] thanks much for your help. /me heads out to a lunch meeting. [18:37:27] 10Analytics, 10Analytics-EventLogging: whitelist multimedia and upload wizard tables - https://phabricator.wikimedia.org/T166821#3308728 (10JKatzWMF) [18:38:00] a-team: logging off, db1047 seems fine, just sent an email to recap what's happened. I am going to be off tomorrow for public holiday, but I'll read IRC/emails so ping me if db1047 gives you headaches [18:38:04] * elukey off [18:51:13] 10Analytics, 10Analytics-EventLogging: whitelist mobileapp reading list schema - https://phabricator.wikimedia.org/T166823#3308784 (10JKatzWMF) [20:14:43] 10Analytics, 10Analytics-Cluster, 06Operations, 10Traffic, 15User-Elukey: Encrypt Kafka traffic, and restrict access via ACLs - https://phabricator.wikimedia.org/T121561#3309109 (10Ottomata) [20:14:45] 10Analytics, 10Analytics-Cluster: Import Kafka messages into HDFS authenticating with TLS/SSL - https://phabricator.wikimedia.org/T166832#3309097 (10Ottomata) [20:15:26] 10Analytics, 06Discovery, 10EventBus, 10Wikidata, and 3 others: Create reliable change stream for specific wiki - https://phabricator.wikimedia.org/T161731#3309114 (10Smalyshev) [20:18:50] 10Analytics, 10Analytics-Cluster: Produce webrequests from varnishkafka to Kafka with Kafka message timestamp set to configurable content field - https://phabricator.wikimedia.org/T166833#3309125 (10Ottomata) [20:19:20] 10Analytics-Cluster, 06Analytics-Kanban: Provision new Kafka cluster(s) with security features - https://phabricator.wikimedia.org/T152015#3309138 (10Ottomata) [20:19:22] 10Analytics, 10Analytics-Cluster: Produce webrequests from varnishkafka to Kafka with Kafka message timestamp set to configurable content field - https://phabricator.wikimedia.org/T166833#3309137 (10Ottomata) [20:19:55] 10Analytics, 10Analytics-Cluster: Produce webrequests from varnishkafka to Kafka with Kafka message timestamp set to configurable content field - https://phabricator.wikimedia.org/T166833#3309125 (10Ottomata) p:05Triage>03Low [20:25:05] 10Analytics, 10Analytics-Cluster: Import Kafka messages into HDFS authenticating with TLS/SSL - https://phabricator.wikimedia.org/T166832#3309144 (10Ottomata) Until we solve this problem we won't be able to disable PLAINTEXT Kafka use. That means we won't solve the current problem that anyone in prod can cons... [20:25:37] ottomata: yt? [20:26:26] 10Analytics-Cluster, 06Analytics-Kanban, 13Patch-For-Review: Update puppet for new Kafka cluster and version - https://phabricator.wikimedia.org/T166162#3287005 (10Ottomata) [20:26:32] yaaa [20:27:36] nuria_: what's up? [20:27:56] ottomata: on the "strongly but respectfully" disagreement of today [20:28:13] ottomata: i disagree with your comment about tags [20:28:19] 10Analytics, 06Discovery, 10EventBus, 10Wikidata, and 3 others: Create reliable change stream for specific wiki - https://phabricator.wikimedia.org/T161731#3309158 (10Smalyshev) [20:28:24] ottomata:regarding moving that to Webrequest.getTags() [20:29:12] hehe [20:29:13] i thought you might [20:29:16] tell me why [20:32:16] ottomata: [20:32:18] 10Analytics, 06Discovery, 10EventBus, 10Wikidata, and 3 others: Create reliable change stream for specific wiki - https://phabricator.wikimedia.org/T161731#3309165 (10Nuria) From meeting: * @Smalyshev can consume from either kafka or event stream once we add the ability to consume from a given point in ti... [20:32:36] ottomata : There is not an entity that is a webrqeuest. As our code stands now WebrequestData is a POJO (inmutable, no behaviour) and Webrequest.java is a just a utilitity class with a pretty name, it is mostly static and thus not properly OO. Even if we had a proper entity for Webrequest and we wanted to co-locate data and behaviour in this case I do not think it would be a fit, tags are an attribute of the data [20:32:36] and thus external to it. If we want to remove tags entirely we will just remove the chain, there should not be any modifications to webrequest entity. [20:32:51] makes sense? [20:33:23] While there is the case to be made for co-locating dat and behaviour , this is not behaviour at all, it is an external quality [20:35:39] ^ ottomata [20:38:47] 10Analytics, 06Discovery, 10EventBus, 10Wikidata, and 3 others: Create reliable change stream for specific wiki - https://phabricator.wikimedia.org/T161731#3309191 (10Smalyshev) As the result of the discussion, we've arrived to a following conclusion: * After we have Kafka version installed that allows to... [20:49:48] nuria_: i buy your argument about the fact that we don't have a good webrequset model class [20:49:58] but, i dont' see why this method couldnt' go in webrequest at all [20:50:13] there are other 'utlitliy' methods in there, that take webrequets fields (or your new pojo class) and return something useful [20:50:25] ottomata: because tags are external to the entity (if we were to have any) [20:50:56] ottomata: removal of the tagging code should not imply any changes to entity code [20:51:09] (if we were to have an entity such us webrequest) [20:51:29] hmm [20:51:39] i was going to argue about this vs UA parser [20:51:42] but UA parser is not in webrequest [20:51:47] as in, it is not used by webrequest [20:52:24] nuria_: part B of my comment is more contentious clearly, i don't htink i want to argue that one fully, especially since we don't have a good webrequest model class [20:52:34] but, what do you think about making this new package be .webrequest instead of .tag [20:52:34] ? [20:52:40] since these tags are webrequest specific? [20:52:49] refinery.core.tag sounds really generic [20:52:56] mmm [20:53:40] wait, thy are request specific because the gettags interface takes as input a webrequest class [20:54:39] but getTags(request some) would not be specific [20:55:22] ok, TagChain is webrequest specific though [20:55:36] and, your reflection looks for anytihng that is core.tag [20:56:42] so, everything in core.tag.* has to take a WebrequestData instance [20:57:12] t.getTag(webrequest); [20:57:12] ottomata: ya, that is true the way the interface is now [20:58:23] ottomata: it really should be public class PortalTagger implements Tagger but if so the portal tagger would need to be on a webrqeuest pacakage [20:58:25] as is now, all implemented Tag classes MUST have a getTag(WeberquestData data) [20:58:29] that i concede [20:59:09] agree, but TagChain even is webrequest speciifc [20:59:32] ottomata: ya, but that is because we do not have more types, it is easy to change to generics so the tagger chain is generic [20:59:41] ottomata: teh portal tagger however it is specific [20:59:54] but you can't though, because it is including ALL classes in getTypesAnnotatedWith(org.wikimedia.analytics.refinery.core.tag.Tag.class‌​ [21:00:08] so, ANY tag class, will need to work with webrequest [21:00:31] oh you would change the tagchain [21:00:40] so that it doesn't work with webrequest? [21:00:57] public List handleRequest( data){ [21:01:00] ottomata: so it doesn't require a webrequest type [21:01:06] public class TagChain [21:01:07] ? [21:01:31] ottomata: but i did not do that cause it seems overkill [21:01:34] but, nuria_ you are specically looking at webrequest fields in TaggerChain [21:01:36] TagChain* [21:01:53] in Portaltagger yes [21:01:58] no [21:02:00] in TagChain [21:02:02] https://gerrit.wikimedia.org/r/#/c/353287/10/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/tag/TagChain.java [21:02:08] assert webrequest.getUriHost()!=null: webrequest; [21:02:13] // only pass to taggers healthy requests [21:02:13] if(Webrequest.SUCCESS_HTTP_STATUSES.contains(webrequest.getHttpStatus()) && [21:02:41] nuria_: it all seems al ittle overkilly, i like the refliction to find the tag classes [21:02:45] cause then we don't need configuration for it [21:02:52] but, maybe we should stop here and make the whole thing webrequest specific [21:03:38] ottomata: ok, conceeding on that one, will move tagging classes to a webrequest package cause we do not need it to be more generic now (it is easy to do that, though) [21:03:48] aye [21:03:48] ottomata: but agree, abit overkill [21:04:07] +1 ok nuria_ [21:04:08] thanks [21:04:22] (03CR) 10Ottomata: [WIP] UDF to tag requests (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [21:04:57] ottomata: ok, changes on the way, will add some tests [21:11:15] https://www.reddit.com/r/IAmA/comments/6epiid/im_the_principal_research_scientist_at_the/ [21:11:24] haha, i can't help but read all of halfak's comments in his voice [21:14:17] :) me too [21:14:25] he writes just like he sounds [21:26:08] (03CR) 10Nuria: "> It would be cleaner from my PoV to be able to have an instance of a webrequest and call >webrequest.getTags()." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [21:33:25] ottomata: "src/main/java/org/wikimedia/analytics/refinery/core/webrequest/tag/" correct? [21:54:21] nuria_: sounds fine, but maybe even just webrequest level package is ok? [21:54:26] oh [21:54:27] you do [21:54:29] beacuse reflection [21:54:31] ok ya +1 nuria [21:54:46] you need .tag for reflection [21:56:11] 10Analytics-Tech-community-metrics: Usable links for specific users or repositories - https://phabricator.wikimedia.org/T164934#3309388 (10Nemo_bis) I find the small icon hard to locate and unintuitive. I suggest filing a separate task about that. I still don't understand why the URLs need to be so long. For t... [22:17:03] ottomata: k [23:39:06] Why do some pageview queries for real pages return a 404? For example: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/fr.wikisource/all-access/user/Voyages%2C_aventures_et_combats%2FChapitre_26/daily/2017041200/2017053100 [23:39:13] versus https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/fr.wikisource/all-access/user/Voyages%2C_aventures_et_combats%2FChapitre_22/daily/2017041200/2017053100 [23:39:30] both of those are for redirects: [23:39:31] https://fr.wikisource.org/w/index.php?title=Voyages,_aventures_et_combats/Chapitre_26&redirect=no [23:39:39] https://fr.wikisource.org/w/index.php?title=Voyages,_aventures_et_combats/Chapitre_22&redirect=no [23:40:16] but 26 (among others) gives a 404, while 22 gives a response (with zero views)